Skip to content
Home ยป Git forked

Git forked

  • by
git logo

Forgive me for the title. Mentally I’m 12.

When I started my current day job, I certainly didn’t expect to write this many blog posts about Git. I don’t fancy myself an expert by any means.

Today I entered a new realm.

Here’s what I know about Git forks:

  1. Git is a fully distributed source control system, so there’s no central repository.
  2. Therefore, pick a repository that you want to be your central, or upstream, repo. This is made easier by some of the magic that GitHub offers.
  3. Fork the repository (make a downstream copy) into your personal GitHub account. It is now linked to the upstream repo by some of that GitHub magic.
  4. You do all your work in your fork (which is an exact copy) of the upstream repository in GitHub.
  5. From time to time, you synchronize your fork with the upstream repo.
  6. If you want to make any changes to the upstream repository, you have to create a pull request that must be merged.

We use this pattern for Microsoft Docs. We have a private version of the public sql-docs repository. Each of us has forked this repository and we do our changes there before creating pull requests to merge our changes into the upstream repo.

Two colleagues were working on a project earlier this week. Before they reached out to me, I had no knowledge of the project or the repo they were working in, and I was pretty busy today when they first started panicking. I’ve always said I’m no expert, but I do know how useful git reflog is about figuring out what broke, as long as you don’t panic. Or force push upstream.

(Is this foreshadowing?)

In one case, one of my colleagues discovered files that didn’t exist, and shouldn’t exist, suddenly show up when they rebased a branch off the sql-docs upstream repo.

The other colleague discovered a second repository inside their local clone of their sql-docs private fork.

Both had followed the same instructions to perform a task, and mentioned they’d had the same error message. So what happened?

The instructions said to use the gh repo fork command to create a downstream fork of a repository they need to work in. Let’s call it “project-sql-docs”. Both colleagues saw an error that said “repository already exists”. As a result, they tried to resolve the issue independently. Both approaches failed in different ways.

It took an hour to figure out what happened (they didn’t force push upstream).

The short version is that they’d tried to fork a repository that was already forked from the sql-docs private repo. That isn’t a supported operation in GitHub. All that magic they offer relies on your fork being based on your chosen upstream repo. Since they had already forked that upstream repo, it caused mayhem on both machines. In one example, it caused one colleague’s sql-docs fork to be renamed to project-sql-docs, and when they tried to fix it, a “feature” of the gh command had set the wrong upstream repository as the default repository (I hate that “default repository” feature for different reasons, but here’s a new one).

So, those files that shouldn’t exist? The files that ostensibly came from the sql-docs upstream repo, ended up rebasing off the project-sql-docs repo, despite the upstream remote reference being correct.

As for the colleague who had a repo inside a repo? They had run the gh repo fork command from inside the sql-docs folder on their machine. It cloned as a “submodule” in the way that Git can be very helpful without being helpful at all.

Is there a moral to this meandering story? First off, make sure that you don’t run any gh commands while you’re in a repo folder unless it’s intentional. Secondly, don’t run gh repo fork. If you need to fork a repo, do it in the web UI. It stops you in a more obvious way. Thirdly, you can’t fork from the child of an upstream repo that you’ve already forked from.

Finally, the git reflog command is amazing. Learn how to use it. The Git client (coincidentally called) Fork is also amazing, and it runs on both Windows and macOS.

Bonus: You probably shouldn’t run potentially destructive commands on your computer if you don’t know what they do. If you’re writing instructions for Git operations, test the commands first. Make the commands idempotent (repeatable).

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.