More fun with Git: git restore

The setup

My day job involves babysitting a lot of Git repositories hosted on GitHub. The vast majority of the commits, merges, and squashes I run on a daily basis involve short-lived branches, and I rebase and squash them all the time. It’s cool. It’s even cooler when you can run git reflog to undo something silly.

Recently we had a conference. The conference was where SQL Server 2025 GA was announced. GA means “generally available” or “general availability”, or “release to manufacturing” (RTM) if you’re old enough. It’s the first official release of a product version after all the betas and release candidates are done. In fact, in many software shops, “GA” is just the label you stick on the latest internal release candidate build and call it done.

When we announced SQL Server 2025, and took the “preview” label off a bunch of articles, we published roughly 2,000 new and updated articles. In the Data Docs team, we were working across three main long-lived (or “release”) branches in Git. We keep these release branches in sync by merging in the main branch twice a day. This keeps merge conflicts slightly easier to manage.

Sometimes we have to remove content from these release branches before publication. Features and products can get cut, and we have to move those articles into their own respective branches, waiting for the product group to decide when they finally get a release date.

Sometimes we have to review all the changes in one sitting. For example, let’s say my colleague adds a new product name to the metadata of 2,000 existing articles. When I validate the change (by looking at every file one by one and reading the diff), it’s far more convenient looking at them on a fast client written in C++ that runs on my local machine, as opposed to slogging through them in a web browser using the GitHub UI. It gets pretty hairy with 20 files, never mind 2,000.

The challenge is that after several months of twice-daily merges, it’s almost impossible to do this in a way that doesn’t make people cry.

Enter git restore.

Restore files, but with Git

Now look, I know I’m linking to Git documentation with the expectation that you’ll read it, but let’s be honest: Git’s documentation is … well, every time I try to read it, I become a different person. Reading Git documentation is an exercise in building character. You might say it prioritizes completeness over approachability.

So allow me to summarize how I use git restore to compare changes across hundreds of commits and merges without using git rebase.

If you’ve ever watched my presentation on temporal tables, you know that Back to the Future is one of my favourite films. It coincidentally is a great way to explain how Git works.

If you think of a Git commit as a specific point of time that you can get to with a time-travelling DeLorean, then if you rewrite history by doing a git rebase and git squash, you’re rewriting history and creating an alternate timeline. It’s the slightly better timeline where Marty doesn’t crash his truck and become a loser.

But a release branch can end up being like the alternate 1985 from Back to the Future Part II. Biff Tannen has taken over Hill Valley, and you just want to fix the timeline and put it back the way it’s supposed to be. However, in my analogy, there are hundreds of little changes that have taken place, and you need to be very careful to figure out which changes need to stay, and which changes need to go.

As regular Git users know, everything starts and ends (literally) with a commit hash. This sequence of characters tells you the point in time you want to reset the universe to.

(You can safely get away with using just the first few characters of a commit hash, even in a repository with millions of commits. For safety, I use the first seven, but it will probably work with the first five. Not important.)

git restore lets me take a starting commit hash and a finishing commit hash, and examine all the changes between those two points, by creating a new branch with all those changes in one place. Then, because it’s just a branch, we can stage and commit those changes if we want. Again, not important. What’s important is that it’s all in one place. All those merges from main are effectively squashed out into a single commit.

How it works

Assume I’m working in a release branch called release-branch.

Firstly, we need to know the commit hash when the branch was created. We infer this from where main and release-branch first diverged.

git merge-base main release-branch

We make a note of the first seven characters from the response: 87f53fb5.

Secondly, we need to know the commit hash when the last commit was made to the release branch. There are many ways to get the latest commit hash from a branch. I use git rev-parse:

git rev-parse release-branch

We make a note of the first seven characters from the response: b54aa61.

Now we want to know what files have changed in the lifespan of the release branch.

git diff --name-only 87f53fb5 b54aa61 '*.md' '*.yml'

Wow, over 2,000 files. I guess we need to keep track of that somewhere. Let’s worry about it later.

Now that we know what files were changed between the beginning and end of the release branch, we can figure out how to get all the changes from those files into a single branch since release-branch was created, but based off the latest main.

There are lots of ways to do this. Because I use git rebase a lot, this is a pattern that I understand, so I’m going to emulate the same functionality here. Another method is to use something called a soft reset. One of my colleagues prefers this technique, and here it is:

git reset --soft $(git merge-base upstream/main HEAD)

(You’re left with a branch with a set of unstaged files. Do with it what you will.)

My technique is a little bit more complicated, because I want to start at the point where release-branch was created, before the first twice-daily main merge took place.

We need to know what the current commit hash is for the main branch. That’s our starting point. We’re reproducing the same effect as a git rebase on the latest main, except we’re using git restore to do it. A quick git rev-parse gets us our rebase point.

git rev-parse main

We make a note of the first seven characters from the response: 422d565.

Now comes the fun, wild, fast technique that creates a brand new version of release-branch on our local machine, with all changes since the beginning of time, using the latest version of main as the starting point. Let’s use a name with an underscore prefix so we can remember what’s what.

git switch -c _release-branch 422d565
git restore --source=b54aa61 -- $(git diff --name-only 87f53fb5 b54aa61 '*.md' '*.yml')
git add . && git commit -m "Squashed release branch commit"

These three commands do the following steps:

Check out a new branch, based on the current commit hash of main, called _release-branch.
Run git restore, using the latest commit hash of release-branch as its source. Notice the --. That tells git restore to use a list of files. We provide the list of files using a variable represented by the $ character and parentheses. Recognise that git diff command inside it? It’s our list of all the files that changed between two commits.
Stage the files, and commit them with a message “Squashed release branch commit”.

There’s some output. It looks sort of like this:

[_release-branch 581fc263001] Squashed release branch commit
2015 files changed, 4391 insertions(+), 4848 deletions(-)
delete mode 100644 random-file1.md
create mode 100644 random-file2.md
Updating files: 100% (5050/5050), done.

And we’re done. We now have a local branch called _release-branch, with all the changes from the actual release-branch, but in one place. I can now open Fork and use my keyboard arrows to run through the changed files quickly, looking for anything that looks odd.

The cool thing is that I can customize this to the nth degree. Depending on which main commit hash I use, my new branch might “delete” changes that were made after the fact. Counterintuitively, this is extremely useful for finding out why a file didn’t have a merge conflict (which happened when preparing for the conference).

Let’s say you have have ten release branches you’re juggling (this is hypothetical because it definitely didn’t happen at Ignite 2025, no you shut up), and some of them were created long before others. It would be very easy to make a change to a file in one release branch, and the same file in another release branch, and have the newer change take precedence and not create a merge conflict, even though the “older” change is the correct one.

By using this technique to reset the starting point of each release branch to the latest version of main, we might see a file in the list that is unexpected, which in turn would cause a manual review.

If only I discovered this technique before 18 November 2025….

More fun with Git: git restore

The setup

Restore files, but with Git

How it works

Leave a Reply Cancel reply