Git Fetch | How Git Fetch Works

0
20

Table of Contents

Introduction

Git was built around a distributed model to offer collaboration freedom. You and anyone else working on a project, can make changes offline and later merge these changes to the remote repository, so everyone else can view and access them.

To support a distributed architecture, Git’s developer, Linus Torvalds, developed a repository system to store Git’s internal objects. This local object database uses remote-tracking branches to download specific commits using the command git fetch. The command git pull takes it one step further, by merging those downloaded commits to your working copy.

Continue reading to learn more about how git fetch works, how git fetch compares to git pull, and how to use git fetch effectively.

What is Git Fetch?

Before discussing git fetch, it helps to understand how Git manages commits between local and remote repositories. When you clone a remote repository, a local copy is created on your machine which contains the full set of the repository’s commits (and other Git objects such as blobs and trees). However, only the master (or main) branch is set up to track the remote’s version of that branch.

Git does this by creating a “remote-tracking branch” in the local repo, which you can think of as an intermediate version of the branch that Git uses to keep the local and remote copies of the branch in sync.

Git separates your personal local repository commits and the remote-tracking branches using branch references, also known as refs. These are both stored in the hidden .git/ folder at the following paths:

  • Local branches are stored on the path ./.git/refs/heads/
  • Remote-tracking branches are stored on the path ./.git/refs/remotes/

Each time you use the git fetch command, Git downloads any non-local commits from the remote repo into the local repo. These fetched commits are stored in your object database so they exist locally, but are not merged into your current active branch. Therefore Git fetching is useful when you want to keep your repository up to date, but do not want the updated files to interfere with the current files you are working on. You must later merge to integrate these fetched commits into your current branch.

In a nutshell, Git fetch only updates your local object database with new remote commits. Your local Git working directory remains unaffected.

Git Fetch vs Git Pull

Both git fetch and git pull are used for updating your local repo’s object database with commits from a remote repo. On an active project, the central (remote) repository may receive new commits daily. Remote-tracking branches are only updated when you use git fetch or git pull. The longer you wait between updating your remote-tracking branches, the more outdated they become.

How Git Fetch Works

Git fetch is often useful when you do not want to impact files sitting in your Git working directory or in the staging area. This command will not manipulate, destroy, or mess up your ongoing work. You can fetch as often as you want, and it will not ever harm your workflow.

Using git fetch allows for a more careful approach to merging remote-tracking branches. Once you’ve fetched the updates, you can check for the differences between your local branches and the remote-tracking branches, using the git diff command. This enables you to verify that these changes will not conflict with your working files, before merging.

How Git Pull Works

Newer users are probably more familiar with git pull because it does a lot of the heavy lifting for you.

Under the hood, the git pull command is simply doing a git fetch plus a git merge in one single step.

But git pull has a completely different endpoint than git fetch. When you use git pull you are updating your currently checked-out branch. The updates are not just downloaded to your object database like with git fetch, but merged into your working files.

Since git pull attempts to merge the pulled branch into the active branch, you may end up having to resolve a merge conflict. To avoid this, you can ensure that your working directory is clean before running Git pull. You can temporarily unload your changes in the working directory using the git stash command.

Git Fetch Command Options

Like most Git commands, fetch has many useful command-line options and flags:

  • git fetch <remote>: Fetches all commits and related objects from all branches in the specified remote. If unspecified, the default remote is origin.
  • git fetch <remote> <branch>: Fetches all commits and related objects the specified remote branch.
  • git fetch --all: Fetches all commits and related objects from all registered remotes and their associated branches.
  • git fetch --dry-run: The --dry-run option will output the actions that will happen if you use the fetch command, without actually running the fetch command.

How to Use Git Fetch?

There is a general workflow that is recommended when using git fetch. Start with git fetch, then check the differences between repositories, and finally merge the fetched changes into your desired branch. To learn the workflow, follow the steps below:

Update your Local Repository using Git Fetch

Before using git fetch you may need to add one or more remote repositories depending on where you want to fetch from. You do this with the git remote command:

$ git remote add sample_repo git@bitbucket.org:sample/sample_repo.git

Now you can use the remote repository name with the git fetch command:

$ git fetch sample_repo

In this case, all commits from all branches of sample_repo are now downloaded to your local Git object database. If you only want a specific branch, you can include the branch name after the repo name, as follows:

$ git fetch sample_repo debug_branch 

If you want to integrate this branch into your local working copy, you will need to use git checkout debug_branch to create a local branch with the commits that were fetched. This can then be merged into any branch of your choosing by checking out your desired branch to merge intoand running git merge debug_branch.

2. Use Git Diff Master Origin / Master

However, before merging, you may want to examine the actual fetched code changes. The git diff command is a useful way to check code changes between your local branch and remote-tracking branches that were fetched, before proceeding with the merge.

So continuing the example from above, our git diff to compare our local state with the fetched changes on the remote tracking branch will be:

$ git diff sample_repo/debug_branch

diff --git a/debug.txt b/debug.txt 
index 15827f4..8115e72 100644 
--- a/debug.txt 
+++ b/debug.txt 
@@ -1,5 +1,5 @@ 
Err 123
Err 123
Err 404
Err 404
- Err 500
+ Err 203

Note that this is not representative of an actual debug log, but we are using it for demonstration purposes. In the outdated version of debug.txt, line 5 read “Err 500”. In the updated version of debug.txt, like 5 has been changed to “Err 203”. You would look through what git diff outputs and ensure the changes are what you expect. Address these conflicts before moving on to step 3.

3. Merge Using Git Merge

Once you’ve verified and fixed any potential conflicts between the remote-tracking branches and your working copy, you can move on by using git merge to integrate these two together:

$ git checkout master
$ git merge debug_branch

Git merge will result in an output that displays the files changed and the number of insertions:

Updating 15827f4..8115e72 100644
Fast-forward
Debug.txt | 1 - 1 +
1 file changed, 1 insertion(+), 1 deletion(-)

Now that you’ve fetched and merged in changes from a remote repo, you’ve essentially learned how git pull works by doing it the manual way!

Summary

Git fetch is a powerful command to add to your Git toolkit. Git fetch is safer than git pull, so use it freely and often to download commits to your object database. Your local working directory is completely untouched by the fetching process. Once you’ve verified the file changes using git diff, you should move forward with merging, which will ultimately lead to the same effect as git pull.

Next Steps

If you’re interested in learning more about how Git works under the hood, check out our Baby Git Guidebook for Developers, which dives into Git’s code in an accessible way. We wrote it for curious developers to learn how Git works at the code level. To do this we documented the first version of Git’s code and discuss it in detail.

We hope you enjoyed this post! Feel free to shoot me an email at jacob@initialcommit.io with any questions or comments.

References

  1. Git SCM Docs, git fetch – https://git-scm.com/docs/git-fetch

Source

LEAVE A REPLY

Please enter your comment!
Please enter your name here