At its core, git is simply a persistent map of content and hashes. Git calculates hash for different types of content and saves it in its object database. Everything else that git does is built on top of this basic functionality. The git database is made of four types of objects; blob, tree, commit and tag.

blob

The content of content every tracked files is saved as blob. Blob object is named with the unique SHA-1 hash value calculated. It is worth mentioning that files with identical content will be represented by same blob object even when they have different names and are saved in different folders.

tree

A snapshot of directory structure is saved as tree object in a specific format. A tree object can have references to other tree and blob objects. There will be one top-level tree for the base directory where git is initialized. There will be other tree objects based on directory structure. Again, the name of tree object is calculated based on its content hence new tree objects will be created as updates are made.

alt text

Here is an example of tree object.

040000 tree d8329fc1cc938780ffdd9f94e0d364e0ea74f579      bak
100644 blob fa49b077972391ad58037050f2a75f74e3671e92      new.txt
100644 blob 1f7a7a472abf3dd9643fd615f6da379c4acb3e3a      test.txt

commit

A commit object has a reference to top-level tree of the snapshot, reference to one or more parent commits and author/committer information. Again, the commit object name will be a calculated SHA-1 hash based on its content. A each commit has a parent commit, we can follow to first commit from any commit in a repository.

Here is an example of a commit object. Notice it does not reference a parent commit as it is the first commit.

tree d8329fc1cc938780ffdd9f94e0d364e0ea74f579
author Scott Chacon <schacon@gmail.com> 1243040974 -0700
committer Scott Chacon <schacon@gmail.com> 1243040974 -0700

first commit

tag

The tag object is similar to a commit object. It will reference a commit object and information author/tagger and name for the tag. A tag is simply a pointer to a commit and gives us an easier way to get to a specific commit.

We have not talked about the branch yet but branches in git are nothing more than a pointer to a commit. They are similar to a tag except it always points to last commit on a chain and move as new commits are made. Also note that “HEAD” is a pointer to currently checked out branch.

Git Areas

Now that we understand the git object model, let us talk about 3 main areas that we should know about. These 3 areas are “Working directory”, “staging or index” and “repository”. Repository is basically where git saved all the objects that we discussed above. Inside repository, we use blob and tree objects to represent snapshots of “Working directory” at different times. Staging or index is an area in the middle of working directory and repository. It is a holding place of sorts before changes are committed and become part of the history.

alt text

In order to understand any git command, we need to understand following;

  • How it moves information in different areas
  • What it does to repository

In part 2 of this article, we will start looking at basic commands that every developer should know.

Happy coding.