I’ve read the amazing book Building Git by James Coglan and inspired by it, I thought of trying some of the examples in Elixir.

In this post I want to explore inflating Git objects, in particular, commit objects. But first, what are Git objects?

Git Objects

Git stores the information in the object database, that you can find under the .git/objects directory. There are three types of objects: Tree, Commit, and Blob.

  • Git creates “Tree objects” for every directory in the project with the path. Also, it has the list of the content of that directory with the hash of each element.
  • “Commit objects” contain the information related to the commit like the tree (directory), commit message, author, parent commits, etc.
  • And “Blob objects” are the contents of your files stored as binary data.

Taking a look at .git/objects

Let’s create a new folder

mkdir elixir-inflate
cd elixir-inflate

Initialize a git repository and add a commit with a test file

git init
echo "world" > "hello.txt"
git add hello.txt
git commit -m "My first commit"

Now let’s print the hash of the commit

git log --format="format:%H"

In my case 92a4b34fabdcf29a310372e709f737a3c645b3b6

If we look at the contents of .git/objects we’ll see something like this:

.git/objects
├── 4b
│   └── 825dc642cb6eb9a060e54bf8d69288fbee4904
├── 7f
│   └── 2dbfa479cbe99062de2ef82b713f044c4406d8
├── 92
│   └── a4b34fabdcf29a310372e709f737a3c645b3b6
├── cc
│   └── 628ccd10742baea8241c5924df992b5c019f71
├── info
└── pack

I’ve used the tool tree to get the content of the directory.

We can notice that there is a folder and a file with the same name as our commit hash

├── 92
│   └── a4b34fabdcf29a310372e709f737a3c645b3b6

and our commit hash is 92a4b34fabdcf29a310372e709f737a3c645b3b6

That’s the object for our commit. Git uses the first two characters of the hash as the folder name and the rest as the file name. Let’s look at the content.

cat .git/objects/92/a4b34fabdcf29a310372e709f737a3c645b3b6

And we get gibberish

'''K' ]s'w
41ƕ+=''%R'
s'^j''s<SS֔'''P'''m,''c'wj''!\N+v''WA~'b-WPvr'y'
                                               ''R't'''dn}''#''9=%

Inflating GIT objects with Elixir

Git objects are stored compressed with gzip, so we need to inflate the objects before inspecting them. We’ll create an Elixir script to do just that.

Create the file inflate.exs at the root of elixir-inflate directory and add the following:

compressed = IO.read(:stdio, :all)
z = :zlib.open()
:zlib.inflateInit(z)
[decompressed] = :zlib.inflate(z, compressed)
:zlib.close(z)

IO.puts decompressed

Let’s use our inflate script to inflate the commit object

alias inflate='elixir inflate.exs'
cat .git/objects/92/a4b34fabdcf29a310372e709f737a3c645b3b6 | inflate

and now we can see the commit information.

commit 7f2dbfa479cbe99062de2ef82b713f044c4406d8
author Luis Ferreira <test@example.com> 1647778013 -0300
committer Luis Ferreira <test@example.com> 1647778013 -0300

My first commit

Let’s analyze the script

Let’s go bit by bit.

compressed = IO.read(:stdio, :all)

First, we read from the standard input.

z = :zlib.open()
:zlib.inflateInit(z)

We initialize the Erlang zlib library to decompress (or inflate) the file.

[decompressed] = :zlib.inflate(z, compressed)

IO.puts decompressed

Finally, we inflate the contents of what we read from standard input and print it on the screen.

Conclusion

Git stores all the information in objects under the folder .git/objects. There are three types of objects: commit, tree, and blob. We focused on the first type, ”commit objects”, and created a short elixir script to inflate them. As we can see, all the information related to commits is stored in commit objects. Try inflating the other types of objects to see the results.