Inflating Git objects with Elixir
I’ve read the amazing book Building Git
by James Coglan and inspired by it, I thought of trying some of the examples in Elixir.
In this post I want to explore inflating Git objects, in particular, commit objects. But first, what are Git objects?
Git Objects
Git stores the information in the object database, that you can find under the .git/objects
directory. There are three types of objects: Tree, Commit, and Blob.
- Git creates “Tree objects” for every directory in the project with the path. Also, it has the list of the content of that directory with the hash of each element.
- “Commit objects” contain the information related to the commit like the tree (directory), commit message, author, parent commits, etc.
- And “Blob objects” are the contents of your files stored as binary data.
Taking a look at .git/objects
Let’s create a new folder
mkdir elixir-inflate
cd elixir-inflate
Initialize a git repository and add a commit with a test file
git init
echo "world" > "hello.txt"
git add hello.txt
git commit -m "My first commit"
Now let’s print the hash of the commit
git log --format="format:%H"
In my case 92a4b34fabdcf29a310372e709f737a3c645b3b6
If we look at the contents of .git/objects
we’ll see something like this:
.git/objects
├── 4b
│ └── 825dc642cb6eb9a060e54bf8d69288fbee4904
├── 7f
│ └── 2dbfa479cbe99062de2ef82b713f044c4406d8
├── 92
│ └── a4b34fabdcf29a310372e709f737a3c645b3b6
├── cc
│ └── 628ccd10742baea8241c5924df992b5c019f71
├── info
└── pack
I’ve used the tool tree
to get the content of the directory.
We can notice that there is a folder and a file with the same name as our commit hash
├── 92
│ └── a4b34fabdcf29a310372e709f737a3c645b3b6
and our commit hash is 92a4b34fabdcf29a310372e709f737a3c645b3b6
That’s the object for our commit. Git uses the first two characters of the hash as the folder name and the rest as the file name. Let’s look at the content.
cat .git/objects/92/a4b34fabdcf29a310372e709f737a3c645b3b6
And we get gibberish
'''K' ]s'w
41ƕ+=''%R'
s'^j''s<SS֔'''P'''m,''c'wj''!\N+v''WA~'b-WPvr'y'
''R't'''dn}''#''9=%
Inflating GIT objects with Elixir
Git objects are stored compressed with gzip, so we need to inflate the objects before inspecting them. We’ll create an Elixir script to do just that.
Create the file inflate.exs
at the root of elixir-inflate
directory and add the following:
compressed = IO.read(:stdio, :all)
z = :zlib.open()
:zlib.inflateInit(z)
[decompressed] = :zlib.inflate(z, compressed)
:zlib.close(z)
IO.puts decompressed
Let’s use our inflate
script to inflate the commit object
alias inflate='elixir inflate.exs'
cat .git/objects/92/a4b34fabdcf29a310372e709f737a3c645b3b6 | inflate
and now we can see the commit information.
commit 7f2dbfa479cbe99062de2ef82b713f044c4406d8
author Luis Ferreira <test@example.com> 1647778013 -0300
committer Luis Ferreira <test@example.com> 1647778013 -0300
My first commit
Let’s analyze the script
Let’s go bit by bit.
compressed = IO.read(:stdio, :all)
First, we read from the standard input.
z = :zlib.open()
:zlib.inflateInit(z)
We initialize the Erlang zlib
library to decompress (or inflate) the file.
[decompressed] = :zlib.inflate(z, compressed)
IO.puts decompressed
Finally, we inflate the contents of what we read from standard input and print it on the screen.
Conclusion
Git stores all the information in objects under the folder .git/objects
. There are three types of objects: commit, tree, and blob. We focused on the first type, ”commit objects”, and created a short elixir script to inflate them. As we can see, all the information related to commits is stored in commit objects. Try inflating the other types of objects to see the results.