Storing SHA1 Hashes As Binary
Storing SHA1 hashes as strings is not very space efficient. A SHA1 as a string requires at least 40 bytes, an additional byte to store the encoding, and perhaps more space depending on the internals of PostgreSQL.
On the other hand, if one were to store a SHA1 as binary one would only need 20 bytes for the actual SHA1, and 1 or 4 bytes of additional space (again depending on database internals). This means that in the best case scenario we can reduce the space usage by 50%.
To make this easier to work with you can include the concern ShaAttribute
into
a model and define a SHA attribute using the sha_attribute
class method. For
example:
class Commit < ActiveRecord::Base
include ShaAttribute
sha_attribute :sha
end
This allows you to use the value of the sha
attribute as if it were a string,
while storing it as binary. This means that you can do something like this,
without having to worry about converting data to the right binary format:
commit = Commit.find_by(sha: '88c60307bd1f215095834f09a1a5cb18701ac8ad')
commit.sha = '971604de4cfa324d91c41650fabc129420c8d1cc'
commit.save
There is however one requirement: the column used to store the SHA has must be
a binary type. For Rails this means you need to use the :binary
type instead
of :text
or :string
.