I’d like to implement the logic to generate the nix hash of a directory.
I want to get the hash that nix hash path /path/to/dir
gives you, but without relaying on nix (I want to re-implement that logic in a different programming language)
It looks to me that the C++ implementation is here:
static time_t dump(const Path & path, Sink & sink, PathFilter & filter)
{
checkInterrupt();
auto st = lstat(path);
time_t result = st.st_mtime;
sink << "(";
if (S_ISREG(st.st_mode)) {
sink << "type" << "regular";
if (st.st_mode & S_IXUSR)
sink << "executable" << "";
dumpContents(path, st.st_size, sink);
}
else if (S_ISDIR(st.st_mode)) {
sink << "type" << "directory";
/* If we're on a case-insensitive system like macOS, undo
This file has been truncated. show original
But is that algorithm explained/documented somewhere?
1 Like
NobbZ
May 3, 2022, 8:29am
2
I remember having seen it explained somewhere, though I am not sure if that was in an official documentation.
The gist of what I read was:
strip all file attributes and permissions and ownerships that can’t exist in the store
store the folder in a NAR (nix archive)
Get the hash of that
1 Like
Then, the next question is how to generate a NAR. I found this gist:
nar-spec.md
This is taken directly from Figure 5.2 in http://nixos.org/~eelco/pubs/phd-thesis.pdf. It is presented here to be more directly linkable.
```
serialise(fso) = str("nix-archive-1") + serialise1(fso)
serialise1(fso) = str("(") + seralise2(fso) + str(")")
serialise2(type=Regular, exec, contents) =
str("type") + str("regular")
+ (
This file has been truncated. show original
1 Like
I’m not sure if the details have changed, but nobbz might be thinking of Eelco’s thesis–I think there’s a section in there on the hashing at least: https://edolstra.github.io/pubs/phd-thesis.pdf
2 Likes
I implemented the NAR serialization algorithm from https://edolstra.github.io/pubs/phd-thesis.pdf (Figure 5.2, page 101) in clojure:
;; NAR serialization algorithm
;; See https://edolstra.github.io/pubs/phd-thesis.pdf (Figure 5.2, page 101)
(defn int-n
"NAR serialization algorithm
int(n)"
[n]
(-> (ByteBuffer/allocate Long/BYTES)
(.order ByteOrder/LITTLE_ENDIAN)
(.putInt (int n))
(.array)))
(defn pad
"NAR serialization algorithm
pad(s)"
[n sink]
(let [pad (mod (- 8 n)
8)]
(when-not (zero? pad)
(bs/transfer (byte-array pad) sink))))
This file has been truncated. show original
The hash from the NAR is the hash for a directory
Thanks to you both @NobbZ @abathur
2 Likes