Decoding Base64 in the Nix language

I came to a problem where I decided it might actually be a good idea to be able to decode a Base64 string containing ASCII inside of a Nix expression. I’m not going to bother explaining how I got to that point, as it’s mostly irrelevant, but I will say that it’s not a result of wanting to use Nix as a general-purpose programming language, just an issue I was having plumbing data between multiple things.

I was kind of surprised to find, then, that there was, as far as I can tell, no existing implementation of this anywhere. I’m pretty bad at functional programming and the Nix language in general, but I figured it would be a decent learning experience to try to make such an expression anyways. So I did. Here is my attempt, as it stands right now:

let
  # Helpers
  charAt = index: builtins.substring index 1;
  chunkBase64 = base64:
    (builtins.genList
      (index: builtins.substring (index * 4) 4 base64)
      (((builtins.stringLength base64) + 3) / 4)
    );
  concatStrings = builtins.concatStringsSep "";

  # Bitwise math
  truncateToByte = builtins.bitAnd 255;

  # ASCII decoding
  asciiTable = [
    ""  ""  ""   ""  ""  ""  ""  ""  ""  "\t" "\n" ""  ""  "\r" ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""  ""   ""  ""  ""
    " " "!" "\"" "#" "$" "%" "&" "'" "(" ")"  "*"  "+" "," "-"  "." "/" "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" ":" ";" "<"  "=" ">" "?"
    "@" "A" "B"  "C" "D" "E" "F" "G" "H" "I"  "J"  "K" "L" "M"  "N" "O" "P" "Q" "R" "S" "T" "U" "V" "W" "X" "Y" "Z" "[" "\\" "]" "^" "_"
    "`" "a" "b"  "c" "d" "e" "f" "g" "h" "i"  "j"  "k" "l" "m"  "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" "{" "|"  "}" "~" ""
  ];
  byteToAscii = value:
    if value < 128
    then
      let
        asciiValue = builtins.elemAt asciiTable value;
      in
        if asciiValue != ""
        then
          asciiValue
        else
          builtins.abort "unsupported character code ${toString value}"
    else
      builtins.abort "unsupported non-ascii byte ${toString value}";

  # Base64 encoding
  base64Alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
  base64Table = builtins.listToAttrs
    (map
      (character: { name = character.value; value = character.index; })
      (builtins.genList
        (index: { inherit index; value = (charAt index base64Alphabet); })
        (builtins.stringLength base64Alphabet)
      )
    );
  base64CharToValue = character:
    if builtins.hasAttr character base64Table
    then
      base64Table.${character}
    else
      builtins.abort("invalid base64 character ${character}");
  base64CalcByte = left: right: offset: chunk:
    (builtins.bitOr
      (truncateToByte ((base64CharToValue (charAt (offset) chunk)) * left))
      (truncateToByte ((base64CharToValue (charAt (offset + 1) chunk)) / right))
    );
  base64ChunkToBytes = chunk:
    if (
      ((builtins.stringLength chunk) == 2) ||
      (
        ((builtins.stringLength chunk) == 4) &&
        ((builtins.substring 2 2 chunk) == "==")
      )
    )
    then
      [
        (base64CalcByte 4 16 0 chunk)
      ]
    else
      if (
        ((builtins.stringLength chunk) == 3) ||
        (
          ((builtins.stringLength chunk) == 4) &&
          ((builtins.substring 3 1 chunk) == "=")
        )
      )
      then
        [
          (base64CalcByte 4 16 0 chunk)
          (base64CalcByte 16 4 1 chunk)
        ]
      else
        if (builtins.stringLength chunk) == 4
        then
          [
            (base64CalcByte 4 16 0 chunk)
            (base64CalcByte 16 4 1 chunk)
            (base64CalcByte 64 1 2 chunk)
          ]
        else
          builtins.abort "invalid base64 chunk ${chunk}";
  base64ToAscii = base64:
    concatStrings
      (map
        byteToAscii
        (builtins.concatMap
          base64ChunkToBytes
          (chunkBase64 base64)
        )
      );
in {
  base64Decode = base64ToAscii;
}

I have to say, I am pretty sure this is awful. That said, it does basically work.

I have a few reasons I feel enticed to post this:

  1. I thought it was an interesting challenge, given that Nix isn’t particularly well-suited to the task (Or, maybe it is and I am just ignorant.)
  2. I suspect someone may find this useful some day, so it may as well be on the Internet instead of just sitting here locally.
  3. If I’m lucky, I figure someone will kindly give me hints as to how to better utilize Nix and the functional programming paradigm in this particular case.

In any case, here it is.

3 Likes

Gave it a go myself, though mine doesn’t handle padding!

let
  lib = import <nixpkgs/lib>;
  testString = "TWFueSBoYW5kcyBtYWtlIGxpZ2h0IHdvcmsu";

  base64Table = builtins.listToAttrs
    (lib.imap0 (i: c: lib.nameValuePair c i)
    (lib.stringToCharacters "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"));

  # Generated using python3:
  # print(''.join([ chr(n) for n in range(1, 256) ]), file=open('ascii', 'w'))
  ascii = builtins.readFile ./ascii;

  decode = str:
    let
      # List of base-64 numbers
      numbers64 = map (c: base64Table.${c}) (lib.stringToCharacters str);

      # List of base-256 numbers
      numbers256 = lib.concatLists (lib.genList (i:
        let
          v = lib.foldl'
            (acc: el: acc * 64 + el)
            0
            (lib.sublist (i * 4) 4 numbers64);
        in
        [
          (lib.mod (v / 256 / 256) 256)
          (lib.mod (v / 256) 256)
          (lib.mod v 256)
        ]
      ) (lib.length numbers64 / 4));

    in
    # Converts base-256 numbers to ascii
    lib.concatMapStrings (n:
      # Can't represent the null byte in Nix..
      lib.substring (n - 1) 1 ascii
    ) numbers256;

in
decode testString
2 Likes

Nice! That looks significantly better. Thanks for sharing.

Ah, that’s also quite clever.

There seems to be a couple of tiny issues in the posted code, though. Here’s my diff.

--- a/base64.nix
+++ b/base64.nix
@@ -21,21 +21,21 @@ let
           v = lib.foldl'
             (acc: el: acc * 64 + el)
             0
-            (lib.sublist (i * 4) 4 numbers);
+            (lib.sublist (i * 4) 4 numbers64);
         in
         [
           (lib.mod (v / 256 / 256) 256)
           (lib.mod (v / 256) 256)
           (lib.mod v 256)
         ]
-      ) (lib.length numbers / 4));
+      ) (lib.length numbers64 / 4));
 
     in
-    # Converts base-265 numbers to ascii
+    # Converts base-256 numbers to ascii
     lib.concatMapStrings (n:
       # Can't represent the null byte in Nix..
       lib.substring (n - 1) 1 ascii
-    ) grouped;
+    ) numbers256;
 
 in
 decode testString

And of course, this makes for a nice exercise, as I am now thinking of how to handle padding/the remainder with this approach to Base64 decoding.

Ahh that was stupid, I changed the names the last second without testing it again haha, thanks!