Dict: offline version?

tl;dr tips on offline dictionary with nix?

I found Dict - NixOS Wiki writeup but it talks about using dict.org. Isn’t there a way to configure dict so you can lookup words in the dictionary when you’re offline? I seem to recall it took a few packages on Debian, but don’t recall which exactly (quick websearch suggests at least two: apt install dict dictd).

anyway, testing with nix-shell -p dict and trying to do a lookup results in an error:

'dict.conf' doesn't specify any dict server

So I’m guessing there’s not some default offline state for the nix package.

1 Like

Hm… I saw nixos package search showed some dictdDB... packages underneath the main dictd package… I tried installing them (kind of guessing both which ones and how to install), so now my environment.systemPackages = with pkgs; [ ... ] ellipsis has these lines:

dict
dictdDBs.wordnet
dictdDBs.wiktionary

rebuilt, no errors (well other than this unrelated thing), but dict command still gives the same error.

Looks like the wiki is (as usual) outdated or incomplete. You can use this option to create a local server, but you’ll want to remove the suggestions from the wiki first: NixOS Search

1 Like

Huh, I think that worked. So I enabled and rebuilt and now can lookup words (and indeed can see responses from multiple of the DBs I installed), but during rebuild I saw multiple errors output from some build process:

Basename is /nix/store/gfbza56cr66mhka9rnzc2s1ksb939iza-dict-db-wiktionary-20220420/share/dictd/wiktionary-en
Locale is en_US.UTF-8
/nix/store/7cni7ndy2pm18ysl5znq6znb30sxp156-stdenv-linux/setup: line 1626: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8): No such file or directory
/nix/store/m36d29gn5gm9bk0g7fcln1v8171hvn95-bash-5.2-p15/bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
awk: cmd. line:4: fatal: cannot open file `en_US.UTF-8' for reading: No such file or directory
dictfmt: option '--locale' requires an argument
dictfmt-1.13.1
Copyright 1997-2000 Rickard E. Faith (faith@cs.unc.edu)
Copyright 2002-2007 Aleksey Cheusov (vle@gmx.net)

Usage: dictfmt -c5|-t|-e|-f|-h|-j|-p [-u url] [-s name] [options] basename
       dictfmt -i|-I [options]
Create a dictionary database and index file for use by a dictd server

# qsu@ snipped _long_ help output from dictfmt

I don’t see anything obvious in https://github.com/NixOS/nixpkgs/blob/e5f018cf150e29aac26c61dac0790ea023c46b24/nixos/modules/services/misc/dictd.nix - no references to dictfmt or awk

here's the full output of rebuilding in case the above snippet isn't clear:
$ sudo --reset-timestamp nixos-rebuild switch --upgrade
[sudo] password for qsu:
unpacking channels...
building Nix...
building the system configuration...
trace: warning: mdadm: Neither MAILADDR nor PROGRAM has been set. This will cause the `mdmon` service to crash.
these 20 derivations will be built:
  /nix/store/200m9ajjj53vscmwp795rw85sdrnqhfn-dictd-dbs.drv
  /nix/store/ajq02in86wmw5np7hgxmhcc7bvlnc676-system-path.drv
  /nix/store/dl8w7mpvpskyx9nmqcz6hrijcnqryqrw-X-Restart-Triggers.drv
  /nix/store/5d03f6k6xrv94pjxyhjp478lfkmlkxwa-unit-polkit.service.drv
  /nix/store/5hksp8i3annfd1m42ycch11yb55gwg13-unit-script-dictd-start.drv
  /nix/store/7alayslwnrkmbpdgyfz3cri698n9nrjl-dbus-1.drv
  /nix/store/fxffiip2zvd2xrcdmvh0hxpb928w1wy1-X-Restart-Triggers.drv
  /nix/store/9a8mvkj1qn1g8kypchsq732fppkqkwrp-unit-dbus.service.drv
  /nix/store/66fvph86fwzbc0c8zfd45gzqj14na3n7-user-units.drv
  /nix/store/7bmk2rf2lakba9sw72y8k8hj729l16bx-unit-dbus.service.drv
  /nix/store/s6f16h3vyh8az4h1qqigvi2am4q6r53k-unit-accounts-daemon.service.drv
  /nix/store/z8h5p821j1ywz8wlk6p1cca2f91hb9wk-unit-dictd.service.drv
  /nix/store/9x827sr5fcm63h2l0578fsfy400xfibj-system-units.drv
  /nix/store/c3p7zdy0m2jb3hz95bwzii72s93xsmqz-etc-dict.conf.drv
  /nix/store/k7h48ln52vcykpnzxh1ivr5frghgj1k7-set-environment.drv
  /nix/store/c44b30x3jzmvzbc71sk2ls1vhiw6033w-etc-profile.drv
  /nix/store/mbyfn7zngqb13xa6pxcmpa497yd2ynfn-etc-pam-environment.drv
  /nix/store/nvwjbv4x96h7i4jqa7fkr5qp9pr3bwhp-etc.drv
  /nix/store/q8l78ijzxqjjsvl4lc2lz509a6glrvrn-users-groups.json.drv
  /nix/store/w5fy9c9zcisi2s0nsphc63xd18a3q66b-nixos-system-qsulaptop-23.11pre524605.3a2786eea085.drv
this path will be fetched (0.05 MiB download, 0.20 MiB unpacked):
  /nix/store/iwbwvxdkj5zhxwkh27cihx90q5cw0kix-libfaketime-0.9.10
building '/nix/store/ajq02in86wmw5np7hgxmhcc7bvlnc676-system-path.drv'...
building '/nix/store/c3p7zdy0m2jb3hz95bwzii72s93xsmqz-etc-dict.conf.drv'...
copying path '/nix/store/iwbwvxdkj5zhxwkh27cihx90q5cw0kix-libfaketime-0.9.10' from 'https://cache.nixos.org'...
building '/nix/store/200m9ajjj53vscmwp795rw85sdrnqhfn-dictd-dbs.drv'...
warning: collision between `/nix/store/gfbza56cr66mhka9rnzc2s1ksb939iza-dict-db-wiktionary-20220420/share/dictd/locale' and `/nix/store/psj541qixnmwj96ca5ylk7vlzqqyz88q-dict-db-wordnet-542/share/dictd/locale'
patching sources
updateAutotoolsGnuConfigScriptsPhase
configuring
no configure script, doing nothing
building
no Makefile or custom buildPhase, doing nothing
installing
Got store path /nix/store/gfbza56cr66mhka9rnzc2s1ksb939iza-dict-db-wiktionary-20220420
Directory reference: /nix/store/gfbza56cr66mhka9rnzc2s1ksb939iza-dict-db-wiktionary-20220420/share/dictd
Basename is /nix/store/gfbza56cr66mhka9rnzc2s1ksb939iza-dict-db-wiktionary-20220420/share/dictd/wiktionary-en
Locale is en_US.UTF-8
/nix/store/7cni7ndy2pm18ysl5znq6znb30sxp156-stdenv-linux/setup: line 1626: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8): No such file or directory
/nix/store/m36d29gn5gm9bk0g7fcln1v8171hvn95-bash-5.2-p15/bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
awk: cmd. line:4: fatal: cannot open file `en_US.UTF-8' for reading: No such file or directory
dictfmt: option '--locale' requires an argument
dictfmt-1.13.1
Copyright 1997-2000 Rickard E. Faith (faith@cs.unc.edu)
Copyright 2002-2007 Aleksey Cheusov (vle@gmx.net)

Usage: dictfmt -c5|-t|-e|-f|-h|-j|-p [-u url] [-s name] [options] basename
       dictfmt -i|-I [options]
Create a dictionary database and index file for use by a dictd server

-c5       headwords are preceded by a line containing at least
                5 underscore (_) characters
-t        implies -c5, --without-headword and --without-info options
-e        file is in html format
-f        headwords start in col 0, definitions start in col 8
-j        headwords are set off by colons
-p        headwords are preceded by %p, with %d on following line
-i        reformat stdin having three-column .index file format
-u <url>  URL of site where database was obtained
-s <name> name of the database
--license
-L        display copyright and license information
--version
-V        display version information
-D        debug
--utf8    for creating utf-8 dictionary
--quiet
--silent
-q        quiet operation
--help    display this help message
--locale   <locale> specifies the locale used for sorting.
           if no locale is specified, the "C" locale is used.
--allchars all characters (not only alphanumeric and space)
           will be used in search if this argument is supplied
--headword-separator <sep> sets headword separator which allows
                     several words to have the same definition
                     Example: autumn%%%fall can be used
                     if '--headword-separator %%%' is supplied
--index-data-separator <sep> sets index/data separator which allows
                     to explicitly set fourth column in .index file,
                     the default is "\034"
--break-headwords    multiple headwords will be written on separate lines
                     in the .dict file.  For use with '--headword-separator.
--index-keep-orig    fourth column in .index file stores original headword
                     which is returned by MATCH command
--case-sensitive     Create .index/.dict files for case sensitive search
--without-headword   headwords will not be copied to .dict file
--without-header     header will not be copied to DB info entry
--without-url        URL will not be copied to DB info entry
--without-time       time of creation will not be copied to DB info entry
--without-info       DB info entry will not be created.
                     This may be useful if 00-database-info headword
                     is expected from stdin (dictunformat outputs it).
--columns            Set the number of columns for wrapping text
                     before writing it to .dict file.
                     If it is zero, wrapping is off.
--default-strategy  Sets the default search strategy for the database.
                    Special entry 00-database-default-strategy is created
                    for this purpose.
--mime-header       Sets MIME header stored in .data file which
                    prepend definition
                    when client sent OPTION MIME to `dictd'
--without-ver      do not create 00-database-dictfmt-<VER> entry in .index
/nix/store/m36d29gn5gm9bk0g7fcln1v8171hvn95-bash-5.2-p15/bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
dictfmt: option '--locale' requires an argument
dictfmt-1.13.1
Copyright 1997-2000 Rickard E. Faith (faith@cs.unc.edu)
Copyright 2002-2007 Aleksey Cheusov (vle@gmx.net)

Usage: dictfmt -c5|-t|-e|-f|-h|-j|-p [-u url] [-s name] [options] basename
       dictfmt -i|-I [options]
Create a dictionary database and index file for use by a dictd server

-c5       headwords are preceded by a line containing at least
                5 underscore (_) characters
-t        implies -c5, --without-headword and --without-info options
-e        file is in html format
-f        headwords start in col 0, definitions start in col 8
-j        headwords are set off by colons
-p        headwords are preceded by %p, with %d on following line
-i        reformat stdin having three-column .index file format
-u <url>  URL of site where database was obtained
-s <name> name of the database
--license
-L        display copyright and license information
--version
-V        display version information
-D        debug
--utf8    for creating utf-8 dictionary
--quiet
--silent
-q        quiet operation
--help    display this help message
--locale   <locale> specifies the locale used for sorting.
           if no locale is specified, the "C" locale is used.
--allchars all characters (not only alphanumeric and space)
           will be used in search if this argument is supplied
--headword-separator <sep> sets headword separator which allows
                     several words to have the same definition
                     Example: autumn%%%fall can be used
                     if '--headword-separator %%%' is supplied
--index-data-separator <sep> sets index/data separator which allows
                     to explicitly set fourth column in .index file,
                     the default is "\034"
--break-headwords    multiple headwords will be written on separate lines
                     in the .dict file.  For use with '--headword-separator.
--index-keep-orig    fourth column in .index file stores original headword
                     which is returned by MATCH command
--case-sensitive     Create .index/.dict files for case sensitive search
--without-headword   headwords will not be copied to .dict file
--without-header     header will not be copied to DB info entry
--without-url        URL will not be copied to DB info entry
--without-time       time of creation will not be copied to DB info entry
--without-info       DB info entry will not be created.
                     This may be useful if 00-database-info headword
                     is expected from stdin (dictunformat outputs it).
--columns            Set the number of columns for wrapping text
                     before writing it to .dict file.
                     If it is zero, wrapping is off.
--default-strategy  Sets the default search strategy for the database.
                    Special entry 00-database-default-strategy is created
                    for this purpose.
--mime-header       Sets MIME header stored in .data file which
                    prepend definition
                    when client sent OPTION MIME to `dictd'
--without-ver      do not create 00-database-dictfmt-<VER> entry in .index
gawk: cmd. line:25: fatal: cannot open file `en_US.UTF-8' for reading: No such file or directory
Got store path /nix/store/psj541qixnmwj96ca5ylk7vlzqqyz88q-dict-db-wordnet-542
Directory reference: /nix/store/psj541qixnmwj96ca5ylk7vlzqqyz88q-dict-db-wordnet-542/share/dictd
Basename is /nix/store/psj541qixnmwj96ca5ylk7vlzqqyz88q-dict-db-wordnet-542/share/dictd/wn
Locale is en_US.UTF-8
/nix/store/7cni7ndy2pm18ysl5znq6znb30sxp156-stdenv-linux/setup: line 1626: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
created 21360 symlinks in user environment
gtk-update-icon-cache: Cache file created successfully.
gtk-update-icon-cache: Cache file created successfully.
building '/nix/store/dl8w7mpvpskyx9nmqcz6hrijcnqryqrw-X-Restart-Triggers.drv'...
building '/nix/store/7alayslwnrkmbpdgyfz3cri698n9nrjl-dbus-1.drv'...
building '/nix/store/mbyfn7zngqb13xa6pxcmpa497yd2ynfn-etc-pam-environment.drv'...
building '/nix/store/k7h48ln52vcykpnzxh1ivr5frghgj1k7-set-environment.drv'...
building '/nix/store/s6f16h3vyh8az4h1qqigvi2am4q6r53k-unit-accounts-daemon.service.drv'...
/nix/store/m36d29gn5gm9bk0g7fcln1v8171hvn95-bash-5.2-p15/bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
dictfmt: option '--locale' requires an argument
dictfmt-1.13.1
Copyright 1997-2000 Rickard E. Faith (faith@cs.unc.edu)
Copyright 2002-2007 Aleksey Cheusov (vle@gmx.net)

Usage: dictfmt -c5|-t|-e|-f|-h|-j|-p [-u url] [-s name] [options] basename
       dictfmt -i|-I [options]
Create a dictionary database and index file for use by a dictd server

-c5       headwords are preceded by a line containing at least
                5 underscore (_) characters
-t        implies -c5, --without-headword and --without-info options
-e        file is in html format
-f        headwords start in col 0, definitions start in col 8
-j        headwords are set off by colons
-p        headwords are preceded by %p, with %d on following line
-i        reformat stdin having three-column .index file format
-u <url>  URL of site where database was obtained
-s <name> name of the database
--license
-L        display copyright and license information
--version
-V        display version information
-D        debug
--utf8    for creating utf-8 dictionary
--quiet
--silent
-q        quiet operation
--help    display this help message
--locale   <locale> specifies the locale used for sorting.
           if no locale is specified, the "C" locale is used.
--allchars all characters (not only alphanumeric and space)
           will be used in search if this argument is supplied
--headword-separator <sep> sets headword separator which allows
                     several words to have the same definition
                     Example: autumn%%%fall can be used
                     if '--headword-separator %%%' is supplied
--index-data-separator <sep> sets index/data separator which allows
                     to explicitly set fourth column in .index file,
                     the default is "\034"
--break-headwords    multiple headwords will be written on separate lines
                     in the .dict file.  For use with '--headword-separator.
--index-keep-orig    fourth column in .index file stores original headword
                     which is returned by MATCH command
--case-sensitive     Create .index/.dict files for case sensitive search
--without-headword   headwords will not be copied to .dict file
--without-header     header will not be copied to DB info entry
--without-url        URL will not be copied to DB info entry
--without-time       time of creation will not be copied to DB info entry
--without-info       DB info entry will not be created.
                     This may be useful if 00-database-info headword
                     is expected from stdin (dictunformat outputs it).
--columns            Set the number of columns for wrapping text
                     before writing it to .dict file.
                     If it is zero, wrapping is off.
--default-strategy  Sets the default search strategy for the database.
                    Special entry 00-database-default-strategy is created
                    for this purpose.
--mime-header       Sets MIME header stored in .data file which
                    prepend definition
                    when client sent OPTION MIME to `dictd'
--without-ver      do not create 00-database-dictfmt-<VER> entry in .index
awk: cmd. line:4: fatal: cannot open file `en_US.UTF-8' for reading: No such file or directory
/nix/store/m36d29gn5gm9bk0g7fcln1v8171hvn95-bash-5.2-p15/bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
dictfmt: option '--locale' requires an argument
dictfmt-1.13.1
Copyright 1997-2000 Rickard E. Faith (faith@cs.unc.edu)
Copyright 2002-2007 Aleksey Cheusov (vle@gmx.net)

Usage: dictfmt -c5|-t|-e|-f|-h|-j|-p [-u url] [-s name] [options] basename
       dictfmt -i|-I [options]
Create a dictionary database and index file for use by a dictd server

-c5       headwords are preceded by a line containing at least
                5 underscore (_) characters
-t        implies -c5, --without-headword and --without-info options
building '/nix/store/fxffiip2zvd2xrcdmvh0hxpb928w1wy1-X-Restart-Triggers.drv'...
building '/nix/store/c44b30x3jzmvzbc71sk2ls1vhiw6033w-etc-profile.drv'...
building '/nix/store/5d03f6k6xrv94pjxyhjp478lfkmlkxwa-unit-polkit.service.drv'...
-e        file is in html format
-f        headwords start in col 0, definitions start in col 8
-j        headwords are set off by colons
-p        headwords are preceded by %p, with %d on following line
-i        reformat stdin having three-column .index file format
-u <url>  URL of site where database was obtained
-s <name> name of the database
--license
-L        display copyright and license information
--version
-V        display version information
-D        debug
--utf8    for creating utf-8 dictionary
--quiet
--silent
-q        quiet operation
--help    display this help message
--locale   <locale> specifies the locale used for sorting.
           if no locale is specified, the "C" locale is used.
--allchars all characters (not only alphanumeric and space)
           will be used in search if this argument is supplied
--headword-separator <sep> sets headword separator which allows
                     several words to have the same definition
                     Example: autumn%%%fall can be used
                     if '--headword-separator %%%' is supplied
--index-data-separator <sep> sets index/data separator which allows
                     to explicitly set fourth column in .index file,
                     the default is "\034"
--break-headwords    multiple headwords will be written on separate lines
                     in the .dict file.  For use with '--headword-separator.
--index-keep-orig    fourth column in .index file stores original headword
                     which is returned by MATCH command
--case-sensitive     Create .index/.dict files for case sensitive search
--without-headword   headwords will not be copied to .dict file
--without-header     header will not be copied to DB info entry
--without-url        URL will not be copied to DB info entry
--without-time       time of creation will not be copied to DB info entry
--without-info       DB info entry will not be created.
                     This may be useful if 00-database-info headword
                     is expected from stdin (dictunformat outputs it).
--columns            Set the number of columns for wrapping text
                     before writing it to .dict file.
                     If it is zero, wrapping is off.
--default-strategy  Sets the default search strategy for the database.
                    Special entry 00-database-default-strategy is created
                    for this purpose.
--mime-header       Sets MIME header stored in .data file which
                    prepend definition
                    when client sent OPTION MIME to `dictd'
--without-ver      do not create 00-database-dictfmt-<VER> entry in .index
gawk: cmd. line:25: fatal: cannot open file `en_US.UTF-8' for reading: No such file or directory
post-installation fixup
shrinking RPATHs of ELF executables and libraries in /nix/store/c1056hai4jj3rmc1csg9l0dpvidxzypz-dictd-dbs
checking for references to /build/ in /nix/store/c1056hai4jj3rmc1csg9l0dpvidxzypz-dictd-dbs...
patching script interpreter paths in /nix/store/c1056hai4jj3rmc1csg9l0dpvidxzypz-dictd-dbs
building '/nix/store/7bmk2rf2lakba9sw72y8k8hj729l16bx-unit-dbus.service.drv'...
building '/nix/store/9a8mvkj1qn1g8kypchsq732fppkqkwrp-unit-dbus.service.drv'...
building '/nix/store/5hksp8i3annfd1m42ycch11yb55gwg13-unit-script-dictd-start.drv'...
building '/nix/store/q8l78ijzxqjjsvl4lc2lz509a6glrvrn-users-groups.json.drv'...
building '/nix/store/z8h5p821j1ywz8wlk6p1cca2f91hb9wk-unit-dictd.service.drv'...
building '/nix/store/66fvph86fwzbc0c8zfd45gzqj14na3n7-user-units.drv'...
building '/nix/store/9x827sr5fcm63h2l0578fsfy400xfibj-system-units.drv'...
building '/nix/store/nvwjbv4x96h7i4jqa7fkr5qp9pr3bwhp-etc.drv'...
building '/nix/store/w5fy9c9zcisi2s0nsphc63xd18a3q66b-nixos-system-qsulaptop-23.11pre524605.3a2786eea085.drv'...
stopping the following units: accounts-daemon.service
activating the configuration...
setting up /etc...
reloading user units for qsu...
setting up tmpfiles
reloading the following units: dbus.service
restarting the following units: polkit.service
starting the following units: accounts-daemon.service
the following new units were started: dictd.service

ah, found two matches in https://github.com/search?q=repo%3ANixOS%2Fnixpkgs%20dictfmt&type=code

  1. dictfmt -p wiktionary-en --locale en_US.UTF-8 --columns 0 -u http://en.wiktionary.org from https://github.com/NixOS/nixpkgs/blob/a29cf4aece7ed0f497f600faec9614c6feb5159b/pkgs/servers/dict/wiktionary/wiktionary2dict.py#L770
  2. dictfmt_index2word --locale $locale < "$base".index > "$base".word || true from https://github.com/NixOS/nixpkgs/blob/a29cf4aece7ed0f497f600faec9614c6feb5159b/pkgs/servers/dict/dictd-db-collector.nix#L63

So based on the errors from above:

/nix/store/7cni7ndy2pm18ysl5znq6znb30sxp156-stdenv-linux/setup: line 1626: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8): No such file or directory
/nix/store/m36d29gn5gm9bk0g7fcln1v8171hvn95-bash-5.2-p15/bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
awk: cmd. line:4: fatal: cannot open file `en_US.UTF-8' for reading: No such file or directory
dictfmt: option '--locale' requires an argument

I’m guessing it’s the second script that’s failing as $locale must not be set.

edit: correction: $locale is set, but some relevant file is missing (per nearby log lines). Looking at the awk error triggered by this dictfmt_index2word script and the dictfmt -i command it calls, I’m unclear on what’s happening… maybe somehow the index2word script is dropping the --locale arg and it would work if --locale=$locale was used by the nix script? reason for this hypothesis: this while loop is moving over whitespace-expansion, meaning the $locale encounters that loop all by itself lonesome, and gets consumed by that break clause?


I guess the root cause is line 1626: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8): No such file or directory but I don’t know much about setlocale or why/where that missing file logic is for me to debug… Any ideas?

(sorry for the multiple posts; should probably just research for longer before sharing my notes…)

Yet another answer-to-self:

k, out of ideas for the moment. input welcome!


edit:

okay, seriously last thought: this ^is probably what’s happening; I wonder if there’s some way for me to test this theory? I want to try this:

-      dictfmt_index2word --locale $locale < "$base".index > "$base".word || true
+      dictfmt_index2word --locale=$locale < "$base".index > "$base".word || true
-      dictfmt_index2suffix --locale $locale < "$base".index > "$base".suffix || true
+      dictfmt_index2suffix --locale=$locale < "$base".index > "$base".suffix || true

Can I edit this file myself somehow on my local machine quickly (and re-run everything), to prove my hypothesis? Very much a nix-question out of my depth… :expressionless: