2023-03-25 Nix team meeting #133
Agenda
- builder work
- GSoC
- Regression Triage
Builder/scheduler refactoring, delegation of work and reviewing
-
We all agree the builder/schedule logic is convoluted today
-
@rhelmot and @l-as are both working on extensions of it (BSD sandboxing, WASM) that would not be so nice with the current arch
-
Both have also started working on refactor to make the code nicer, so we have “complexity budget” left over to support these new things
-
@ericson2314 would like to use this opportunity to work on our codeownership and contributor onboarding goals:
-
Team outlines criteria for good design
-
@rhelmot and @l-as and work on fleshing and and executing that vision, including figuring out an order of smaller refactors that is reusable
-
@ericson2314 as codeowner and point-person on this task and assess plan and small PRs w.r.t team’s criteria. Can merge things without having to re-involve the rest of the team.
-
-
If this sort of process goes well, we should consider it for if/when @pennae gets back to overhauling the evaluator.
-
@edolstra: sounds good to me
-
@tomberek: still good if those two want to come to this meeting occasionally and present their progress, but in more a communication-building than blocking way.
-
Add @ericson2314 as libstore codeowner? (waiting for @thufschmitt)
Windows plans
-
@qknight has NLnet funding to work on this
-
GSOC project for
std::filesystem::path
has applicants publicly expressing interest -
Getting Build unit tests with MinGW by Ericson2314 · Pull Request #8901 · NixOS/nix · GitHub merged were unblock/accelerate making both these things happen. @Ericson2314 thus feels more urgency getting it reviewed and ready to go.
Triage
possible regression: "stack overflow (possible infinite recursion)" · Issue #9672 · NixOS/nix · GitHub
builtins.getFlake regression, claims flakeref is unlocked when narHash not supplied · Issue #10297 · NixOS/nix · GitHub
Made impure intentionally because the current implementation has impurities.
New design: take advantage of Git tree hashes for tarball verification
Locking:
- Both types of tarballs can be considered locked
- tree hash can be verified, no problem
- commit hash can be mapped to tree hash, and then everything else via done via tree hash.
- Use the tree hash to let GitHub generate a tree-based tarball. This should be pure.
- Use the tree hash to verify the download
- Store unpacked tarball in a git repo cache, verify tree hash
- Try to put the commit object also in that github cache So we have light proof of the commit hash → tree hash. (See Merkle proof - Computer Science Wiki)
Fetching:
- Fetch by the recorded tree hash if available
- Otherwise see locking
submodules?
- proposal: immediate fallback to git
- no change from current supported behavior
- decision: no impact to this discussion
proposal: configuration option
- “trust-github = true”
- “trust-X”
- “extra-trusted-fetchers = github gitlab”
- “extra-trusted-fetchers = github-rev github-ref”
- with fallback to git
- defaults to true
- can cause more user confusion regarding trust settings + errors
github:org/repo/COMMIT_HASH
- currently is considered unlocked
- Nix will ask GitHub for the relevant tree hash,
- or will use a provided narHash
- proposal: make this also considered locked, but include a trusted lookup step
- this may not work
- examined behavior, considered usable, requires API usage
- github / gitlab / sourcehut? each needs unique approach
- fallback to using libgit provides assurances
- must trust Github for mapping
- implied by usage of “github:” scheme without narHash
- prove is that the commit object is correct
- assumption: SHA1 conflicts not considered
- steps:
- fetch commit object
- fetch by tree hash
- lock via tree hash
- proposal: only use git API
- performance / latency
- benchmark?
- decision: keep using github tarball endpoint
- new lockfile version? backwards compat?
- yes + breaking (see dirtyRev)
- proposal: backport a “tolerance” to older versions to avoid breaking
- at least consider back to 2.18
- this may not work
Options
- use existing github fetcher, API based
- sub-class under git fetcher
- re-frame?: github protocol is subclass of git fetching
- additional transport “git+github”, can leverage github API for perf + fallback
github:org/repo/TREE_HASH
- locked
- still has edge cases?
Next steps + TODO
Using git+https API is not rate-limited?
- release notes clarifying the change
- needs changes to fetcher
- implement “github” fetcher changes
- update lockfile format, breaking!
- alternative: revert + fix later?
- estimates:
- time to confirm+understand: 6h work / 4d calendar
- fallback to config option / revert + block stabilize
- time to fix: 18h work / 2w calendar
- time to confirm+understand: 6h work / 4d calendar
- people who can help? @DavHau, coordinate with @roberth
- revert: @thufschmitt
Decision: [@tomberek]: look into faster GitHub runners + Foundation support
Decision: if no progress, devote Wed meeting to this
Decision: must be fixed prior to stabilization