Design preview: PAM rule ordering and targets

I’m working on security.pam and pam.nix. I’d like to preview a new feature and get comments on the design.

Problem: pam.nix isn’t modular

Two big strengths of the NixOS module system are that you can compose modules and merge multiple module configurations. The nginx module configures systemd options, those options are used to configure environment.etc, and so on; this is composition. Many modules can define systemd.services or environment.etc; this is merging.

security.pam has a problem: In practice, you cannot merge different PAM configurations from different client modules. If you use both fprintd and systemd-homed, you want each one to contribute PAM rules using security.pam, just like multiple modules can use environment.etc. But security.pam can’t (practically speaking) do this.

You can define entirely new PAM services using security.pam. For example, desktop managers like gdm and sddm do this. But they are defining the entire rule stack, and in a way that prohibits other modules from defining rules for those same services.

The issue with merging rules into the same service is that PAM rules are ordered. I won’t go into the details here, but suffice to say the ordering of rules in a PAM service file can be really important. Should the fprintd rule go before the systemd-homed rule? What about rssh and gnupg? Where should those go in relation to unix?

The way it works today is that pam.nix defines all of those rules in a big list, and other modules (e.g. services.fprintd) get to toggle those rules on and off. But the ordering is fixed by pam.nix.

But technically

One of my earlier changes (#255547) made it technically possible for different modules to define rules for the same PAM service. But it doesn’t enable decomposing pam.nix in practice. Why not?

I added an order integer option to each rule. The rules are then sorted by their order value. So if you want to put fprintd before systemd-homed, you can configure:

{
  security.pam.services.login.rules.fprintd.order =
    config.security.pam.services.login.rules.systemd-homed.order - 1;
}

What if you now want rssh to go between the two? Gah, there’s no room left! And maybe you have multiple other rules you want to put your rule before or after, and you don’t know what order those rules will be in.

So while it’s technically possible with this (hidden, experimental) order option to compose rules from multiple modules, it doesn’t really work out. And not a single NixOS module has been extracted from pam.nix using it.

Topological sorting

What if we let rules define multiple ordering relationships relative to other rules? Something like:

{
  security.pam.services.login.rules.auth.securetty = {
    control = "required";
    modulePath = "pam_securetty.so";
    after.rule.rootok = true;
    before.rule.unix = true;
    before.rule.unix-early = true;
  };
}

This isn’t a new idea. But now I’ve implemented it! (PR coming soon.)

The big list of rules in pam.nix remains. But it’s now used to create these before/after relationships between rules in the list. As rules are extracted from pam.nix to separate modules, the idea is that they would explicitly define the essential ordering relationships.

Targets

But, ugh, I wrote that and I still haven’t extracted anything from pam.nix.

Let’s take fprintd for example. Its auth rule sits right before systemd_home-early and right after p9. So if I enable fprintd on my system, where will that rule go? Well, I don’t have either systemd_home-early or p9, so I have no idea! We have to look further up and down the rule stack to find an ordering relationship.

It also comes before unix-early, which I have because I use GDM. That’s probably an important ordering. But what if someone doesn’t have unix-early? Everyone will have unix, right? Well, unless they disabled unixAuth and use ldap. But fprintd should come before that too!

It seems like every new rule would end up defining a huge list of before or after rules referencing all the other rules in NixOS. That’s not very modular.

So I’m prototyping targets for security.pam. Targets are well-known points in a rule stack that we can order rules around, but which otherwise don’t affect PAM behavior. These are similar conceptually to systemd targets, but otherwise unrelated.

We can define targets:

{
  security.pam.services.login.targets.auth = {
    root = {};
    early = {
      after.target.root = true;
      before.target.main = true;
    };
    main = {};
  };
}

and then order rules with respect to the targets:

{
  security.pam.services.login.rules.auth.securetty = {
    control = "required";
    modulePath = "pam_securetty.so";
    before.target.root = true;
  };
  security.pam.services.sudo.rules.auth.rssh = {
    after.target.early = true;
    before.target.main = true;
  };
}

The idea is that pam.nix will offer an opinionated set of targets. Other NixOS modules can define new rules with respect to those targets. Users can modify those orderings to customize their setups. (To unset an ordering relationship, you can set it to false.)

Some open design questions

Ambiguous ordering

It’s possible to configure rules such that more than one ordering is valid. Should this be allowed?

If yes, then it’s possible that an innocent config change or nixpkgs update could cause the actual ordering of PAM rules for a user to change.

If no, then when we detect an ambiguous ordering, we would fail at evaluation time and tell the user to specify an ordering. This case is easy to detect, but I don’t know yet how to generate a helpful error message.

I’m leaning no.

Built-in targets

pam.nix needs to define a useful set of targets for other modules to build upon. What should those targets be?

I can see some patterns in the auth rules in pam.nix. There’s some local policy enforcement, then various passwordless auth providers, then “early auth” steps, then password-based “early auth” providers, the main password auth rules, a few post-password(?) rules, and the catch-all deny rule.

But it’s squishy. Some of the existing rules probably have to be re-ordered for any set of targets to make sense. (For example, why is oslogin_login the very first rule? Can’t it come after faillock?)

Adopting targets can be done progressively, so we can start with targets that are obvious (like the “early auth” steps). I could use advice here!

Option structure

So we currently define rules like this:

{
  security.pam.services.login.rules.auth.securetty = { ... };
  security.pam.services.passwd.rules.password.ldap = { ... };
}

auth and password are the “type” of PAM rule. Each type is an attrset with named rules.

If we define a quality target, it might look like:

{
  security.pam.services.login.rules.auth.securetty = { ... };
  security.pam.services.passwd.rules.password.ldap = { ... };
  security.pam.services.passwd.targets.password.quality = { ... };
}

But that’s kind of backwards, right? The rules and targets are both for the password type. What if we swap them?

{
  security.pam.services.login.auth.rules.securetty = { ... };
  security.pam.services.passwd.password.rules.ldap = { ... };
  security.pam.services.passwd.password.targets.quality = { ... };
}

Looks better! But now password (and auth and account and session) are options at the same level as service options like unixAuth, rootOK, startSession, etc. Maybe nest them under types?

{
  security.pam.services.login.types.auth.rules.securetty = { ... };
  security.pam.services.passwd.types.password.rules.ldap = { ... };
  security.pam.services.passwd.types.password.targets.quality = { ... };
}

Now it’s clean in a way only a programmer could like. But these are power-user options, right? Just like how a user sets services.fprintd.enable and doesn’t worry about how it uses systemd.services, they shouldn’t have to worry about how it uses security.pam.services.

What do you think? Is the more structured approach acceptable?

Rough plan

Not all of this is related to the above, but here are some pam.nix changes I’ve prototyped:

  • Add a useDefaultRules option. This can be set to false to suppress the default rules that pam.nix usually adds. This allows a module to define a blank-slate PAM service using the rules options instead of text.
  • Replace the existing usages of text in nixpkgs with rules. This is a stepping-stone to better integrating modules with the main rule stack.
  • Introduce topological sorting with after/before, replacing the experimental order property.
  • Introduce targets, also using topological ordering with respect to rules and other targets.

What’s next?

  • Some PAM rules are configured per-service, while others are configured globally for all services. I think it would be nice if all rules and settings could be configured per-service (to the extent it makes sense) and global options served only as a convenience mechanism to configure all services.
  • A way to configure the default service rules would be nice. Something like security.pam.defaults, with the same options as a PAM service. These are the settings that would be applied if useDefaultRules is enabled.
  • Once the ordering and targets design is more settled, I want to try extracting a module or two from pam.nix. This is the litmus test for any potential security.pam design changes.
2 Likes

At first read, this strikes me as a bit over-complicated. Targets seem like they would be equivalent to defining well-known order values. So instead of before.target.root = true;, we could document or define the root target to be order 10000, and have order = 9950; (or order = lib.pam.targets.root - 50; or something). It’s so much easier to explain how order works than transitive topological relationships. And other Linux rules systems use numeric order for things like this without running out of space—I’m thinking of udev rules in particular, which restrict themselves to the 1–99 range and still never seem to have practical issues.

numeric order being enough in practice or not aside (it may well be, I don’t actually know!), I think it should be kept in mind that configuration options used in other contexts that are being used as references here were chosen without assuming all the affordances of the NixOS module system.

1 Like

thank you so much for tackling this problem @Majiir! some of the points you mention here sound good, though i agree with @rhendric that specifically the ordering sounds more complicated than i would like it to be. have you ever looked at how debian handles PAM ordering? it is pretty good system in general… maybe you can draw some inspiration from that?

i want to mention how important your work here splitting up pam.nix is for enabling us to really have a minimal module list so please keep up the great work!

1 Like

The ordering mechanism for udev is defined by udev itself. Every distro and application providing udev rules is working within that framework. PAM modules tend to just document their behavior and expect the “user” to figure out the appropriate ordering.

It’s true that targets could be defined with an integer order value. But how would that compose? Say fprintd decides to order itself at targets.main - 100 and rssh chooses targets.early + 1000. Which one comes first? Did the authors of those rules make a deliberate decision to put one before the other, or is it an accident? What if they both set the same value? The ordering of rules changes the user experience at login and may have serious implications for security.

If we can’t express ordering richly, then we’re forced to have a monolithic rule stack like we have now. We can define one safe ordering, but it doesn’t satisfy every user’s needs.

Putting targets aside for a moment, consider the case where a user wants to deviate from the default ordering. This is common for u2f, for example. So we set u2f.order = unix.order - 50. What if another rule is ordered at u2f + 100? That’s encoding an intent “run after u2f” but doesn’t allow additionally encoding “run before unix” and the rule ends up after unix.

So, if rule orderings are relative, then after/before is essential because you can define multiple ordering relationships and better capture intent.

What if we standardize on absolute integer order values, and no relative orderings? I think this has a few problems:

  1. It’s effectively still a monolithic ordering, but now distributed across NixOS modules.
  2. Orderings have to be frozen, i.e. never modified once the first version ships. User configs might place a rule at order 1200. What does that mean? Maybe they wanted to be ahead of unix at 1500, or maybe they wanted to be after u2f at 1100. We don’t know, so we can’t ever move unix or u2f or any other rule because that could quietly break a user’s system.
  3. More generally, absolute integer orderings don’t capture intent. They are at best a snapshot of intent relative to existing rules at the time the config was made.

As far as I can tell, it’s a mixture of:

  • Opinionated phases that rules can be added to
  • Integer orderings with a prescribed method to calculate an appropriate ordering value
  • Fall back to completely manual configuration when a user adds any custom rules
  • Some dissatisfaction with the rigidity of this design

I think we can do better in NixOS. Also, our module system is substantially different from Debian/Ubuntu’s package system, and users have different expectations for the configurability of their systems. It’s generally considered a failure of a NixOS module if a user has to manually edit config files on their system.

That said, I think we can take inspiration from the way that other distros organize PAM rulesets (i.e. the “opinionated” bits). The only reason NixOS should deviate from other distros here is (unfortunately) backward compatibility.

3 Likes

We can still express sophisticated relative orderings based on absolute number values. We just need a helper function like:

  between = earlier: later:
    let
      inf = 1.e308 * 2;
      lastEarlier = lib.foldr lib.max 0 earlier;
      firstLater = lib.foldr lib.min inf later;
    in
    if earlier == [ ] then
      if later == [ ] then
        throw "give me something here"
      else
        firstLater - 1
    else
      if later == [ ] then
        lastEarlier + 1
      else
        if lastEarlier < firstLater - 1 then
          (lastEarlier + firstLater) / 2
        else
          throw "can't solve";

Then expressing ‘run after u2f and run before unix’ is just order = between [ u2f.order ] [ unix.order ];. Intent captured, future proofed. And then with targets you can cut through the potential for circular dependencies between order values.

And the advantage over defining those transitive relationships in the data is that debugging why thing 1 comes before thing 2 becomes much simpler. Just look at the numbers in a REPL.

When I say “absolute” integer order values, I mean orderings that are given as literal integers. Order configs that are based on other order configs are (in my lexicon) relative orderings.

You’ve extended relative orderings to be based on two rules rather than one, which is clever. But I think we can find situations where three or more relationships would be ideal.

I think if you know a rule is in the wrong spot, the debugging experience isn’t too bad with after/before: You either add an ordering that fixes it, or you get an error that you’ve created a cycle (and a dump of the relevant rules).

The UX gets trickier if we forbid ambiguous orderings. But if we do, that that also means any system that builds has something pinning the ordering. So the debugging step is the same: add the ordering you want, see that it creates a cycle, get a dump of the rules creating the cycle, and choose one to drop.

For debugging in a REPL, we can expose some of the functions that are used internally, e.g. the function that identifies after/before settings contributing to the ordering of a rule.

An interesting nuance with after/before is that we can make the ordering apply only if both rules (or targets) in the relationship are enabled. If a > b and b > c, the ordering has to be [ a b c ] if all three are enabled. But if you disable b, rules a and c are free to use either ordering. I’d have to study the stack for a bit to find a real example where this is useful, but it’s at least conceivable that a user would like to be able to choose which auth steps happen first in their login flow while also enjoying the protection of specific constraints provided by NixOS to guard against misconfiguration. If we applied constraints to all rules (even if not enabled) then after/before would be about as expressive as between.

This seems like you’re just kicking the can down the road… In a topological system, you would know the order of things by reading the PAM configuration file. That doesn’t seem particularly useful, ideally you’d want to know why things are ordered the way they are. So you’d take a look at the nix modules to find out the ordering rules.

Now imagine a numeric ordering. If you wanted to know the order to things, you’d read the PAM config file. That’s not very helpful, you want to know why things are ordered the way they are, so you open a repl and get their priorities. IMO, that’s still not very helpful, not until you do the exact same thing with the topological sorting and read nix modules to understand the meaning.

I do think you’re on to something, though, there should be an easy way to view all the rules applied in one place for debugging configurations. I’d rather not track down and read every nixos module that I use which roaches PAM just to figure out why something’s ordered the way it is.

IMO, topological sorting is much closer to the actual essence of what we want to achieve, and seems like the right tool for the job. And Nix module authors are already familiar with systemd which does things the same way.

1 Like

What would be the implications of disallowing ambiguous ordering? The way I’m reading it, it sounds like it would blow up the amount of rules you have to write as the number of modules that could possibly exist scales… I agree it would be bad for ordering to change unintentionally across updates, but it also seems like an easy way to pollute PAM rules with relationships that don’t matter; adding a lot of noise to the system and extra work for maintainers. Maybe sorting alphabetically by rule name would be a middle ground? It’s unlikely for a rule name to change, and it would slim down the rules to only those that we know matter, instead of all rules that could maybe, possibly, matter.

Topological ordering with targets definitely makes sense. It sounds like it has a lot of potential to truly modularize PAM config. The relative success of systemd units and targets as a combination mechanism for things coming from unrelated sources also speaks to this being a good idea.

I do have to question the lean towards disallowing ambiguous ordering, however. It’s very realistic for there to be aspects of the ordering that truly don’t matter, and forcing people to fix a specific order in such cases seems like it works against modularity. Ordering relationships should be specified because they matter, not because you were forced to clear up the ambiguity somehow and chose an arbitrary way. Especially when those relationships look identical in the code, and someone who comes along later to modify the module doesn’t know whether they were important or arbitrary.

Avoiding spurious order changes is desirable, but that seems doable without forcing module authors to specify irrelevant orderings. Alphabetically by rule name, as suggested above, would achieve that just fine.

Regarding option structure, I feel like introducing types is going too far. Since (as far as I know) there are a small number of rule types, and there are unlikely to ever be more, it’s fine to leave them in the same name space as service options, imo.

3 Likes

For ambiguous orderings, my thinking is that ambiguities would be resolved in the user’s config, not in NixOS modules.

For example, say we have fprintd and u2f, which both declare after.target.root and before.target.early in their respective NixOS modules.

If I have a fresh config and I enable fprintd, there is no ambiguity. If I disable fprintd and enable u2f, still no ambiguity. It’s only when both are enabled that it’s unclear which rule should come first, and the user is prompted to configure that explicitly. The user says “ah, I want the security key prompt before the fingerprint” and sets u2f.before.fprintd.

I’d expect this situation to be rare. For example, it wouldn’t happen with fprintd and fscrypt because they would be ordered by different targets.

I agree with this in principle. We have some orderings configured in NixOS modules, which should be the ones that “matter” to every user. Then, there are orderings in a user’s configuration, which are the ones that matter to that user specifically. The issue seems to be if there are rules where the ordering cannot matter to anyone.

Does it help if the interface includes a “deliberately ambiguous” relationship? Something like a.beforeOrAfter.b? These would not impose any ordering constraint, but would also suppress the ambiguous ordering detection. I could see NixOS modules setting these if it’s well known that either ordering is safe and no user should have any reason to care about the ordering. But then, that’s a lot of cross-module relationships.

We can also put ambiguous ordering detection behind a strict option. But that doesn’t really fix the issues with either the strict or non-strict approach.

This might work. I worry about an innocent change merging that breaks things for users who didn’t know they were relying on the default ordering. Rules might be consolidated or restructured, especially in cases where multiple rules are working together (e.g. fscrypt-skip-systemd and fscrypt).

There’s also the matter of backward compatibility. For example, u2f currently comes before fprintd. If we allow ambiguous orderings, then the default ordering has to encode the existing order of rules (and do that forevermore). So we’ll have after/before but also an internal defaultOrder field. If we don’t allow ambiguous orderings, then existing users who use both u2f and fprintd need to encode their choice in their config. We could phase this in with warnings for a release before removing the default ordering.

We can also phase it: Start with defaultOrder for now, and decide at a later point to obsolete it by requiring users to choose once we’ve settled more of the module design.

This makes sense. My hesitation was mostly about module documentation, since there are many options under each type that would be duplicated. Maybe there’s a clean way to do this anyway?

I see. I didn’t expect that. I worry that any approach that requires this will require too much technical ability and knowledge from the user. Certainly the eval errors need to be very clear about what choice needs to be made, and exactly how to set it.

This seems like a fairly bad UX to me on first look. Maybe I’m wrong. It really depends on how common this situation is, and how often the user really doesn’t care about the order, vs how often they do. We tend to have default values for most options where we can reasonably choose one, and this situation seems analogous. Then again, PAM is security-related, and somewhat fiddly, so it may justify an exception. I’m coming into this with relatively little knowledge of the “lay of the land”, so maybe my concerns are off the mark.

Somewhat. What I really want here is for module authors, and even more importantly users, not to be forced to set a setting to an arbitrary value just to make it build. This at least somewhat achieves that, since at least with this setting, they can set something that says “I don’t care” rather than choosing something arbitrarily, which is much clearer on re-read when they go back to change their configuration. Module authors also at least have some method to mitigate bad UX of this sort. I would add here that if you do this, before/after should probably silently override a “deliberately ambiguous” setting, rather than conflicting. Somebody will always want to fix the order even where it definitely doesn’t matter, and shouldn’t need to work extra hard to do so.

Yes, this makes sense, even just from the “easing introduction” aspect, if you go through a stage where it just gives deprecation warnings. Ideally, these warnings/errors would list the exact settings needed to maintain the current/previous default order, so users who don’t really know what they want can just copy that into their config.

It also allows the “default or not” question to be resolved in a context where the rest of the implementation is in place, and people can see more directly what the issues are.

Well, the systemd module has even more of this kind of thing already, and it’s not a big problem in practice, as far as I can see.

1 Like