Trimming XDG_DATA_DIRS to not contain duplicates

Hello everyone :wave:

I have the following question: how can I trim the generated XDG_DATA_DIRS environment my DE uses to the non-duplicated paths only?

I noticed KDE apps are insanely slow to start compared to other distributions. Read about XDG_DATA_DIRS duplicates causing the startup here:

And in fact, there are tons of duplicates, mostly KDE-related.

 nico@dragons in ~ 
❯ echo $XDG_DATA_DIRS | tr ':' '\n' | wc -l
203

 nico@dragons in ~ took 6ms
❯ echo $XDG_DATA_DIRS | tr ':' '\n' | sort | uniq | wc -l
87

The full list of entries can be found here.
I have a script to export this variable without the unnecessary duplicates, which I would like to run during login. I tried programs.bash.loginShellInit, as well as home-managers option for appending content to .bashrc, but both of these fail to reduce the amount of paths set in XDG_DATA_DIRS.

So I’m assuming something else exports the variables, overwriting my changes. Any idea how to proceed here?

My flake can be found here, using latest nixpkgs-unstable with KDE 6.

2 Likes

I am testing this:

File: clean_v2.rs
use std::collections::HashSet;
use std::env;

fn clean_env_var(var_name: &str) {
    if let Some(paths) = env::var_os(var_name) {
        // Convert OsString to a String (lossy, but fine for paths)
        let paths = paths.to_string_lossy();

        // Split the paths by ':' and collect them into a vector
        let paths_vec: Vec<&str> = paths.split(':').collect();

        // Use a HashSet to track seen paths while preserving order
        let mut seen = HashSet::new();
        let cleaned_paths: Vec<&str> = paths_vec
            .into_iter()
            .filter(|&path| seen.insert(path)) // Keep only the first occurrence of each path
            .collect();

        // Join the cleaned paths back into a single string with ':' as the delimiter
        let cleaned_paths = cleaned_paths.join(":");

        // Set the cleaned value back into the environment
        env::set_var(var_name, &cleaned_paths);

        // Print the cleaned value (optional)
        println!("{}=\"{}\"", var_name, cleaned_paths);
    } }

fn main() {
    // List of environment variables to clean
    let env_vars = [
        "GTK2_RC_FILES",
        "GTK_PATH",
        "GTK_RC_FILES",
        "INFOPATH",
        "LIBEXEC_PATH",
        "NIXPKGS_QT6_QML_IMPORT_PATH",
        "NIX_PATH",
        "PATH",
        "QML2_IMPORT_PATH",
        "QTWEBKIT_PLUGIN_PATH",
        "QT_PLUGIN_PATH",
        "TERMINFO_DIRS",
        "XCURSOR_PATH",
        "XDG_CONFIG_DIRS",
        "XDG_DATA_DIRS"
    ];

    // Clean each environment variable
    for var in env_vars.iter() {
        clean_env_var(var);
    }
}

and then put it on ~/.bashrc eval $(WHERE-YOU-HAVE-YOUR-BINARY)

It takes me back 120 us (meaning 0.1 milliseconds) :slight_smile: but it is worth it (for example SiYuan Notes, saves some 4 seconds from startup).

PS: I also have a ~/.config/plasma-workspace/env/10-env-trimm.sh with:

#!/run/current-system/sw/bin/bash

set -a
eval $(WHERE-YOU-HAVE-YOUR-BINARY)
set +a

Hope it helps!