NeoVim Python plugins for basic programming and XML PMC Study parsing/layering

Hello again :slight_smile: Requesting a little help reigning in my headspace around NeoVim plugins. I have watched too many videos and asked too many questions of LLMs and would love a little human powered clarity lol.

I’m working with PubMed Core studies in the Full XML format to build a tool for home psychobiotic protocol creation and home cultivation. I’m using Python to build the tools, and using .json files for each layers particular schema.

I had a few basic tools built using Gemini/Claude/CoPilot and then edited them myself for proper syntax and content. The tools were designed to create layers for proper organism recognition and medical concepts.

What I’m going for now is to manually write each tool myself in NeoVim with a few updates to make the tools more adaptable to whatever layers I want to create, with a fundamental organizational layer that represents which layers are being applied to the data for easy indexing in the future.

I’ve been looking at this code for a while so it actually makes a decent amount of sense, but using Nano as my editor is already tedious.

After going thru the nvim tutor twice I absolutely love the motions and have an understanding of how much more fun it will be to build and edit code using it, but I would love a little help with plugin selection.

I don’t want to just throw a huge all-in-one beast into my system, I want to install what I need. I am even going to do it manually now that I understand the file system enough to do that, and this is a private passion project to learn from not just to get to an MVP.

I have a bunch of packages currently installed in my config
This is my current package list:

  environment.systemPackages = with pkgs; [
    brave
    driversi686Linux.libva-vdpau-driver
    fd
    gcc
    gitFull
    ghostty
    libreoffice-still
    lua
    neovim
    python313Full
    (python313.withPackages (ps: with ps; [
      black
      debugpy
      elementpath
      fuzzyfinder
      libxml2
      lxml
      libxslt
      pandas
      pynvim
      requests
      ruff
      types-lxml
      ]))
    ripgrep
    unzip
    vim
    visidata
    xclip
    zram-generator
    wget
  ];

And it has been suggested by CoPilot that I enable nvim in my config.nix like this(i am using its formatting here, it is not in my config.nix at this point). Is this even a valid approach?:

{ config, pkgs, ... }:

let
  vimPkgs = pkgs.vimPlugins;
  py313 = pkgs.python313Packages;
in
{
  # 1. System packages ###### This is their version, I left it here unedited#####
  environment.systemPackages = with pkgs; [
    neovim
    xmlstarlet       # xml formatting & query CLI
    libxml2          # XML parsing libs
    libxslt          # XSLT support
    python313Full    # your Python interpreter + pip
    py313.lxml       # XML parsing from Python
    py313.pandas     # DataFrame support
    py313.black      # code formatter
    py313.debugpy    # debug adapter
    visidata         # DataFrame viewer
  ];

  # 2. Enable & extend Neovim
  programs.neovim = {
    enable = true;
    package = pkgs.neovim;
    # drop in your own init.lua
    configure = {
      customRC = ''
        " init.lua is loaded from ~/.config/nvim/init.lua
      '';
    };

    # 3. Neovim plugins via nixpkgs' Vim-plugins
    plugins = with vimPkgs; [
      itchyny-vim-xpath         # :XPath queries
      b0o-schemastore-nvim      # JSON/XML schema support
      gpanders-xsd-nvim         # XSD validation

      neovim-lspconfig          # builtin LSP configs
      hrsh7th-cmp-nvim-lsp      # LSP completion
      echasnovski-mini-completion
      SirVer-UltiSnips          # snippet engine
      ms-Jeddy-black-nvim       # Black formatter integration

      dvassco-nvim-visidata     # pandas DataFrame viewer
      mfussenegger-nvim-dap     # DAP client
      rcarriga-nvim-dap-ui      # DAP user-interface
    ];
  };

Syntax, formatting, autocomplete are must haves. I want to spend as little time on typos and missed semi-colons as possible.

I am not analyzing the scientific data that is used to write the studies. I am creating a dataset from the studies themselves. Some of which do have a large amount of data represented, as well as different graphs and charts.

Eventually I want to turn this data into a SQLite or PostgreSQL db to deploy a forum and research website for individuals and researchers to have a community to interact with to empower this emerging field of study, but for now, building out a deep dataset that has all the scientific data accessible for things like ADHD, Autism, Depression, SIBO… along with allowing people to input what their environmental factors are(water supply, processed food/drinks, medical interventions/medications) and how those are currently effecting their personal biome is where I am starting.

I’m asking here because when I search for help on the web, I get steered to the tools to process the research data itself, which isn’t what I really need. I need to be able to parse and extensively layer XMLs, create XMLs, and I’ll also need to create JSON.

And I need my NeoVim setup properly. Because once I finish with my test dataset of 206 studies under Psychobiotics and establish my templates, I can move on to the Mind-Gut–Brain-Axis archive with over 14,000 studies and really build something valuable :slight_smile:
I appreciate any guidance you have!