Blake Smith

create. code. learn.

»

Running NixOS Tests with Nix Flakes

I’ve been using Nix / NixOS for a couple years now, but have never really bothered to learn the NixOS testing framework. After playing with it a bit, I really like it! It’s like all the best properties of classic Vagrant and Chef Test Kitchen, but with the power and reproducibility of Nix.

Andreas Fuchs’ blogged about how NixOS tests actually helped them catch a regression in their code, so I felt inspired to give it a try.

Since I use nix flakes for all my personal development and machine setup, and the Nix documentation doesn’t go into detail about how to use the NixOS test harness with flakes, I wrote up my initial investigation here.

This setup will allow us to:

  1. Build a custom package with nix flakes.
  2. Run that package as a NixOS systemd service via a NixOS module.
  3. Boot a QEMU machine and execute automated tests against the package / module configuration.

Because it’s Nix, the entire process of building the package, assembling it into a systemd service, booting a QEMU VM, and executing tests all happens with a single command with a declarative build definition.

First, setup a bare-bones flake with a custom package. For this example, we’ll just make a simple netcat echo server that listens on port 3000.

flake.nix

{
  description = "NixOS tests example";

  inputs = {
    nixpkgs.url = "github:nixos/nixpkgs/nixos-23.11";
    flake-utils.url = "github:numtide/flake-utils";
  };

  outputs = { self, nixpkgs, flake-utils }:
    flake-utils.lib.eachDefaultSystem (system:
      let
        pkgs = nixpkgs.legacyPackages.${system};
      in
        {
          packages = {
            helloNixosTests = pkgs.writeScriptBin "hello-nixos-tests" ''
            ${pkgs.netcat}/bin/nc -l 3000
            '';
          };
        }
    );
}

Once you save your flake.nix file, you should be able to build the package from within the directory:

$ nix build '.#helloNixosTests'
$ ./result/bin/hello-nixos-tests

This just runs a simple nc server that listens on port 3000.

Next, let’s create a NixOS module that will make a systemd service for our little nc server. This is just a bog-standard NixOS service module that we’re going to eventually write a test for.

hello-module.nix

{ config, lib, pkgs, ... }:

with lib;
let
  cfg = config.services.helloNixosTests;
in {
  options = {
    services.helloNixosTests = {
      enable = mkEnableOption "helloNixosTests";
    };
  };

  #### Implementation

  config = mkIf cfg.enable {
    users.users.hello = {
      createHome = true;
      description = "helloNixosTests user";
      isSystemUser = true;
      group = "hello";
      home = "/srv/helloNixosTests";
    };

    users.groups.hello.gid = 1000;

    systemd.services.helloNixosTests = {
      description = "helloNixosTests server";
      after = [ "network.target" ];
      wantedBy = [ "multi-user.target" ];
      script = ''
      exec ${pkgs.helloNixosTests}/bin/hello-nixos-tests \
      '';

      serviceConfig = {
        Type = "simple";
        User = "hello";
        Group = "hello";
        Restart = "on-failure";
        RestartSec = "30s";
      };
    };
  };
}

If you’ve written NixOS modules before, this should look completely familiar to you.

To make this module work, we have to do a few things:

  1. Create a pkgs overlay that includes our new pkgs.helloNixosTests package.
  2. Create a top-level nixosModules entry in our flake.nix file for the module.

Let’s do that now, and extend our flake file some more.

flake.nix

{
  description = "NixOS tests example";

  inputs = {
    nixpkgs.url = "github:nixos/nixpkgs/nixos-23.11";
    flake-utils.url = "github:numtide/flake-utils";
  };

  outputs = { self, nixpkgs, flake-utils }:
    {
      nixosModules = {
        helloNixosModule = import ./hello-module.nix;
      };
    } // flake-utils.lib.eachDefaultSystem (system:
      let
        overlay = final: prev: {
          helloNixosTests = self.packages.${system}.helloNixosTests;
        };
        pkgs = nixpkgs.legacyPackages.${system}.extend overlay;
      in
        {
          packages = {
            helloNixosTests = pkgs.writeScriptBin "hello-nixos-tests" ''
            ${pkgs.netcat}/bin/nc -l 3000
            '';
          };
        }
    );
}

Notice we extended pkgs with our overlay definition, which will extend the base nixpkgs with our custom package. We also created a nixosModules.helloNixosModule attribute that imports the module we just wrote.

We’ve made a package, and a module, now let’s write the NixOS test to make sure they both work!

hello-boots.nix

{ self, pkgs }:

pkgs.nixosTest {
  name = "hello-boots";
  nodes.machine = { config, pkgs, ... }: {
    imports = [
      self.nixosModules.helloNixosModule
    ];
    services.helloNixosTests = {
      enable = true;
    };

    system.stateVersion = "23.11";
  };

  testScript = ''
    machine.wait_for_unit("helloNixosTests.service")
    machine.wait_for_open_port(3000)
  '';
}

This file is a normal nix derivation like any other one, but builds a complete QEMU based test harness to test our package and module.

A few things to notice:

  1. The top-level nodes attribute list allows us to define complete node / machine definitions. In this example, we just have one machine called machine.
  2. We configure that machine to run our module. The import block is critical to bring our service module definition into scope to run on the machine.
  3. The testScript key is where we run our tests. The NixOS test framework provides a python library to exercise the environment. You can see a list of available commands on the NixOS wiki.

Now, let’s hook in the test to our top-level flake.nix file, so we can run the test. This is the full, finished flake.nix file:

flake.nix

{
  description = "NixOS tests example";

  inputs = {
    nixpkgs.url = "github:nixos/nixpkgs/nixos-23.11";
    flake-utils.url = "github:numtide/flake-utils";
  };

  outputs = { self, nixpkgs, flake-utils }:
    {
      nixosModules = {
        helloNixosModule = import ./hello-module.nix;
      };
    } // flake-utils.lib.eachDefaultSystem (system:
      let
        overlay = final: prev: {
          helloNixosTests = self.packages.${system}.helloNixosTests;
        };
        pkgs = nixpkgs.legacyPackages.${system}.extend overlay;
      in
        {
          checks = {
            helloNixosTest = pkgs.callPackage ./hello-boots.nix { inherit self; };
          };
          packages = {
            helloNixosTests = pkgs.writeScriptBin "hello-nixos-tests" ''
            ${pkgs.netcat}/bin/nc -l 3000
            '';
          };
        }
    );
}

We added a checks output attribute to the flake, which accept any normal nix derivation. After hooking it into our flake, we should be able to run:

$ nix flake check

This will build and run our test script inside a complete QEMU VM, and if all goes well, our scirpt will return success. If things don’t go as expected, you can boot an interactive Python shell to debug the tests like so:

[blake@blake-framework:~/code/nixos-test]$ nix run '.#checks.x86_64-linux.helloNixosTest.driverInteractive'
Machine state will be reset. To keep it, pass --keep-vm-state
start all VLans
start vlan
running vlan (pid 267780; ctl /tmp/vde1.ctl)
(finished: start all VLans, in 0.00 seconds)
additionally exposed symbols:
    machine,
    vlan1,
    start_all, test_script, machines, vlans, driver, log, os, create_machine, subtest, run_tests, join_all, retry, serial_stdout_off, serial_stdout_on, polling_condition, Machine
>>> 

You can run the tests interactively. Boot all the test machines, and run the tests:

>>> start_all()
>>> run_tests()
Test will time out and terminate in 3600 seconds
run the VM test script
additionally exposed symbols:
    machine,
    vlan1,
    start_all, test_script, machines, vlans, driver, log, os, create_machine, subtest, run_tests, join_all, retry, serial_stdout_off, serial_stdout_on, polling_condition, Machine
machine: waiting for unit helloNixosTests.service
machine: waiting for the VM to finish booting
machine: Guest shell says: b'Spawning backdoor root shell...\n'
machine: connected to guest root shell
machine: (connecting took 0.00 seconds)
(finished: waiting for the VM to finish booting, in 0.00 seconds)
(finished: waiting for unit helloNixosTests.service, in 0.04 seconds)
machine: waiting for TCP port 3000 on localhost
machine # Connection to localhost (127.0.0.1) 3000 port [tcp/hbci] succeeded!
machine # [   30.490067] systemd[1]: helloNixosTests.service: Deactivated successfully.
machine # [   30.491654] systemd[1]: helloNixosTests.service: Consumed 3ms CPU time, no IO, received 216B IP traffic, sent 112B IP traffic.
(finished: waiting for TCP port 3000 on localhost, in 0.02 seconds)
(finished: run the VM test script, in 0.06 seconds)

Shutdown the machine and quit manually via:

>>> machine.shutdown()
>>> join_all()
>>> quit()

Some other great resources to check out:


about the author

Blake Smith is a Principal Software Engineer at Sprout Social.