Sigil

Aus C3D2
Version vom 24. April 2021, 13:32 Uhr von Ehmry (Diskussion | Beiträge) (Add Sigil post-mortem report)
(Unterschied) ← Nächstältere Version | Aktuelle Version (Unterschied) | Nächstjüngere Version → (Unterschied)
Zur Navigation springen Zur Suche springen

Sigil OS post-mortem report

Sigil was an operating system project canceled before release. Sigil OS was a variant of the NixOS distribution with the Linux kernel and systemd replaced with Genode components, or to put it another way, Sigil OS was a distribution of the Genode OS framework built with Nix. The name Sigil is retrospective, before cancellation the project was publicly known as "Genodepkgs". The project was funded by NLnet from their Privacy and Trust Enhancing technologies (PET) fund, supported by the European Commission's Next Generation Internet program. No payments were made as the project failed to achieve its own goals. NLnet however provided the necessary impetus and support to push to the project through to its final conclusion.

https://nlnet.nl/project/Genodepkgs/ https://nlnet.nl/PET/ https://ngi.eu/


Summary

To put the project into context, I quote the PET fund statement:

The research topic of Privacy and Trust enhancing technologies is aimed at providing people with new instruments that allow them more agency - and assist us with fulfilling the human need of keeping some private and confidential context and information private and confidential.

This was about building a small, reliable, special-purpose OS for people. I have no interest in the datacenter or IOT.

Sigil was intended to be a distro for Nix power-users that combined Genode-native software, a portable subset of Nixpkgs, and NixOS virtual machines. The was to be no installer, just tools for building bootable images. State and persistent user data was to be kept on block partitions separate from boot and package data. The system would boot from a firmware-like image that contained the system kernel and enough drivers to start a GUI server and bootstrap the rest of the system from a block device.

Cultivating the abstractions of configuration and composition took such precedence that no time was spared for self-hosting. A system specification would be written in a mix of the Nix and Dhall languages, and this would be built using a checkout of the Sigil git repository which in turn sourced a checkout of the Nixpkgs/NixOS monorepo.

After a base system was established the plan was to support declarative package management by writing Dhall descriptions of environments. Registries of packages would be published that contained a content-addressed identifiers for package data and configuration types and generators. The hope was that this done entirely with a Genode-native Dhall interpreter, but neither the base system or interpreter were completed.


Why it failed

I would assigned the blame to three categories, Genode, NixOS/Dhall, and myself.

Myself

The project was too ambitious for one person. The goal was to produce an outcome maintainable by a single person, and I was able to "keep the whole project in my head" to a substantial depth, but this was not practical. Localizing the vision, planning, implementation in one head leads to short feedback loops between those concerns, and this led to something incoherent to observers. I was encouraged to embark on the project by many people because of a perception that I had the necessary expertise in both the Nix and Genode. Maybe that was true, unfortunately if it was then I was the only one who could do it. This project was so niche (or foolish) that I received no criticism.


NixOS

https://nixos.org/

I use NixOS exclusively and the NixOS module system is the most reasonable configuration system that I know of. Guix is no doubt better in some regards, but I don't have any experience to make a comparison. Some people complain that the Nix language is too Perl-ish, the tools are unfriendly, and the documentation isn't easy to read. Most people aren't motivated enough to work through those obstacles, but if you want to make a exotic software-distribution, those are minor problems and Nix is probably the best tool to use. Likewise I expect that Guix on Hurd is more viable than Debian on Hurd.

For as powerful as Nix/NixOS is, developing an OS within its confines is unpleasant. To mention a few painful points:

Mass rebuilds

Making any changes to the Libc triggers a mass rebuild. Modifying system headers causes massive rebuilds. Someone is probably working on this, but I don't know of any solutions. Moving drivers to userspace makes development there much more reasonable than it would be for Linux, but I wouldn't consider it an improvement because monolithic kernels are a poor baseline.

Expensive evaluation

Evaluation is slow and consumes massive amounts of memory. I would often restart evaluations because of memory exhaustion. Perhaps this could be fixed by profiling Nix expressions, but I don't know of anyone doing this. "Get better hardware" is a lazy and shortsighted reply because Moore's law ceases but Wirth's law remains.

Expensive test images

Building disk images is slow and rapidly consumes storage. This can be alleviated by loading content otherwise and reducing boot images to a minimum, but this is additional development that compounds the cost of implementing standard boot images. For example NixOS tests systems can boot from a disk image or bootstrap from virtio-9P.

Out-of-the-sandbox testing

I didn't find a solution for testing complex systems on real hardware or with global network connectivity. Cyberus did donate time on their test hardware which I managed to drive with the standard NixOS CI, but I did not manage to find a semi-automated method for testing directly on laptops. Cyberus SoTest

Linux specific

NixOS is well abstracted away from Linux and increasingly so, but the configuration modules do not accommodate other kernels. Much of the modules could be reusable, but in some places there are some assertions that will always fail and it is difficult to extract generated configuration files.

See also: Bridging the stepping stones: using pieces of NixOS without full commitment Nix RFC 78


Dhall

https://dhall-lang.org/

Writing configurations in Dhall gave me a sense of certainty that I haven't found otherwise, which perhaps made it easy to overuse. Schema, transformation, and analysis of configuration is easier to do correctly in Dhall than in other languages, so I found it an ideal modeling language for configuration.

My complaints:

Expensive

Dhall is enjoyable to write, perhaps because it feels like solving puzzles, but it was painfully slow to evaluate. Is this why people tolerate Rust?

I suspect my code was particularly expensive because I was writing nested configurations that were intermingled with abstract XML, and Dhall being a language not supporting recursion, this was represented using a lot of intermediate types and functions.

Nix impedance mismatch

Having a "lib.generators.toDhall" function in Nix is handy, but the stricter type system of Dhall makes translation from Nix awkward at times. For example:

let
  a = [ 1 2 3 ];
  b = [ ];
in ''
  { a = ${toDhall a} -- inferred list annotation.
  , b = ${toDhall b} --  missing list annotation!
  }
''


Genode

https://genode.org/

In my opinion Genode is the most reasonably designed OS that can drive modern hardware. Unfortunately I find it unusable without modernization.

XML

Genode parses all configuration in XML format, which is a minor improvement over the ramshackle Unix formats, but otherwise wrong because there is no overlap between configuration and markup. As a simple example, in a distant past boolean values were represented in Genode XML by the strings "yes" and "no", and boolean values were parsed by comparing attribute values to either "yes" or "no", any other value was not-yes or not-no respectively. I patched this to allow "true", "false", "0", "1", but in retrospect this was a mistake because there is no guarantee that a values other than "yes" and "no" are parsed. This language simply does not support the concept of true and false, nor any category of numbers.

Furthermore, Genode configuration cannot be generated or processed by standard XML tools because the Genode RPC label separator is " -> " but this must be escaped to " -> " and I would be surprised if the Genode parser un-escapes this.

Abstraction is necessary but a poor solution because XML is difficult to formally model. Dhall XML/Type

Just make CBOR the common language for representing configuration and cross-component state. CBOR is safer and more efficient to implement than XML or JSON with a feature superset of both. Translation from JSON is covered by the CBOR specification and is trivial to add to a configuration pipeline.

Poor documentation

Configuration is documented in blog posts and release notes, sometimes in README, but these are usually out of date. If it is necessary to understand how a component is configured then one must to consult the C++ implementation.

Convoluted

A large part of why I gave up on Genode was the complexity of the GUI stack which I fear will increase as Sculpt OS develops. The GUI stack is made of a several mutually dependent components each requiring multiple session routes and configuration files that are too complicated to document. This could be one component that spawns its multiple constitutes under-the-hood. Or just implement Wayland.

The configuration of the init system is also afflicted by denormalization. Both the Routing and resources must be specified in redundant locations which is onerous to abstract and nearly impossible to write by hand.

Toolkit drift

Genode is self-described as an "operating system toolkit", but I fear a shift away from the toolkit in favor of Sculpt OS, which is a desktop OS.

In the original Genode init a routing configuring might look like this:

<routes>
  <service name="Rtc"> <parent/> </service>
  <service name="Timer"> <parent/> </service>
  <service name="Geo"> <parent/> </service>
</routes>
<source>
In Sculpt runtime configuration that would be configured like this:
<source>
<requires>
  <rtc/>
  <timer/>
</requires>

It can be seen in the latter that there are now "timer" and "rtc" sessions, which perhaps correspond with "Timer" and "Rtc" by some hardcoded C++ logic. "Geo" is missing because that's not a session type recognized by Sculpt and it is not possible for Sculpt packages to implement custom session protocols. The recommendation is to overload one of the existing session types or implement a custom Plan9-style virtual file-system. Watch any of the Sculpt demo videos and see the developers struggle to find the right service endpoint for a precariously overloaded session type.

At the moment its not clear how much of Sculpt's problems are leaking into Genode, but I came upon drivers that somehow work in the Sculpt image but are broken in Genode release.

C++

It would be nice to implement native components in better languages, but the RPC mechanisms and interfaces are only implemented in C++, often with arcane templates and idiom. Obviously this should be described with a neutral, high-level language and bindings generated from these. There is no versioning on interfaces.

Implementation

What follows are descriptions of how the distro was implemented for those that might find Nix or Dhall code snippets interesting.

Ported UNIX services

Tor was known to build and with a bit of code it could run as a native service. A test system that did nothing more than host a Tor relay would be declared like this:

sigil/tests/tor.nix

{
  name = "tor";
  machine = { config, lib, pkgs, ... }: {

    genode.core.storeBackend = "fs";
    hardware.genode.usb.storage.enable = true;

    services.tor = {
      enable = true;
      client.enable = false;
      extraConfig = "Log [general,net,config,fs]debug stdout";
      relay = {
        enable = true;
        port = 80;
        role = "relay";
        bridgeTransports = [ ];
      };
    };

    virtualisation.memorySize = 768;
    # This is passed to QEMU during testing

  };
}

There were no complete demos, but the NixOS testing framework was ported and tests were written in the same format. NixOS Tests

The Nix support code for Tor looked like this:

sigil/nixos-modules/services/tor.nix

{ config, lib, pkgs, ... }:

let
  toDhall = lib.generators.toDhall { };

  cfg =
    # The options from the NixOS Tor module were reusable
    config.services.tor;

in {
  config = lib.mkIf config.services.tor.enable {

    genode.init.children.tor =
      # If Tor was enabled then a child named "tor"
      # would be added to the Genode init.
      let
        args = lib.strings.splitString " "
          config.systemd.services.tor.serviceConfig.ExecStart;

        tor' = lib.getEris' "bin" pkgs.tor "tor";
        lwip' = lib.getEris "lib" pkgs.genodePackages.vfs_lwip;
        pipe' = lib.getEris "lib" pkgs.genodePackages.vfs_pipe;
        # getEris produces a content-addressed capability,
        # more on that later…

      in {
        package =
          # The "package" associated with this child of init.
          # This Tor from Nixpkgs after its been processed by
          # the Genode-specific overlay function.
          pkgs.tor;

        binary =
          # A binary is explicitly set because the tor package
          # contains multiple programs at …/bin.
          builtins.head args;

        extraErisInputs =
          # Explicitly add these VFS plugin libraries to the ERIS closure.
          [ lwip' pipe' ];

        configFile =
          # The configuration of the child within the init instance
          # is always a file containing a Dhall configuration.
          # This is awkward but adds some strict type checks.
          # The file at ./tor.dhall is a function that takes
          # the list of shell to pass to the program, and
          # ERIS capabilities to the LwIP and pipe plugins
          # that are loaded by the libc virtual-file-system.
          pkgs.writeText "tor.dhall" "${./tor.dhall} ${toDhall args} ${
            toDhall {
              lwip = lwip'.cap;
              pipe = pipe'.cap;
            }
          }";

        uplinks =
          # Specify one network uplink of this child named
          # "uplink" and use a driver ported from iPXE.
          {
            uplink = { driver = "ipxe"; };
          };

      };

  };
}

The tor.dhall function that generated the Genode-native configuration:

sigil/nixos-modules/services/tor.dhall

let Sigil =
        env:DHALL_SIGIL
      ? https://git.sr.ht/~ehmry/dhall-sigil/blob/trunk/package.dhall

let Init = Sigil.Init

let Libc = Sigil.Libc

let VFS = Sigil.VFS

in  λ(args : List Text) → -- arguments from systemd.services.tor.serviceConfig.ExecStart
    λ(vfs : { lwip : Text, pipe : Text }) → -- ERIS capabilities to plugins to load into the VFS
    λ(binary : Text) → -- ERIS cap of the Tor binary, binaries are injected after package resolution
      Init.Child.flat -- generate a child without its own nested children
        Init.Child.Attributes::{ -- fill out a Child.Attribute.Type record to pass to Child.flat
        , binary -- inherit the binary field from the function scope
        , config =
            Libc.toConfig -- generate a config from a Libc.Type
              Libc::{ -- fill out a Libc.Type record
              , args -- inherit args from the function scope
              , pipe = Some "/dev/pipes"
              , rng = Some "/dev/entropy"
              , socket = Some "/dev/sockets" -- set the location of special files within the VFS
              , vfs = -- specify the VFS configuration, ugly because it's generating XML
                [ VFS.dir
                    "dev"
                    [ VFS.leaf "null"
                    , VFS.leaf "log"
                    , VFS.leaf "rtc"
                    , VFS.leafAttrs
                        "terminal"
                        (toMap { name = "entropy", label = "entropy" })
                    , VFS.dir
                        "pipes"
                        [ VFS.leafAttrs "plugin" (toMap { load = vfs.pipe }) ]
                    , VFS.dir
                        "sockets"
                        [ VFS.leafAttrs
                            "plugin"
                            (toMap { load = vfs.lwip, label = "uplink" })
                        ]
                    ]
                , VFS.dir -- Tor gets a read-only view of /nix/store
                    "nix"
                    [ VFS.dir
                        "store"
                        [ VFS.fs
                            VFS.FS::{ label = "nix-store", writeable = "no" }
                        ]
                    ]
                ]
              }
        , resources = -- set the amount of capability slots and RAM needed by Tor
            Init.Resources::{ caps = 512, ram = Sigil.units.MiB 384 }
        }

Network drivers

The Tor example specified a network uplink and driver, this was done within the NixOS service module like this:

{ genode.init.children.<child-name>.uplinks.<uplink-name> = { … }; }

Which would generate a sibling in the Genode init named "<child-name>-<uplink-name>-driver" with a config like this:

tor-uplink-driver.dhall

let Sigil =
        env:DHALL_SIGIL
      ? https://git.sr.ht/~ehmry/dhall-sigil/blob/trunk/package.dhall

let Init = Sigil.Init

in  λ(binary : Text) →
      Init.Child.flat
        Init.Child.Attributes::{
        , binary -- the binary is injected later
        , resources = Init.Resources::{ caps = 128, ram = Sigil.units.MiB 4 }
        , routes = [ Init.ServiceRoute.parent "IO_MEM" ]
        , config = Init.Config::{
          , attributes = toMap { verbose = "no" }
          , policies =
            [ Init.Config.Policy::{
              , service = "Nic"
              , label = Init.LabelSelector.prefix "tor -> uplink"
              }
            ]
          }
        }

Note the "policies" field in the Init.Config record, these are converted into <policy/> elements that are injected into the XML configuration of the driver. The Genode network drivers don't parse for these elements, but the Dhall function for converting to an init configuration injects routes that correspond to policies. This differs from standard Genode where routes are specified at the client configuration site and the policies at the server site. Reducing the route and policy to a "single source of truth" at the server reduces the effort and chance of misrouting.

Furthermore, creating an uplink driver sibling in turn generates a policy at the Genode platform driver, the gatekeeper to PCI devices.

tor-uplink-driver-policy.dhall

let Sigil =
        env:DHALL_SIGIL
      ? https://git.sr.ht/~ehmry/dhall-sigil/blob/trunk/package.dhall

in  Sigil.Init.Config.Policy::{
    , service = "Platform"
    , label = Sigil.Init.LabelSelector.prefix "nixos -> tor-uplink-driver"
    , content =
      [ Sigil.Prelude.XML.leaf
          { name = "pci", attributes = toMap { class = "ETHERNET" } }
      ]
    }

ERIS capabilities

> ERIS is an encoding of arbitrary content into a set of uniformly sized, encrypted and content-addressed blocks as well as a short identifier (a URN). The content can be reassembled from the encrypted blocks only with this identifier. The encoding is defined independent of any storage and transport layer or any specific application. Encoding for Robust Immutable Storage (ERIS)

As mentioned before, binaries can be referred to by ERIS URN capabilities. These capabilities are URNs to content-addressed data, the term capability is be used here because a valid URN uniquely identifies a single piece of data. An example of an ERIS URN would be:

urn:erisx2:AEA24Z7OYXY4VWA5ZUSRTCC46JXTKNHOGAVPHGH5QASXS74VZYU3UTYOCAEA2AGPASNTJHKWJ6I3LRNI74BP3XAPGY4ZODWORFQXYT74AA

These URNs are base32 encoded. The first two bytes are metadata followed by a thirty-two byte "reference" and a thirty-two byte "key". The key is the hash of original data, and the reference is the hash of the original data after it has been encrypted using the key. This means that knowledge of the URN allows the retrieval of encrypted data using the reference and then local decryption of data using the key. In theory this would allow for data to be loaded from untrusted block devices or network sources without exposing the data to parties not having prior knowledge of that data.

An ERIS URN or capability (I use the terms interchangeably) could be generated from a Nix package using a simple function:

lib.getEris "bin" pkgs.hello

lib.getEris

In this example the first file in the "bin" directory of pkgs.hello is extracted and the following would be returned:

{
  cap =
    "urn:erisx2:AEAYD5LFYUCZOTTW7QDWF7ER4EBVINHNL2RDXESGAXSMINZCU4ONYMZXQAG5VFFFNP4O7QJSP2GE3SIHUFQJ27WKDRX5QW3KWR7CCQGDSM";
  closure = {
    "/nix/store/8ybwzp4r9lz3y0azyqbwim7ks4hy9yya-libc-x86_64-unknown-genode/lib/libc.lib.so" =
      "urn:erisx2:AEARWNZNSNXQPZXSDL5UC3T7MRODFKE2TBAIMOW6VSYVKI7R447YE2OBFWQWNYLJAIIX3XPE57I555E7OOTBMDLXXMSA527LN4QYPPP2SY";
    "/nix/store/b4d5q975dwynaps35fxa8amhywafbnjm-posix-x86_64-unknown-genode/lib/posix.lib.so" =
      "urn:erisx2:AEAU3A4NZUCCCXCUDPPUQFXTVMLZE4WHTHRK5L2O7Q7UUAGDFXQGQ6I2UMRRZLY6NZ25XQRLWIPUHHCGFXIBFKJ4OFMDAQQCHQIRBAGTNE";
    "/nix/store/h0nx14lbx29hikf2yaklwg6dgmlhw11y-vfs-x86_64-unknown-genode/lib/vfs.lib.so" =
      "urn:erisx2:AEAUQXA6Z4FM6B3WHVDBVTWMMFBUQWJKGVLKKKCAWPESEKRSPAZAKXBGWLQ6MP4W2Q7AJRCQWJR5WXWFNPEKAC6WOXETFEUMZYMC7ZYGHY";
  };
  path =
    "/nix/store/jg6b7h3ignng5kh3hrfxnymywzygdpqd-hello-2.10-x86_64-unknown-genode/bin/hello";
}

This is a Nix record with the fields "cap", "closure", and "path". Cap is the ERIS URN, path is a path to the file corresponding to the cap, and closure is a record of caps that this file is known to refer to. This data is generated using postFixupHooks in the Nixpkgs stdenv. eris-patch-hook Package setup hooks

The "erisPatchHook" would run as a final step of a package builder. Any ELF file found in the outputs of a package would have the references to libraries by path replaced with ERIS URNs to libraries already processed by the hook. Replacing the "/nix/store/…" references within binaries would break the runtime dependency detection that is used to build store-closures of packages, but the same can be achieved by listing the original paths in another file in the package output, which is required to track the ERIS closure anyway. A map of the libraries paths to URNs is written to "…/nix-support/eris-manifest.json", which is loaded and translated to Nix by the "lib.getEris" function.

This shows that it is possible to use Nix to build systems not relying on a /nix/store file-system by implementing runtime dependency tracking using the Nix language itself, which is normally hidden by the Nix implementation.

In practice I never implemented the decryption and verification of ERIS data, I merely used the URNs as keys for a key-value store. The initial store was binaries compiled as blobs into the core firmware image and referenced by URN. The second store was a tarball of similar mappings mounted as a file-system, and the third was mappings in an ISO9660 file-system (because the Ext driver broke in this case).

The Genode loader would load libraries by requesting whatever strings were in the ELF DT_NEEDED section using the ROM RPC interface to its parent, and these requests would be intercepted and resolved by reading a file with the same name as the URN.

Thanks go to pukkamustard for developing the ERIS draft, which has also been supported by NLnet.


Core image

The only implemented bootloading was booting from EFI to GRUB, and from GRUB to the NOVA microkernel with a Genode core image passed in memory. NOVA Microhypervisor

NOVA would execute the core image as its user-space task, and the core image would start an instance of the Genode init, which would spawn the children specified within the Nix configuration as

{ genode.core.children.<child-name> = { … }; }

The default core children were the drivers related to hosting the secondary ERIS store and graphics. One of the core children was "nixos", which was a nested secondary init containing similarly declared children. This init differed from the primary init in that binaries were loaded from the secondary ERIS store, which could be backed by a block device.

{ genode.init.children.<child-name> = { … }; }

Configuration of all the primary and secondary init were collapsed into a single configuration document, converted from Dhall to the native Genode-XML format, and embedded within the core image. sigil/lib/compile-boot.dhall


Block-devices and file-systems

The core would start a "device_manager" component that controlled a sibling init containing AHCI and USB drivers. The device_manager would parse the status of these drivers and start components for detecting and serving the partitions of any block device that was found. device_manager.nim

The device manager would also configure routes such that block devices requests labeled with GUIDs would be routed to the appropriate GPT partition (if the partition was missing, the request would stall until it appeared). The file-system server configuration contained a hardcoded partition GUID that would be generated as the configuration was finalized.

sigil/nixos-modules/eris/rom-vfs.dhall sigil/nixos-modules/lib/make-bootable-image.nix lib.uuidFrom


Package overlay

If you want to port Nixpkgs/NixOS to an unsupported system, I recommend starting with an overlay rather than forking the monorepo.

I was taking Packages from the Nixpkgs collection and cross-compiled on Linux to Genode with some patches and build system tweaks. sigil/overlay

Patches were applied to LLVM and Clang and an Clang-base stdenv was used rather than the GCC stdenv. Recompiling a custom LLVM/Clang toolchain as Nixpkgs was updated was the most expensive process in development, and the base Nixpkgs revision was seldom updated to avoid rebuilds.

The upstream Genode build system using GCC, but that version is only available in precompiled form or as a makefile that maybe works with Ubuntu-16.04. Reusing the upstream GCC patches was not feasible because the patches do not properly modify GCC to recognize Genode system-triplets, which is why there is no official means of reusing package build recipes from BSD or Linux distributions.


Nix Flake

The Sigil flake had a slightly non-standard layout because packages were cross-compiled. The "legacyPackages" outputs contained the systems "x86_64-linux", "x86_64-linux-aarch64-genode", and "x86_64-linux-x86_64-genode". https://git.sr.ht/~ehmry/sigil/tree/post-mortem/item/flake.nix#L8

After this experience I would recommend that packages are defined in flakes using an overlay which is applied to the Nixpkgs set. This allows your packages to be cross-compiled in other flakes.

For example:

{ self, nixpkgs }: {
  overlay = final: prev: { myPkgs = { }; };

  legacyPackages = forAllSystems
    (system: nixpkgs.legacyPackages.${system}.extend self.overlay);

  packages = forAllSystems (system: self.legacyPackages.${system}.myPkgs);
}


Conclusion

I would not abandon the idea of building a new NixOS without Linux, but there were stark trade-offs in my experiment. I managed to building systems with sizes measured in MiB rather than GiB but only by exchanging seconds for minutes in evacuation and build time. Security was much stricter as access to system resources was explicitly accounted for and most side channels were removed, but I did not manage to make any demos of interactive or extensible systems.

I had planned to implement some on-target package-management using Dhall to compose and configure pre-built packages. This would bypass the complexity of Nix evaluation, but was not something I managed to implement. Perhaps I may try this anyway with BSD/Linux.

Operating system research should get increasingly easier as tools improve, and Nix was a inflection point of how well our tools can leverage what software we already have. Where some might see NixOS as entrenching the Linux regression, I see the most reasonable way forward. If we build a formal model of the systems we have to today then we have a solid basis to make a counter-argument tomorrow. The Ptolemaic and Copernican models of the universe share a common foundation in geometry, but the correctness and elegance of the later is made apparent by the complexity of the former.