203 lines
6.1 KiB
Markdown
203 lines
6.1 KiB
Markdown
|
|
# docker-nfsd
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### Table of Contents
|
|||
|
|
1. [Overview](#1-overview)
|
|||
|
|
2. [Purpose](#2-purpose)
|
|||
|
|
3. [Architecture](#3-architecture)
|
|||
|
|
4. [How it Works](#4-how-it-works)
|
|||
|
|
5. [Why it Was Created](#5-why-it-was-created)
|
|||
|
|
6. [Limitations and Notes](#6-limitations-and-notes)
|
|||
|
|
7. [Installation](#7-installation)
|
|||
|
|
8. [Usage](#8-usage)
|
|||
|
|
9. [Design Philosophy](#9-design-philosophy)
|
|||
|
|
10. [License](#10-license)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 1. Overview
|
|||
|
|
|
|||
|
|
`docker-nfsd` is a small daemon that allows Docker and Docker Swarm to mount
|
|||
|
|
NFS shares directly through the kernel.
|
|||
|
|
It implements the [Docker VolumeDriver API](https://docs.docker.com/engine/extend/plugins_volume/)
|
|||
|
|
over a UNIX socket and performs real mounts using the standard
|
|||
|
|
`mount(2)` and `umount(2)` syscalls.
|
|||
|
|
|
|||
|
|
The goal is simple: let Docker request a mount, and let the kernel do the work.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 2. Purpose
|
|||
|
|
|
|||
|
|
Docker supports external "volume drivers" to manage persistent storage.
|
|||
|
|
In theory, this should make it possible to mount any remote filesystem.
|
|||
|
|
In practice, on Swarm, **distributed storage that “just works” does not exist**.
|
|||
|
|
|
|||
|
|
Even in 2025, Docker provides **no stable driver for NFS, Ceph, or S3** —
|
|||
|
|
which is absurd, considering NFS has existed longer than Docker itself.
|
|||
|
|
|
|||
|
|
Existing NFS “drivers” are mostly containerized plugins written in Go or Python,
|
|||
|
|
each with their own runtime, namespaces, and orchestration layers.
|
|||
|
|
They fail on Swarm because mounts are host-level operations that cannot be done
|
|||
|
|
inside a container in any reliable or consistent way.
|
|||
|
|
|
|||
|
|
`docker-nfsd` was written to solve that:
|
|||
|
|
a native, host-level daemon that performs kernel mounts directly, without tricks.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 3. Architecture
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
┌──────────────────────────────┐
|
|||
|
|
│ Docker Daemon │
|
|||
|
|
│ (Engine / SwarmKit) │
|
|||
|
|
└──────────────┬───────────────┘
|
|||
|
|
│ HTTP over UNIX socket
|
|||
|
|
▼
|
|||
|
|
┌──────────────────────────────┐
|
|||
|
|
│ docker-nfsd │
|
|||
|
|
│ Implements VolumeDriver API │
|
|||
|
|
│ - Receives JSON requests │
|
|||
|
|
│ - Calls mount(2)/umount(2) │
|
|||
|
|
│ - Exposes mountpoints │
|
|||
|
|
└──────────────┬───────────────┘
|
|||
|
|
│ Kernel syscalls
|
|||
|
|
▼
|
|||
|
|
┌──────────────────────────────┐
|
|||
|
|
│ Linux Kernel │
|
|||
|
|
│ NFS client (v4.1) │
|
|||
|
|
└──────────────────────────────┘
|
|||
|
|
|
|||
|
|
````
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 4. How it Works
|
|||
|
|
|
|||
|
|
1. **Socket registration**
|
|||
|
|
The daemon listens on `/run/docker/plugins/docker-nfsd.sock`.
|
|||
|
|
|
|||
|
|
2. **Docker interaction**
|
|||
|
|
When Docker needs a volume, it connects to that socket and issues
|
|||
|
|
JSON-encoded requests (`/Plugin.Activate`, `/VolumeDriver.Mount`, etc.).
|
|||
|
|
|
|||
|
|
3. **Volume creation and exposure**
|
|||
|
|
`docker-nfsd` creates a dedicated mount directory under
|
|||
|
|
`/var/lib/docker-volumes-nfsd/<volume-name>/`
|
|||
|
|
and performs a real kernel mount using:
|
|||
|
|
```c
|
|||
|
|
mount(server:path, target, "nfs4", MS_MGC_VAL, "nfsvers=4.1,rw,noatime,soft");
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Once mounted, Docker bind-mounts that directory into the container.
|
|||
|
|
The container never sees NFS directly — only the mounted directory.
|
|||
|
|
|
|||
|
|
4. **Unmount and cleanup**
|
|||
|
|
When the container stops, Docker calls `/VolumeDriver.Unmount`.
|
|||
|
|
The daemon executes `umount(2)` and releases the directory.
|
|||
|
|
|
|||
|
|
At no point does this involve FUSE, RPC daemons, or helper binaries.
|
|||
|
|
All logic happens through kernel syscalls.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 5. Why it Was Created
|
|||
|
|
|
|||
|
|
While setting up Swarm clusters with shared storage, we encountered
|
|||
|
|
the following hard limitations:
|
|||
|
|
|
|||
|
|
* **Docker provides no reliable host-level storage driver** for NFS or Ceph.
|
|||
|
|
* The so-called “official” NFS drivers run *inside containers*, which
|
|||
|
|
cannot perform kernel-level mounts on the host.
|
|||
|
|
* Other solutions rely on heavyweight sidecars or Go daemons
|
|||
|
|
that introduce complexity without solving the actual problem.
|
|||
|
|
|
|||
|
|
Our requirement was simple:
|
|||
|
|
|
|||
|
|
* Persistent shared volumes across Swarm nodes.
|
|||
|
|
* No extra layers of abstraction.
|
|||
|
|
* A driver that survives restarts and behaves like any normal service.
|
|||
|
|
|
|||
|
|
So we wrote `docker-nfsd` — a 100 KB C daemon that does exactly that.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 6. Limitations and Notes
|
|||
|
|
|
|||
|
|
* **Privileges:** Requires root privileges only as a normal system daemon.
|
|||
|
|
(No different than `sshd`, `nfsd`, or `dockerd` itself.)
|
|||
|
|
* **NFS Versions:** Uses NFSv4 by default, but older versions are supported
|
|||
|
|
by the kernel and can be negotiated automatically if desired.
|
|||
|
|
* **Concurrency:** Each mount request is independent.
|
|||
|
|
The daemon can handle multiple volumes concurrently — one mount per request,
|
|||
|
|
not one per host. Docker may issue many simultaneous mounts, and it will work.
|
|||
|
|
* **Scope:** Designed for Linux systems using Docker Swarm or standalone Docker.
|
|||
|
|
(If you’re running Swarm on Windows, you’re on your own — and probably deserve it.)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 7. Installation
|
|||
|
|
|
|||
|
|
### Build and install
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
apt install build-essential libmicrohttpd-dev
|
|||
|
|
make
|
|||
|
|
sudo make install
|
|||
|
|
sudo systemctl enable --now docker-nfsd
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
This installs the binary under `/usr/local/sbin/docker-nfsd`
|
|||
|
|
and registers a systemd service unit.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 8. Usage
|
|||
|
|
|
|||
|
|
Example:
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
docker volume create -d nfsd \
|
|||
|
|
--opt server=127.0.0.1 \
|
|||
|
|
--opt path=/exports/data \
|
|||
|
|
myvolume
|
|||
|
|
|
|||
|
|
docker run --rm -v myvolume:/mnt alpine df -h /mnt
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
`docker-nfsd` will:
|
|||
|
|
|
|||
|
|
* create `/var/lib/docker-volumes-nfsd/myvolume`
|
|||
|
|
* mount `127.0.0.1:/exports/data` there
|
|||
|
|
* return the mountpoint to Docker
|
|||
|
|
|
|||
|
|
From Docker’s perspective, it’s a normal persistent volume.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 9. Design Philosophy
|
|||
|
|
|
|||
|
|
* Written in **pure C** for transparency and performance.
|
|||
|
|
* Uses only **libmicrohttpd** and **syscalls**.
|
|||
|
|
* Does one job, and does it predictably.
|
|||
|
|
* Follows the same principle as every proper Unix daemon:
|
|||
|
|
|
|||
|
|
> “Start once, listen quietly, do your work, and stay out of the way.”
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 10. License
|
|||
|
|
|
|||
|
|
GPLv2 with the Affero clause (free as in freedom, and free as in beer).
|
|||
|
|
|
|||
|
|
Use it, modify it, improve it, or ignore it.
|
|||
|
|
Just don’t rewrite it in Go.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### — Sotiris from Greece
|
|||
|
|
|