diff --git a/README.md b/README.md
index 14c89b2..85b2e90 100644
--- a/README.md
+++ b/README.md
@@ -1,11 +1,13 @@
+While functional and tested to some extent, this project is still in early development phase: don't use in production or critical environments yet.
+
# passt: Plug A Simple Socket Transport
-_passt_ implements a translation layer between a Layer-2 network interface (tap)
-and native Layer-4 sockets (TCP, UDP, ICMP/ICMPv6 echo) on a host. It doesn't
+_passt_ implements a translation layer between a Layer-2 network interface and
+native Layer-4 sockets (TCP, UDP, ICMP/ICMPv6 echo) on a host. It doesn't
require any capabilities or privileges, and it can be used as a simple
replacement for Slirp.
-
+
-- [General idea](#general-idea)
+# pasta: Pack A Subtle Tap Abstraction
+
+_pasta_ (same binary as _passt_, different command) offers equivalent
+functionality, for network namespaces: traffic is forwarded using a tap
+interface inside the namespace, without the need to create further interfaces on
+the host, hence not requiring any capabilities or privileges.
+
+It also implements a tap bypass path for local connections: packets with a local
+destination address are moved directly between Layer-4 sockets, avoiding Layer-2
+translations, using the _splice_(2) and _recvmmsg_(2)/_sendmmsg_(2) system calls
+for TCP and UDP, respectively.
+
+
+
+- [Motivation](#motivation)
- [Non-functional Targets](#non-functional-targets)
- [Interfaces and Environment](#interfaces-and-environment)
- [Services](#services)
- [Addresses](#addresses)
- [Protocols](#protocols)
- [Ports](#ports)
+- [Continuous Integration](#continuous-integration)
+- [Performance](#performance)
- [Try it](#try-it)
- [Contribute](#contribute)
-## General idea
+## Motivation
+
+### passt
When container workloads are moved to virtual machines, the network traffic is
typically forwarded by interfaces operating at data link level. Some components
@@ -110,19 +130,17 @@ in the containers ecosystem (such as _service meshes_), however, expect
applications to run locally, with visible sockets and processes, for the
purposes of socket redirection, monitoring, port mapping.
-To solve this issue, user mode networking as provided e.g. by _Slirp_,
-_libslirp_, _slirp4netns_ can be used. However, these existing solutions
-implement a full TCP/IP stack, replaying traffic on sockets that are local to
-the pod of the service mesh. This creates the illusion of application processes
-running on the same host, eventually separated by user namespaces.
+To solve this issue, user mode networking, as provided e.g. by _libslirp_,
+can be used. Existing solutions implement a full TCP/IP stack, replaying traffic
+on sockets that are local to the pod of the service mesh. This creates the
+illusion of application processes running on the same host, eventually separated
+by user namespaces.
While being almost transparent to the service mesh infrastructure, that kind of
solution comes with a number of downsides:
* three different TCP/IP stacks (guest, adaptation and host) need to be
- traversed for every service request. There are no chances to implement
- zero-copy mechanisms, and the amount of context switches increases
- dramatically
+ traversed for every service request
* addressing needs to be coordinated to create the pretense of consistent
addresses and routes between guest and host environments. This typically needs
a NAT with masquerading, or some form of packet bridging
@@ -135,21 +153,43 @@ solution comes with a number of downsides:
would if deployed with regular containers
_passt_ implements a thinner layer between guest and host, that only implements
-what's strictly needed to pretend processes are running locally. A further, full
-TCP/IP stack is not necessarily needed. Some sort of TCP adaptation is needed,
-however, as this layer runs without the `CAP_NET_RAW` capability: we can't
-create raw IP sockets on the pod, and therefore need to map packets at Layer-2
-to Layer-4 sockets offered by the host kernel.
+what's strictly needed to pretend processes are running locally. The TCP
+adaptation doesn't keep per-connection packet buffers, and reflects observed
+sending windows and acknowledgements between the two sides. This TCP adaptation
+is needed as _passt_ runs without the `CAP_NET_RAW` capability: it can't create
+raw IP sockets on the pod, and therefore needs to map packets at Layer-2 to
+Layer-4 sockets offered by the host kernel.
The problem and this approach are illustrated in more detail, with diagrams,
[here](https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Networking.md).
+### pasta
+
+On Linux, regular users can create network namespaces and run application
+services inside them. However, connecting namespaces to other namespaces and to
+external hosts requires the creation of network interfaces, such as `veth`
+pairs, which needs in turn elevated privileges or the `CAP_NET_ADMIN`
+capability. _pasta_, similarly to _slirp4netns_, solves this problem by creating
+a tap interface available to processes in the namespace, and mapping network
+traffic outside the namespace using native Layer-4 sockets.
+
+Existing approaches typically implement a full, generic TCP/IP stack for this
+translation between data and transport layers, without the possibility of
+speeding up local connections, and usually requiring NAT. _pasta_:
+* avoids the need for a generic, full-fledged TCP/IP stack by coordinating TCP
+connection dynamics between sender and receiver
+* offers a fast bypass path for local connections: if a process connects to
+another process on the same host across namespaces, data is directly forwarded
+using pairs of Layer-4 sockets
+* with default options, maps routing and addressing information to the
+namespace, avoiding any need for NAT
+
## Non-functional Targets
Security and maintainability goals:
* no dynamic memory allocation
-* ~2 000 LoC target
+* ~5 000 LoC target
* no external dependencies
## Interfaces and Environment
@@ -166,83 +206,125 @@ TCP. Two temporary solutions are available:
This approach, compared to using a _tap_ device, doesn't require any security
capabilities, as we don't need to create any interface.
+_pasta_ runs out of the box with any recent (post-3.8) Linux kernel.
+
## Services
-_passt_ provides some minimalistic implementations of networking services that
-can't practically run on the host:
+_passt_ and _pasta_ provide some minimalistic implementations of networking
+services:
* [ARP proxy](https://passt.top/passt/tree/arp.c), that resolves the address of
the host (which is used as gateway) to the original MAC address of the host
* [DHCP server](https://passt.top/passt/tree/dhcp.c), a simple implementation
- handing out one single IPv4 address to the guest, namely, the same address as
- the first one configured for the upstream host interface, and passing the
- nameservers configured on the host
+ handing out one single IPv4 address to the guest or namespace, namely, the
+ same address as the first one configured for the upstream host interface, and
+ passing the nameservers configured on the host
* [NDP proxy](https://passt.top/passt/tree/ndp.c), which can also assign prefix
and nameserver using SLAAC
* [DHCPv6 server](https://passt.top/passt/tree/dhcpv6.c): a simple
- implementation handing out one single IPv6 address to the guest, namely, the
- the same address as the first one configured for the upstream host interface,
- and passing the first nameserver configured on the host
+ implementation handing out one single IPv6 address to the guest or namespace,
+ namely, the the same address as the first one configured for the upstream host
+ interface, and passing the nameservers configured on the host
## Addresses
-For IPv4, the guest is assigned, via DHCP, the same address as the upstream
-interface of the host, and the same default gateway as the default gateway of
-the host. Addresses are translated in case the guest is seen using a different
-address from the assigned one.
+For IPv4, the guest or namespace is assigned, via DHCP, the same address as the
+upstream interface of the host, and the same default gateway as the default
+gateway of the host. Addresses are translated in case the guest is seen using a
+different address from the assigned one.
-For IPv6, the guest is assigned, via SLAAC, the same prefix as the upstream
-interface of the host, the same default route as the default route of the
-host, and, if a DHCPv6 client is running on the guest, also the same address as
-the upstream address of the host. This means that, with a DHCPv6 client on the
-guest, addresses don't need to be translated. Should the client use a different
-address, the destination address is translated for packets going to the guest.
+For IPv6, the guest or namespace is assigned, via SLAAC, the same prefix as the
+upstream interface of the host, the same default route as the default route of
+the host, and, if a DHCPv6 client is running in the guest or namespace, also the
+same address as the upstream address of the host. This means that, with a DHCPv6
+client in the guest or namespace, addresses don't need to be translated. Should
+the client use a different address, the destination address is translated for
+packets going to the guest or to the namespace.
-For UDP and TCP, for both IPv4 and IPv6, packets addressed to a loopback address
-are forwarded to the guest with their source address changed to the address of
-the gateway or first hop of the default route. This mapping is reversed as the
-guest replies to those packets (on the same TCP connection, or using destination
-port and address that were used as source for UDP).
+### Local connections with _passt_
+
+For UDP and TCP, for both IPv4 and IPv6, packets from the host addressed to a
+loopback address are forwarded to the guest with their source address changed to
+the address of the gateway or first hop of the default route. This mapping is
+reversed on the other way.
+
+### Local connections with _pasta_
+
+Packets addressed to a loopback address in either namespace are directly
+forwarded to the corresponding (or configured) port in the other namespace.
+Similarly as _passt_, packets from the non-init namespace addressed to the
+default gateway, which are therefore sent via the tap device, will have their
+destination address translated to the loopback address.
## Protocols
-_passt_ supports TCP, UDP and ICMP/ICMPv6 echo (requests and replies). More
-details about the TCP implementation are available
+_passt_ and _pasta_ support TCP, UDP and ICMP/ICMPv6 echo (requests and
+replies). More details about the TCP implementation are available
[here](https://passt.top/passt/tree/tcp.c), and for the UDP
implementation [here](https://passt.top/passt/tree/udp.c).
-An IGMP proxy is currently work in progress.
+An IGMP/MLD proxy is currently work in progress.
## Ports
-To avoid the need for explicit port mapping configuration, _passt_ binds to all
-unbound non-ephemeral (0-49152) TCP and UDP ports. Binding to low ports (0-1023)
-will fail without additional capabilities, and ports already bound (service
-proxies, etc.) will also not be used.
+### passt
+
+To avoid the need for explicit port mapping configuration, _passt_ can bind to
+all unbound non-ephemeral (0-49152) TCP and UDP ports. Binding to low ports
+(0-1023) will fail without additional capabilities, and ports already bound
+(service proxies, etc.) will also not be used. Smaller subsets of ports, with
+port translations, are also configurable.
UDP ephemeral ports are bound dynamically, as the guest uses them.
-Service proxies and other services running in the container need to be started
-before _passt_ starts.
+If all ports are forwarded, service proxies and other services running in the
+container need to be started before _passt_ starts.
+
+### pasta
+
+With default options, _pasta_ scans for bound ports on init and non-init
+namespaces, and automatically forwards them from the other side. Port forwarding
+is fully configurable with command line options.
+
+## Continuous Integration
+
+