passt: Relicense to GPL 2.0, or any later version
In practical terms, passt doesn't benefit from the additional
protection offered by the AGPL over the GPL, because it's not
suitable to be executed over a computer network.
Further, restricting the distribution under the version 3 of the GPL
wouldn't provide any practical advantage either, as long as the passt
codebase is concerned, and might cause unnecessary compatibility
dilemmas.
Change licensing terms to the GNU General Public License Version 2,
or any later version, with written permission from all current and
past contributors, namely: myself, David Gibson, Laine Stump, Andrea
Bolognani, Paul Holzinger, Richard W.M. Jones, Chris Kuhn, Florian
Weimer, Giuseppe Scrivano, Stefan Hajnoczi, and Vasiliy Ulyanov.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-04-05 20:11:44 +02:00
|
|
|
// SPDX-License-Identifier: GPL-2.0-or-later
|
passt: New design and implementation with native Layer 4 sockets
This is a reimplementation, partially building on the earlier draft,
that uses L4 sockets (SOCK_DGRAM, SOCK_STREAM) instead of SOCK_RAW,
providing L4-L2 translation functionality without requiring any
security capability.
Conceptually, this follows the design presented at:
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Networking.md
The most significant novelty here comes from TCP and UDP translation
layers. In particular, the TCP state and translation logic follows
the intent of being minimalistic, without reimplementing a full TCP
stack in either direction, and synchronising as much as possible the
TCP dynamic and flows between guest and host kernel.
Another important introduction concerns addressing, port translation
and forwarding. The Layer 4 implementations now attempt to bind on
all unbound ports, in order to forward connections in a transparent
way.
While at it:
- the qemu 'tap' back-end can't be used as-is by qrap anymore,
because of explicit checks now introduced in qemu to ensure that
the corresponding file descriptor is actually a tap device. For
this reason, qrap now operates on a 'socket' back-end type,
accounting for and building the additional header reporting
frame length
- provide a demo script that sets up namespaces, addresses and
routes, and starts the daemon. A virtual machine started in the
network namespace, wrapped by qrap, will now directly interface
with passt and communicate using Layer 4 sockets provided by the
host kernel.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-02-16 07:25:09 +01:00
|
|
|
|
2020-07-21 10:48:24 +02:00
|
|
|
/* PASST - Plug A Simple Socket Transport
|
passt: Add PASTA mode, major rework
PASTA (Pack A Subtle Tap Abstraction) provides quasi-native host
connectivity to an otherwise disconnected, unprivileged network
and user namespace, similarly to slirp4netns. Given that the
implementation is largely overlapping with PASST, no separate binary
is built: 'pasta' (and 'passt4netns' for clarity) both link to
'passt', and the mode of operation is selected depending on how the
binary is invoked. Usage example:
$ unshare -rUn
# echo $$
1871759
$ ./pasta 1871759 # From another terminal
# udhcpc -i pasta0 2>/dev/null
# ping -c1 pasta.pizza
PING pasta.pizza (64.190.62.111) 56(84) bytes of data.
64 bytes from 64.190.62.111 (64.190.62.111): icmp_seq=1 ttl=255 time=34.6 ms
--- pasta.pizza ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 34.575/34.575/34.575/0.000 ms
# ping -c1 spaghetti.pizza
PING spaghetti.pizza(2606:4700:3034::6815:147a (2606:4700:3034::6815:147a)) 56 data bytes
64 bytes from 2606:4700:3034::6815:147a (2606:4700:3034::6815:147a): icmp_seq=1 ttl=255 time=29.0 ms
--- spaghetti.pizza ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 28.967/28.967/28.967/0.000 ms
This entails a major rework, especially with regard to the storage of
tracked connections and to the semantics of epoll(7) references.
Indexing TCP and UDP bindings merely by socket proved to be
inflexible and unsuitable to handle different connection flows: pasta
also provides Layer-2 to Layer-2 socket mapping between init and a
separate namespace for local connections, using a pair of splice()
system calls for TCP, and a recvmmsg()/sendmmsg() pair for UDP local
bindings. For instance, building on the previous example:
# ip link set dev lo up
# iperf3 -s
$ iperf3 -c ::1 -Z -w 32M -l 1024k -P2 | tail -n4
[SUM] 0.00-10.00 sec 52.3 GBytes 44.9 Gbits/sec 283 sender
[SUM] 0.00-10.43 sec 52.3 GBytes 43.1 Gbits/sec receiver
iperf Done.
epoll(7) references now include a generic part in order to
demultiplex data to the relevant protocol handler, using 24
bits for the socket number, and an opaque portion reserved for
usage by the single protocol handlers, in order to track sockets
back to corresponding connections and bindings.
A number of fixes pertaining to TCP state machine and congestion
window handling are also included here.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-07-17 08:34:53 +02:00
|
|
|
* for qemu/UNIX domain socket mode
|
|
|
|
*
|
|
|
|
* PASTA - Pack A Subtle Tap Abstraction
|
|
|
|
* for network namespace/tap device mode
|
2020-07-21 10:48:24 +02:00
|
|
|
*
|
|
|
|
* ndp.c - NDP support for PASST
|
|
|
|
*
|
passt: New design and implementation with native Layer 4 sockets
This is a reimplementation, partially building on the earlier draft,
that uses L4 sockets (SOCK_DGRAM, SOCK_STREAM) instead of SOCK_RAW,
providing L4-L2 translation functionality without requiring any
security capability.
Conceptually, this follows the design presented at:
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Networking.md
The most significant novelty here comes from TCP and UDP translation
layers. In particular, the TCP state and translation logic follows
the intent of being minimalistic, without reimplementing a full TCP
stack in either direction, and synchronising as much as possible the
TCP dynamic and flows between guest and host kernel.
Another important introduction concerns addressing, port translation
and forwarding. The Layer 4 implementations now attempt to bind on
all unbound ports, in order to forward connections in a transparent
way.
While at it:
- the qemu 'tap' back-end can't be used as-is by qrap anymore,
because of explicit checks now introduced in qemu to ensure that
the corresponding file descriptor is actually a tap device. For
this reason, qrap now operates on a 'socket' back-end type,
accounting for and building the additional header reporting
frame length
- provide a demo script that sets up namespaces, addresses and
routes, and starts the daemon. A virtual machine started in the
network namespace, wrapped by qrap, will now directly interface
with passt and communicate using Layer 4 sockets provided by the
host kernel.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-02-16 07:25:09 +01:00
|
|
|
* Copyright (c) 2020-2021 Red Hat GmbH
|
2020-07-21 10:48:24 +02:00
|
|
|
* Author: Stefano Brivio <sbrivio@redhat.com>
|
|
|
|
*
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include <stdio.h>
|
|
|
|
#include <stddef.h>
|
|
|
|
#include <stdint.h>
|
|
|
|
#include <unistd.h>
|
|
|
|
#include <string.h>
|
dhcp, ndp, dhcpv6: Support for multiple DNS servers, search list
Add support for a variable amount of DNS servers, including zero,
from /etc/resolv.conf, in DHCP, NDP and DHCPv6 implementations.
Introduce support for domain search list for DHCP (RFC 3397),
NDP (RFC 8106), and DHCPv6 (RFC 3646), also sourced from
/etc/resolv.conf.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-05-21 11:14:47 +02:00
|
|
|
#include <arpa/inet.h>
|
2021-10-21 04:26:08 +02:00
|
|
|
#include <netinet/ip.h>
|
2020-07-21 10:48:24 +02:00
|
|
|
#include <net/if.h>
|
|
|
|
#include <net/if_arp.h>
|
2021-10-21 04:26:08 +02:00
|
|
|
#include <netinet/if_ether.h>
|
|
|
|
|
|
|
|
#include <linux/icmpv6.h>
|
2020-07-21 10:48:24 +02:00
|
|
|
|
2021-07-26 07:18:50 +02:00
|
|
|
#include "checksum.h"
|
2020-07-21 10:48:24 +02:00
|
|
|
#include "util.h"
|
passt: Add PASTA mode, major rework
PASTA (Pack A Subtle Tap Abstraction) provides quasi-native host
connectivity to an otherwise disconnected, unprivileged network
and user namespace, similarly to slirp4netns. Given that the
implementation is largely overlapping with PASST, no separate binary
is built: 'pasta' (and 'passt4netns' for clarity) both link to
'passt', and the mode of operation is selected depending on how the
binary is invoked. Usage example:
$ unshare -rUn
# echo $$
1871759
$ ./pasta 1871759 # From another terminal
# udhcpc -i pasta0 2>/dev/null
# ping -c1 pasta.pizza
PING pasta.pizza (64.190.62.111) 56(84) bytes of data.
64 bytes from 64.190.62.111 (64.190.62.111): icmp_seq=1 ttl=255 time=34.6 ms
--- pasta.pizza ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 34.575/34.575/34.575/0.000 ms
# ping -c1 spaghetti.pizza
PING spaghetti.pizza(2606:4700:3034::6815:147a (2606:4700:3034::6815:147a)) 56 data bytes
64 bytes from 2606:4700:3034::6815:147a (2606:4700:3034::6815:147a): icmp_seq=1 ttl=255 time=29.0 ms
--- spaghetti.pizza ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 28.967/28.967/28.967/0.000 ms
This entails a major rework, especially with regard to the storage of
tracked connections and to the semantics of epoll(7) references.
Indexing TCP and UDP bindings merely by socket proved to be
inflexible and unsuitable to handle different connection flows: pasta
also provides Layer-2 to Layer-2 socket mapping between init and a
separate namespace for local connections, using a pair of splice()
system calls for TCP, and a recvmmsg()/sendmmsg() pair for UDP local
bindings. For instance, building on the previous example:
# ip link set dev lo up
# iperf3 -s
$ iperf3 -c ::1 -Z -w 32M -l 1024k -P2 | tail -n4
[SUM] 0.00-10.00 sec 52.3 GBytes 44.9 Gbits/sec 283 sender
[SUM] 0.00-10.43 sec 52.3 GBytes 43.1 Gbits/sec receiver
iperf Done.
epoll(7) references now include a generic part in order to
demultiplex data to the relevant protocol handler, using 24
bits for the socket number, and an opaque portion reserved for
usage by the single protocol handlers, in order to track sockets
back to corresponding connections and bindings.
A number of fixes pertaining to TCP state machine and congestion
window handling are also included here.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-07-17 08:34:53 +02:00
|
|
|
#include "passt.h"
|
passt: New design and implementation with native Layer 4 sockets
This is a reimplementation, partially building on the earlier draft,
that uses L4 sockets (SOCK_DGRAM, SOCK_STREAM) instead of SOCK_RAW,
providing L4-L2 translation functionality without requiring any
security capability.
Conceptually, this follows the design presented at:
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Networking.md
The most significant novelty here comes from TCP and UDP translation
layers. In particular, the TCP state and translation logic follows
the intent of being minimalistic, without reimplementing a full TCP
stack in either direction, and synchronising as much as possible the
TCP dynamic and flows between guest and host kernel.
Another important introduction concerns addressing, port translation
and forwarding. The Layer 4 implementations now attempt to bind on
all unbound ports, in order to forward connections in a transparent
way.
While at it:
- the qemu 'tap' back-end can't be used as-is by qrap anymore,
because of explicit checks now introduced in qemu to ensure that
the corresponding file descriptor is actually a tap device. For
this reason, qrap now operates on a 'socket' back-end type,
accounting for and building the additional header reporting
frame length
- provide a demo script that sets up namespaces, addresses and
routes, and starts the daemon. A virtual machine started in the
network namespace, wrapped by qrap, will now directly interface
with passt and communicate using Layer 4 sockets provided by the
host kernel.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-02-16 07:25:09 +01:00
|
|
|
#include "tap.h"
|
2022-09-24 09:53:15 +02:00
|
|
|
#include "log.h"
|
2020-07-21 10:48:24 +02:00
|
|
|
|
|
|
|
#define RS 133
|
|
|
|
#define RA 134
|
|
|
|
#define NS 135
|
|
|
|
#define NA 136
|
|
|
|
|
|
|
|
/**
|
|
|
|
* ndp() - Check for NDP solicitations, reply as needed
|
|
|
|
* @c: Execution context
|
treewide: Packet abstraction with mandatory boundary checks
Implement a packet abstraction providing boundary and size checks
based on packet descriptors: packets stored in a buffer can be queued
into a pool (without storage of its own), and data can be retrieved
referring to an index in the pool, specifying offset and length.
Checks ensure data is not read outside the boundaries of buffer and
descriptors, and that packets added to a pool are within the buffer
range with valid offset and indices.
This implies a wider rework: usage of the "queueing" part of the
abstraction mostly affects tap_handler_{passt,pasta}() functions and
their callees, while the "fetching" part affects all the guest or tap
facing implementations: TCP, UDP, ICMP, ARP, NDP, DHCP and DHCPv6
handlers.
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-25 13:02:47 +01:00
|
|
|
* @ih: ICMPv6 header
|
|
|
|
* @saddr Source IPv6 address
|
2020-07-21 10:48:24 +02:00
|
|
|
*
|
|
|
|
* Return: 0 if not handled here, 1 if handled, -1 on failure
|
|
|
|
*/
|
2022-10-19 11:43:54 +11:00
|
|
|
int ndp(struct ctx *c, const struct icmp6hdr *ih, const struct in6_addr *saddr)
|
2020-07-21 10:48:24 +02:00
|
|
|
{
|
2022-10-19 11:43:55 +11:00
|
|
|
const struct in6_addr *rsaddr; /* src addr for reply */
|
2020-07-21 10:48:24 +02:00
|
|
|
char buf[BUFSIZ] = { 0 };
|
treewide: Packet abstraction with mandatory boundary checks
Implement a packet abstraction providing boundary and size checks
based on packet descriptors: packets stored in a buffer can be queued
into a pool (without storage of its own), and data can be retrieved
referring to an index in the pool, specifying offset and length.
Checks ensure data is not read outside the boundaries of buffer and
descriptors, and that packets added to a pool are within the buffer
range with valid offset and indices.
This implies a wider rework: usage of the "queueing" part of the
abstraction mostly affects tap_handler_{passt,pasta}() functions and
their callees, while the "fetching" part affects all the guest or tap
facing implementations: TCP, UDP, ICMP, ARP, NDP, DHCP and DHCPv6
handlers.
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-25 13:02:47 +01:00
|
|
|
struct ipv6hdr *ip6hr;
|
|
|
|
struct icmp6hdr *ihr;
|
|
|
|
struct ethhdr *ehr;
|
|
|
|
unsigned char *p;
|
|
|
|
size_t len;
|
2020-07-21 10:48:24 +02:00
|
|
|
|
treewide: Packet abstraction with mandatory boundary checks
Implement a packet abstraction providing boundary and size checks
based on packet descriptors: packets stored in a buffer can be queued
into a pool (without storage of its own), and data can be retrieved
referring to an index in the pool, specifying offset and length.
Checks ensure data is not read outside the boundaries of buffer and
descriptors, and that packets added to a pool are within the buffer
range with valid offset and indices.
This implies a wider rework: usage of the "queueing" part of the
abstraction mostly affects tap_handler_{passt,pasta}() functions and
their callees, while the "fetching" part affects all the guest or tap
facing implementations: TCP, UDP, ICMP, ARP, NDP, DHCP and DHCPv6
handlers.
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-25 13:02:47 +01:00
|
|
|
if (ih->icmp6_type < RS || ih->icmp6_type > NA)
|
2020-07-21 10:48:24 +02:00
|
|
|
return 0;
|
|
|
|
|
2021-08-12 15:42:43 +02:00
|
|
|
if (c->no_ndp)
|
|
|
|
return 1;
|
|
|
|
|
2020-07-21 10:48:24 +02:00
|
|
|
ehr = (struct ethhdr *)buf;
|
|
|
|
ip6hr = (struct ipv6hdr *)(ehr + 1);
|
|
|
|
ihr = (struct icmp6hdr *)(ip6hr + 1);
|
|
|
|
|
|
|
|
if (ih->icmp6_type == NS) {
|
treewide: Packet abstraction with mandatory boundary checks
Implement a packet abstraction providing boundary and size checks
based on packet descriptors: packets stored in a buffer can be queued
into a pool (without storage of its own), and data can be retrieved
referring to an index in the pool, specifying offset and length.
Checks ensure data is not read outside the boundaries of buffer and
descriptors, and that packets added to a pool are within the buffer
range with valid offset and indices.
This implies a wider rework: usage of the "queueing" part of the
abstraction mostly affects tap_handler_{passt,pasta}() functions and
their callees, while the "fetching" part affects all the guest or tap
facing implementations: TCP, UDP, ICMP, ARP, NDP, DHCP and DHCPv6
handlers.
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-25 13:02:47 +01:00
|
|
|
if (IN6_IS_ADDR_UNSPECIFIED(saddr))
|
2021-10-21 12:13:44 +02:00
|
|
|
return 1;
|
|
|
|
|
2021-03-18 07:49:08 +01:00
|
|
|
info("NDP: received NS, sending NA");
|
2020-07-21 10:48:24 +02:00
|
|
|
ihr->icmp6_type = NA;
|
|
|
|
ihr->icmp6_code = 0;
|
|
|
|
ihr->icmp6_router = 1;
|
|
|
|
ihr->icmp6_solicited = 1;
|
|
|
|
ihr->icmp6_override = 1;
|
|
|
|
|
|
|
|
p = (unsigned char *)(ihr + 1);
|
2021-05-21 11:14:52 +02:00
|
|
|
memcpy(p, ih + 1, sizeof(struct in6_addr)); /* target address */
|
2020-07-21 10:48:24 +02:00
|
|
|
p += 16;
|
2021-05-21 11:14:52 +02:00
|
|
|
*p++ = 2; /* target ll */
|
|
|
|
*p++ = 1; /* length */
|
2020-07-21 10:48:24 +02:00
|
|
|
memcpy(p, c->mac, ETH_ALEN);
|
|
|
|
p += 6;
|
|
|
|
} else if (ih->icmp6_type == RS) {
|
2021-10-21 09:41:13 +02:00
|
|
|
size_t dns_s_len = 0;
|
dhcp, ndp, dhcpv6: Support for multiple DNS servers, search list
Add support for a variable amount of DNS servers, including zero,
from /etc/resolv.conf, in DHCP, NDP and DHCPv6 implementations.
Introduce support for domain search list for DHCP (RFC 3397),
NDP (RFC 8106), and DHCPv6 (RFC 3646), also sourced from
/etc/resolv.conf.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-05-21 11:14:47 +02:00
|
|
|
int i, n;
|
|
|
|
|
2021-08-12 15:42:43 +02:00
|
|
|
if (c->no_ra)
|
|
|
|
return 1;
|
|
|
|
|
2021-03-18 07:49:08 +01:00
|
|
|
info("NDP: received RS, sending RA");
|
2020-07-21 10:48:24 +02:00
|
|
|
ihr->icmp6_type = RA;
|
|
|
|
ihr->icmp6_code = 0;
|
2021-10-21 12:12:14 +02:00
|
|
|
ihr->icmp6_hop_limit = 255;
|
ndp: Extend lifetime of prefix, router, RDNSS and search list
Currently, we have no mechanism to dynamically update IPv6
addressing, routing or DNS information (which should eventually be
implemented via netlink monitor), so it makes no sense to limit
lifetimes of NDP information to any particular value.
If we do, with common configurations of systemd-networkd in a guest,
we can end up in a situation where we have a /128 address assigned
via DHCPv6, the NDP-assigned prefix expires, and the default route
also expires. However, as there's a valid address, the prefix is
not renewed. As a result, the default route becomes invalid and we
lose it altogether, which implies that the guest loses IPv6
connectivity except for link-local communication.
Set the router lifetime to the maximum allowed by RFC 8319, that is,
65535 seconds (about 18 hours). RFC 4861 limited this value to 9000
seconds, but RFC 8319 later updated this limit.
Set prefix and DNS information lifetime to infinity. This is allowed
by RFC 4861 and RFC 8319.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2023-12-08 17:43:17 +01:00
|
|
|
ihr->icmp6_rt_lifetime = htons(65535); /* RFC 8319 */
|
2021-04-13 21:59:47 +02:00
|
|
|
ihr->icmp6_addrconf_managed = 1;
|
2020-07-21 10:48:24 +02:00
|
|
|
|
|
|
|
p = (unsigned char *)(ihr + 1);
|
|
|
|
p += 8; /* reachable, retrans time */
|
|
|
|
*p++ = 3; /* prefix */
|
|
|
|
*p++ = 4; /* length */
|
|
|
|
*p++ = 64; /* prefix length */
|
2021-04-13 21:59:47 +02:00
|
|
|
*p++ = 0xc0; /* prefix flags: L, A */
|
ndp: Extend lifetime of prefix, router, RDNSS and search list
Currently, we have no mechanism to dynamically update IPv6
addressing, routing or DNS information (which should eventually be
implemented via netlink monitor), so it makes no sense to limit
lifetimes of NDP information to any particular value.
If we do, with common configurations of systemd-networkd in a guest,
we can end up in a situation where we have a /128 address assigned
via DHCPv6, the NDP-assigned prefix expires, and the default route
also expires. However, as there's a valid address, the prefix is
not renewed. As a result, the default route becomes invalid and we
lose it altogether, which implies that the guest loses IPv6
connectivity except for link-local communication.
Set the router lifetime to the maximum allowed by RFC 8319, that is,
65535 seconds (about 18 hours). RFC 4861 limited this value to 9000
seconds, but RFC 8319 later updated this limit.
Set prefix and DNS information lifetime to infinity. This is allowed
by RFC 4861 and RFC 8319.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2023-12-08 17:43:17 +01:00
|
|
|
*(uint32_t *)p = (uint32_t)~0U; /* lifetime */
|
2020-07-21 10:48:24 +02:00
|
|
|
p += 4;
|
ndp: Extend lifetime of prefix, router, RDNSS and search list
Currently, we have no mechanism to dynamically update IPv6
addressing, routing or DNS information (which should eventually be
implemented via netlink monitor), so it makes no sense to limit
lifetimes of NDP information to any particular value.
If we do, with common configurations of systemd-networkd in a guest,
we can end up in a situation where we have a /128 address assigned
via DHCPv6, the NDP-assigned prefix expires, and the default route
also expires. However, as there's a valid address, the prefix is
not renewed. As a result, the default route becomes invalid and we
lose it altogether, which implies that the guest loses IPv6
connectivity except for link-local communication.
Set the router lifetime to the maximum allowed by RFC 8319, that is,
65535 seconds (about 18 hours). RFC 4861 limited this value to 9000
seconds, but RFC 8319 later updated this limit.
Set prefix and DNS information lifetime to infinity. This is allowed
by RFC 4861 and RFC 8319.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2023-12-08 17:43:17 +01:00
|
|
|
*(uint32_t *)p = (uint32_t)~0U; /* preferred lifetime */
|
2020-07-21 10:48:24 +02:00
|
|
|
p += 8;
|
2022-07-22 15:31:18 +10:00
|
|
|
memcpy(p, &c->ip6.addr, 8); /* prefix */
|
2020-07-21 10:48:24 +02:00
|
|
|
p += 16;
|
|
|
|
|
2021-09-07 11:19:57 +02:00
|
|
|
if (c->mtu != -1) {
|
2021-08-12 15:42:43 +02:00
|
|
|
*p++ = 5; /* type */
|
|
|
|
*p++ = 1; /* length */
|
|
|
|
p += 2; /* reserved */
|
|
|
|
*(uint32_t *)p = htonl(c->mtu); /* MTU */
|
|
|
|
p += 4;
|
|
|
|
}
|
|
|
|
|
conf, udp: Introduce basic DNS forwarding
For compatibility with libslirp/slirp4netns users: introduce a
mechanism to map, in the UDP routines, an address facing guest or
namespace to the first IPv4 or IPv6 address resulting from
configuration as resolver. This can be enabled with the new
--dns-forward option.
This implies that sourcing and using DNS addresses and search lists,
passed via command line or read from /etc/resolv.conf, is not bound
anymore to DHCP/DHCPv6/NDP usage: for example, pasta users might just
want to use addresses from /etc/resolv.conf as mapping target, while
not passing DNS options via DHCP.
Reflect this in all the involved code paths by differentiating
DHCP/DHCPv6/NDP usage from DNS configuration per se, and in the new
options --dhcp-dns, --dhcp-search for pasta, and --no-dhcp-dns,
--no-dhcp-search for passt.
This should be the last bit to enable substantial compatibility
between slirp4netns.sh and slirp4netns(1): pass the --dns-forward
option from the script too.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-02-18 04:03:53 +01:00
|
|
|
if (c->no_dhcp_dns)
|
|
|
|
goto dns_done;
|
|
|
|
|
conf, udp: Drop mostly duplicated dns_send arrays, rename related fields
Given that we use just the first valid DNS resolver address
configured, or read from resolv.conf(5) on the host, to forward DNS
queries to, in case --dns-forward is used, we don't need to duplicate
dns[] to dns_send[]:
- rename dns_send[] back to dns[]: those are the resolvers we
advertise to the guest/container
- for forwarding purposes, instead of dns[], use a single field (for
each protocol version): dns_host
- and rename dns_fwd to dns_match, so that it's clear this is the
address we are matching DNS queries against, to decide if they need
to be forwarded
Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2022-11-10 20:30:03 +01:00
|
|
|
for (n = 0; !IN6_IS_ADDR_UNSPECIFIED(&c->ip6.dns[n]); n++);
|
dhcp, ndp, dhcpv6: Support for multiple DNS servers, search list
Add support for a variable amount of DNS servers, including zero,
from /etc/resolv.conf, in DHCP, NDP and DHCPv6 implementations.
Introduce support for domain search list for DHCP (RFC 3397),
NDP (RFC 8106), and DHCPv6 (RFC 3646), also sourced from
/etc/resolv.conf.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-05-21 11:14:47 +02:00
|
|
|
if (n) {
|
treewide: Packet abstraction with mandatory boundary checks
Implement a packet abstraction providing boundary and size checks
based on packet descriptors: packets stored in a buffer can be queued
into a pool (without storage of its own), and data can be retrieved
referring to an index in the pool, specifying offset and length.
Checks ensure data is not read outside the boundaries of buffer and
descriptors, and that packets added to a pool are within the buffer
range with valid offset and indices.
This implies a wider rework: usage of the "queueing" part of the
abstraction mostly affects tap_handler_{passt,pasta}() functions and
their callees, while the "fetching" part affects all the guest or tap
facing implementations: TCP, UDP, ICMP, ARP, NDP, DHCP and DHCPv6
handlers.
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-25 13:02:47 +01:00
|
|
|
*p++ = 25; /* RDNSS */
|
|
|
|
*p++ = 1 + 2 * n; /* length */
|
|
|
|
p += 2; /* reserved */
|
ndp: Extend lifetime of prefix, router, RDNSS and search list
Currently, we have no mechanism to dynamically update IPv6
addressing, routing or DNS information (which should eventually be
implemented via netlink monitor), so it makes no sense to limit
lifetimes of NDP information to any particular value.
If we do, with common configurations of systemd-networkd in a guest,
we can end up in a situation where we have a /128 address assigned
via DHCPv6, the NDP-assigned prefix expires, and the default route
also expires. However, as there's a valid address, the prefix is
not renewed. As a result, the default route becomes invalid and we
lose it altogether, which implies that the guest loses IPv6
connectivity except for link-local communication.
Set the router lifetime to the maximum allowed by RFC 8319, that is,
65535 seconds (about 18 hours). RFC 4861 limited this value to 9000
seconds, but RFC 8319 later updated this limit.
Set prefix and DNS information lifetime to infinity. This is allowed
by RFC 4861 and RFC 8319.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2023-12-08 17:43:17 +01:00
|
|
|
*(uint32_t *)p = (uint32_t)~0U; /* lifetime */
|
dhcp, ndp, dhcpv6: Support for multiple DNS servers, search list
Add support for a variable amount of DNS servers, including zero,
from /etc/resolv.conf, in DHCP, NDP and DHCPv6 implementations.
Introduce support for domain search list for DHCP (RFC 3397),
NDP (RFC 8106), and DHCPv6 (RFC 3646), also sourced from
/etc/resolv.conf.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-05-21 11:14:47 +02:00
|
|
|
p += 4;
|
|
|
|
|
|
|
|
for (i = 0; i < n; i++) {
|
conf, udp: Drop mostly duplicated dns_send arrays, rename related fields
Given that we use just the first valid DNS resolver address
configured, or read from resolv.conf(5) on the host, to forward DNS
queries to, in case --dns-forward is used, we don't need to duplicate
dns[] to dns_send[]:
- rename dns_send[] back to dns[]: those are the resolvers we
advertise to the guest/container
- for forwarding purposes, instead of dns[], use a single field (for
each protocol version): dns_host
- and rename dns_fwd to dns_match, so that it's clear this is the
address we are matching DNS queries against, to decide if they need
to be forwarded
Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2022-11-10 20:30:03 +01:00
|
|
|
memcpy(p, &c->ip6.dns[i], 16); /* address */
|
|
|
|
p += 16;
|
dhcp, ndp, dhcpv6: Support for multiple DNS servers, search list
Add support for a variable amount of DNS servers, including zero,
from /etc/resolv.conf, in DHCP, NDP and DHCPv6 implementations.
Introduce support for domain search list for DHCP (RFC 3397),
NDP (RFC 8106), and DHCPv6 (RFC 3646), also sourced from
/etc/resolv.conf.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-05-21 11:14:47 +02:00
|
|
|
}
|
2021-10-21 17:34:42 +02:00
|
|
|
|
|
|
|
for (n = 0; *c->dns_search[n].n; n++)
|
|
|
|
dns_s_len += strlen(c->dns_search[n].n) + 2;
|
dhcp, ndp, dhcpv6: Support for multiple DNS servers, search list
Add support for a variable amount of DNS servers, including zero,
from /etc/resolv.conf, in DHCP, NDP and DHCPv6 implementations.
Introduce support for domain search list for DHCP (RFC 3397),
NDP (RFC 8106), and DHCPv6 (RFC 3646), also sourced from
/etc/resolv.conf.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-05-21 11:14:47 +02:00
|
|
|
}
|
|
|
|
|
conf, udp: Introduce basic DNS forwarding
For compatibility with libslirp/slirp4netns users: introduce a
mechanism to map, in the UDP routines, an address facing guest or
namespace to the first IPv4 or IPv6 address resulting from
configuration as resolver. This can be enabled with the new
--dns-forward option.
This implies that sourcing and using DNS addresses and search lists,
passed via command line or read from /etc/resolv.conf, is not bound
anymore to DHCP/DHCPv6/NDP usage: for example, pasta users might just
want to use addresses from /etc/resolv.conf as mapping target, while
not passing DNS options via DHCP.
Reflect this in all the involved code paths by differentiating
DHCP/DHCPv6/NDP usage from DNS configuration per se, and in the new
options --dhcp-dns, --dhcp-search for pasta, and --no-dhcp-dns,
--no-dhcp-search for passt.
This should be the last bit to enable substantial compatibility
between slirp4netns.sh and slirp4netns(1): pass the --dns-forward
option from the script too.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-02-18 04:03:53 +01:00
|
|
|
if (!c->no_dhcp_dns_search && dns_s_len) {
|
treewide: Packet abstraction with mandatory boundary checks
Implement a packet abstraction providing boundary and size checks
based on packet descriptors: packets stored in a buffer can be queued
into a pool (without storage of its own), and data can be retrieved
referring to an index in the pool, specifying offset and length.
Checks ensure data is not read outside the boundaries of buffer and
descriptors, and that packets added to a pool are within the buffer
range with valid offset and indices.
This implies a wider rework: usage of the "queueing" part of the
abstraction mostly affects tap_handler_{passt,pasta}() functions and
their callees, while the "fetching" part affects all the guest or tap
facing implementations: TCP, UDP, ICMP, ARP, NDP, DHCP and DHCPv6
handlers.
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-25 13:02:47 +01:00
|
|
|
*p++ = 31; /* DNSSL */
|
|
|
|
*p++ = (dns_s_len + 8 - 1) / 8 + 1; /* length */
|
|
|
|
p += 2; /* reserved */
|
ndp: Extend lifetime of prefix, router, RDNSS and search list
Currently, we have no mechanism to dynamically update IPv6
addressing, routing or DNS information (which should eventually be
implemented via netlink monitor), so it makes no sense to limit
lifetimes of NDP information to any particular value.
If we do, with common configurations of systemd-networkd in a guest,
we can end up in a situation where we have a /128 address assigned
via DHCPv6, the NDP-assigned prefix expires, and the default route
also expires. However, as there's a valid address, the prefix is
not renewed. As a result, the default route becomes invalid and we
lose it altogether, which implies that the guest loses IPv6
connectivity except for link-local communication.
Set the router lifetime to the maximum allowed by RFC 8319, that is,
65535 seconds (about 18 hours). RFC 4861 limited this value to 9000
seconds, but RFC 8319 later updated this limit.
Set prefix and DNS information lifetime to infinity. This is allowed
by RFC 4861 and RFC 8319.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2023-12-08 17:43:17 +01:00
|
|
|
*(uint32_t *)p = (uint32_t)~0U; /* lifetime */
|
dhcp, ndp, dhcpv6: Support for multiple DNS servers, search list
Add support for a variable amount of DNS servers, including zero,
from /etc/resolv.conf, in DHCP, NDP and DHCPv6 implementations.
Introduce support for domain search list for DHCP (RFC 3397),
NDP (RFC 8106), and DHCPv6 (RFC 3646), also sourced from
/etc/resolv.conf.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-05-21 11:14:47 +02:00
|
|
|
p += 4;
|
|
|
|
|
|
|
|
for (i = 0; i < n; i++) {
|
|
|
|
char *dot;
|
|
|
|
|
|
|
|
*(p++) = '.';
|
|
|
|
|
|
|
|
strncpy((char *)p, c->dns_search[i].n,
|
|
|
|
sizeof(buf) -
|
|
|
|
((intptr_t)p - (intptr_t)buf));
|
|
|
|
for (dot = (char *)p - 1; *dot; dot++) {
|
|
|
|
if (*dot == '.')
|
|
|
|
*dot = strcspn(dot + 1, ".");
|
|
|
|
}
|
|
|
|
p += strlen(c->dns_search[i].n);
|
|
|
|
*(p++) = 0;
|
|
|
|
}
|
|
|
|
|
2021-10-21 09:41:13 +02:00
|
|
|
memset(p, 0, 8 - dns_s_len % 8); /* padding */
|
|
|
|
p += 8 - dns_s_len % 8;
|
dhcp, ndp, dhcpv6: Support for multiple DNS servers, search list
Add support for a variable amount of DNS servers, including zero,
from /etc/resolv.conf, in DHCP, NDP and DHCPv6 implementations.
Introduce support for domain search list for DHCP (RFC 3397),
NDP (RFC 8106), and DHCPv6 (RFC 3646), also sourced from
/etc/resolv.conf.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-05-21 11:14:47 +02:00
|
|
|
}
|
2020-07-21 10:48:24 +02:00
|
|
|
|
conf, udp: Introduce basic DNS forwarding
For compatibility with libslirp/slirp4netns users: introduce a
mechanism to map, in the UDP routines, an address facing guest or
namespace to the first IPv4 or IPv6 address resulting from
configuration as resolver. This can be enabled with the new
--dns-forward option.
This implies that sourcing and using DNS addresses and search lists,
passed via command line or read from /etc/resolv.conf, is not bound
anymore to DHCP/DHCPv6/NDP usage: for example, pasta users might just
want to use addresses from /etc/resolv.conf as mapping target, while
not passing DNS options via DHCP.
Reflect this in all the involved code paths by differentiating
DHCP/DHCPv6/NDP usage from DNS configuration per se, and in the new
options --dhcp-dns, --dhcp-search for pasta, and --no-dhcp-dns,
--no-dhcp-search for passt.
This should be the last bit to enable substantial compatibility
between slirp4netns.sh and slirp4netns(1): pass the --dns-forward
option from the script too.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-02-18 04:03:53 +01:00
|
|
|
dns_done:
|
2020-07-21 10:48:24 +02:00
|
|
|
*p++ = 1; /* source ll */
|
|
|
|
*p++ = 1; /* length */
|
|
|
|
memcpy(p, c->mac, ETH_ALEN);
|
|
|
|
p += 6;
|
|
|
|
} else {
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
len = (uintptr_t)p - (uintptr_t)ihr - sizeof(*ihr);
|
|
|
|
|
treewide: Packet abstraction with mandatory boundary checks
Implement a packet abstraction providing boundary and size checks
based on packet descriptors: packets stored in a buffer can be queued
into a pool (without storage of its own), and data can be retrieved
referring to an index in the pool, specifying offset and length.
Checks ensure data is not read outside the boundaries of buffer and
descriptors, and that packets added to a pool are within the buffer
range with valid offset and indices.
This implies a wider rework: usage of the "queueing" part of the
abstraction mostly affects tap_handler_{passt,pasta}() functions and
their callees, while the "fetching" part affects all the guest or tap
facing implementations: TCP, UDP, ICMP, ARP, NDP, DHCP and DHCPv6
handlers.
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-25 13:02:47 +01:00
|
|
|
if (IN6_IS_ADDR_LINKLOCAL(saddr))
|
2022-07-22 15:31:18 +10:00
|
|
|
c->ip6.addr_ll_seen = *saddr;
|
2021-07-21 10:04:17 +02:00
|
|
|
else
|
2022-07-22 15:31:18 +10:00
|
|
|
c->ip6.addr_seen = *saddr;
|
2021-07-21 10:04:17 +02:00
|
|
|
|
2022-07-22 15:31:18 +10:00
|
|
|
if (IN6_IS_ADDR_LINKLOCAL(&c->ip6.gw))
|
2022-10-19 11:43:55 +11:00
|
|
|
rsaddr = &c->ip6.gw;
|
2021-10-20 11:10:23 +02:00
|
|
|
else
|
2022-10-19 11:43:55 +11:00
|
|
|
rsaddr = &c->ip6.addr_ll;
|
2021-10-20 11:10:23 +02:00
|
|
|
|
2022-10-19 11:43:55 +11:00
|
|
|
tap_icmp6_send(c, rsaddr, saddr, ihr, len + sizeof(*ihr));
|
2020-07-21 10:48:24 +02:00
|
|
|
|
|
|
|
return 1;
|
|
|
|
}
|