passt

mirror of https://passt.top/passt synced 2024-12-22 13:45:32 +00:00

Author	SHA1	Message	Date
David Gibson	cbc83e14df	ndp: Split out helpers for sending specific NDP message types Currently the large ndp() function responds to all NDP messages we handle, both parsing the message as necessary and sending the response. Split out the code to construct and send specific message types into ndp_na() (to send NA messages) and ndp_ra() (to send RA messages). As well as breaking up an excessively large function, this is a first step to being able to send unsolicited NDP messages. While we're there, remove a slighty ugly goto. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-11-14 19:00:29 +01:00
David Gibson	4e47167035	ndp: Add ndp_send() helper ndp() has a conditional on message type generating the reply message, then a tiny amount of common code, then another conditional to send the reply with slightly different parameters. We can make this a bit neater by making a helper function for sending the reply, and call it from each of the different message type paths. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-11-14 19:00:28 +01:00
David Gibson	71f228d04b	ndp: Remove redundant update to addr_seen ndp() updates addr_seen or addr_ll_seen based on the source address of the received packet. This is redundant since tap6_handler() has already updated addr_seen for any type of packet, not just NDP. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-11-14 19:00:13 +01:00
David Gibson	d8e05a3fe0	ndp: Use const pointer for ndp_ns packet We don't modify this structure at all. For some reason cppcheck doesn't catch this with our current options, but did when I was experimenting with some different options. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-11-07 12:47:19 +01:00
David Gibson	975cfa5f32	Initialise our_tap_ll to ip6.gw when suitable In every place we use our_tap_ll, we only use it as a fallback if the IPv6 gateway address is not link-local. We can avoid that conditional at use time by doing it at initialisation of our_tap_ll instead. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-08-21 12:00:22 +02:00
David Gibson	a42fb9c000	treewide: Change misleading 'addr_ll' name c->ip6.addr_ll is not like c->ip6.addr. The latter is an address for the guest, but the former is an address for our use on the tap link. Rename it accordingly, to 'our_tap_ll'. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-08-21 12:00:16 +02:00
David Gibson	905ecd2b0b	treewide: Rename MAC address fields for clarity c->mac isn't a great name, because it doesn't say whose mac address it is and it's not necessarily obvious in all the contexts we use it. Since this is specifically the address that we (passt/pasta) use on the tap interface, rename it to "our_tap_mac". Rename the "mac_guest" field to "guest_mac" to be grammatically consistent. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-08-21 11:59:54 +02:00
AbdAlRahman Gad	c16141eda5	ndp.c: Turn NDP responder into more declarative implementation - Add structs for NA, RA, NS, MTU, prefix info, option header, link-layer address, RDNSS, DNSSL and link-layer for RA message. - Turn NA message from purely imperative, going byte by byte, to declarative by filling it's struct. - Turn part of RA message into declarative. - Move packet_add() to be before the call of ndp() in tap6_handler() if the protocol of the packet is ICMPv6. - Add a pool of packets as an additional parameter to ndp(). - Check the size of NS packet with packet_get() before sending an NA packet. - Add documentation for the structs. - Add an enum for NDP option types. Link: https://bugs.passt.top/show_bug.cgi?id=21 Signed-off-by: AbdAlRahman Gad <abdobngad@gmail.com> [sbrivio: Minor coding style fixes] Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-08-13 19:46:16 +02:00
David Gibson	5566386f5f	treewide: Standardise variable names for various packet lengths At various points we need to track the lengths of a packet including or excluding various different sets of headers. We don't always use the same variable names for doing so. Worse in some places we use the same name for different things: e.g. tcp_fill_headers[46]() use ip_len for the length including the IP headers, but then tcp_send_flag() which calls it uses it to mean the IP payload length only. To improve clarity, standardise on these names: dlen: L4 protocol payload length ("data length") l4len: plen + length of L4 protocol header l3len: l4len + length of IPv4/IPv6 header l2len: l3len + length of L2 (ethernet) header Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-05-02 16:13:23 +02:00
Laurent Vivier	324bd46782	util: move IP stuff from util.[ch] to ip.[ch] Introduce ip.[ch] file to encapsulate IP protocol handling functions and structures. Modify various files to include the new header ip.h when it's needed. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-ID: <20240303135114.1023026-5-lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2024-03-06 08:03:38 +01:00
Stefano Brivio	00358b7828	ndp: Extend lifetime of prefix, router, RDNSS and search list Currently, we have no mechanism to dynamically update IPv6 addressing, routing or DNS information (which should eventually be implemented via netlink monitor), so it makes no sense to limit lifetimes of NDP information to any particular value. If we do, with common configurations of systemd-networkd in a guest, we can end up in a situation where we have a /128 address assigned via DHCPv6, the NDP-assigned prefix expires, and the default route also expires. However, as there's a valid address, the prefix is not renewed. As a result, the default route becomes invalid and we lose it altogether, which implies that the guest loses IPv6 connectivity except for link-local communication. Set the router lifetime to the maximum allowed by RFC 8319, that is, 65535 seconds (about 18 hours). RFC 4861 limited this value to 9000 seconds, but RFC 8319 later updated this limit. Set prefix and DNS information lifetime to infinity. This is allowed by RFC 4861 and RFC 8319. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>	2023-12-27 19:22:29 +01:00
Stefano Brivio	ca2749e1bd	passt: Relicense to GPL 2.0, or any later version In practical terms, passt doesn't benefit from the additional protection offered by the AGPL over the GPL, because it's not suitable to be executed over a computer network. Further, restricting the distribution under the version 3 of the GPL wouldn't provide any practical advantage either, as long as the passt codebase is concerned, and might cause unnecessary compatibility dilemmas. Change licensing terms to the GNU General Public License Version 2, or any later version, with written permission from all current and past contributors, namely: myself, David Gibson, Laine Stump, Andrea Bolognani, Paul Holzinger, Richard W.M. Jones, Chris Kuhn, Florian Weimer, Giuseppe Scrivano, Stefan Hajnoczi, and Vasiliy Ulyanov. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2023-04-06 18:00:33 +02:00
Stefano Brivio	3a2afde87d	conf, udp: Drop mostly duplicated dns_send arrays, rename related fields Given that we use just the first valid DNS resolver address configured, or read from resolv.conf(5) on the host, to forward DNS queries to, in case --dns-forward is used, we don't need to duplicate dns[] to dns_send[]: - rename dns_send[] back to dns[]: those are the resolvers we advertise to the guest/container - for forwarding purposes, instead of dns[], use a single field (for each protocol version): dns_host - and rename dns_fwd to dns_match, so that it's clear this is the address we are matching DNS queries against, to decide if they need to be forwarded Suggested-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>	2022-11-16 15:09:31 +01:00
Stefano Brivio	73f50a76aa	conf: Split the notions of read DNS addresses and offered ones With --dns-forward, if the host has a loopback address configured as DNS server, we should actually use it to forward queries, but, if --no-map-gw is passed, we shouldn't offer the same address via DHCP, NDP and DHCPv6, because it's not going to be reachable. Problematic configuration: * systemd-resolved configuring the usual 127.0.0.53 on the host: we read that from /etc/resolv.conf * --dns-forward specified with an unrelated address, for example 198.51.100.1 We still want to forward queries to 127.0.0.53, if we receive one directed to 198.51.100.1, so we can't drop 127.0.0.53 from our list: we want to use it for forwarding. At the same time, we shouldn't offer 127.0.0.53 to the guest or container either. With this change, I'm only covering the case of automatically configured DNS servers from /etc/resolv.conf. We could extend this to addresses configured with command-line options, but I don't really see a likely use case at this point. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2022-11-04 12:04:32 +01:00
David Gibson	db07804d26	ndp: Use tap_icmp6_send() helper We send ICMPv6 packets to the guest from both icmp.c and from ndp.c. The case in ndp() manually constructs L2 and IPv6 headers, unlike the version in icmp.c which uses the tap_icmp6_send() helper from tap.c Now that we've broaded the parameters of tap_icmp6_send() we can use it in ndp() as well saving some duplicated logic. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2022-10-19 03:34:53 +02:00
David Gibson	cb1edae3b5	ndp: Remove unneeded eh_source parameter ndp() takes a parameter giving the ethernet source address of the packet it is to respond to, which it uses to determine the destination address to send the reply packet to. This is not necessary, because the address will always be the guest's MAC address. Even if the guest has just changed MAC address, then either tap_handler_passt() or tap_handler_pasta() - which are the only call paths leading to ndp() will have updated c->mac_guest with the new value. So, remove the parameter, and just use c->mac_guest, making it more consistent with other paths where we construct packets to send inwards. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2022-10-19 03:34:51 +02:00
David Gibson	fb5d1c5d7d	tap: Remove unhelpeful vnet_pre optimization from tap_send() Callers of tap_send() can optionally use a small optimization by adding extra space for the 4 byte length header used on the qemu socket interface. tap_ip_send() is currently the only user of this, but this is used only for "slow path" ICMP and DHCP packets, so there's not a lot of value to the optimization. Worse, having the two paths here complicates the interface and makes future cleanups difficult, so just remove it. I have some plans to bring back the optimization in a more general way in future, but for now it's just in the way. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2022-10-19 03:34:43 +02:00
David Gibson	7abd2b0d72	Add csum_icmp6() helper for calculating ICMPv6 checksums At least two places in passt calculate ICMPv6 checksums, ndp() and tap_ip_send(). Add a helper to handle this calculation in both places. For future flexibility, the new helper takes parameters for the fields in the IPv6 pseudo-header, so an IPv6 header or pseudo-header doesn't need to be explicitly constructed. It also allows the ICMPv6 header and payload to be in separate buffers, although we don't use this yet. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2022-10-19 03:34:21 +02:00
Stefano Brivio	da152331cf	Move logging functions to a new file, log.c Logging to file is going to add some further complexity that we don't want to squeeze into util.c. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>	2022-10-14 17:38:25 +02:00
David Gibson	16f5586bb8	Make substructures for IPv4 and IPv6 specific context information The context structure contains a batch of fields specific to IPv4 and to IPv6 connectivity. Split those out into a sub-structure. This allows the conf_ip4() and conf_ip6() functions, which take the entire context but touch very little of it, to be given more specific parameters, making it clearer what it affects without stepping through the code. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2022-07-30 22:14:07 +02:00
Stefano Brivio	48582bf47f	treewide: Mark constant references as const Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2022-03-29 15:35:38 +02:00
Stefano Brivio	bb70811183	treewide: Packet abstraction with mandatory boundary checks Implement a packet abstraction providing boundary and size checks based on packet descriptors: packets stored in a buffer can be queued into a pool (without storage of its own), and data can be retrieved referring to an index in the pool, specifying offset and length. Checks ensure data is not read outside the boundaries of buffer and descriptors, and that packets added to a pool are within the buffer range with valid offset and indices. This implies a wider rework: usage of the "queueing" part of the abstraction mostly affects tap_handler_{passt,pasta}() functions and their callees, while the "fetching" part affects all the guest or tap facing implementations: TCP, UDP, ICMP, ARP, NDP, DHCP and DHCPv6 handlers. Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2022-03-29 15:35:38 +02:00
Stefano Brivio	89678c5157	conf, udp: Introduce basic DNS forwarding For compatibility with libslirp/slirp4netns users: introduce a mechanism to map, in the UDP routines, an address facing guest or namespace to the first IPv4 or IPv6 address resulting from configuration as resolver. This can be enabled with the new --dns-forward option. This implies that sourcing and using DNS addresses and search lists, passed via command line or read from /etc/resolv.conf, is not bound anymore to DHCP/DHCPv6/NDP usage: for example, pasta users might just want to use addresses from /etc/resolv.conf as mapping target, while not passing DNS options via DHCP. Reflect this in all the involved code paths by differentiating DHCP/DHCPv6/NDP usage from DNS configuration per se, and in the new options --dhcp-dns, --dhcp-search for pasta, and --no-dhcp-dns, --no-dhcp-search for passt. This should be the last bit to enable substantial compatibility between slirp4netns.sh and slirp4netns(1): pass the --dns-forward option from the script too. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2022-02-21 13:41:13 +01:00
Stefano Brivio	b93c2c1713	passt: Drop <linux/ipv6.h> include, carry own ipv6hdr and opt_hdr definitions This is the only remaining Linux-specific include -- drop it to avoid clang-tidy warnings and to make code more portable. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2022-01-26 07:57:09 +01:00
Stefano Brivio	73a4a6b7cd	ndp: Don't send a DNS search list if we don't have a list of DNS servers This is not explicitly forbidden, but it confuses the ISC's DHCP client, and doesn't make sense anyway. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-10-21 17:34:42 +02:00
Stefano Brivio	af55c4e98f	ndp: Don't sabotage DAD by replying to probing neighbour solicitation If the solicitation comes from ::, it's the guest performing duplicate address detection -- don't answer that. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-10-21 12:16:16 +02:00
Stefano Brivio	bf68270898	ndp: Set (ICMP) hop limit to 255 in router advertisement Found while re-reading this part, zero works as well, but a host might legitimately refuse a value that's below a given threshold. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-10-21 12:16:16 +02:00
Stefano Brivio	685b50c3ce	Makefile: cppcheck target: Suppress unmatchedSuppression, pass CFLAGS Some of those warnings don't trigger even on systems with very similar toolchains, suppress unmatchedSuppression warnings, they're basically useless. While at it, pass CFLAGS to cppcheck. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-10-21 12:16:16 +02:00
Stefano Brivio	627e18fa8a	passt: Add cppcheck target, test, and address resulting warnings ...mostly false positives, but a number of very relevant ones too, in tcp_get_sndbuf(), tcp_conn_from_tap(), and siphash PREAMBLE(). Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-10-21 09:41:13 +02:00
Stefano Brivio	dd942eaa48	passt: Fix build with gcc 7, use std=c99, enable some more Clang checkers Unions and structs, you all have names now. Take the chance to enable bugprone-reserved-identifier, cert-dcl37-c, and cert-dcl51-cpp checkers in clang-tidy. Provide a ffsl() weak declaration using gcc built-in. Start reordering includes, but that's not enough for the llvm-include-order checker yet. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-10-21 04:26:08 +02:00
Stefano Brivio	9618d24700	ndp, dhcpv6, tcp, udp: Always use link-local as source if gateway isn't This shouldn't happen on any sane configuration, but I just met an example of that: the default IPv6 gateway on the host is configured with a global unicast address, we use that as source for RA, DHCPv6 replies, and the guest ignores it. Same later on if we talk TCP or UDP and the guest has no idea where that address comes from. Use our link-local address in case the gateway address is global. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-10-20 11:10:23 +02:00
Stefano Brivio	12cfa6444c	passt: Add clang-tidy Makefile target and test, take care of warnings Most are just about style and form, but a few were actually serious mistakes (NDP-related). Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-10-20 08:34:22 +02:00
Stefano Brivio	4b0ccb8323	ndp: Set router lifetime to 9000s instead of 3600s Seen while testing: lifetime expires while we're flooding a tap interface with UDP packets, the router advertisement comes too late, and the kernel drops the default router in the namespace. This should only affect testing, so go for the maximum allowed value, that is, 9000 seconds. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-09-27 01:28:02 +02:00
Stefano Brivio	ec2b58ea4d	conf, dhcp, ndp: Fix message about default MTU, make NDP consistent Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-09-09 15:40:04 +02:00
Stefano Brivio	1e49d194d0	passt, pasta: Introduce command-line options and port re-mapping Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-09-01 17:00:27 +02:00
Stefano Brivio	17765f8de0	checksum: Introduce AVX2 implementation, unify helpers Provide an AVX2-based function using compiler intrinsics for TCP/IP-style checksums. The load/unpack/add idea and implementation is largely based on code from BESS (the Berkeley Extensible Software Switch) licensed as 3-Clause BSD, with a number of modifications to further decrease pipeline stalls and to minimise cache pollution. This speeds up considerably data paths from sockets to tap interfaces, decreasing overhead for checksum computation, with 16-64KiB packet buffers, from approximately 11% to 7%. The rest is just syscalls at this point. While at it, provide convenience targets in the Makefile for avx2, avx2_debug, and debug targets -- these simply add target-specific CFLAGS to the build. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-07-26 07:18:50 +02:00
Stefano Brivio	7fa3e90290	ndp: Store link-local or global address on any NDP message received The guest might not send other types of traffic before we try to communicate to it, so take also this chance to store its configured addresses. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-07-21 10:04:17 +02:00
Stefano Brivio	a9c8d4d924	ndp: Fix calculation of length for DNS Search List option (31) Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-07-17 17:58:03 +02:00
Stefano Brivio	33482d5bf2	passt: Add PASTA mode, major rework PASTA (Pack A Subtle Tap Abstraction) provides quasi-native host connectivity to an otherwise disconnected, unprivileged network and user namespace, similarly to slirp4netns. Given that the implementation is largely overlapping with PASST, no separate binary is built: 'pasta' (and 'passt4netns' for clarity) both link to 'passt', and the mode of operation is selected depending on how the binary is invoked. Usage example: $ unshare -rUn # echo $$ 1871759 $ ./pasta 1871759 # From another terminal # udhcpc -i pasta0 2>/dev/null # ping -c1 pasta.pizza PING pasta.pizza (64.190.62.111) 56(84) bytes of data. 64 bytes from 64.190.62.111 (64.190.62.111): icmp_seq=1 ttl=255 time=34.6 ms --- pasta.pizza ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 34.575/34.575/34.575/0.000 ms # ping -c1 spaghetti.pizza PING spaghetti.pizza(2606:4700:3034::6815:147a (2606:4700:3034::6815:147a)) 56 data bytes 64 bytes from 2606:4700:3034::6815:147a (2606:4700:3034::6815:147a): icmp_seq=1 ttl=255 time=29.0 ms --- spaghetti.pizza ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 28.967/28.967/28.967/0.000 ms This entails a major rework, especially with regard to the storage of tracked connections and to the semantics of epoll(7) references. Indexing TCP and UDP bindings merely by socket proved to be inflexible and unsuitable to handle different connection flows: pasta also provides Layer-2 to Layer-2 socket mapping between init and a separate namespace for local connections, using a pair of splice() system calls for TCP, and a recvmmsg()/sendmmsg() pair for UDP local bindings. For instance, building on the previous example: # ip link set dev lo up # iperf3 -s $ iperf3 -c ::1 -Z -w 32M -l 1024k -P2 \| tail -n4 [SUM] 0.00-10.00 sec 52.3 GBytes 44.9 Gbits/sec 283 sender [SUM] 0.00-10.43 sec 52.3 GBytes 43.1 Gbits/sec receiver iperf Done. epoll(7) references now include a generic part in order to demultiplex data to the relevant protocol handler, using 24 bits for the socket number, and an opaque portion reserved for usage by the single protocol handlers, in order to track sockets back to corresponding connections and bindings. A number of fixes pertaining to TCP state machine and congestion window handling are also included here. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-07-17 11:04:22 +02:00
Stefano Brivio	5fd6db7751	ndp: Always answer neighbour solicitations with the requested target address The guest might try to resolve hosts other than the main host namespace (i.e. the gateway) -- just recycle the target address from the request and resolve it to the MAC address of the gateway. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-05-21 11:14:52 +02:00
Stefano Brivio	9010054ea4	dhcp, ndp, dhcpv6: Support for multiple DNS servers, search list Add support for a variable amount of DNS servers, including zero, from /etc/resolv.conf, in DHCP, NDP and DHCPv6 implementations. Introduce support for domain search list for DHCP (RFC 3397), NDP (RFC 8106), and DHCPv6 (RFC 3646), also sourced from /etc/resolv.conf. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-05-21 11:14:47 +02:00
Stefano Brivio	4aa8e54a30	passt: Introduce a DHCPv6 server This implementation, similarly to the IPv4 DHCP one, hands out a single address, which is the same as the upstream address for the host. This avoids the need for address translation as long as the client runs a DHCPv6 client. The NDP "Managed" flag is now set in Router Advertisements. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-04-13 22:37:40 +02:00
Stefano Brivio	48ca38c606	passt: Run in background, add message logging with severities Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-03-18 12:58:07 +01:00
Stefano Brivio	8bca388e8a	passt: Assorted fixes from "fresh eyes" review A bunch of fixes not worth single commits at this stage, notably: - make buffer, length parameter ordering consistent in ARP, DHCP, NDP handlers - strict checking of buffer, message and option length in DHCP handler (a malicious client could have easily crashed it) - set up forwarding for IPv4 and IPv6, and masquerading with nft for IPv4, from demo script - get rid of separate slow and fast timers, we don't save any overhead that way - stricter checking of buffer lengths as passed to tap handlers - proper dequeuing from qemu socket back-end: I accidentally trashed messages that were bundled up together in a single tap read operation -- the length header tells us what's the size of the next frame, but there's no apparent limit to the number of messages we get with one single receive - rework some bits of the TCP state machine, now passive and active connection closes appear to be robust -- introduce a new FIN_WAIT_1_SOCK_FIN state indicating a FIN_WAIT_1 with a FIN flag from socket - streamline TCP option parsing routine - track TCP state changes to stderr (this is temporary, proper debugging and syslogging support pending) - observe that multiplying a number by four might very well change its value, and this happens to be the case for the data offset from the TCP header as we check if it's the same as the total length to find out if it's a duplicated ACK segment - recent estimates suggest that the duration of a millisecond is closer to a million nanoseconds than a thousand of them, this trend is now reflected into the timespec_diff_ms() convenience routine Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-02-21 11:55:49 +01:00
Stefano Brivio	105b916361	passt: New design and implementation with native Layer 4 sockets This is a reimplementation, partially building on the earlier draft, that uses L4 sockets (SOCK_DGRAM, SOCK_STREAM) instead of SOCK_RAW, providing L4-L2 translation functionality without requiring any security capability. Conceptually, this follows the design presented at: https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Networking.md The most significant novelty here comes from TCP and UDP translation layers. In particular, the TCP state and translation logic follows the intent of being minimalistic, without reimplementing a full TCP stack in either direction, and synchronising as much as possible the TCP dynamic and flows between guest and host kernel. Another important introduction concerns addressing, port translation and forwarding. The Layer 4 implementations now attempt to bind on all unbound ports, in order to forward connections in a transparent way. While at it: - the qemu 'tap' back-end can't be used as-is by qrap anymore, because of explicit checks now introduced in qemu to ensure that the corresponding file descriptor is actually a tap device. For this reason, qrap now operates on a 'socket' back-end type, accounting for and building the additional header reporting frame length - provide a demo script that sets up namespaces, addresses and routes, and starts the daemon. A virtual machine started in the network namespace, wrapped by qrap, will now directly interface with passt and communicate using Layer 4 sockets provided by the host kernel. Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-02-16 09:28:55 +01:00
Stefano Brivio	d02e059ddc	passt: Add IPv6 and NDP support, further fixes for IPv4 CT Signed-off-by: Stefano Brivio <sbrivio@redhat.com>	2021-02-16 07:58:05 +01:00

46 Commits