1
0
mirror of https://passt.top/passt synced 2024-12-22 21:55:22 +00:00
Commit Graph

238 Commits

Author SHA1 Message Date
Laurent Vivier
28997fcb29 vhost-user: add vhost-user
add virtio and vhost-user functions to connect with QEMU.

  $ ./passt --vhost-user

and

  # qemu-system-x86_64 ... -m 4G \
        -object memory-backend-memfd,id=memfd0,share=on,size=4G \
        -numa node,memdev=memfd0 \
        -chardev socket,id=chr0,path=/tmp/passt_1.socket \
        -netdev vhost-user,id=netdev0,chardev=chr0 \
        -device virtio-net,mac=9a:2b:2c:2d:2e:2f,netdev=netdev0 \
        ...

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[sbrivio: as suggested by lvivier, include <netinet/if_ether.h>
 before including <linux/if_ether.h> as C libraries such as musl
 __UAPI_DEF_ETHHDR in <netinet/if_ether.h> if they already have
 a definition of struct ethhdr]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-11-27 16:47:32 +01:00
Stefano Brivio
14b84a7f07 treewide: Introduce 'local mode' for disconnected setups
There are setups where no host interface is available or configured
at all, intentionally or not, temporarily or not, but users expect
(Podman) containers to run in any case as they did with slirp4netns,
and we're now getting reports that we broke such setups at a rather
alarming rate.

To this end, if we don't find any usable host interface, instead of
exiting:

- for IPv4, use 169.254.2.1 as guest/container address and 169.254.2.2
  as default gateway

- for IPv6, don't assign any address (forcibly disable DHCPv6), and
  use the *first* link-local address we observe to represent the
  guest/container. Advertise fe80::1 as default gateway

- use 'tap0' as default interface name for pasta

Change ifi4 and ifi6 in struct ctx to int and accept a special -1
value meaning that no host interface was selected, but the IP family
is enabled. The fact that the kernel uses unsigned int values for
those is not an issue as 1. one can't create so many interfaces
anyway and 2. we otherwise handle those values transparently.

Fix a botched conditional in conf_print() to actually skip printing
DHCPv6 information if DHCPv6 is disabled (and skip printing NDP
information if NDP is disabled).

Link: https://github.com/containers/podman/issues/24614
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-11-27 05:16:38 +01:00
Stefano Brivio
6819b2e102 conf, passt.1: Update --mac-addr default in usage() and man page
Fixes: 90e83d50a9 ("Don't take "our" MAC address from the host")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-11-26 08:30:18 +01:00
David Gibson
93bce404c1 Makefile: Move NETNS_RUN_DIR definition to C code
NETNS_RUN_DIR is set in the Makefile, then passed into the C code with
-D.  But NETNS_RUN_DIR is just a fixed string, it doesn't depend on any
make probes or variables, so there's really no reason to handle it via the
Makefile.  Just move it to a plain #define in conf.c.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-11-07 12:46:52 +01:00
Stefano Brivio
59fe34ee36 treewide: Suppress clang-tidy warning if we already use O_CLOEXEC
In pcap_init(), we should always open the packet capture file with
O_CLOEXEC, even if we're not running in foreground: O_CLOEXEC means
close-on-exec, not close-on-fork.

In logfile_init() and pidfile_open(), the fact that we pass a third
'mode' argument to open() seems to confuse the android-cloexec-open
checker in LLVM versions from 16 to 19 (at least).

The checker is suggesting to add O_CLOEXEC to 'mode', and not in
'flags', where we already have it.

Add a suppression for clang-tidy and a comment, and avoid repeating
those three times by adding a new helper, output_file_open().

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-10-30 12:37:31 +01:00
Stefano Brivio
744247856d treewide: Silence cert-err33-c clang-tidy warnings for fprintf()
We use fprintf() to print to standard output or standard error
streams. If something gets truncated or there's an output error, we
don't really want to try and report that, and at the same time it's
not abnormal behaviour upon which we should terminate, either.

Just silence the warning with an ugly FPRINTF() variadic macro casting
the fprintf() expressions to void.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-10-30 12:37:31 +01:00
Stefano Brivio
98efe7c2fd treewide: Comply with CERT C rule ERR33-C for snprintf()
clang-tidy, starting from LLVM version 16, up to at least LLVM version
19, now checks that we detect and handle errors for snprintf() as
requested by CERT C rule ERR33-C. These warnings were logged with LLVM
version 19.1.2 (at least Debian and Fedora match):

/home/sbrivio/passt/arch.c:43:3: error: the value returned by this function should not be disregarded; neglecting it may lead to errors [cert-err33-c,-warnings-as-errors]
   43 |                 snprintf(new_path, PATH_MAX + sizeof(".avx2"), "%s.avx2", exe);
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/sbrivio/passt/arch.c:43:3: note: cast the expression to void to silence this warning
/home/sbrivio/passt/conf.c:577:4: error: the value returned by this function should not be disregarded; neglecting it may lead to errors [cert-err33-c,-warnings-as-errors]
  577 |                         snprintf(netns, PATH_MAX, "/proc/%ld/ns/net", pidval);
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/sbrivio/passt/conf.c:577:4: note: cast the expression to void to silence this warning
/home/sbrivio/passt/conf.c:579:5: error: the value returned by this function should not be disregarded; neglecting it may lead to errors [cert-err33-c,-warnings-as-errors]
  579 |                                 snprintf(userns, PATH_MAX, "/proc/%ld/ns/user",
      |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  580 |                                          pidval);
      |                                          ~~~~~~~
/home/sbrivio/passt/conf.c:579:5: note: cast the expression to void to silence this warning
/home/sbrivio/passt/pasta.c:105:2: error: the value returned by this function should not be disregarded; neglecting it may lead to errors [cert-err33-c,-warnings-as-errors]
  105 |         snprintf(ns, PATH_MAX, "/proc/%i/ns/net", pasta_child_pid);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/sbrivio/passt/pasta.c:105:2: note: cast the expression to void to silence this warning
/home/sbrivio/passt/pasta.c:242:2: error: the value returned by this function should not be disregarded; neglecting it may lead to errors [cert-err33-c,-warnings-as-errors]
  242 |         snprintf(uidmap, BUFSIZ, "0 %u 1", uid);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/sbrivio/passt/pasta.c:242:2: note: cast the expression to void to silence this warning
/home/sbrivio/passt/pasta.c:243:2: error: the value returned by this function should not be disregarded; neglecting it may lead to errors [cert-err33-c,-warnings-as-errors]
  243 |         snprintf(gidmap, BUFSIZ, "0 %u 1", gid);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/sbrivio/passt/pasta.c:243:2: note: cast the expression to void to silence this warning
/home/sbrivio/passt/tap.c:1155:4: error: the value returned by this function should not be disregarded; neglecting it may lead to errors [cert-err33-c,-warnings-as-errors]
 1155 |                         snprintf(path, UNIX_PATH_MAX - 1, UNIX_SOCK_PATH, i);
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/sbrivio/passt/tap.c:1155:4: note: cast the expression to void to silence this warning

Don't silence the warnings as they might actually have some merit. Add
an snprintf_check() function, instead, checking that we're not
truncating messages while printing to buffers, and terminate if the
check fails.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-10-30 12:37:25 +01:00
David Gibson
b4dace8f46 fwd: Direct inbound spliced forwards to the guest's external address
In pasta mode, where addressing permits we "splice" connections, forwarding
directly from host socket to guest/container socket without any L2 or L3
processing.  This gives us a very large performance improvement when it's
possible.

Since the traffic is from a local socket within the guest, it will go over
the guest's 'lo' interface, and accordingly we set the guest side address
to be the loopback address.  However this has a surprising side effect:
sometimes guests will run services that are only supposed to be used within
the guest and are therefore bound to only 127.0.0.1 and/or ::1.  pasta's
forwarding exposes those services to the host, which isn't generally what
we want.

Correct this by instead forwarding inbound "splice" flows to the guest's
external address.

Link: https://github.com/containers/podman/issues/24045
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-10-18 20:28:03 +02:00
David Gibson
ff63ac922a conf: Add --dns-host option to configure host side nameserver
When redirecting DNS queries with the --dns-forward option, passt/pasta
needs a host side nameserver to redirect the queries to.  This is
controlled by the c->ip[46].dns_host variables.  This is set to the first
first nameserver listed in the host's /etc/resolv.conf, and there isn't
currently a way to override it from the command line.

Prior to 0b25cac9 ("conf: Treat --dns addresses as guest visible
addresses") it was possible to alter this with the -D/--dns option.
However, doing so was confusing and had some nonsensical edge cases because
-D generally takes guest side addresses, rather than host side addresses.

Add a new --dns-host option to restore this functionality in a more
sensible way.

Link: https://bugs.passt.top/show_bug.cgi?id=102
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-10-04 19:04:29 +02:00
David Gibson
9d66df9a9a conf: Add command line switch to enable IP_FREEBIND socket option
In a couple of recent reports, we've seen that it can be useful for pasta
to forward ports from addresses which are not currently configured on the
host, but might be in future.  That can be done with the sysctl
net.ipv4.ip_nonlocal_bind, but that does require CAP_NET_ADMIN to set in
the first place.  We can allow the same thing on a per-socket basis with
the IP_FREEBIND (or IPV6_FREEBIND) socket option.

Add a --freebind command line argument to enable this socket option on
all listening sockets.

Link: https://bugs.passt.top/show_bug.cgi?id=101
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-10-04 19:04:29 +02:00
David Gibson
b55013b1a7 inany: Add inany_pton() helper
We already have an inany_ntop() function to format inany addresses into
text.  Add inany_pton() to parse them from text, and use it in
conf_ports().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2024-09-25 19:03:17 +02:00
David Gibson
cbde4192ee tcp, udp: Make {tcp,udp}_sock_init() take an inany address
tcp_sock_init() and udp_sock_init() take an address to bind to as an
address family and void * pair.  Use an inany instead.  Formerly AF_UNSPEC
was used to indicate that we want to listen on both 0.0.0.0 and ::, now use
a NULL inany to indicate that.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2024-09-25 19:03:16 +02:00
David Gibson
eedc81b6ef fwd, conf: Probe host's ephemeral ports
When we forward "all" ports (-t all or -u all), or use an exclude-only
range, we don't actually forward *all* ports - that wouln't leave local
ports to use for outgoing connections.  Rather we forward all non-ephemeral
ports - those that won't be used for outgoing connections or datagrams.

Currently we assume the range of ephemeral ports is that recommended by
RFC 6335, 49152-65535.  However, that's not the range used by default on
Linux, 32768-60999 but configurable with the net.ipv4.ip_local_port_range
sysctl.

We can't really know what range the guest will consider ephemeral, but if
it differs too much from the host it's likely to cause problems we can't
avoid anyway.  So, using the host's ephemeral range is a better guess than
using the RFC 6335 range.

Therefore, add logic to probe the host's ephemeral range, falling back to
the RFC 6335 range if that fails.  This has the bonus advantage of
reducing the number of ports bound by -t all -u all on most Linux machines
thereby reducing kernel memory usage.  Specifically this reduces kernel
memory usage with -t all -u all from ~380MiB to ~289MiB.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-29 22:26:08 +02:00
David Gibson
4a41dc58d6 conf, fwd: Don't attempt to forward port 0
When using -t all, -u all or exclude-only ranges, we'll attempt to forward
all non-ephemeral port numbers, including port 0.  However, this won't work
as intended: bind() treats a zero port not as literal port 0, but as
"pick a port for me".  Because of the special meaning of port 0, we mostly
outright exclude it in our handling.

Do the same for setting up forwards, not attempting to forward for port 0.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-29 22:26:05 +02:00
David Gibson
1daf6f4615 conf, fwd: Make ephemeral port logic more flexible
"Ephemeral" ports are those which the kernel may allocate as local
port numbers for outgoing connections or datagrams.  Because of that,
they're generally not good choices for listening servers to bind to.

Thefore when using -t all, -u all or exclude-only ranges, we map only
non-ephemeral ports.  Our logic for this is a bit rigid though: we
assume the ephemeral ports are always a fixed range at the top of the
port number space.  We also assume PORT_EPHEMERAL_MIN is a multiple of
8, or we won't set the forward bitmap correctly.

Make the logic in conf.c more flexible, using a helper moved into
fwd.[ch], although we don't change which ports we consider ephemeral
(yet).

The new handling is undoubtedly more computationally expensive, but
since it's a once-off operation at start off, I don't think it really
matters.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-29 22:25:51 +02:00
Stefano Brivio
f00ebda369 util: Don't stop on unrelated values when looking for --fd in close_open_files()
Seen with krun: we get a file descriptor via --fd, but we close it and
happily use the same number for TCP files.

The issue is that if we also get other options before --fd, with
arguments, getopt_long() stops parsing them because it sees them as
non-option values.

Use the - modifier at the beginning of optstring (before :, which is
needed to avoid printing errors) instead of +, which means we'll
continue parsing after finding unrelated option values, but
getopt_long() won't reorder them anyway: they'll be passed with option
value '1', which we can ignore.

By the way, we also need to add : after F in the optstring, so that
we're able to parse the option when given as short name as well.

Now that we change the parsing mode between close_open_files() and
conf(), we need to reset optind to 0, not to 1, whenever we call
getopt_long() again in conf(), so that the internal initialisation
of getopt_long() evaluating GNU extensions is re-triggered.

Link: https://github.com/slp/krun/issues/17#issuecomment-2294943828
Fixes: baccfb95ce ("conf: Stop parsing options at first non-option argument")
Fixes: 09603cab28 ("passt, util: Close any open file that the parent might have leaked")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-08-21 12:04:53 +02:00
David Gibson
57b7bd2a48 fwd, conf: Allow NAT of the guest's assigned address
The guest is usually assigned one of the host's IP addresses.  That means
it can't access the host itself via its usual address.  The
--map-host-loopback option (enabled by default with the gateway address)
allows the guest to contact the host.  However, connections forwarded this
way appear on the host to have originated from the loopback interface,
which isn't always desirable.

Add a new --map-guest-addr option, which acts similarly but forwarded
connections will go to the host's external address, instead of loopback.

If '-a' is used, so the guest's address is not the same as the host's, this
will instead forward to whatever host-visible site is shadowed by the
guest's assigned address.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-21 12:00:40 +02:00
David Gibson
e813a4df7d conf: Allow address remapped to host to be configured
Because the host and guest share the same IP address with passt/pasta, it's
not possible for the guest to directly address the host.  Therefore we
allow packets from the guest going to a special "NAT to host" address to be
redirected to the host, appearing there as though they have both source and
destination address of loopback.

Currently that special address is always the address of the default
gateway (or none).  That can be a problem if we want that gateway to be
addressable by the guest.  Therefore, allow the special "NAT to host"
address to be overridden on the command line with a new --map-host-loopback
option.

In order to exercise and test it, update the passt_in_ns and perf
tests to use this option and give different mapping addresses for the
two layers of the environment.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-21 12:00:35 +02:00
David Gibson
935bd81936 conf, fwd: Split notion of gateway/router from guest-visible host address
The @gw fields in the ip4_ctx and ip6_ctx give the (host's) default
gateway.  We use this for two quite distinct things: advertising the
gateway that the guest should use (via DHCP, NDP and/or --config-net)
and for a limited form of NAT.  So that the guest can access services
on the host, we map the gateway address within the guest to the
loopback address on the host.

Using the gateway address for this isn't necessarily the best choice
for this purpose, certainly not for all circumstances.  So, start off
by splitting the notion of these into two different values: @guest_gw
which is the gateway address the guest should use and @nat_host_loopback,
which is the guest visible address to remap to the host's loopback.

Usually nat_host_loopback will have the same value as guest_gw.  However
when --no-map-gw is specified we leave them unspecified instead.  This
means when we use nat_host_loopback, we don't need to separately check
c->no_map_gw to see if it's relevant.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-21 12:00:31 +02:00
David Gibson
90e83d50a9 Don't take "our" MAC address from the host
When sending frames to the guest over the tap link, we need a source MAC
address.  Currently we take that from the MAC address of the main interface
on the host, but that doesn't actually make much sense:
 * We can't preserve the real MAC address of packets from anywhere
   external so there's no transparency case here
 * In fact, it's confusingly different from how we handle IP addresses:
   whereas we give the guest the same IP as the host, we're making the
   host's MAC the one MAC that the guest *can't* use for itself.
 * We already need a fallback case if the host doesn't have an Ethernet
   like MAC (e.g. if it's connected via a point to point interface, such
   as a wireguard VPN).

Change to just just use an arbitrary fixed MAC address - I've picked
9a:55:9a:55:9a:55.  It's simpler and has the small advantage of making
the fact that passt/pasta is in use typically obvious from guest side
packet dumps.  This can still, of course, be overridden with the -M option.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-21 12:00:28 +02:00
David Gibson
975cfa5f32 Initialise our_tap_ll to ip6.gw when suitable
In every place we use our_tap_ll, we only use it as a fallback if the
IPv6 gateway address is not link-local.  We can avoid that conditional at
use time by doing it at initialisation of our_tap_ll instead.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-21 12:00:22 +02:00
David Gibson
a42fb9c000 treewide: Change misleading 'addr_ll' name
c->ip6.addr_ll is not like c->ip6.addr.  The latter is an address for the
guest, but the former is an address for our use on the tap link.  Rename it
accordingly, to 'our_tap_ll'.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-21 12:00:16 +02:00
David Gibson
57532f1ded conf: Remove incorrect initialisation of addr_ll_seen
Despite the names, addr_ll_seen does not relate to addr_ll the same
way addr_seen relates to addr.  addr_ll_seen is an observed address
from the guest, whereas addr_ll is *our* link-local address for use on
the tap link when we can't use an external endpoint address.  It's
used both for passt provided services (DHCPv6, NDP) and in some cases
for connections from addresses the guest can't access.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-21 12:00:10 +02:00
David Gibson
0b25cac94e conf: Treat --dns addresses as guest visible addresses
Although it's not 100% explicit in the man page, addresses given to
the --dns option are intended to be addresses as seen by the guest.
This differs from addresses taken from the host's /etc/resolv.conf,
which must be translated to guest accessible versions in some cases.

Our implementation is currently inconsistent on this: when using
--dns-forward, you must usually also give --dns with the matching address,
which is meaningful only in the guest's address view.  However if you give
--dns with a loopback addres, it will be translated like a host view
address.

Move the remapping logic for DNS addresses out of add_dns4() and add_dns6()
into add_dns_resolv() so that it is only applied for host nameserver
addresses, not for nameservers given explicitly with --dns.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-21 12:00:08 +02:00
David Gibson
a6066f4e27 conf: Correct setting of dns_match address in add_dns6()
add_dns6() (but not add_dns4()) has a bug setting dns_match: it sets it to
the given address, rather than the gateway address.  This is doubly wrong:
 - We've just established the given address is a host loopback address
   the guest can't access
 - We've just set ip6.dns[] to tell the guest to use the gateway address,
   so it won't use the dns_match address we're setting

Correct this to use the gateway address, like IPv4.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-21 12:00:06 +02:00
David Gibson
7c083ee41c conf: Move adding of a nameserver from resolv.conf into subfunction
get_dns() is already quite deeply nested, and future changes I have in
mind will add more complexity.  Prepare for this by splitting out the
adding of a single nameserver to the configuration into its own function.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-21 12:00:04 +02:00
David Gibson
1d10760c9f conf: Move DNS array bounds checks into add_dns[46]
Every time we call add_dns[46] we need to first check if there's space in
the c->ip[46].dns array for the new entry.  We might as well make that
check in add_dns[46]() itself.

In fact it looks like the calls in get_dns() had an off by one error, not
allowing the last entry of the array to be filled.  So, that bug is also
fixed by the change.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-21 12:00:02 +02:00
David Gibson
6852bd07cc conf: More accurately count entries added in get_dns()
get_dns() counts the number of guest DNS servers it adds, and gives an
error if it couldn't add any.  However, this count ignores the fact that
add_dns[46]() may in some cases *not* add an entry.  Use the array indices
we're already tracking to get an accurate count.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-21 12:00:00 +02:00
David Gibson
c679894668 conf: Use array indices rather than pointers for DNS array slots
Currently add_dns[46]() take a somewhat awkward double pointer to the
entry in the c->ip[46].dns array to update.  It turns out to be easier to
work with indices into that array instead.

This diff does add some lines, but it's comments, and will allow some
future code reductions.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-21 11:59:58 +02:00
David Gibson
ceea52ca93 treewide: Use struct assignment instead of memcpy() for IP addresses
We rely on C11 already, so we can use clearer and more type-checkable
struct assignment instead of mempcy() for copying IP addresses around.

This exposes some "pointer could be const" warnings from cppcheck, so
address those too.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-21 11:59:56 +02:00
David Gibson
905ecd2b0b treewide: Rename MAC address fields for clarity
c->mac isn't a great name, because it doesn't say whose mac address it is
and it's not necessarily obvious in all the contexts we use it.  Since this
is specifically the address that we (passt/pasta) use on the tap interface,
rename it to "our_tap_mac".  Rename the "mac_guest" field to "guest_mac"
to be grammatically consistent.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-21 11:59:54 +02:00
David Gibson
066e69986b util: Helper for formatting MAC addresses
There are a couple of places where we somewhat messily open code formatting
an Ethernet like MAC address for display.  Add an eth_ntop() helper for
this.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-21 11:59:51 +02:00
David Gibson
baba284912 conf: Don't ignore -t and -u options after -D
f6d5a52392 moved handling of -D into a later loop.  However as a side
effect it moved this from a switch block to an if block.  I left a couple
of 'break' statements that don't make sense in the new context.  They
should be 'continue' so that we go onto the next option, rather than
leaving the loop entirely.

Fixes: f6d5a52392 ("conf: Delay handling -D option until after addresses are configured")
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-14 09:14:12 +02:00
David Gibson
f6d5a52392 conf: Delay handling -D option until after addresses are configured
add_dns[46]() rely on the gateway address and c->no_map_gw being already
initialised, in order to properly handle DNS servers which need NAT to be
accessed from the guest.

Usually these are called from get_dns() which is well after the addresses
are configured, so that's fine.  However, they can also be called earlier
if an explicit -D command line option is given.  In this case no_map_gw
and/or c->ip[46].gw may not get be initialised properly, leading to this
doing the wrong thing.

Luckily we already have a second pass of option parsing for things which
need addresses to already be configured.  Move handling of -D to there.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-12 21:29:36 +02:00
Stefano Brivio
baccfb95ce conf: Stop parsing options at first non-option argument
Given that pasta supports specifying a command to be executed on the
command line, even without the usual -- separator as long as there's
no ambiguity, we shouldn't eat up options that are not meant for us.

Paul reports, for instance, that with:

  pasta --config-net ip -6 route

-6 is taken by pasta to mean --ipv6-only, and we execute 'ip route'.
That's because getopt_long(), by default, shuffles the argument list
to shift non-option arguments at the end.

Avoid that by adding '+' at the beginning of 'optstring'.

Reported-by: Paul Holzinger <pholzing@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-08-08 21:34:06 +02:00
Stefano Brivio
09603cab28 passt, util: Close any open file that the parent might have leaked
If a parent accidentally or due to implementation reasons leaks any
open file, we don't want to have access to them, except for the file
passed via --fd, if any.

This is the case for Podman when Podman's parent leaks files into
Podman: it's not practical for Podman to close unrelated files before
starting pasta, as reported by Paul.

Use close_range(2) to close all open files except for standard streams
and the one from --fd.

Given that parts of conf() depend on other files to be already opened,
such as the epoll file descriptor, we can't easily defer this to a
more convenient point, where --fd was already parsed. Introduce a
minimal, duplicate version of --fd parsing to keep this simple.

As we need to check that the passed --fd option doesn't exceed
INT_MAX, because we'll parse it with strtol() but file descriptor
indices are signed ints (regardless of the arguments close_range()
take), extend the existing check in the actual --fd parsing in conf(),
also rejecting file descriptors numbers that match standard streams,
while at it.

Suggested-by: Paul Holzinger <pholzing@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Paul Holzinger <pholzing@redhat.com>
2024-08-08 21:31:25 +02:00
Stefano Brivio
fbb0c9523e conf, pasta: Make -g and -a skip route/addresses copy for matching IP version only
Paul reports that setting IPv4 address and gateway manually, using
--address and --gateway, causes pasta to fail inserting IPv6 routes
in a setup where multiple, inter-dependent IPv6 routes are present
on the host.

That's because, currently, any -g option implies --no-copy-routes
altogether, and any -a implies --no-copy-addrs.

Limit this implication to the matching IP version, instead, by having
two copies of no_copy_routes and no_copy_addrs in the context
structure, separately for IPv4 and IPv6.

While at it, change them to 'bool': we had them as 'int' because
getopt_long() used to set them directly, but it hasn't been the case
for a while already.

Reported-by: Paul Holzinger <pholzing@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-08-07 09:15:25 +02:00
Stefano Brivio
4a333c88d7 conf: Accept addresses enclosed by square brackets in port forwarding specifiers
Even though we don't use : as delimiter for the port, making square
brackets unneeded, RFC 3986, section 3.2.2, mandates them for IPv6
literals. We want IPv6 addresses there, but some users might still
specify them out of habit.

Same for IPv4 addresses: RFC 3986 doesn't specify square brackets for
IPv4 literals, but I had reports of users actually trying to use them
(they're accepted by many tools).

Allow square brackets for both IPv4 and IPv6 addresses, correct or
not, they're harmless anyway.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-07-25 12:30:56 +02:00
David Gibson
d29fa0856e udp: Remove rdelta port forwarding maps
In addition to the struct fwd_ports used by both UDP and TCP to track
port forwarding, UDP also included an 'rdelta' field, which contained the
reverse mapping of the main port map.  This was used so that we could
properly direct reply packets to a forwarded packet where we change the
destination port.  This has now been taken over by the flow table: reply
packets will match the flow of the originating packet, and that gives the
correct ports on the originating side.

So, eliminate the rdelta field, and with it struct udp_fwd_ports, which
now has no additional information over struct fwd_ports.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-07-19 18:33:57 +02:00
David Gibson
f79c42317f conf: Don't configure port forwarding for a disabled protocol
UDP and/or TCP can be disabled with the --no-udp and --no-tcp options.
However, when this is specified, it's still possible to configure forwarded
ports for the disabled protocol.  In some cases this will open sockets and
perform other actions, which might not be safe since the entire protocol
won't be initialised.

Check for this case, and explicitly forbid it.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-07-17 07:04:55 +02:00
Stefano Brivio
b625ed5fee conf: Use the right maximum buffer size for c->sock_path
UNIX_SOCK_MAX is the maximum number we'll append to the socket path
if we generate it automatically. If it's given on the command line,
it can be up to UNIX_PATH_MAX (including the terminating character)
long.

UNIX_SOCK_MAX happened to kind of fit because it's 100 (instead of
108).

Commit ceddcac74a ("conf, tap: False "Buffer not null terminated"
positives, CWE-170") fixed the wrong problem: the right fix for the
problem at hand was actually commit cc287af173 ("conf: Fix
incorrect bounds checking for sock_path parameter").

Fixes: ceddcac74a ("conf, tap: False "Buffer not null terminated" positives, CWE-170")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-07-02 21:34:06 +02:00
Stefano Brivio
21ee1eb2de conf: Copy up to MAXDNSRCH - 1 bytes, not MAXDNSRCH
Spotted by Coverity just recently. Not that it really matters as
MAXDNSRCH always appears to be defined as 1025, while a full domain
name can have up to 253 characters: it would be a bit pointless to
have a longer search domain.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-07-02 21:33:29 +02:00
Stefano Brivio
e7323e515a conf, passt: Don't call __openlog() if a log file is used
If a log file is configured, we would otherwise open a connection to
the system logger (if any), print any message that we might have
before we initialise the log file, and then keep that connection
around for no particular reason.

Call __openlog() as an alternative to the log file setup, instead.

This way, we might skip printing some messages during the
initialisation phase, but they're probably not really valuable to
have in a system log, and we're going to print them to standard
error anyway.

Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-06-21 15:32:46 +02:00
Stefano Brivio
dba7f0f5ce treewide: Replace strerror() calls
Now that we have logging functions embedding perror() functionality,
we can make _some_ calls more terse by using them. In many places,
the strerror() calls are still more convenient because, for example,
they are used in flow debugging functions, or because the return code
variable of interest is not 'errno'.

While at it, convert a few error messages from a scant perror style
to proper failure descriptions.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-06-21 15:32:44 +02:00
Stefano Brivio
92a22fef93 treewide: Replace perror() calls with calls to logging functions
perror() prints directly to standard error, but in many cases standard
error might be already closed, or we might want to skip logging, based
on configuration. Our logging functions provide all that.

While at it, make errors more descriptive, replacing some of the
existing basic perror-style messages.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-06-21 15:32:43 +02:00
Stefano Brivio
8c2f24a560 conf, log: Instead of abusing log levels, add log_conf_parsed flag
We currently use a LOG_EMERG log mask to represent the fact that we
don't know yet what the mask resulting from configuration should be,
before the command line is parsed.

However, we have the necessity of representing another phase as well,
that is, configuration is parsed but we didn't daemonise yet, or
we're not ready for operation yet. The next patch will add that
notion explicitly.

Mapping these cases to further log levels isn't really practical.
Introduce boolean log flags to represent them, instead of abusing
log priorities.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-06-21 15:32:31 +02:00
Stefano Brivio
bca0fefa32 conf, passt: Make --stderr do nothing, and deprecate it
The original behaviour of printing messages to standard error by
default when running from a non-interactive terminal was introduced
because the first KubeVirt integration draft used to start passt in
foreground and get messages via standard error.

For development purposes, the system logger was more convenient at
that point, and passt was running from interactive terminals only if
not started by the KubeVirt integration.

This behaviour was introduced by 84a62b79a2 ("passt: Also log to
stderr, don't fork to background if not interactive").

Later, I added command-line options in 1e49d194d0 ("passt, pasta:
Introduce command-line options and port re-mapping") and accidentally
reversed this condition, which wasn't a problem as --stderr could
force printing to standard error anyway (and it was used by KubeVirt).

Nowadays, the KubeVirt integration uses a log file (requested via
libvirt configuration), and the same applies for Podman if one
actually needs to look at runtime logs. There are no use cases left,
as far as I know, where passt runs in foreground in non-interactive
terminals.

Seize the chance to reintroduce some sanity here. If we fork to
background, standard error is closed, so --stderr is useless in that
case.

If we run in foreground, there's no harm in printing messages to
standard error, and that accidentally became the default behaviour
anyway, so --stderr is not needed in that case.

It would be needed for non-interactive terminals, but there are no
use cases, and if there were, let's log to standard error anyway:
the user can always redirect standard error to /dev/null if needed.

Before we're up and running, we need to print to standard error anyway
if something happens, otherwise we can't report failure to start in
any kind of usage, stand-alone or in integrations.

So, make --stderr do nothing, and deprecate it.

While at it, drop a left-over comment about --foreground being the
default only for interactive terminals, because it's not the case
anymore.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-06-21 15:32:28 +02:00
Stefano Brivio
b74801645c conf, passt: Don't try to log to stderr after we close it
If we don't run in foreground, we close standard error as we
daemonise, so it makes no sense to check if the controlling terminal
is an interactive terminal or if --force-stderr was given, to decide
if we want to log to standard error.

Make --force-stderr depend on --foreground.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-06-21 15:32:15 +02:00
Stefano Brivio
65923ba798 conf: Accept duplicate and conflicting options, the last one wins
In multiple occasions, especially when passt(1) and pasta(1) are used
in integrations such as the one with Podman, the ability to override
earlier options on the command line with later one would have been
convenient.

Recently, to debug a number of issues happening with Podman, I would
have liked to ask users to share a debug log by passing --debug as
additional option, but pasta refuses --quiet (always passed by Podman)
and --debug at the same time.

On top of this, Podman lets users specify other pasta options in its
containers.conf(5) file, as well as on the command line.

The options from the configuration files are appended together with
the ones from the command line, which makes it impossible for users to
override options from the configuration file, if duplicated options
are refused, unless Podman takes care of sorting them, which is
clearly not sustainable.

For --debug and --trace, somebody took care of this on Podman side at:
  https://github.com/containers/common/pull/2052

but this doesn't fix the issue with other options, and we'll have
anyway older versions of Podman around, too.

I think there's some value in telling users about duplicated or
conflicting options, because that might reveal issues in integrations
or accidental misconfigurations, but by now I'm fairly convinced that
the downsides outweigh this.

Drop checks about duplicate options and mutually exclusive ones. In
some cases, we need to also undo a couple of initialisations caused
by earlier options, but this looks like a simplification, overall.

Notable exception: --stderr still conflicts with --log-file, because
users might have the expectation that they don't actually conflict.
But they do conflict in the existing implementation, so it's safer
to make sure that the users notice that.

Suggested-by: Paul Holzinger <pholzing@redhat.com>
Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: Paul Holzinger <pholzing@redhat.com>
2024-06-21 15:31:46 +02:00
Laurent Vivier
0c335d751a vhost-user: compare mode MODE_PASTA and not MODE_PASST
As we are going to introduce the MODE_VU that will act like
the mode MODE_PASST, compare to MODE_PASTA rather than to add
a comparison to MODE_VU when we check for MODE_PASST.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-06-13 15:45:38 +02:00