1
0
mirror of https://passt.top/passt synced 2024-12-22 21:55:22 +00:00
Commit Graph

72 Commits

Author SHA1 Message Date
David Gibson
b4dace8f46 fwd: Direct inbound spliced forwards to the guest's external address
In pasta mode, where addressing permits we "splice" connections, forwarding
directly from host socket to guest/container socket without any L2 or L3
processing.  This gives us a very large performance improvement when it's
possible.

Since the traffic is from a local socket within the guest, it will go over
the guest's 'lo' interface, and accordingly we set the guest side address
to be the loopback address.  However this has a surprising side effect:
sometimes guests will run services that are only supposed to be used within
the guest and are therefore bound to only 127.0.0.1 and/or ::1.  pasta's
forwarding exposes those services to the host, which isn't generally what
we want.

Correct this by instead forwarding inbound "splice" flows to the guest's
external address.

Link: https://github.com/containers/podman/issues/24045
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-10-18 20:28:03 +02:00
David Gibson
1fa421192c passt.1: Clarify and update "Handling of local addresses" section
This section didn't mention the effect of the --map-host-loopback option
which now alters this behaviour.  Update it accordingly.

It used "local addresses" to mean specifically 127.0.0.0/8 and ::1.
However, "local" could also refer to link-local addresses or to addresses
of any scope which happen to be configured on the host.  Use "loopback
address" to be more precise about this.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-10-18 20:27:57 +02:00
David Gibson
ef8a5161d0 passt.1: Mark --stderr as deprecated more prominently
The description of this option says that it's deprecated, but unlike
--no-copy-addrs and --no-copy-routes it doesn't have a clear label.  Add
one to make it easier to spot.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-10-18 20:27:54 +02:00
David Gibson
ff63ac922a conf: Add --dns-host option to configure host side nameserver
When redirecting DNS queries with the --dns-forward option, passt/pasta
needs a host side nameserver to redirect the queries to.  This is
controlled by the c->ip[46].dns_host variables.  This is set to the first
first nameserver listed in the host's /etc/resolv.conf, and there isn't
currently a way to override it from the command line.

Prior to 0b25cac9 ("conf: Treat --dns addresses as guest visible
addresses") it was possible to alter this with the -D/--dns option.
However, doing so was confusing and had some nonsensical edge cases because
-D generally takes guest side addresses, rather than host side addresses.

Add a new --dns-host option to restore this functionality in a more
sensible way.

Link: https://bugs.passt.top/show_bug.cgi?id=102
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-10-04 19:04:29 +02:00
David Gibson
9d66df9a9a conf: Add command line switch to enable IP_FREEBIND socket option
In a couple of recent reports, we've seen that it can be useful for pasta
to forward ports from addresses which are not currently configured on the
host, but might be in future.  That can be done with the sysctl
net.ipv4.ip_nonlocal_bind, but that does require CAP_NET_ADMIN to set in
the first place.  We can allow the same thing on a per-socket basis with
the IP_FREEBIND (or IPV6_FREEBIND) socket option.

Add a --freebind command line argument to enable this socket option on
all listening sockets.

Link: https://bugs.passt.top/show_bug.cgi?id=101
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-10-04 19:04:29 +02:00
David Gibson
57b7bd2a48 fwd, conf: Allow NAT of the guest's assigned address
The guest is usually assigned one of the host's IP addresses.  That means
it can't access the host itself via its usual address.  The
--map-host-loopback option (enabled by default with the gateway address)
allows the guest to contact the host.  However, connections forwarded this
way appear on the host to have originated from the loopback interface,
which isn't always desirable.

Add a new --map-guest-addr option, which acts similarly but forwarded
connections will go to the host's external address, instead of loopback.

If '-a' is used, so the guest's address is not the same as the host's, this
will instead forward to whatever host-visible site is shadowed by the
guest's assigned address.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-21 12:00:40 +02:00
David Gibson
e813a4df7d conf: Allow address remapped to host to be configured
Because the host and guest share the same IP address with passt/pasta, it's
not possible for the guest to directly address the host.  Therefore we
allow packets from the guest going to a special "NAT to host" address to be
redirected to the host, appearing there as though they have both source and
destination address of loopback.

Currently that special address is always the address of the default
gateway (or none).  That can be a problem if we want that gateway to be
addressable by the guest.  Therefore, allow the special "NAT to host"
address to be overridden on the command line with a new --map-host-loopback
option.

In order to exercise and test it, update the passt_in_ns and perf
tests to use this option and give different mapping addresses for the
two layers of the environment.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-21 12:00:35 +02:00
David Gibson
0b25cac94e conf: Treat --dns addresses as guest visible addresses
Although it's not 100% explicit in the man page, addresses given to
the --dns option are intended to be addresses as seen by the guest.
This differs from addresses taken from the host's /etc/resolv.conf,
which must be translated to guest accessible versions in some cases.

Our implementation is currently inconsistent on this: when using
--dns-forward, you must usually also give --dns with the matching address,
which is meaningful only in the guest's address view.  However if you give
--dns with a loopback addres, it will be translated like a host view
address.

Move the remapping logic for DNS addresses out of add_dns4() and add_dns6()
into add_dns_resolv() so that it is only applied for host nameserver
addresses, not for nameservers given explicitly with --dns.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-08-21 12:00:08 +02:00
Stefano Brivio
fbb0c9523e conf, pasta: Make -g and -a skip route/addresses copy for matching IP version only
Paul reports that setting IPv4 address and gateway manually, using
--address and --gateway, causes pasta to fail inserting IPv6 routes
in a setup where multiple, inter-dependent IPv6 routes are present
on the host.

That's because, currently, any -g option implies --no-copy-routes
altogether, and any -a implies --no-copy-addrs.

Limit this implication to the matching IP version, instead, by having
two copies of no_copy_routes and no_copy_addrs in the context
structure, separately for IPv4 and IPv6.

While at it, change them to 'bool': we had them as 'int' because
getopt_long() used to set them directly, but it hasn't been the case
for a while already.

Reported-by: Paul Holzinger <pholzing@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-08-07 09:15:25 +02:00
David Gibson
becf81ab88 fwd: Broaden what we consider for DNS specific forwarding rules
passt/pasta has options to redirect DNS requests from the guest to a
different server address on the host side.  Currently, however, only UDP
packets to port 53 are considered "DNS requests".  This ignores DNS
requests over TCP - less common, but certainly possible.  It also ignores
encrypted DNS requests on port 853.

Extend the DNS forwarding logic to handle both of those cases.

Link: https://github.com/containers/podman/issues/23239
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: Paul Holzinger <pholzing@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-07-25 12:38:11 +02:00
Stefano Brivio
bca0fefa32 conf, passt: Make --stderr do nothing, and deprecate it
The original behaviour of printing messages to standard error by
default when running from a non-interactive terminal was introduced
because the first KubeVirt integration draft used to start passt in
foreground and get messages via standard error.

For development purposes, the system logger was more convenient at
that point, and passt was running from interactive terminals only if
not started by the KubeVirt integration.

This behaviour was introduced by 84a62b79a2 ("passt: Also log to
stderr, don't fork to background if not interactive").

Later, I added command-line options in 1e49d194d0 ("passt, pasta:
Introduce command-line options and port re-mapping") and accidentally
reversed this condition, which wasn't a problem as --stderr could
force printing to standard error anyway (and it was used by KubeVirt).

Nowadays, the KubeVirt integration uses a log file (requested via
libvirt configuration), and the same applies for Podman if one
actually needs to look at runtime logs. There are no use cases left,
as far as I know, where passt runs in foreground in non-interactive
terminals.

Seize the chance to reintroduce some sanity here. If we fork to
background, standard error is closed, so --stderr is useless in that
case.

If we run in foreground, there's no harm in printing messages to
standard error, and that accidentally became the default behaviour
anyway, so --stderr is not needed in that case.

It would be needed for non-interactive terminals, but there are no
use cases, and if there were, let's log to standard error anyway:
the user can always redirect standard error to /dev/null if needed.

Before we're up and running, we need to print to standard error anyway
if something happens, otherwise we can't report failure to start in
any kind of usage, stand-alone or in integrations.

So, make --stderr do nothing, and deprecate it.

While at it, drop a left-over comment about --foreground being the
default only for interactive terminals, because it's not the case
anymore.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-06-21 15:32:28 +02:00
Stefano Brivio
65923ba798 conf: Accept duplicate and conflicting options, the last one wins
In multiple occasions, especially when passt(1) and pasta(1) are used
in integrations such as the one with Podman, the ability to override
earlier options on the command line with later one would have been
convenient.

Recently, to debug a number of issues happening with Podman, I would
have liked to ask users to share a debug log by passing --debug as
additional option, but pasta refuses --quiet (always passed by Podman)
and --debug at the same time.

On top of this, Podman lets users specify other pasta options in its
containers.conf(5) file, as well as on the command line.

The options from the configuration files are appended together with
the ones from the command line, which makes it impossible for users to
override options from the configuration file, if duplicated options
are refused, unless Podman takes care of sorting them, which is
clearly not sustainable.

For --debug and --trace, somebody took care of this on Podman side at:
  https://github.com/containers/common/pull/2052

but this doesn't fix the issue with other options, and we'll have
anyway older versions of Podman around, too.

I think there's some value in telling users about duplicated or
conflicting options, because that might reveal issues in integrations
or accidental misconfigurations, but by now I'm fairly convinced that
the downsides outweigh this.

Drop checks about duplicate options and mutually exclusive ones. In
some cases, we need to also undo a couple of initialisations caused
by earlier options, but this looks like a simplification, overall.

Notable exception: --stderr still conflicts with --log-file, because
users might have the expectation that they don't actually conflict.
But they do conflict in the existing implementation, so it's safer
to make sure that the users notice that.

Suggested-by: Paul Holzinger <pholzing@redhat.com>
Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: Paul Holzinger <pholzing@redhat.com>
2024-06-21 15:31:46 +02:00
Danish Prakash
1544a43863 passt.1, qrap.1: align license description with SPDX identifier
The SPDX identifier states GPL-2.0-or-later but the copyright section
mentions GPL-3.0 or later causing a mismatch.

Also, only correctly refers to GPL instead of AGPL.

Signed-off-by: Danish Prakash <contact@danishpraka.sh>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-06-19 15:00:55 +02:00
Stefano Brivio
450a6131be netlink: With no default route, pick the first interface with a route
While commit f919dc7a4b ("conf, netlink: Don't require a default
route to start") sounded reasonable in the assumption that, if we
don't find default routes for a given address family, we can still
proceed by selecting an interface with any route *iff it's the only
one for that protocol family*, Jelle reported a further issue in a
similar setup.

There, multiple interfaces are present, and while remote container
connectivity doesn't matter for the container, local connectivity is
desired. There are no default routes, but those multiple interfaces
all have non-default routes, so we should just pick one and start.

Pick the first interface reported by the kernel with any route, if
there are no default routes. There should be no harm in doing so.

Reported-by: Jelle van der Waa <jvanderwaa@redhat.com>
Reported-by: Martin Pitt <mpitt@redhat.com>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2277954
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Paul Holzinger <pholzing@redhat.com>
2024-06-19 15:00:55 +02:00
Stefano Brivio
f919dc7a4b conf, netlink: Don't require a default route to start
There might be isolated testing environments where default routes and
global connectivity are not needed, a single interface has all
non-loopback addresses and routes, and still passt and pasta are
expected to work.

In this case, it's pretty obvious what our upstream interface should
be, so go ahead and select the only interface with at least one
route, disabling DHCP and implying --no-map-gw as the documentation
already states.

If there are multiple interfaces with routes, though, refuse to start,
because at that point it's really not clear what we should do.

Reported-by: Martin Pitt <mpitt@redhat.com>
Link: https://github.com/containers/podman/issues/21896
Signed-off-by: Stefano brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-03-18 08:57:21 +01:00
Stefano Brivio
0761f29a14 passt.1: --{no-,}dhcp-dns and --{no-,}dhcp-search don't take addresses
...they are simple enable/disable options.

Fixes: 89678c5157 ("conf, udp: Introduce basic DNS forwarding")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-03-14 08:18:09 +01:00
Stefano Brivio
ff22a78d7b pasta: Don't try to watch namespaces in procfs with inotify, use timer instead
We watch network namespace entries to detect when we should quit
(unless --no-netns-quit is passed), and these might stored in a tmpfs
typically mounted at /run/user/UID or /var/run/user/UID, or found in
procfs at /proc/PID/ns/.

Currently, we try to use inotify for any possible location of those
entries, but inotify, of course, doesn't work on pseudo-filesystems
(see inotify(7)).

The man page reflects this: the description of --no-netns-quit
implies that we won't quit anyway if the namespace is not "bound to
the filesystem".

Well, we won't quit, but, since commit 9e0dbc8948 ("More
deterministic detection of whether argument is a PID, PATH or NAME"),
we try. And, indeed, this is harmless, as the caveat from that
commit message states.

Now, it turns out that Buildah, a tool to create container images,
sharing its codebase with Podman, passes a procfs entry to pasta, and
expects pasta to exit once the network namespace is not needed
anymore, that is, once the original container process, also spawned
by Buildah, terminates.

Get this to work by using the timer fallback mechanism if the
namespace name is passed as a path belonging to a pseudo-filesystem.
This is expected to be procfs, but I covered sysfs and devpts
pseudo-filesystems as well, because nothing actually prevents
creating this kind of directory structure and links there.

Note that fstatfs(), according to some versions of man pages, was
apparently "deprecated" by the LSB. My reasoning for using it is
essentially this:
  https://lore.kernel.org/linux-man/f54kudgblgk643u32tb6at4cd3kkzha6hslahv24szs4raroaz@ogivjbfdaqtb/t/#u

...that is, there was no such thing as an LSB deprecation, and
anyway there's no other way to get the filesystem type.

Also note that, while it might sound more obvious to detect the
filesystem type using fstatfs() on the file descriptor itself
(c->pasta_netns_fd), the reported filesystem type for it is nsfs, no
matter what path was given to pasta. If we use the parent directory,
we'll typically have either tmpfs or procfs reported.

If the target namespace is given as a PID, or as a PID-based procfs
entry, we don't risk races if this PID is recycled: our handle on
/proc/PID/ns will always refer to the original namespace associated
with that PID, and we don't re-open this entry from procfs to check
it.

There's, however, a remaining race possibility if the parent process
is not the one associated to the network namespace we operate on: in
that case, the parent might pass a procfs entry associated to a PID
that was recycled by the time we parse it. This can't happen if the
namespace PID matches the one of the parent, because we detach from
the controlling terminal after parsing the namespace reference.

To avoid this type of race, if desired, we could add the option for
the parent to pass a PID file descriptor, that the parent obtained
via pidfd_open(). This is beyond the scope of this change.

Update the man page to reflect that, even if the target network
namespace is passed as a procfs path or a PID, we'll now quit when
the procfs entry is gone.

Reported-by: Paul Holzinger <pholzing@redhat.com>
Link: https://github.com/containers/podman/pull/21563#issuecomment-1948200214
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2024-02-19 19:58:50 +01:00
Stefano Brivio
f57a2fb4d5 conf, passt.1: Exit if we can't bind a forwarded port, except for -[tu] all
...or similar, that is, if only excluded ranges are given (implying
we'll forward any other available port). In that case, we'll usually
forward large sets of ports, and it might be inconvenient for the
user to skip excluding single ports that are already taken.

The existing behaviour, that is, exiting only if we fail to bind all
the ports for one given forwarding option, turns out to be
problematic for several aspects raised by Paul:

- Podman merges ranges anyway, so we might fail to bind all the ports
  from a specific range given by the user, but we'll not fail anyway
  because Podman merges it with another one where we succeed to bind
  at least one port. At the same time, there should be no semantic
  difference between multiple ranges given by a single option and
  multiple ranges given as multiple options: it's unexpected and
  not documented

- the user might actually rely on a given port to be forwarded to a
  given container or a virtual machine, and if connections are
  forwarded to an unrelated process, this might raise security
  concerns

- given that we can try and fail to bind multiple ports before
  exiting (in case we can't bind any), we don't have a specific error
  code we can return to the user, so we don't give the user helpful
  indication as to why we couldn't bind ports.

Exit as soon as we fail to create or bind a socket for a given
forwarded port, and report the actual error.

Keep the current behaviour, however, in case the user wants to
forward all the (available) ports for a given protocol, or all the
ports with excluded ranges only. There, it's more reasonable that
the user is expecting partial failures, and it's probably convenient
that we continue with the ports we could forward.

Update the manual page to reflect the new behaviour, and the old
behaviour too in the cases where we keep it.

Suggested-by: Paul Holzinger <pholzing@redhat.com>
Link: https://github.com/containers/podman/pull/21563#issuecomment-1937024642
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Tested-by: Paul Holzinger <pholzing@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-02-16 08:47:14 +01:00
Stefano Brivio
6c7623d07b netlink: Add support to fetch default gateway from multipath routes
If the default route for a given IP version is a multipath one,
instead of refusing to start because there's no RTA_GATEWAY attribute
in the set returned by the kernel, we can just pick one of the paths.

To make this somewhat less arbitrary, pick the path with the highest
weight, if weights differ.

Reported-by: Ed Santiago <santiago@redhat.com>
Link: https://github.com/containers/podman/issues/20927
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2024-02-09 13:24:33 +01:00
David Gibson
457ff122e3 udp,pasta: Periodically scan for ports to automatically forward
pasta supports automatic port forwarding, where we look for listening
sockets in /proc/net (in both namespace and outside) and establish port
forwarding to match.

For TCP we do this scan both at initial startup, then periodically
thereafter.  For UDP however, we currently only scan at start.  So unlike
TCP we won't update forwarding to handle services that start after pasta
has begun.

There's no particular reason for that, other than that we didn't implement
it.  So, remove that difference, by scanning for new UDP forwards
periodically too.  The logic is basically identical to that for TCP, but it
needs some changes to handle the mildly different data structures in the
UDP case.

Link: https://bugs.passt.top/show_bug.cgi?id=45
Link: https://github.com/rootless-containers/rootlesskit/issues/383
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-11-19 09:08:39 +01:00
Stefano Brivio
cc9d16758b conf, pasta: With --config-net, copy all addresses by default
Use the newly-introduced NL_DUP mode for nl_addr() to copy all the
addresses associated to the template interface in the outer
namespace, unless --no-copy-addrs (also implied by -a) is given.

This option is introduced as deprecated right away: it's not expected
to be of any use, but it's helpful to keep it around for a while to
debug any suspected issue with this change.

This is done mostly for consistency with routes. It might partially
cover the issue at:
  https://bugs.passt.top/show_bug.cgi?id=47
  Support multiple addresses per address family

for some use cases, but not the originally intended one: we'll still
use a single outbound address (unless the routing table specifies
different preferred source addresses depending on the destination),
regardless of the address used in the target namespace.

Link: https://bugs.passt.top/show_bug.cgi?id=47
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2023-05-23 16:13:28 +02:00
Stefano Brivio
a7359f0948 conf: Don't exit if sourced default route has no gateway
If we use a template interface without a gateway on the default
route, we can still offer almost complete functionality, except that,
of course, we can't map the gateway address to the outer namespace or
host, and that we have no obvious server address or identifier for
use in DHCP's siaddr and option 54 (Server identifier, mandatory).

Continue, if we have a default route but no default gateway, and
imply --no-map-gw and --no-dhcp in that case. NDP responder and
DHCPv6 should be able to work as usual because we require a
link-local address to be present, and we'll fall back to that.

Together with the previous commits implementing an actual copy of
routes from the outer namespace, this should finally fix the
operation of 'pasta --config-net' for cases where we have a default
route on the host, but no default gateway, as it's the case for
tap-style routes, including typical Wireguard endpoints.

Reported-by: me@yawnt.com
Link: https://bugs.passt.top/show_bug.cgi?id=49
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2023-05-23 16:13:28 +02:00
Stefano Brivio
da54641f14 conf, pasta: With --config-net, copy all routes by default
Use the newly-introduced NL_DUP mode for nl_route() to copy all the
routes associated to the template interface in the outer namespace,
unless --no-copy-routes (also implied by -g) is given.

This option is introduced as deprecated right away: it's not expected
to be of any use, but it's helpful to keep it around for a while to
debug any suspected issue with this change.

Otherwise, we can't use default gateways which are not, address-wise,
on the same subnet as the container, as reported by Callum.

Reported-by: Callum Parsey <callum@neoninteger.au>
Link: https://github.com/containers/podman/issues/18539
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2023-05-23 16:13:28 +02:00
lemmi
96f8d55c4f correct -6 option in manpage
Signed-off-by: lemmi <lemmi@nerd2nerd.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-05-09 23:29:32 +02:00
Stefano Brivio
ca2749e1bd passt: Relicense to GPL 2.0, or any later version
In practical terms, passt doesn't benefit from the additional
protection offered by the AGPL over the GPL, because it's not
suitable to be executed over a computer network.

Further, restricting the distribution under the version 3 of the GPL
wouldn't provide any practical advantage either, as long as the passt
codebase is concerned, and might cause unnecessary compatibility
dilemmas.

Change licensing terms to the GNU General Public License Version 2,
or any later version, with written permission from all current and
past contributors, namely: myself, David Gibson, Laine Stump, Andrea
Bolognani, Paul Holzinger, Richard W.M. Jones, Chris Kuhn, Florian
Weimer, Giuseppe Scrivano, Stefan Hajnoczi, and Vasiliy Ulyanov.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-04-06 18:00:33 +02:00
Stefano Brivio
98a9a7d9e5 conf: Allow binding to ports on an interface without a specific address
Somebody might want to bind listening sockets to a specific
interface, but not a specific address, and there isn't really a
reason to prevent that. For example:

  -t %eth0/2022

Alternatively, we support options such as -t 0.0.0.0%eth0/2022 and
-t ::%eth0/2022, but not together, for the same port.

Enable this kind of syntax and add examples to the man page.

Reported-by: Paul Holzinger <pholzing@redhat.com>
Link: https://github.com/containers/podman/issues/14425#issuecomment-1485192195
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-03-29 13:48:12 +02:00
Stefano Brivio
7727804658 passt.1: Fix description of --mtu option
By default, 65520 bytes are advertised, and zero disables DHCP and
NDP options.

Fixes: ec2b58ea4d ("conf, dhcp, ndp: Fix message about default MTU, make NDP consistent")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-03-17 08:26:07 +01:00
Stefano Brivio
a9c59dd91b conf, icmp, tcp, udp: Add options to bind to outbound address and interface
I didn't notice earlier: libslirp (and slirp4netns) supports binding
outbound sockets to specific IPv4 and IPv6 addresses, to force the
source addresse selection. If we want to claim feature parity, we
should implement that as well.

Further, Podman supports specifying outbound interfaces as well, but
this is simply done by resolving the primary address for an interface
when the network back-end is started. However, since kernel version
5.7, commit c427bfec18f2 ("net: core: enable SO_BINDTODEVICE for
non-root users"), we can actually bind to a specific interface name,
which doesn't need to be validated in advance.

Implement -o / --outbound ADDR to bind to IPv4 and IPv6 addresses,
and --outbound-if4 and --outbound-if6 to bind IPv4 and IPv6 sockets
to given interfaces.

Given that it probably makes little sense to select addresses and
routes from interfaces different than the ones given for outbound
sockets, also assign those as "template" interfaces, by default,
unless explicitly overridden by '-i'.

For ICMP and UDP, we call sock_l4() to open outbound sockets, as we
already needed to bind to given ports or echo identifiers, and we
can bind() a socket only once: there, pass address (if any) and
interface (if any) for the existing bind() and setsockopt() calls.

For TCP, in general, we wouldn't otherwise bind sockets. Add a
specific helper to do that.

For UDP outbound sockets, we need to know if the final destination
of the socket is a loopback address, before we decide whether it
makes sense to bind the socket at all: move the block mangling the
address destination before the creation of the socket in the IPv4
path. This was already the case for the IPv6 path.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2023-03-09 03:43:59 +01:00
Stefano Brivio
d9577965a3 passt.1: Fix typo, improve wording in examples of port forwarding specifiers
Based on a patch from Laine, and reports from Laine and Yalan: fix
the "22-80:32-90" example, and improve wording for the other ones:
instead of using "to" to denote the end of a range, use "between ...
and", so that it's clear we're *not* referring to target ports.

Reported-by: Laine Stump <laine@redhat.com>
Reported-by: Yalan Zhang <yalzhang@redhat.com>
Fixes: da20f57f19 ("passt, qrap: Add man pages")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-02-16 17:34:00 +01:00
Richard W.M. Jones
6b4e68383c passt, tap: Add --fd option
This passes a fully connected stream socket to passt.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
[sbrivio: reuse fd_tap instead of adding a new descriptor,
 imply --one-off on --fd, add to optstring and usage()]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-11-25 01:40:47 +01:00
Stefano Brivio
11efaefa1e passt, qrap, README: Update notes and documentation for AF_UNIX support in qemu
We can't get rid of qrap quite yet, but at least we should start
telling users it's not going to be needed anymore starting from qemu
7.2.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-11-04 12:04:32 +01:00
Stefano Brivio
10cabe3dbf passt.1: Fix typo: "addressses", reported by Lintian
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-10-27 14:28:00 +02:00
Stefano Brivio
7402951658 conf, passt.1: Don't imply --foreground with --debug
Having -f implied by -d (and --trace) usually saves some typing, but
debug mode in background (with a log file) is quite useful if pasta
is started by Podman, and is probably going to be handy for passt
with libvirt later, too.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2022-10-27 00:17:56 +02:00
Stefano Brivio
b3f359167b passt.1: Add David to AUTHORS
I just realised while reading the man page.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2022-10-15 02:10:36 +02:00
Stefano Brivio
3e2eb4337b conf: Bind inbound ports with CAP_NET_BIND_SERVICE before isolate_user()
Even if CAP_NET_BIND_SERVICE is granted, we'll lose the capability in
the target user namespace as we isolate the process, which means
we're unable to bind to low ports at that point.

Bind inbound ports, and only those, before isolate_user(). Keep the
handling of outbound ports (for pasta mode only) after the setup of
the namespace, because that's where we'll bind them.

To this end, initialise the netlink socket for the init namespace
before isolate_user() as well, as we actually need to know the
addresses of the upstream interface before binding ports, in case
they're not explicitly passed by the user.

As we now call nl_sock_init() twice, checking its return code from
conf() twice looks a bit heavy: make it exit(), instead, as we
can't do much if we don't have netlink sockets.

While at it:

- move the v4_only && v6_only options check just after the first
  option processing loop, as this is more strictly related to
  option parsing proper

- update the man page, explaining that CAP_NET_BIND_SERVICE is
  *not* the preferred way to bind ports, because passt and pasta
  can be abused to allow other processes to make effective usage
  of it. Add a note about the recommended sysctl instead

- simplify nl_sock_init_do() now that it's called once for each
  case

Reported-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-10-15 02:10:36 +02:00
Stefano Brivio
c1eff9a3c6 conf, tcp, udp: Allow specification of interface to bind to
Since kernel version 5.7, commit c427bfec18f2 ("net: core: enable
SO_BINDTODEVICE for non-root users"), we can bind sockets to
interfaces, if they haven't been bound yet (as in bind()).

Introduce an optional interface specification for forwarded ports,
prefixed by %, that can be passed together with an address.

Reported use case: running local services that use ports we want
to have externally forwarded:
  https://github.com/containers/podman/issues/14425

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2022-10-15 02:10:36 +02:00
Stefano Brivio
a62ed181db conf, tap: Add option to quit once the client closes the connection
This is practical to avoid explicit lifecycle management in users,
e.g. libvirtd, and is trivial to implement.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2022-10-15 02:10:36 +02:00
Stefano Brivio
e23024ccff conf, log, Makefile: Add versioning information
Add a --version option displaying that, and also include this
information in the log files.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-10-15 02:10:28 +02:00
Stefano Brivio
01efc71ddd log, conf: Add support for logging to file
In some environments, such as KubeVirt pods, we might not have a
system logger available. We could choose to run in foreground, but
this takes away the convenient synchronisation mechanism derived from
forking to background when interfaces are ready.

Add optional logging to file with -l/--log-file and --log-size.

Unfortunately, this means we need to duplicate features that are more
appropriately implemented by a system logger, such as rotation. Keep
that reasonably simple, by using fallocate() with range collapsing
where supported (Linux kernel >= 3.15, extent-based ext4 and XFS) and
falling back to an unsophisticated block-by-block moving of entries
toward the beginning of the file once we reach the (mandatory) size
limit.

While at it, clarify the role of LOG_EMERG in passt.c.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2022-10-14 17:38:28 +02:00
David Gibson
ef6da15732 Allow --userns when pasta spawns a command
Currently --userns is only allowed when pasta is attaching to an existing
netns or PID, and is prohibited when creating a new netns by spawning a
command or shell.

With the new handling of userns, this check isn't neccessary.  I'm not sure
if there's any use case for --userns with a spawned command, but it's
strictly more flexible and requires zero extra code, so we might as well.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-09-13 05:31:51 +02:00
David Gibson
10c6347747 Split checking for root from dropping root privilege
check_root() both checks to see if we are root (in the init namespace),
and if we are drops to an unprivileged user.  To make future cleanups
simpler, split the checking for root (now in check_root()) from the actual
dropping of privilege (now in drop_root()).

Note that this does slightly alter semantics.  Previously we would only
setuid() if we were originally root (in the init namespace).  Now we will
always setuid() and setgid(), though it won't actually change anything if
we weren't privileged to begin with.  This also means that we will now
always attempt to switch to the user specified with --runas, even if we
aren't (init namespace) root to begin with.  Obviously this will fail with
an error if we weren't privileged to start with.  --help and the man page
are updated accordingly.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-09-13 05:31:51 +02:00
David Gibson
1392bc5ca0 Allow pasta to take a command to execute
When not given an existing PID or network namspace to attach to, pasta
spawns a shell.  Most commands which can spawn a shell in an altered
environment can also run other commands in that same environment, which can
be useful in automation.

Allow pasta to do the same thing; it can be given an arbitrary command to
run in the network and user namespace which pasta creates.  If neither a
command nor an existing PID or netns to attach to is given, continue to
spawn a default shell, as before.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-08-30 19:43:31 +02:00
David Gibson
c188736cd8 Use explicit --netns option rather than multiplexing with PID
When attaching to an existing namespace, pasta can take a PID or the name
or path of a network namespace as a non-option parameter.  We disambiguate
based on what the parameter looks like.  Make this more explicit by using
a --netns option for explicitly giving the path or name, and treating a
non-option argument always as a PID.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[sbrivio: Fix typo in man page]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-08-30 19:43:31 +02:00
David Gibson
8de488892f Remove --nsrun-dir option
pasta can identify a netns as a "name", which is to say a path relative to
(usually) /run/netns, which is the place that ip(8) creates persistent
network namespaces.  Alternatively a full path to a netns can be given.

The --nsrun-dir option allows the user to change the standard path where
netns names are resolved.  However, there's no real point to this, if the
user wants to override the location of the netns, they can just as easily
use the full path to specify the netns.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-08-30 19:43:31 +02:00
David Gibson
ff1ac78a5e Correct manpage for --userns
The man page states that the --userns option can be given either as a path
or as a name relative to --nsrun-dir.  This is not correct: as the name
suggests --nsrun-dir is (correctly) used only for *netns* resolution, not
*userns* resolution.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-08-30 19:43:31 +02:00
David Gibson
aae2a9bbf7 conf: Use "-D none" and "-S none" instead of missing empty option arguments
Both the -D (--dns) and -S (--search) options take an optional argument.
If the argument is omitted the option is disabled entirely.  However,
handling the optional argument requires some ugly special case handling if
it's the last option on the command line, and has potential ambiguity with
non-option arguments used with pasta.  It can also make it more confusing
to read command lines.

Simplify the logic here by replacing the non-argument versions with an
explicit "-D none" or "-S none".

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[sbrivio: Reworked logic to exclude redundant/conflicting options]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-08-30 19:42:52 +02:00
David Gibson
bf95322fc1 conf: Make the argument to --pcap option mandatory
The --pcap or -p option can be used with or without an argument.  If given,
the argument gives the name of the file to save a packet trace to.  If
omitted, we generate a default name in /tmp.

Generating the default name isn't particularly useful though, since making
a suitable name can easily be done by the caller.  Remove this feature.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-08-30 19:17:57 +02:00
Stefano Brivio
0070e35d2b passt.1: Default host interfaces are now selected based on IP version
Reflect the changes from commit 4b2e018d70 ("Allow different
external interfaces for IPv4 and IPv6 connectivity") into the manual.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-07-30 22:17:07 +02:00
Stefano Brivio
220759efb8 conf: Allow to specify ranges and ports excluded from given ranges
This is useful in environments where we want to forward a large
number of ports, or all non-ephemeral ones, and some other service
running on the host needs a few selected ports.

I'm using ~ as prefix for the specification of excluded ranges and
ports to avoid the need for explicit command line quoting.

Ranges and ports can be excluded from given ranges by adding them
in the comma-separated list, prefixed by ~. Some quick examples:

  -t 5000-6000,~5555: forward ports 5000 to 6000, but not 5555

  -t ~20000-20010: forward all non-ephemeral, allowed ports, except
     for ports 20000 to 20010

...more details in usage message and man page.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-07-14 01:36:05 +02:00
David Gibson
7bcc5930a6 Invoke specific qemu-system-* binaries
A lot of tests and examples invoke qemu with the command "kvm".  However,
as far as I can tell, "kvm" being aliased to the appropriate qemu system
binary is Debian specific.  The binary names from qemu upstream -
qemu-system-$ARCH - also aren't universal, but they are more common (they
should be good for both Debian and Fedora at least).

In order to still get KVM acceleration when available, we use the option
"-M accel=kvm:tcg" to tell qemu to try using either KVM or TCG in that
order

A number of the places we invoked "kvm" are expecting specifically an x86
guest, and so it's also safer to explicitly invoke qemu-system-x86_64.

Some others appear to be independent of the target arch (just wanting the
same arch as the host to allow KVM acceleration).  Although I suspect there
may be more subtle x86 specific options in the qemu command lines, attempt
to preserve arch independence by using $(uname -m).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-07-14 01:32:42 +02:00