1
0
mirror of https://passt.top/passt synced 2025-01-21 19:55:17 +00:00
Commit Graph

58 Commits

Author SHA1 Message Date
Stefano Brivio
f091893c1f netlink: Fetch most specific (longest prefix) address in nl_addr_get()
This happened in most cases implicitly before commit eff3bcb24547
("netlink: Split nl_addr() into separate operation functions"): while
going through results from netlink, we would only copy an address
into the provided return buffer if no address had been picked yet.

Because of the insertion logic in the kernel (ipv6_link_dev_addr()),
the first returned address would also be the one added last, and, in
case of a Linux guest using a DHCPv6 client as well as SLAAC, that
would be the address assigned via DHCPv6, because SLAAC happens
before the DHCPv6 exchange.

The effect of, instead, picking the last returned address (first
assigned) is visible when passt or pasta runs nested, given that, by
default, they advertise a prefix for SLAAC usage, plus an address via
DHCPv6.

The first level (L1 guest) would get a /64 address by means of SLAAC,
and a /128 address via DHCPv6, the latter matching the address on the
host.

The second level (L2 guest) would also get two addresses: a /64 via
SLAAC (same prefix as the host), and a /128 via DHCPv6, matching the
the L1 SLAAC-assigned address, not the one obtained via DHCPv6. That
is, none of the L2 addresses would match the address on the host. The
whole point of having a DHCPv6 server is to avoid (implicit) NAT when
possible, though.

Fix this in a more explicit way than the behaviour we initially had:
pick the first address among the set of most specific ones, by
comparing prefix lengths. Do this for IPv4 and for link-local
addresses, too, to match in any case the implementation of the
default source address selection.

Reported-by: Yalan Zhang <yalzhang@redhat.com>
Fixes: eff3bcb24547 ("netlink: Split nl_addr() into separate operation functions")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-12-30 11:45:27 +01:00
Stefano Brivio
06559048e7 treewide: Use 'z' length modifier for size_t/ssize_t conversions
Types size_t and ssize_t are not necessarily long, it depends on the
architecture.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2023-12-02 03:54:42 +01:00
Stefano Brivio
b944622969 netlink: Sequence numbers are actually 32 bits wide
Harmless, as we use sequence numbers monotonically anyway, but now
clang-tidy reports:

/home/sbrivio/passt/netlink.c:155:7: error: format specifies type 'unsigned short' but the argument has type '__u32' (aka 'unsigned int') [clang-diagnostic-format,-warnings-as-errors]
                    nh->nlmsg_seq, seq);
                    ^
/home/sbrivio/passt/log.h:26:7: note: expanded from macro 'die'
                err(__VA_ARGS__);                                       \
                    ^~~~~~~~~~~
/home/sbrivio/passt/log.h:19:34: note: expanded from macro 'err'
                                        ^~~~~~~~~~~
Suppressed 222820 warnings (222816 in non-user code, 4 NOLINT).
Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
1 warning treated as error
make: *** [Makefile:255: clang-tidy] Error 1

Fixes: 9d4ab98d538f ("netlink: Add nl_do() helper for simple operations with error checking")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-11-07 12:22:13 +01:00
David Gibson
6471c7d01b cppcheck: Make many pointers const
Newer versions of cppcheck (as of 2.12.0, at least) added a warning for
pointers which could be declared to point at const data, but aren't.
Based on that, make many pointers throughout the codebase const.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-10-04 23:23:35 +02:00
David Gibson
a7e4bfb857 pasta: Strip RTA_PREFSRC when copying routes to the namespace
Host routes can include a preferred source address (RTA_PREFSRC), which
must be one of the host's addresses.  However when using pasta with -a the
namespace might be given a different address, not on the host.  This seems
to occur pretty routinely depending on the network configuration systems
in place on the host.

With --config-net we will try to copy host routes to the namespace.  If
one of those includes an RTA_PREFSRC, but the namespace doesn't have the
host address, this will fail with -EINVAL, causing pasta to fail.

Fix this by stripping off RTA_PREFSRC attributes from routes as we copy
them to the namespace.  This is by no means infallible, bit it should at
least handle common cases for the time being.

Link: https://bugs.passt.top/show_bug.cgi?id=71
Link: https://github.com/containers/podman/pull/19699#issuecomment-1688769287
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-23 15:52:31 +02:00
Stefano Brivio
5e4f7b92b0 netlink: Set IFA_ADDRESS, not just IFA_LOCAL, while adding IPv4 addresses
Otherwise, we actually configure the address, but it's not usable
because no local route is added by the kernel.

Link: https://github.com/containers/podman/pull/19699
Fixes: cfe7509e5c16 ("netlink: Use struct in_addr for IPv4 addresses, not bare uint32_t")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-23 09:34:44 +02:00
David Gibson
da0aeb9080 netlink: Don't propagate host address expiry to the container
When we copy addresses from the host to the container in nl_addr_dup(), we
copy all the address's attributes, including IFA_CACHEINFO, which controls
the address's lifetime.  If the host address is managed by, for example,
DHCP, it will typically have a finite lifetime.

When we copy that lifetime to the pasta container, that lifetime will
remain, meaning the kernel will eventually remove the address, typically
some hours later.  The container, however, won't have the DHCP client or
whatever was managing and maintaining the address in the host, so it will
just lose connectivity.

Long term, we may want to monitor host address changes and reflect them to
the guest.  But for now, we just want to take a snapshot of the host's
address and set those in the container permanently.  We can accomplish that
by stripping off the IFA_CACHEINFO attribute as we copy addresses.

Link: https://github.com/containers/podman/issues/19405
Link: https://bugs.passt.top/show_bug.cgi?id=70
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-16 08:10:04 +02:00
David Gibson
b4f8ffd5c4 netlink: Correctly calculate attribute length for address messages
In nl_addr_get() and nl_addr_dup() we step the attributes attached to each
RTM_NEWADDR message with a loop initialised with IFA_RTA() and
RTM_PAYLOAD() macros.  RTM_PAYLOAD(), however is for RTM_NEWROUTE messages
(struct rtmsg), not RTM_NEWADDR messages (struct ifaddrmsg).  Consequently
it miscalculates the size and means we can skip some attributes.  Switch
to IFA_PAYLOAD() which we should be using here.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-16 08:10:02 +02:00
David Gibson
4b9f4c2513 netlink: Remove redundant check on nlmsg_type
In the loop within nl_addr_dup() we check and skip for any messages that
aren't of type RTM_NEWADDR.  This is a leftover that was missed in the
recent big netlink cleanup.  In fact we already check for the message type
in the nl_foreach_oftype() macro, so the explicit test is redudant.
Remove it.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-16 08:09:49 +02:00
David Gibson
02b30e7871 netlink: Propagate errors for "dup" operations
We now detect errors on netlink "set" operations while configuring the
pasta namespace with --config-net.  However in many cases rather than
a simple "set" we use a more complex "dup" function to copy
configuration from the host to the namespace.  We're not yet properly
detecting and reporting netlink errors for that case.

Change the "dup" operations to propagate netlink errors to their
caller, pasta_ns_conf() and report them there.

Link: https://bugs.passt.top/show_bug.cgi?id=60
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[sbrivio: Minor formatting changes in pasta_ns_conf()]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-04 01:32:32 +02:00
David Gibson
5103811e2d netlink: Propagate errors for "dump" operations
Currently if we receive any netlink errors while discovering network
configuration from the host, we'll just ignore it and carry on.  This
might lead to cryptic error messages later on, or even silent
misconfiguration.

We now have the mechanisms to detect errors from get/dump netlink
operations.  Propgate these errors up to the callers and report them usefully.

Link: https://bugs.passt.top/show_bug.cgi?id=60
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-04 01:30:41 +02:00
David Gibson
4d6e9d0816 netlink: Always process all responses to a netlink request
A single netlink request can result in multiple response datagrams.  We
process multiple response datagrams in some circumstances, but there are
cases where we exit early and will leave remaining datagrams in the queue.
These will be flushed in nl_send() before we send another request.

This is confusing, and not what we need to reliably check for errors from
netlink operations.  So, instead, make sure we always process all the
response datagrams whenever we send a request (excepting fatal errors).

In most cases this is just a matter of avoiding early exits from nl_foreach
loops.  nl_route_dup() is a bit trickier, because we need to retain all the
routes we're going to try to copy in a single buffer.  Here we instead use
a secondary buffer to flush any remaining datagrams, and report an error
if there are any additional routes in those datagrams .

Link: https://bugs.passt.top/show_bug.cgi?id=67
Link: https://bugs.passt.top/show_bug.cgi?id=60
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-04 01:30:37 +02:00
David Gibson
8de9805224 netlink: Propagate errors for "set" operations
Currently if anything goes wrong while we're configuring the namespace
network with --config-net, we'll just ignore it and carry on.  This might
lead to a silently unconfigured or misconfigured namespace environment.

For simple "set" operations based on nl_do() we can now detect failures
reported via netlink.  Propagate those errors up to pasta_ns_conf() and
report them usefully.

Link: https://bugs.passt.top/show_bug.cgi?id=60
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[sbrivio: Minor formatting changes in pasta_ns_conf()]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-04 01:30:22 +02:00
David Gibson
a309318275 netlink: Add nl_foreach_oftype to filter response message types
In most cases where processing response messages, we expect only one type
of message (excepting NLMSG_DONE or NLMSG_ERROR), and so we need a test
and continue to skip anything else.  Add a helper macro to do this.
This also fixes a bug in nl_get_ext_if() where we didn't have such a test
and if we got a message other than RTM_NEWROUTE we would have parsed
its contents as nonsense.

Also add a warning message if we get such an unexpected message type, which
could be useful for debugging if we ever hit it.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-04 01:28:29 +02:00
David Gibson
99ddd7ce83 netlink: Split nl_req() to allow processing multiple response datagrams
Currently nl_req() sends the request, and receives a single response
datagram which we then process.  However, a single request can result in
multiple response datagrams.  That happens nearly all the time for DUMP
requests, where the 'DONE' message usually comes in a second datagram after
the NEW{LINK|ADDR|ROUTE} messages.  It can also happen if there are just
too many objects to dump in a single datagram.

Allow our netlink code to process multiple response datagrams by splitting
nl_req() into three different helpers: nl_send() just sends a request,
without getting a response.  nl_status() checks a single message to see if
it indicates the end of the reponses for our request.  nl_next() moves onto
the next response message, whether it's in a datagram we already received
or we need to recv() a new one.  We also add a 'for'-style macro to use
these to step through every response message to a request across multiple
datagrams.

While we're at it, be more thourough with checking that our sequence
numbers are in sync.

Link: https://bugs.passt.top/show_bug.cgi?id=67
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-04 01:28:26 +02:00
David Gibson
8ec757d003 netlink: Clearer reasoning about the netlink response buffer size
Currently we set NLBUFSIZ large enough for 8192 netlink headers (128kiB in
total), and reference netlink(7).  However netlink(7) says nothing about
reponse buffer sizes, and the documents which do reference 8192 *bytes* not
8192 headers.

Update NLBUFSIZ to 64kiB with a more detailed rationale.

Link: https://bugs.passt.top/show_bug.cgi?id=67
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-04 01:28:22 +02:00
David Gibson
9d4ab98d53 netlink: Add nl_do() helper for simple operations with error checking
So far we never checked for errors reported on netlink operations via
NLMSG_ERROR messages.  This has led to several subtle and tricky to debug
situations which would have been obvious if we knew that certain netlink
operations had failed.

Introduce a nl_do() helper that performs netlink "do" operations (that is
making a single change without retreiving complex information) with much
more thorough error checking.  As well as returning an error code if we
get an NLMSG_ERROR message, we also check for unexpected behaviour in
several places.  That way if we've made a mistake in our assumptions about
how netlink works it should result in a clear error rather than some subtle
misbehaviour.

We update those calls to nl_req() that can use the new wrapper to do so.
We will extend those to better handle errors in future.  We don't touch
non-"do" operations for now, those are a bit trickier.

Link: https://bugs.passt.top/show_bug.cgi?id=60
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-04 01:28:19 +02:00
David Gibson
282581ba84 netlink: Fill in netlink header fields from nl_req()
Currently netlink functions need to fill in a full netlink header, as well
as a payload then call nl_req() to submit that to the kernel.  It makes
things a bit terser if we just give the relevant header fields as
parameters to nl_req() and have it complete the header.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-04 01:28:16 +02:00
David Gibson
f62600b2df netlink: Treat send() or recv() errors as fatal
Errors on send() or recv() calls on a netlink socket don't indicate errors
with the netlink operations we're attempting, but rather that something's
gone wrong with the mechanics of netlink itself.  We don't really expect
this to ever happen, and if it does, it's not clear what we could to to
recover.

So, treat errors from these calls as fatal, rather than returning the error
up the stack.  This makes handling failures in the callers of nl_req()
simpler.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-04 01:28:14 +02:00
David Gibson
0a568c847d netlink: Start sequence number from 1 instead of 0
Netlink messages have a sequence number that's used to match requests to
responses.  It mostly doesn't matter what it is as long as it monotonically
increases, so we just use a global counter which we advance with each
request.

However, we start this counter at 0, so our very first request has sequence
number 0, which is usually reserved for asynchronous messages from the
kernel which aren't in response to a specific request. Since we don't (for
now) use such async messages, this doesn't really matter, but it's not
good practce.  So start the sequence at 1 instead.

Link: https://bugs.passt.top/show_bug.cgi?id=67
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-04 01:28:09 +02:00
David Gibson
dee7594180 netlink: Make nl_*_dup() use a separate datagram for each request
nl_req() is designed to handle a single netlink request message: it only
receives a single reply datagram for the request, and only waits for a
single NLMSG_DONE or NLMSG_ERROR message at the beginning to clear out
things from previous requests.

However, in both nl_addr_dup() and nl_route_dup() we can send multiple
request messages as a single datagram, with a single nl_req() call.
This can easily mean that the replies nl_req() collects get out of
sync with requests.  We only get away with this because after we call
these functions we don't make any netlink calls where we need to parse
the replies.

This is fragile, so alter nl_*_dup() to make an nl_req() call for each
address it is adding in the target namespace.

For nl_route_dup() this fixes an additional minor problem: because
routes can have dependencies, some of the route add requests might
fail on the first attempt, so we need to repeat the requests a number
of times.  When we did that, we weren't updating the sequence number
on each new attempt.  This works, but not updating the sequence number
for each new request isn't ideal.  Now that we're making the requests
one at a time, it's easier to make sure we update the sequence number
each time.

Link: https://bugs.passt.top/show_bug.cgi?id=67
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-04 01:28:00 +02:00
David Gibson
576df71e8b netlink: Explicitly pass netlink sockets to operations
All the netlink operations currently implicitly use one of the two global
netlink sockets, sometimes depending on an 'ns' parameter.  Change them
all to explicitly take the socket to use (or two sockets to use in the case
of the *_dup() functions).  As well as making these functions strictly more
general, it makes the callers easier to follow because we're passing a
socket variable with a name rather than an unexplained '0' or '1' for the
ns parameter.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[sbrivio: Minor formatting changes in pasta_ns_conf()]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-04 01:27:42 +02:00
David Gibson
cfe7509e5c netlink: Use struct in_addr for IPv4 addresses, not bare uint32_t
This improves consistency with IPv6 and makes it harder to misuse these as
some other sort of value.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-04 01:25:23 +02:00
David Gibson
257a6b0b7e netlink: Split nl_route() into separate operation functions
nl_route() can perform 3 quite different operations based on the 'op'
parameter.  Split this into separate functions for each one.  This requires
more lines of code, but makes the internal logic of each operation much
easier to follow.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-04 01:25:20 +02:00
David Gibson
eff3bcb245 netlink: Split nl_addr() into separate operation functions
nl_addr() can perform three quite different operations based on the 'op'
parameter, each of which uses a different subset of the parameters.  Split
them up into a function for each operation.  This does use more lines of
code, but the overlap wasn't that great, and the separated logic is much
easier to follow.

It's also clearer in the callers what we expect the netlink operations to
do, and what information it uses.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[sbrivio: Minor formatting fixes in pasta_ns_conf()]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-04 01:24:52 +02:00
David Gibson
e96182e9c2 netlink: Split up functionality of nl_link()
nl_link() performs a number of functions: it can bring links up, set MAC
address and MTU and also retrieve the existing MAC.  This makes for a small
number of lines of code, but high conceptual complexity: it's quite hard
to follow what's going on both in nl_link() itself and it's also not very
obvious which function its callers are intending to use.

Clarify this, by splitting nl_link() into nl_link_up(), nl_link_set_mac(),
and nl_link_get_mac().  The first brings up a link, optionally setting the
MTU, the others get or set the MAC address.

This fixes an arguable bug in pasta_ns_conf(): it looks as though that was
intended to retrieve the guest MAC whether or not c->pasta_conf_ns is set.
However, it only actually does so in the !c->pasta_conf_ns case: the fact
that we set up==1 means we would only ever set, never get, the MAC in the
nl_link() call in the other path.  We get away with this because the MAC
will quickly be discovered once we receive packets on the tap interface.
Still, it's neater to always get the MAC address here.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-08-04 01:18:14 +02:00
David Gibson
289301b39c netlink: Use correct interface index in NL_SET mode
nl_addr() and nl_route() take an 'op' selector which affects a number of
parameters to the netlink call.  Unfortunately when we introduced this
option a bug was introduced so that we always use the interface index for
the host side, rather than the one for the pasta namespace.

Really, the entire interface to nl_addr() and nl_route() is pretty bad:
it's tightly coupled with the use cases of its callers.  This is a minimal
fix which doesn't address that, but also doesn't make it significantly
worse.

Link: https://bugs.passt.top/show_bug.cgi?id=59
Fixes: 2fe046185634 ("netlink: Add functionality to copy routes from outer namespace")
Fixes: e89da3cf03b2 ("netlink: Add functionality to copy addresses from outer namespace")
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-06-27 17:52:30 +02:00
Stefano Brivio
e89da3cf03 netlink: Add functionality to copy addresses from outer namespace
Similarly to what we've just done with routes, support NL_DUP for
addresses (currently not exposed): nl_addr() can optionally copy
mulitple addresses to the target namespace, by fixing up data from
the dump with appropriate flags and interface index, and repeating
it back to the kernel on the socket opened in the target namespace.

Link-local addresses are not copied: the family is set to AF_UNSPEC,
which means the kernel will ignore them. Same for addresses from a
mismatching address (pre-4.19 kernels without support for
NETLINK_GET_STRICT_CHK).

Ignore IFA_LABEL attributes by changing their type to IFA_UNSPEC,
because in general they will report mismatching names, and we don't
really need to use labels as we already know the interface index.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2023-05-23 16:13:28 +02:00
Stefano Brivio
2fe0461856 netlink: Add functionality to copy routes from outer namespace
Instead of just fetching the default gateway and configuring a single
equivalent route in the target namespace, on 'pasta --config-net', it
might be desirable in some cases to copy the whole set of routes
corresponding to a given output interface.

For instance, in:
  https://github.com/containers/podman/issues/18539
  IPv4 Default Route Does Not Propagate to Pasta Containers on Hetzner VPSes

configuring the default gateway won't work without a gateway-less
route (specifying the output interface only), because the default
gateway is, somewhat dubiously, not on the same subnet as the
container.

This is a similar case to the one covered by commit 7656a6f88882
("conf: Adjust netmask on mismatch between IPv4 address/netmask and
gateway"), and I'm not exactly proud of that workaround.

We also have:
  https://bugs.passt.top/show_bug.cgi?id=49
  pasta does not work with tap-style interface

for which, eventually, we should be able to configure a gateway-less
route in the target namespace.

Introduce different operation modes for nl_route(), including a new
NL_DUP one, not exposed yet, which simply parrots back to the kernel
the route dump for a given interface from the outer namespace, fixing
up flags and interface indices on the way, and requesting to add the
same routes in the target namespace, on the interface we manage.

For n routes we want to duplicate, send n identical netlink requests
including the full dump: routes might depend on each other and the
kernel processes RTM_NEWROUTE messages sequentially, not atomically,
and repeating the full dump naturally resolves dependencies without
the need to actually calculate them.

I'm not kidding, it actually works pretty well.

Link: https://github.com/containers/podman/issues/18539
Link: https://bugs.passt.top/show_bug.cgi?id=49
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2023-05-23 16:13:28 +02:00
Stefano Brivio
1c3c68970e netlink: Fix comment about response buffer size for nl_req()
Fixes: fde8004ab0b4 ("netlink: Use 8 KiB * netlink message header size as response buffer")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2023-05-23 16:13:28 +02:00
Stefano Brivio
ca2749e1bd passt: Relicense to GPL 2.0, or any later version
In practical terms, passt doesn't benefit from the additional
protection offered by the AGPL over the GPL, because it's not
suitable to be executed over a computer network.

Further, restricting the distribution under the version 3 of the GPL
wouldn't provide any practical advantage either, as long as the passt
codebase is concerned, and might cause unnecessary compatibility
dilemmas.

Change licensing terms to the GNU General Public License Version 2,
or any later version, with written permission from all current and
past contributors, namely: myself, David Gibson, Laine Stump, Andrea
Bolognani, Paul Holzinger, Richard W.M. Jones, Chris Kuhn, Florian
Weimer, Giuseppe Scrivano, Stefan Hajnoczi, and Vasiliy Ulyanov.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-04-06 18:00:33 +02:00
David Gibson
34ade90957 Work around weird false positives with cppcheck-2.9.1
Commit 89e38f55 "treewide: Fix header includes to build with musl" added
extra #includes to work with musl.  Unfortunately with the cppcheck version
I'm using (cppcheck-2.9-1.fc37.x86_64 in Fedora 37) this causes weird false
positives: specifically cppcheck seems to hit a #error in <bits/unistd.h>
complaining about including it directly instead of via <unistd.h> (which is
not something we're doing).

I have no idea why that would be happening; but I'm guessing it has to be
a bug in the cpp implementation in that cppcheck version.  In any case,
it's possible to work around this by moving the include of <unistd.h>
before the include of <signal.h>.  So, do that.

Fixes: 89e38f55405d ("treewide: Fix header includes to build with musl")
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-03-21 16:38:06 +01:00
Chris Kuhn
89e38f5540 treewide: Fix header includes to build with musl
Roughly inspired from a patch by Chris Kuhn: fix up includes so that
we can build against musl: glibc is more lenient as headers generally
include a larger amount of other headers.

Compared to the original patch, I only included what was needed
directly in C files, instead of adding blanket includes in local
header files. It's a bit more involved, but more consistent with the
current (not ideal) situation.

Reported-by: Chris Kuhn <kuhnchris+github@kuhnchris.eu>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2023-03-09 03:44:21 +01:00
Stefano Brivio
fde8004ab0 netlink: Use 8 KiB * netlink message header size as response buffer
...instead of BUFSIZ. On musl, BUFSIZ is 1024, so we'll typically
truncate the response to the request we send in nl_link(). It's
usually 8192 or more with glibc.

There doesn't seem to be any macro defining the rtnetlink maximum
message size, and iproute2 just hardcodes 1024 * 1024 for the receive
buffer, but the example in netlink(7) makes somewhat sense, looking
at the kernel implementation.

It's not very clean, but we're very unlikely to hit that limit,
and if we do, we'll find out painlessly, because NLA_OK() will tell
us right away.

Reported-by: Chris Kuhn <kuhnchris+passt@kuhnchris.eu>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2023-03-09 03:44:21 +01:00
Laine Stump
c9af6f92db convert all remaining err() followed by exit() to die()
This actually leaves us with 0 uses of err(), but someone could want
to use it in the future, so we may as well leave it around.

Signed-off-by: Laine Stump <laine@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-02-16 17:32:27 +01:00
Paul Holzinger
3487b5fc93 pasta: do not leak netlink sock into child
When spawning a child command with pasta command... pasta should not
leak fds that it opened. Only the fds that were already open should be
given to the child.

Run `pasta --config-net -- ls -l /proc/self/fd` from a terminal where
only stdin/out/err are open. The fd 3 was opend by ls to read the
/proc/self/fd dir. But fd 5 is the netlink socket that was opend in
pasta. To prevent such a leak we will open the socket with SOCK_CLOEXEC.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2023-02-12 23:42:34 +01:00
Stefano Brivio
3e2eb4337b conf: Bind inbound ports with CAP_NET_BIND_SERVICE before isolate_user()
Even if CAP_NET_BIND_SERVICE is granted, we'll lose the capability in
the target user namespace as we isolate the process, which means
we're unable to bind to low ports at that point.

Bind inbound ports, and only those, before isolate_user(). Keep the
handling of outbound ports (for pasta mode only) after the setup of
the namespace, because that's where we'll bind them.

To this end, initialise the netlink socket for the init namespace
before isolate_user() as well, as we actually need to know the
addresses of the upstream interface before binding ports, in case
they're not explicitly passed by the user.

As we now call nl_sock_init() twice, checking its return code from
conf() twice looks a bit heavy: make it exit(), instead, as we
can't do much if we don't have netlink sockets.

While at it:

- move the v4_only && v6_only options check just after the first
  option processing loop, as this is more strictly related to
  option parsing proper

- update the man page, explaining that CAP_NET_BIND_SERVICE is
  *not* the preferred way to bind ports, because passt and pasta
  can be abused to allow other processes to make effective usage
  of it. Add a note about the recommended sysctl instead

- simplify nl_sock_init_do() now that it's called once for each
  case

Reported-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-10-15 02:10:36 +02:00
Stefano Brivio
6acf89638b netlink: Disable duplicate address detection for configured IPv6 address
With default options, when we pass --config-net, the IPv6 address is
actually going to be recycled from the init namespace, so it is in
fact duplicated, but duplicate address detection has no way to find
out.

With a different configured address, that's not the case, but anyway
duplicate address detection will be unable to see this.

In both cases, we're wasting time for nothing.

Pass the IFA_F_NODAD flag as we configure globally scoped IPv6
addresses via netlink.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2022-10-15 02:10:36 +02:00
Stefano Brivio
da152331cf Move logging functions to a new file, log.c
Logging to file is going to add some further complexity that we don't
want to squeeze into util.c.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2022-10-14 17:38:25 +02:00
David Gibson
656acdfc26 Avoid ugly 'end' members in netlink structures
We use a number of complex structures to format messages to send to
netlink.  In some cases we add imaginary 'end' members not because they
actually mean something on the wire, but so that we can use offsetof() on
the member to determine the relevant size.

Adding extra things to the structures for this is kinda nasty.  We can use
a different construct with offsetof and sizeof to avoid them.  As a bonus
this removes some cppcheck warnings about unused struct members.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-09-29 12:22:21 +02:00
David Gibson
eb5e123038 cppcheck: Reduce scope of some variables
Minor style improvement suggested by cppcheck.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-09-29 12:22:01 +02:00
David Gibson
06abfcf6d9 Separately locate external interfaces for IPv4 and IPv6
Now that the back end allows passt/pasta to use different external
interfaces for IPv4 and IPv6, use that to do the right thing in the case
that the host has IPv4 and IPv6 connectivity via different interfaces.
If the user hasn't explicitly chosen an interface, separately search for
a suitable external interface for each protocol.

As a bonus, this substantially simplifies the external interface probe.  It
also eliminates a subtle confusing case where in some circumstances we
would pick the first interface in interface index order, and sometimes in
order of routes returned from netlink.  On some network configurations that
could cause tests to fail, because the logic in the tests was subtly
different (it always used route order).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-07-30 21:57:50 +02:00
Stefano Brivio
3f2e7098ac netlink: In nl_addr() and nl_route(), don't return before set request
Fixes: 22ed4467a413 ("treewide: Unchecked return value from library, CWE-252")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-05-01 07:19:05 +02:00
Stefano Brivio
22ed4467a4 treewide: Unchecked return value from library, CWE-252
All instances were harmless, but it might be useful to have some
debug messages here and there. Reported by Coverity.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-04-07 11:44:35 +02:00
Stefano Brivio
48582bf47f treewide: Mark constant references as const
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-03-29 15:35:38 +02:00
Stefano Brivio
ed58ad1a59 netlink: Avoid left-over bytes in request on MTU configuration
When nl_link() configures the MTU, it shouldn't send extra bytes,
otherwise we'll get a kernel warning:

  netlink: 4 bytes leftover after parsing attributes in process `pasta'.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-02-23 13:21:52 +01:00
Stefano Brivio
64f7d81d9a netlink: Fix swapped v4/v6-only flags in external interface detection
The effect of this typo became visible in an IPv6-only environment,
where passt wouldn't work at all.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-01-26 16:30:59 +01:00
Stefano Brivio
1776de0140 tcp, netlink, HAS{BYTES_ACKED,MIN_RTT,GETRANDOM} and NETLINK_GET_STRICT_CHK
tcpi_bytes_acked and tcpi_min_rtt are only available on recent
kernel versions: provide fall-back paths (incurring some grade of
performance penalty).

Support for getrandom() was introduced in Linux 3.17 and glibc 2.25:
provide an alternate mechanism for that as well, reading from
/dev/random.

Also check if NETLINK_GET_STRICT_CHK is defined before using it:
it's not strictly needed, we'll filter out irrelevant results from
netlink anyway.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-01-26 16:30:59 +01:00
Stefano Brivio
b93c2c1713 passt: Drop <linux/ipv6.h> include, carry own ipv6hdr and opt_hdr definitions
This is the only remaining Linux-specific include -- drop it to avoid
clang-tidy warnings and to make code more portable.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2022-01-26 07:57:09 +01:00
Stefano Brivio
d36e429bc6 netlink: Fix length of address attribute
...I broke this while playing with clang-tidy, and didn't add
tests for pasta's --config-net yet.

Reported-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
2021-10-21 20:14:52 +02:00