Roughly inspired from a patch by Chris Kuhn: fix up includes so that
we can build against musl: glibc is more lenient as headers generally
include a larger amount of other headers.
Compared to the original patch, I only included what was needed
directly in C files, instead of adding blanket includes in local
header files. It's a bit more involved, but more consistent with the
current (not ideal) situation.
Reported-by: Chris Kuhn <kuhnchris+github@kuhnchris.eu>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
...instead of BUFSIZ. On musl, BUFSIZ is 1024, so we'll typically
truncate the response to the request we send in nl_link(). It's
usually 8192 or more with glibc.
There doesn't seem to be any macro defining the rtnetlink maximum
message size, and iproute2 just hardcodes 1024 * 1024 for the receive
buffer, but the example in netlink(7) makes somewhat sense, looking
at the kernel implementation.
It's not very clean, but we're very unlikely to hit that limit,
and if we do, we'll find out painlessly, because NLA_OK() will tell
us right away.
Reported-by: Chris Kuhn <kuhnchris+passt@kuhnchris.eu>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
This actually leaves us with 0 uses of err(), but someone could want
to use it in the future, so we may as well leave it around.
Signed-off-by: Laine Stump <laine@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
When spawning a child command with pasta command... pasta should not
leak fds that it opened. Only the fds that were already open should be
given to the child.
Run `pasta --config-net -- ls -l /proc/self/fd` from a terminal where
only stdin/out/err are open. The fd 3 was opend by ls to read the
/proc/self/fd dir. But fd 5 is the netlink socket that was opend in
pasta. To prevent such a leak we will open the socket with SOCK_CLOEXEC.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Even if CAP_NET_BIND_SERVICE is granted, we'll lose the capability in
the target user namespace as we isolate the process, which means
we're unable to bind to low ports at that point.
Bind inbound ports, and only those, before isolate_user(). Keep the
handling of outbound ports (for pasta mode only) after the setup of
the namespace, because that's where we'll bind them.
To this end, initialise the netlink socket for the init namespace
before isolate_user() as well, as we actually need to know the
addresses of the upstream interface before binding ports, in case
they're not explicitly passed by the user.
As we now call nl_sock_init() twice, checking its return code from
conf() twice looks a bit heavy: make it exit(), instead, as we
can't do much if we don't have netlink sockets.
While at it:
- move the v4_only && v6_only options check just after the first
option processing loop, as this is more strictly related to
option parsing proper
- update the man page, explaining that CAP_NET_BIND_SERVICE is
*not* the preferred way to bind ports, because passt and pasta
can be abused to allow other processes to make effective usage
of it. Add a note about the recommended sysctl instead
- simplify nl_sock_init_do() now that it's called once for each
case
Reported-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
With default options, when we pass --config-net, the IPv6 address is
actually going to be recycled from the init namespace, so it is in
fact duplicated, but duplicate address detection has no way to find
out.
With a different configured address, that's not the case, but anyway
duplicate address detection will be unable to see this.
In both cases, we're wasting time for nothing.
Pass the IFA_F_NODAD flag as we configure globally scoped IPv6
addresses via netlink.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Logging to file is going to add some further complexity that we don't
want to squeeze into util.c.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
We use a number of complex structures to format messages to send to
netlink. In some cases we add imaginary 'end' members not because they
actually mean something on the wire, but so that we can use offsetof() on
the member to determine the relevant size.
Adding extra things to the structures for this is kinda nasty. We can use
a different construct with offsetof and sizeof to avoid them. As a bonus
this removes some cppcheck warnings about unused struct members.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Minor style improvement suggested by cppcheck.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Now that the back end allows passt/pasta to use different external
interfaces for IPv4 and IPv6, use that to do the right thing in the case
that the host has IPv4 and IPv6 connectivity via different interfaces.
If the user hasn't explicitly chosen an interface, separately search for
a suitable external interface for each protocol.
As a bonus, this substantially simplifies the external interface probe. It
also eliminates a subtle confusing case where in some circumstances we
would pick the first interface in interface index order, and sometimes in
order of routes returned from netlink. On some network configurations that
could cause tests to fail, because the logic in the tests was subtly
different (it always used route order).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
All instances were harmless, but it might be useful to have some
debug messages here and there. Reported by Coverity.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
When nl_link() configures the MTU, it shouldn't send extra bytes,
otherwise we'll get a kernel warning:
netlink: 4 bytes leftover after parsing attributes in process `pasta'.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
The effect of this typo became visible in an IPv6-only environment,
where passt wouldn't work at all.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
tcpi_bytes_acked and tcpi_min_rtt are only available on recent
kernel versions: provide fall-back paths (incurring some grade of
performance penalty).
Support for getrandom() was introduced in Linux 3.17 and glibc 2.25:
provide an alternate mechanism for that as well, reading from
/dev/random.
Also check if NETLINK_GET_STRICT_CHK is defined before using it:
it's not strictly needed, we'll filter out irrelevant results from
netlink anyway.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
This is the only remaining Linux-specific include -- drop it to avoid
clang-tidy warnings and to make code more portable.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
...I broke this while playing with clang-tidy, and didn't add
tests for pasta's --config-net yet.
Reported-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
...mostly false positives, but a number of very relevant ones too,
in tcp_get_sndbuf(), tcp_conn_from_tap(), and siphash PREAMBLE().
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Unions and structs, you all have names now.
Take the chance to enable bugprone-reserved-identifier,
cert-dcl37-c, and cert-dcl51-cpp checkers in clang-tidy.
Provide a ffsl() weak declaration using gcc built-in.
Start reordering includes, but that's not enough for the
llvm-include-order checker yet.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Move netlink routines to their own file, and use netlink to configure
or fetch all the information we need, except for the TUNSETIFF ioctl.
Move pasta-specific functions to their own file as well, add
parameters and calls to configure the tap interface in the namespace.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>