RFC 9293, 3.8.4 says:
Implementers MAY include "keep-alives" in their TCP implementations
(MAY-5), although this practice is not universally accepted. Some
TCP implementations, however, have included a keep-alive mechanism.
To confirm that an idle connection is still active, these
implementations send a probe segment designed to elicit a response
from the TCP peer. Such a segment generally contains SEG.SEQ =
SND.NXT-1 and may or may not contain one garbage octet of data. If
keep-alives are included, the application MUST be able to turn them
on or off for each TCP connection (MUST-24), and they MUST default to
off (MUST-25).
but currently, tcp_data_from_tap() is not aware of this and will
schedule a fast re-transmit on the second keep-alive (because it's
also a duplicate ACK), ignoring the fact that the sequence number was
rewinded to SND.NXT-1.
Send ACK segments when we receive those segments, reset the activity
timeout, and ignore them for the rest. We can't affect the outbound
keep-alive behaviour, other than enabling or disabling keep-alives
with SO_KEEPALIVE, because it's controlled by sysctls.
Link: https://github.com/containers/podman/discussions/24572
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We enter the timer handler with the ACK_TO_TAP_DUE flag, call
tcp_prepare_flags() with ACK_IF_NEEDED, and realise that we
acknowledged everything meanwhile, so we return early, but we also
need to reset that flag to avoid unnecessarily scheduling the timer
over and over again until more pending data appears.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Currently, our NDP implementation only sends Router Advertisements (RA)
when it receives a Router Solicitation (RS) from the guest. However,
RFC 4861 requires that we periodically send unsolicited RAs.
Linux as a guest also requires this: it will send an RS when a link first
comes up, but the route it gets from this will have a finite lifetime (we
set this to 65535s, the maximum allowed, around 18 hours). When that
expires the guest will not send a new RS, but instead expects the route to
have been renewed (if still valid) by an unsolicited RA.
Implement sending unsolicited RAs on a partially randomised timer, as
required by RFC 4861. The RFC also specifies that solicited RAs should
also be delayed, or even omitted, if the next unsolicited RA is soon
enough. For now we don't do that, always sending an immediate RA in
response to an RS. We can get away with this because in our use cases
we expect to just have passt itself and the guest on the link, rather than
a large broadcast domain.
Link: https://github.com/kubevirt/kubevirt/issues/13191
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We have an upcoming case where we need pseudo-random numbers to scatter
timings, but we don't need cryptographically strong random numbers. libc's
built in random() is fine for this purpose, but we should seed it. Extend
secret_init() - the only current user of random numbers - to do this as
well as generating the SipHash secret. Using /dev/random for a PRNG seed
is probably overkill, but it's simple and we only do it once, so we might
as well.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Currently secret_init() open codes getting good quality random bytes from
the OS, either via getrandom(2) or reading /dev/random. We're going to
add at least one more place that needs random data in future, so make a
general helper for getting random bytes. While we're there, fix a number
of minor bugs:
- getrandom() can theoretically return a "short read", so handle that case
- getrandom() as well as read can return a transient EINTR
- We would attempt to read data from /dev/random if we failed to open it
(open() returns -1), but not if we opened it as fd 0 (unlikely, but ok)
- More specific error reporting
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Currently we open-code the lifetime of the route we advertise via NDP to be
65535s (the maximum). Change it to a #define.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
There are a number of places we can simply assign IPv6 addresses about,
rather than the current mildly ugly memcpy().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Currently the large ndp() function responds to all NDP messages we handle,
both parsing the message as necessary and sending the response. Split out
the code to construct and send specific message types into ndp_na() (to
send NA messages) and ndp_ra() (to send RA messages).
As well as breaking up an excessively large function, this is a first step
to being able to send unsolicited NDP messages.
While we're there, remove a slighty ugly goto.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
ndp() has a conditional on message type generating the reply message, then
a tiny amount of common code, then another conditional to send the reply
with slightly different parameters. We can make this a bit neater by
making a helper function for sending the reply, and call it from each of
the different message type paths.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
ndp() updates addr_seen or addr_ll_seen based on the source address of the
received packet. This is redundant since tap6_handler() has already
updated addr_seen for any type of packet, not just NDP.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We pass -I options to cppcheck so that it will find the system headers.
Then we need to pass a bunch more options to suppress the zillions of
cppcheck errors found in those headers.
It turns out, however, that it's not recommended to give the system headers
to cppcheck anyway. Instead it has built-in knowledge of the ANSI libc and
uses that as the basis of its checks. We do need to suppress
missingIncludeSystem warnings instead though.
Not bothering with the system headers makes the cppcheck runtime go from
~37s to ~14s on my machine, which is a pretty nice win.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
If CLOSE_RANGE_UNSHARE isn't defined, we define a fallback version of
close_range() which is a (successful) no-op. This is broken in several
ways:
* It doesn't actually fix compile if using old kernel headers, because
the caller of close_range() still directly uses CLOSE_RANGE_UNSHARE
unprotected by ifdefs
* Even if it did fix the compile, it means inconsistent behaviour between
a compile time failure to find the value (we silently don't close files)
and a runtime failure (we die with an error from close_range())
* Silently not closing the files we intend to close for security reasons
is probably not a good idea in any case
We don't want to simply error if close_range() or CLOSE_RANGE_UNSHARE isn't
available, because that would require running on kernel >= 5.9. On the
other hand there's not really any other way to flush all possible fds
leaked by the parent (close() in a loop takes over a minute). So in this
case print a warning and carry on.
As bonus this fixes a cppcheck error I see with some different options I'm
looking to apply in future.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
util.h has some #ifdefs and weak definitions to handle compatibility with
various kernel versions. Move this to linux_dep.h which handles several
other similar cases.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
log.c has several #ifdefs on FALLOC_FL_COLLAPSE_RANGE that won't attempt
to use it if not defined. But even if the value is defined at compile
time, it might not be available in the runtime kernel, so we need to check
for errors from a fallocate() call and fall back to other methods.
Simplify this to only need the runtime check by using linux_dep.h to define
FALLOC_FL_COLLAPSE_RANGE if it's not in the kernel headers.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
I have no idea why, but these are reported by clang-tidy (19.2.1) on
Alpine (x86) only:
/home/sbrivio/passt/tap.c:1139:38: error: 'socket' should use SOCK_CLOEXEC where possible [android-cloexec-socket,-warnings-as-errors]
1139 | int fd = socket(AF_UNIX, SOCK_STREAM, 0);
| ^
| | SOCK_CLOEXEC
/home/sbrivio/passt/tap.c:1158:51: error: 'socket' should use SOCK_CLOEXEC where possible [android-cloexec-socket,-warnings-as-errors]
1158 | ex = socket(AF_UNIX, SOCK_STREAM | SOCK_NONBLOCK, 0);
| ^
| | SOCK_CLOEXEC
/home/sbrivio/passt/tcp.c:1413:44: error: 'socket' should use SOCK_CLOEXEC where possible [android-cloexec-socket,-warnings-as-errors]
1413 | s = socket(af, SOCK_STREAM | SOCK_NONBLOCK, IPPROTO_TCP);
| ^
| | SOCK_CLOEXEC
/home/sbrivio/passt/util.c:188:38: error: 'socket' should use SOCK_CLOEXEC where possible [android-cloexec-socket,-warnings-as-errors]
188 | if ((s = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)) < 0) {
| ^
| | SOCK_CLOEXEC
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
For some reason, this is only reported by clang-tidy 19.1.2 on
Alpine:
/home/sbrivio/passt/passt.c:314:53: error: conditional operator with identical true and false expressions [bugprone-branch-clone,-warnings-as-errors]
314 | nfds = epoll_wait(c.epollfd, events, EPOLL_EVENTS, TIMER_INTERVAL);
| ^
We do have a suppression, but not on the line preceding it, because
we also need a cppcheck suppression there. Use NOLINTBEGIN/NOLINTEND
for the clang-tidy suppression.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
On 32-bit architectures, clang-tidy reports:
/home/pi/passt/tcp.c:728:11: error: performing an implicit widening conversion to type 'uint64_t' (aka 'unsigned long long') of a multiplication performed in type 'unsigned long' [bugprone-implicit-widening-of-multiplication-result,-warnings-as-errors]
728 | if (v >= SNDBUF_BIG)
| ^
/home/pi/passt/util.h:158:22: note: expanded from macro 'SNDBUF_BIG'
158 | #define SNDBUF_BIG (4UL * 1024 * 1024)
| ^
/home/pi/passt/tcp.c:728:11: note: make conversion explicit to silence this warning
728 | if (v >= SNDBUF_BIG)
| ^
/home/pi/passt/util.h:158:22: note: expanded from macro 'SNDBUF_BIG'
158 | #define SNDBUF_BIG (4UL * 1024 * 1024)
| ^~~~~~~~~~~~~~~~~
/home/pi/passt/tcp.c:728:11: note: perform multiplication in a wider type
728 | if (v >= SNDBUF_BIG)
| ^
/home/pi/passt/util.h:158:22: note: expanded from macro 'SNDBUF_BIG'
158 | #define SNDBUF_BIG (4UL * 1024 * 1024)
| ^~~~~~~~~~
/home/pi/passt/tcp.c:730:15: error: performing an implicit widening conversion to type 'uint64_t' (aka 'unsigned long long') of a multiplication performed in type 'unsigned long' [bugprone-implicit-widening-of-multiplication-result,-warnings-as-errors]
730 | else if (v > SNDBUF_SMALL)
| ^
/home/pi/passt/util.h:159:24: note: expanded from macro 'SNDBUF_SMALL'
159 | #define SNDBUF_SMALL (128UL * 1024)
| ^
/home/pi/passt/tcp.c:730:15: note: make conversion explicit to silence this warning
730 | else if (v > SNDBUF_SMALL)
| ^
/home/pi/passt/util.h:159:24: note: expanded from macro 'SNDBUF_SMALL'
159 | #define SNDBUF_SMALL (128UL * 1024)
| ^~~~~~~~~~~~
/home/pi/passt/tcp.c:730:15: note: perform multiplication in a wider type
730 | else if (v > SNDBUF_SMALL)
| ^
/home/pi/passt/util.h:159:24: note: expanded from macro 'SNDBUF_SMALL'
159 | #define SNDBUF_SMALL (128UL * 1024)
| ^~~~~
/home/pi/passt/tcp.c:731:17: error: performing an implicit widening conversion to type 'uint64_t' (aka 'unsigned long long') of a multiplication performed in type 'unsigned long' [bugprone-implicit-widening-of-multiplication-result,-warnings-as-errors]
731 | v -= v * (v - SNDBUF_SMALL) / (SNDBUF_BIG - SNDBUF_SMALL) / 2;
| ^
/home/pi/passt/util.h:159:24: note: expanded from macro 'SNDBUF_SMALL'
159 | #define SNDBUF_SMALL (128UL * 1024)
| ^
/home/pi/passt/tcp.c:731:17: note: make conversion explicit to silence this warning
731 | v -= v * (v - SNDBUF_SMALL) / (SNDBUF_BIG - SNDBUF_SMALL) / 2;
| ^
/home/pi/passt/util.h:159:24: note: expanded from macro 'SNDBUF_SMALL'
159 | #define SNDBUF_SMALL (128UL * 1024)
| ^~~~~~~~~~~~
/home/pi/passt/tcp.c:731:17: note: perform multiplication in a wider type
731 | v -= v * (v - SNDBUF_SMALL) / (SNDBUF_BIG - SNDBUF_SMALL) / 2;
| ^
/home/pi/passt/util.h:159:24: note: expanded from macro 'SNDBUF_SMALL'
159 | #define SNDBUF_SMALL (128UL * 1024)
| ^~~~~
because, wherever we use those thresholds, we define the other term
of comparison as uint64_t. Define the thresholds as unsigned long long
as well, to make sure we match types.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Given that we're comparing against 'n', which is signed, we cast
TAP_BUF_BYTES to ssize_t so that the maximum buffer usage, calculated
as the difference between TAP_BUF_BYTES and ETH_MAX_MTU, will also be
signed.
This doesn't necessarily happen on 32-bit architectures, though. On
armhf and i686, clang-tidy 18.1.8 and 19.1.2 report:
/home/pi/passt/tap.c:1087:16: error: comparison of integers of different signs: 'ssize_t' (aka 'int') and 'unsigned int' [clang-diagnostic-sign-compare,-warnings-as-errors]
1087 | for (n = 0; n <= (ssize_t)TAP_BUF_BYTES - ETH_MAX_MTU; n += len) {
| ~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cast the whole difference to ssize_t, as we know it's going to be
positive anyway, instead of relying on that side effect.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
cppcheck 2.14.2 on Alpine reports:
dhcpv6.c:431:32: style: Variable 'client_id' can be declared as pointer to const [constVariablePointer]
struct opt_hdr *ia, *bad_ia, *client_id;
^
It's not only 'client_id': we can declare 'ia' as const pointer too.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
cppcheck 2.16.0 reports:
dhcpv6.c:334:14: style: The comparison 'ia_type == 3' is always true. [knownConditionTrueFalse]
if (ia_type == OPT_IA_NA) {
^
dhcpv6.c:306:12: note: 'ia_type' is assigned value '3' here.
ia_type = OPT_IA_NA;
^
dhcpv6.c:334:14: note: The comparison 'ia_type == 3' is always true.
if (ia_type == OPT_IA_NA) {
^
this is not really the case as we set ia_type to OPT_IA_TA and then
jump back.
Anyway, there's no particular reason to use a goto here: add a trivial
foreach() macro to go through elements of an array and use it instead.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
In the NDP tests we search explicitly for a guest address with prefix
length 64. AFAICT this is an attempt to specifically find the SLAAC
assigned address, rather than something assigned by other means. We can do
that more explicitly by checking for .protocol == "kernel_ra". however.
The SLAAC prefixes we assigned *will* always be 64-bit, that's hard-coded
into our NDP implementation. RFC4862 doesn't really allow anything else
since the interface identifiers for an Ethernet-like link are 64-bits.
Let's actually verify that, rather than just assuming it, by extracting the
prefix length assigned in the guest and checking it as well.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
When determining the namespace's IPv6 address in the perf test setup, we
explicitly filter for addresses with a 64-bit prefix length. There's no
real reason we need that - as long as it's a global address we can use it.
I suspect this was copied without thinking from a similar example in the
NDP tests, where the 64-bit prefix length _is_ meaningful (though it's not
entirely clear if the handling is correct there either).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Currently nstool die()s on essentially any error. In most cases that's
fine for our purposes. However, it's a problem when in "hold" mode and
getting an IO error on an accept()ed socket. This could just indicate that
the control client aborted prematurely, in which case we don't want to
kill of the namespace we're holding.
Adjust these to print an error, close() the control client socket and
carry on. In addition, we need to explicitly ignore SIGPIPE in order not
to be killed by an abruptly closed client connection.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
nstool in "exec" mode will propagate some signals (specifically SIGTERM) to
the process in the namespace it executes. The signal handler which
accomplishes this is called simply sig_handler(). However, it turns out
we're going to need some other signal handlers, so rename this to the more
specific sig_propagate().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
While experimenting with cppcheck options, I hit several false positives
caused by this bug: https://trac.cppcheck.net/ticket/13227
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We have an ASSERT() verifying that we're able to look up the flow in
udp_reply_sock_handler(). However, we dereference uflow before that in
an initializer, rather defeating the point. Rearrange to avoid that.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We don't modify this structure at all. For some reason cppcheck doesn't
catch this with our current options, but did when I was experimenting with
some different options.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
tcp_info.h exists just to contain a modern enough version of struct
tcp_info for our needs, removing compile time dependency on the version of
kernel headers. There are several other cases where we can remove similar
compile time dependencies on kernel version. Prepare for that by renaming
tcp_info.h to linux_dep.h.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
On certain architectures we get a warning about comparison between
different signedness integers in fwd_probe_ephemeral(). This is because
NUM_PORTS evaluates to an unsigned integer. It's a fixed value, though
and we know it will fit in a signed long on anything reasonable, so add
a cast to suppress the warning.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We supply a weak alias for ffsl() in case it's not defined in our libc.
Except.. we don't have any users for it any more, so remove it.
make cppcheck doesn't spot this at present for complicated reasons, but it
might with tweaks to the options I'm experimenting with.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
clangd's default configuration seems to try to treat .h files as C++ not
C. There are many more spurious warnings generated at present, but this
removes some of the most egregious ones.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We probe the available stack limit in the Makefile using rlimit, then use
that to set the size of the stack when we clone() extra threads. But
the rlimit at compile time need not be the same as the rlimit at runtime,
so that's not particularly sensible.
Ideally, we'd set the stack size based on an estimate of the actual
maximum stack usage of all our clone()ed functions. We don't have that
at the moment, but to keep things simple just set it to 1MiB - that's what
the current probe will set things to on my default configuration Fedora 40,
so it's likely to be fine in most cases.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We insert -DARCH for all compiles, based on TARGET_ARCH determined in the
Makefile. However, this is only used in qrap.c, not anywhere else in
passt or pasta. Only supply this -D when compiling qrap specifically.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Currently we construct the AUDIT_ARCH variable in the Makefile, then pass
it into the C code with -D. The only place that uses it, though is the
BPF filter generated by seccomp.sh. seccomp.sh already needs to do things
differently depending on the arch, so it might as well just insert the
expanded AUDIT_ARCH directly into the generated code, rather than using
a #define. Arguably this is better, even, since it ensures more locally
that the arch the BPF checks for matches the arch seccomp.sh built the
filter for.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
NETNS_RUN_DIR is set in the Makefile, then passed into the C code with
-D. But NETNS_RUN_DIR is just a fixed string, it doesn't depend on any
make probes or variables, so there's really no reason to handle it via the
Makefile. Just move it to a plain #define in conf.c.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Since it's the size of a chunk of memory it would seem logical that
RTA_PAYLOAD() returns size_t. However, it doesn't - it explicitly casts
its result to an int. RTNH_OK(), which often takes the result of
RTA_PAYLOAD() as a parameter compares it to an int, so using size_t can
result in comparison of different-signed integer warnings from clang.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Due to a copy-pasta error, this returns 'PIF_NONE' instead of NULL on the
failure case. PIF_NONE expands to 0, which turns into NULL, but it's
still confusing, so fix it. This removes a clang warning.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We pass 'environ' to execve() in arch_avc2_exec(), so that we retain the
environment in the current process. But the declaration of 'environ' is
a bit weird - it doesn't seem to be in a standard header, requiring a
manual explicit declaration. But, we can avoid needing to reference it
explicitly by using execv() instead of execve(). This removes a clang
warning.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Currently we configure clang-tidy with a very long command line spelled out
in the Makefile (mostly a big list of lints to disable). Move it from here
into a .clang-tidy configuration file, so that the config is accessible if
clang-tidy is invoked in other ways (e.g. via clangd) as well. As a bonus
this also means that we can move the bulky comments about why we're
suppressing various tests inline with the relevant config lines.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
There are things in qrap.c that clang-tidy complains about that aren't
worth fixing. So, we currently exclude it using $(filter-out). However,
we already have a make variable which has just the passt sources, excluding
qrap, so we can use that instead of the awkward filter-out expression.
Currently, we still include qrap.c for cppcheck, but there's not much
point doing so: it's, well, qrap, so we don't care that much about lints.
Exclude it from cppcheck as well, for consistency.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
I've been experimenting with clangd, but its default format style is
horrid. Since our style is basically that of the Linux kernel, copy the
.clang-format from the kernel, minus reference to a bunch of kernel
specific macros.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Most of our transfer tests using socat use 'sleep' waaiting for the server
side to be ready before starting the client. However in two_guests/basic
the sleep is in the wrong place: rather than being between starting the
server and starting the client, it's after waiting for the server to
complete. This causes occasional hangs when the client runs before the
server is ready - in that case the receiving guest sends an RST, which we
don't (currently) propagate back to the sender.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
On ppc64le, TUNSETIFF happens to be 2147767498, which is bigger than
INT_MAX (2^31 - 1), and musl declares the second argument of ioctl()
as 'int', not 'unsigned long' like glibc does, probably because of how
POSIX specifies the equivalent argument, int dcmd, in posix_devctl(),
so gcc reports a warning:
tap.c: In function 'tap_ns_tun':
tap.c:1291:24: warning: overflow in conversion from 'long unsigned int' to 'int' changes value from '2147767498' to '-2147199798' [-Woverflow]
1291 | rc = ioctl(fd, TUNSETIFF, &ifr);
| ^~~~~~~~~
We don't care about that overflow, so explicitly cast TUNSETIFF to
int.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Use a plain uint16_t instead and avoid including one extra header:
the 'bitwise' attribute of __sum16 is just used by sparse(1).
Reported-by: omni <omni+alpine@hack.org>
Fixes: 3d484aa370 ("tcp: Update TCP checksum using an iovec array")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
I thought we could just set errno to 0, do a bunch of stuff, and check
that errno didn't change to infer we succeeded. But clang-tidy,
starting with LLVM 19, reports:
/home/sbrivio/passt/util.c:465:6: error: An undefined value may be read from 'errno' [clang-analyzer-unix.Errno,-warnings-as-errors]
465 | if (errno)
| ^
/usr/include/errno.h:38:16: note: expanded from macro 'errno'
38 | # define errno (*__errno_location ())
| ^~~~~~~~~~~~~~~~~~~~~~
/home/sbrivio/passt/util.c:446:6: note: Assuming the condition is false
446 | if (pid == -1) {
| ^~~~~~~~~
/home/sbrivio/passt/util.c:446:2: note: Taking false branch
446 | if (pid == -1) {
| ^
/home/sbrivio/passt/util.c:451:6: note: Assuming 'pid' is 0
451 | if (pid) {
| ^~~
/home/sbrivio/passt/util.c:451:2: note: Taking false branch
451 | if (pid) {
| ^
/home/sbrivio/passt/util.c:463:2: note: Assuming that 'close' is successful; 'errno' becomes undefined after the call
463 | close(devnull_fd);
| ^~~~~~~~~~~~~~~~~
/home/sbrivio/passt/util.c:465:6: note: An undefined value may be read from 'errno'
465 | if (errno)
| ^
/usr/include/errno.h:38:16: note: expanded from macro 'errno'
38 | # define errno (*__errno_location ())
| ^~~~~~~~~~~~~~~~~~~~~~
And the LLVM documentation for the unix.Errno checker, 1.1.8.3
unix.Errno (C), mentions, at:
https://clang.llvm.org/docs/analyzer/checkers.html#unix-errno
that:
The C and POSIX standards often do not define if a standard library
function may change value of errno if the call does not fail.
Therefore, errno should only be used if it is known from the return
value of a function that the call has failed.
which is, somewhat surprisingly, the case for close().
Instead of using errno, check the actual return values of the calls
we issue here.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
For clock_gettime(), we shouldn't ignore errors if they happen at
initialisation phase, because something is seriously wrong and it's
not helpful if we proceed as if nothing happened.
As we're up and running, though, it's probably better to report the
error and use a stale value than to terminate altogether. Make sure
we use a zero value if we don't have a stale one somewhere.
For timerfd_gettime() and timerfd_settime() failures, just report an
error, there isn't much else we can do.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
In pcap_init(), we should always open the packet capture file with
O_CLOEXEC, even if we're not running in foreground: O_CLOEXEC means
close-on-exec, not close-on-fork.
In logfile_init() and pidfile_open(), the fact that we pass a third
'mode' argument to open() seems to confuse the android-cloexec-open
checker in LLVM versions from 16 to 19 (at least).
The checker is suggesting to add O_CLOEXEC to 'mode', and not in
'flags', where we already have it.
Add a suppression for clang-tidy and a comment, and avoid repeating
those three times by adding a new helper, output_file_open().
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>