Creation of a symbolic link from /sbin to /usr/sbin fails if /sbin
exists and is non-empty. This is the case on Ubuntu-23.04.
We fix this by removing /sbin before creating the link.
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
If we run passt nested (a guest connected via passt to a guest
connected via passt to the host), the first guest (L1) typically has
two IPv6 addresses on the same interface: one formed from the prefix
assigned via SLAAC, and another one assigned via DHCPv6 (to match the
address on the host).
When we select addresses for comparison, in this case, we have
multiple global unicast addresses -- again, on the same interface.
Selecting the first reported one on both host and guest is not
entirely correct (in theory, the order might differ), but works
reasonably well.
Use the trick from 5beef085978e ("test: Only select a single
interface or gateway in tests") to ask jq(1) for the first address
returned by the query.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
When using the old-style "pane" methods of executing commands during the
tests, we need to scan the shell output for prompts in order to tell when
commands have finished. This is inherently unreliable because commands
could output things that look like prompts, and prompts might not look like
we expect them to. The only way to really fix this is to use a better way
of dispatching commands, like the newer "context" system.
However, it's awkward to convert everything to "context" right at the
moment, so we're still relying on some tests that do work most of the time.
It is, however, particularly sensitive to fancy coloured prompts using
escape sequences. Currently we try to handle this by stripping actual
ESC characters with tr, then looking for some common variants.
We can do a bit better: instead strip all escape sequences using sed before
looking for our prompt. Or, at least, any one using [a-zA-Z] as the
terminating character. Strictly speaking ANSI escapes can be terminated by
any character in 0x40..0x7e, which isn't easily expressed in a regexp.
This should capture all common ones, though.
With this transformation we can simplify the list of patterns we then look
for as a prompt, removing some redundant variants.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
In test/prepare-distro-img.sh we use guestfish to tweak our distro guest
images to be suitable. Part of this is using a 'copy-in' directive to copy
in the source files for passt itself. Currently we copy in all the files
with a single 'copy-in', since it allows listing multiple files. However
it turns out that the number of arguments it can accept is fairly limited
and our current list of files is already very close to that limit.
Instead, expand our list of files to one copy-in per file, avoiding that
limitation. This isn't much slower, because all the commands still run in
a single invocation of guestfish itself.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
valgrind complains if we pass a NULL buffer to recv(), even if we use
MSG_TRUNC, in which case it's actually safe. For a long time we've had
a valgrind suppression for this. It singles out the recv() in
tcp_sock_consume(), the only place we use MSG_TRUNC.
However, tcp_sock_consume() only has a single caller, which makes it a
prime candidate for inlining. If inlined, it won't appear on the stack and
valgrind won't match the correct suppression.
It appears that certain compiler versions (for example gcc-13.2.1 in
Fedora 39) will inline this function even with the -O0 we use for valgrind
builds. This breaks the suppression leading to a spurious failure in the
tests.
There's not really any way to adjust the suppression itself without making
it overly broad (we don't want to match other recv() calls). So, as a hack
explicitly prevent inlining of this function when we're making a valgrind
build. To accomplish this add an explicit -DVALGRIND when making such a
build.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
As commit 29269705239f ("test/perf: Small MTUs for spliced TCP
aren't interesting") drops all columns for TCP test MTUs except for
one, in throughput test for pasta's local flows, the first column we
need to highlight in that table is now the second one.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
For the TCP throughput tests, we use iperf3's -O "omit" option which
ignores results for the given time at the beginning of the test. Currently
we calculate this as 1/6th of the test measurement time. The purpose of
-O, however, is to skip over the TCP slow start period, which in no way
depends on the overall length of the test.
The slow start time is roughly speaking
log_2 ( max_window_size / MSS ) * round_trip_time
These factors all vary between tests and machines we're running on, but we
can estimate some reasonable bounds for them:
* The maximum window size is bounded by the buffer sizes at each end,
which shouldn't exceed 16MiB
* The mss varies with the MTU we use, but the smallest we use in tests is
~256 bytes
* Round trip time will vary with the system, but with these essentially
local transfers it will typically be well under 1ms (on my laptop it is
closer to 0.03ms)
That gives a worst case slow start time of about 16ms. Setting an omit
time of 0.1s uniformly is therefore more than enough, and substantially
smaller than what we calculate now for the default case (10s / 6 ~= 1.7s).
This reduces total time for the standard benchmark run by around 30s.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We always set --pacing-timer when invoking iperf3. However, the iperf3
man page implies this is only relevant for the -b option. We only use the
-b option for the UDP tests, not TCP, so remove --pacing-timer from the TCP
cases.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
The TCP packet size used on the passt L2 link (qemu socket) makes a huge
difference to passt/pasta throughput; many of passt's overheads (chiefly
syscalls) are per-packet.
That packet size is largely determined by the MTU on the L2 link, so we
benchmark for a number of different MTUs. That works well for the guest to
host transfers. For the host to guest transfers, we purport to test for
different MTUs, but we're not actually adjusting anything interesting.
The host to guest transfers adjust the MTU on the "host's" (actually ns)
loopback interface. However, that only affects the packet size for the
socket going to passt, not the packet size for the L2 link that passt
manages - passt can and will repack the stream into packets of its own
size. Since the depacketization on that socket is handled by the kernel it
doesn't have a lot of bearing on passt's performance.
We can't fix this by changing the L2 link MTU from the guest side (as we do
for guest to host), because that would only change the guest's view of the
MTU, passt would still think it has the large MTU. We could test this by
using the --mtu option to passt, but that would require restarting passt
for each run, which is awkward in the current setup. So, for now, drop all
the "small MTU" tests for host to guest.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Packet size can make a big difference to UDP throughput, so it makes sense
to measure it for a variety of different sizes. Currently we do this by
adjusting the MTU on the relevant interface before running iperf3.
However, the UDP packet size has no inherent connection to the MTU - it's
controlled by the sender, and the MTU just affects whether the packet will
make it through or be fragmented. The only reason adjusting the MTU works
is because iperf3 bases its default packet size on the (path) MTU.
We can test this more simply by using the -l option to the iperf3 client
to directly control the packet size, instead of adjusting the MTU.
As well as simplifying this lets us test different packet sizes for host to
ns traffic. We couldn't do that previously because we don't have
permission to change the MTU on the host.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Currently we make TCP throughput measurements for spliced connections with
a number of different MTU values. However, the results from this aren't
really interesting.
Unlike with tap connections, spliced connections only involve the loopback
interface on host and container, not a "real" external interface. lo
typically has an MTU of 65535 and there is very little reason to ever
change that. So, the measurements for smaller MTUs are rarely going to be
relevant.
In addition, the fact that we can offload all the {de,}packetization to the
kernel with splice(2) means that the throughput difference between these
MTUs isn't very great anyway.
Remove the short MTUs and only show spliced throughput for the normal
65535 byte loopback MTU. This reduces runtime of the performance tests on
my laptop by about 1 minute (out of ~24 minutes).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Currently we start both the iperf3 server(s) and client(s) afresh each time
we want to make a bandwidth measurement. That's not really necessary as
usually a whole batch of bandwidth measurements can use the same server.
Split up the iperf3 directive into 3 directives: iperf3s to start the
server, iperf3 to make a measurement and iperf3k to kill the server, so
that we can start the server less often. This - and more importantly, the
reduced number of waits for the server to be ready - reduces runtime of the
performance tests on my laptop by about 4m (out of ~28minutes).
For now we still restart the server between IPv4 and IPv6 tests. That's
because in some cases the latency measurements we make in between use the
same ports.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
iperf3 generates statistics about its run on both the client and server
sides. They don't have exactly the same information, but both have the
pieces we need (AFAICT the server communicates some nformation to the
client over the control socket, so the most important information is in the
client side output, even if measured by the server).
Currently we use the server side information for our measurements. Using
the client side information has several advantages though:
* We can directly wait for the client to complete and we know we'll have
the output we want. We don't need to sleep to give the server time to
write out the results.
* That in turn means we can wrap up as soon as the client is done, we
don't need to wait overlong to make sure everything is finished.
* The slightly different organisation of the data in the client output
means that we always want the same json value, rather than requiring
slightly different onces for UDP and TCP.
The fact that we avoid some extra delays speeds up the overal run of the
perf tests by around 7 minutes (out of around 35 minutes) on my laptop.
The fact that we no longer unconditionally kill client and server after
a certain time means that the client could run indefinitely if the server
doesn't respond. We mitigate that by setting 1s connect timeout on the
client. This isn't foolproof - if we get an initial response, but then
lose connectivity this could still run indefinitely, however it does cover
by far the most likely failure cases. --snd-timeout would provide more
robustness, but I've hit odd failures when trying to use it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Some older revisions used separate iperf3c and iperf3s test directives to
invoke the iperf3 client and server. Those were combined into a single
iperf3 directive some time ago, but a couple of places still have the old
syntax.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Ugly as hell, but we keep breaking things otherwise, and I keep
forgetting to run this manually (as long as it's based on my local
Podman setup, that's the only alternative).
We need to clone the Podman repository as distribution packages don't
contain test scripts, typically. While at it, build the latest
version which is what really matters.
As we're planning anyway to revamp the test framework, I'd be
inclined to just add this without too many thoughts, and have it as
a nice-to-have requirement reminder for the new framework.
Link: https://github.com/containers/podman/pull/19699
Suggested-by: Paul Holzinger <pholzing@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
nstool loops on accept(), but failed to close the accepted socket fds
before continuing on. So, with repeated commands it would eventually die
with an EMFILE.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Normal filesystem paths can be very long (PATH_MAX is around 8k), however
Unix domain sockets can only use relatively short paths (UNIX_PATH_MAX is
108 on Linux). Currently nstool will simply truncate paths that are too
long, leading to difficult to understand failures.
Make such failures clearer, with an explicit error message if given a path
that's too long.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
If we enter a mount namespace with nstool exec our working directory will
be changed to / in the new mount ns. This is surprising if we haven't
actually altered any mounts yet in the new ns. Instead, change the working
directory to match that of the holder process in this situation.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
This is possible useful in nstool info and has further uses for nstool
exec.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Using this, rather than using "nstool info" to get the pid then manually
connecting with nsenter makes things a little simpler.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Unlike ${DEBUG} we don't initialize ${TRACE} to 0 if not set, which cases
failures when testing it later. That failure acts as though it is false,
however it emits spurious errors in script.log, which can make it harder to
spot real errors.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
This allows you to run commands within a user namespace with the
privilege that comes from owning that userns.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
This combines nstool info -pw <sock> with nsenter with various options for
a more convenient and less verbose of entering existing nstool managed
namespaces.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Will make things a bit less verbose in future.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
So that we'll probably give a better error if you point it at something
that's not an nstool hold control socket.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Give nstool the ability to detect what namespaces the target process is in,
relative to where it's called. That is, those namespace types for which
the target is not in the same namespace as the caller. For now, just
print this information with "info", which can be useful for debugging.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
The new subcommand gives more information about the holder process and its
namespace, and may be further extended in future. Add some options which
give the old behaviour for existing scripts.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
This will make it easier to differentiate the options to those commands
further in future.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Having the "subcommand" first is more conventional and will make it more
natural for future extensions I have planned.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
In preparation for extending what it does.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
context_run() has a race condition if two commands are run in close
proximity (generally involving at least one in the background). Because we
always use the same name for the temporary fifo files, if another command
is issued while the fifos for the first still exist, mkfifo will fail,
typically causing the entire test script to jam.
Create unique names for the temporary fifos to avoid this problem.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
In practical terms, passt doesn't benefit from the additional
protection offered by the AGPL over the GPL, because it's not
suitable to be executed over a computer network.
Further, restricting the distribution under the version 3 of the GPL
wouldn't provide any practical advantage either, as long as the passt
codebase is concerned, and might cause unnecessary compatibility
dilemmas.
Change licensing terms to the GNU General Public License Version 2,
or any later version, with written permission from all current and
past contributors, namely: myself, David Gibson, Laine Stump, Andrea
Bolognani, Paul Holzinger, Richard W.M. Jones, Chris Kuhn, Florian
Weimer, Giuseppe Scrivano, Stefan Hajnoczi, and Vasiliy Ulyanov.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Fedora 32-35 are now old enough that they're not on all mirrors. Fetch
them from the archive server instead.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
The current debian cloud images no longer include ppc64. Change to using
the latest snapshot which does include ppc64.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
On shell 'exit' commands, running shells from pasta, we might get:
Cannot set tty process group (No such process)
as some TTY devices might be unaccessible. This is harmless, but
after commit "pasta: propagate exit code from child command", we'll
get test failures there, at least with dash.
Ignore those explicitly with a ugly workaround: we can't simply do
something like:
exit || :
because the failure is reported by the shell itself once it exits,
regardless of the command evaluation.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Similarly to UDP cases, these were missing as it wasn't clear, when
the other tests were introduced, if using the global address of a
namespace, from the host, should have resulted in connections being
routed via the tap interface.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
These were missing as it wasn't clear, when the other tests were
introduced, if using the global address of a namespace, from the
host, should have resulted in traffic being routed via the tap
interface (as opposed to the loopback interface). We now clarified
that's actually the case.
Use same values and thresholds as the tests for loopback traffic, as
throughput figures currently indicate there isn't much difference.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
...instead of doing it after the test. Now that we have pre-built
guest images, we might also have old JSON files from previous,
interrupted test runs.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Now that we install the binary in /bin, and we have a link from
/usr/bin, change the path in the test itself as well. Otherwise
it works with bash but not with dash for some reason.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Now that we require 13c6be96618c ("net: stream: add unix socket")
in qemu to run the tests, we can also assume that commit df8d07081718
("virtio-net: fix bottom-half packet TX on asynchronous completion")
is present, as it was merged before that one.
This fixes the issue we attempted to work around in passt TCP and
UDP performance tests: finally drop that stuff.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
qemu commit 13c6be96618c ("net: stream: add unix socket") introduces
support for native AF_UNIX support, finally making qrap useless.
We can't quite drop that yet until a qemu release includes it, and
then we'll need to wait a while for users to switch anyway, but at
least for tests, we can use that support.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
As pasta now configures that target network namespace with
--config-net, we need to wait for addresses and routes to be actually
present. Just sending netlink messages doesn't mean this is done
synchronously.
A more elegant alternative, which probably makes sense regardless of
this test setup, would be to query, from pasta, addresses and routes
we added, and wait until they're there, before proceeding.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
These show a summary of memory usage in kernel and userspace with
different port forwarding configurations, details of userspace usage
using 'nm' (passt only uses statically allocated memory), and details
of kernel memory from slab reporting facilities.
This adds a new test image, mbuto.mem.img, with harcoded IPv4 and
IPv6 addresses and routes, and just the tools we need to start and
stop passt, to report from /proc/slabinfo, /proc/meminfo, and to
print and parse symbol sizes using nm(1).
passt can't pivot_root() for sandboxing purposes on ramfs, so we need
to create another filesystem and chroot into it, first.
We don't want to use pane context functions, as we're checking memory
usage for sockets: resort to screen-scraping.
Configure a dummy interface to provide passt with an appearance of
working IPv4 and IPv6 connectivity, contributed by David.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
This can be used for generic cell values with an arbitrary scale.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Instead of just disabling performance reports if running in demo
mode. This allows us to use table functions outside of performance
reports.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
I'm going to add yet another one of those, for which I have no quick
solution. It's a regression in some sense, but at least if we make
this regression more observable and defined, it should be easier to
find a comprehensive solution later, within this or another testing
framework.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
They're too slow to cope with current release cycles, and they
haven't found bugs in months, also because clang-tidy and cppcheck
would find most of them earlier.
Disable them for the moment. We should pre-install gcc and make in
non-x86 images, as those run on my test machine with qemu TCG, and
that's the real slow-down here. Then we can re-enable them.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
To test log files on a tmpfs mount, we need to unshare the mount
namespace, which means using a context for the passt pane is not
really practical at the moment, as we can't open a shell there, so
we would have to encapsulate all the commands under 'unshare -rUm',
plus the "inner" pasta command, running in turn a tcp_rr server.
It might be worth fixing this by e.g. detecting we are trying to
spawn an interactive shell and adding a special path in the context
setup with some form of stdin redirection -- I'm not sure it's doable
though.
For this reason, add a new layout, using a context only for the host
pane, while keeping the old command dispatch mechanism for the passt
pane.
We also need a new setup function that doesn't start pasta: we want
to start and restart it with different options.
Further, we need a 'pint' directive, to send an interrupt to the
passt pane: add that in lib/test.
All the tests before the one involving tmpfs and a detached mount
namespace were also tested with the context mechanism. To make an
eventual conversion easier, pass tcp_crr directly as a command on
pasta's command line where feasible.
While at it, fix the comment to the teardown_pasta() function.
The new test set can be semi-conveniently run as:
./run pasta_options/log_to_file
and it checks basic log creation, size of the log file after flooding
it with debug entries, rotations, and basic consistency after
rotations, on both an existing filesystem and a tmpfs, chosen as
it doesn't support collapsing data ranges via fallocate(), hence
triggering the fall-back mechanism for logging rotation.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
The distro and performance tests are by far the slowest part of the passt
testsuite. Move them to the end of the testsuite run, so that it's easier
to do a quick test during development by letting the other tests run then
interrupting the test runner.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>