Introduce ip.[ch] file to encapsulate IP protocol handling functions and
structures. Modify various files to include the new header ip.h when
it's needed.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-ID: <20240303135114.1023026-5-lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Sufficiently recent cppcheck (I'm using 2.13.0) seems to have added another
warning for pointer variables which could be pointer to const but aren't.
Use this to make a bunch of variables const pointers where they previously
weren't for no particular reason.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
In practical terms, passt doesn't benefit from the additional
protection offered by the AGPL over the GPL, because it's not
suitable to be executed over a computer network.
Further, restricting the distribution under the version 3 of the GPL
wouldn't provide any practical advantage either, as long as the passt
codebase is concerned, and might cause unnecessary compatibility
dilemmas.
Change licensing terms to the GNU General Public License Version 2,
or any later version, with written permission from all current and
past contributors, namely: myself, David Gibson, Laine Stump, Andrea
Bolognani, Paul Holzinger, Richard W.M. Jones, Chris Kuhn, Florian
Weimer, Giuseppe Scrivano, Stefan Hajnoczi, and Vasiliy Ulyanov.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
While generating -device as JSON when JSON is in use is
mandatory, because not doing so can often prevent the VM from
starting up, using JSON for -netdev simply makes things a bit
nicer. No reason not to do it, though.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
For pc machines, devices are placed directly on pci.0 with
addresses like
bus=pci.0,addr=0xa
and in this case the existing code works correctly.
For q35 machines, however, a separate PCI bus is created for
each devices using a pcie-root-port, and the resulting
addresses look like
bus=pci.9,addr=0x0
In this case, we need to treat PCI addresses as decimal, not
hexadecimal, both when parsing and generating them.
This issue has gone unnoticed for a long time because it only
shows up when enough PCI devices are present: for small
numbers, decimal and hexadecimal overlap, masking the issue.
Reported-by: Alona Paz <alkaplan@redhat.com>
Fixes: 5307faa05997 ("qrap: Strip network devices from command line, set them up according to machine")
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
When JSON support was introduced, the drop_args array has
not been updated accordingly.
Fixes: b944ca185587 ("qrap: Support JSON syntax for -device")
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
The JSON version of the template is incorrect, making qrap
completely unusable when JSON arguments are used together
with the pc machine type.
Fixes: b944ca185587 ("qrap: Support JSON syntax for -device")
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
The pci.0 bus on a pc machine has 32 slots.
For q35 machines, we don't expect devices to be plugged into
pcie.0 directly, so technically we could have a very large
number of slots by adding many pcie-root-ports, but even in
this scenario 32 is a reasonable number.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
We can't get rid of qrap quite yet, but at least we should start
telling users it's not going to be needed anymore starting from qemu
7.2.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Starting with version 8.1.0, libvirt uses JSON syntax when
generating the arguments to -device, so they will now look like
{"driver":"virtio-scsi-pci","bus":"pci.3","addr":"0x0"}
instead of
virtio-scsi-pci,bus=pci.3,addr=0x0
qrap needs to parse these arguments and extract the bus number
in order to figure out what address to use for the virtio-net
device it adds, and the libvirt change described above has
broken this parsing logic.
Tweak the code so that both styles are accepted and handled
correctly.
Note that, when JSON is in use, qrap needs to generate its own
command line options in that format as well or things will not
work as expected.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
clang-tidy warns that in passing getenv("PATH") to strncpy() we could be
passing a NULL pointer. While it's unusual for PATH to be unset, it's not
impossible and this would indeed cause getenv() to return NULL.
Handle this case by never recognizing argv[2] as a qemu binary name if
PATH is not set. This is... no flakier than the detection of whether
it's a binary name already is.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
qrap.c uses several old-fashioned functions that cppcheck complains about.
Since it's headed for obselesence anyway, just suppress these rather than
attempting to modernize the code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
cppcheck points out that qrap's main shadows the global err() function with
a local. Rename it to rc to avoid the clash.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
For a while now, passt disables ARP functionality completely if IPv4
is disabled. If qrap sends an ARP request as a probe in that case, it
will receive no answer and move on, trying to find another instance.
Add a second probe frame, a hardcoded neighbour solicitation, so that
we get a neighbour advertisement if IPv6 is enabled.
Without this change, IPv6-only operation is completely broken.
Reported-by: Wenli Quan <wquan@redhat.com>
Reported-by: Alona Paz <alkaplan@redhat.com>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2106257
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
In commit fca5e11773d0 ("qrap: Add probe retry on connection reset
from passt for KubeVirt integration") I just used errno to check if
the connection was reset on recv(), but perror() might set it to
EINVAL if e.g. an underlying logging mechanism fails, so we won't
actually catch the connection reset.
And in case recv() returns 0, errno won't be set, but we're still
using it without resetting it first, which leads to unpredictable
results in that case.
Reset errno before probing with connect(), send() and recv(), and
save it for later checks before calling perror().
Fixes: fca5e11773d0 ("qrap: Add probe retry on connection reset from passt for KubeVirt integration")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
...if there are two or more instances of libvirt in a KubeVirt
scenario using a number of instances of passt, the overlap period
with probing instances of qemu becomes much longer. Switch to 50
retries instead of 5.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
One retry after 100ms was enough for static builds, where qrap
takes a while to start, but it's sometimes not enough with
regular builds. Switch that to five retries with 50ms delay.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
KubeVirt uses libvirt to start qrap in its current draft integration
(https://github.com/kubevirt/kubevirt/pull/7849/), and libvirtd
starts qrap three times every time a new virtual machine is created:
once on domain creation, and twice on domain start (for "probing")
and to finally start it for real.
Very often, a subsequent invocation of qrap happen before the
previously running instance of qemu terminates, which means that
passt will refuse the new connection as the old one is still active.
Introduce a single retry with a 100ms delay to work around this. This
should be checked again once native libvirt support is there, and
that point qrap will have no reason to exist anymore.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
All instances were harmless, but it might be useful to have some
debug messages here and there. Reported by Coverity.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
clang-tidy from LLVM 13.0.1 reports some new warnings from these
checkers:
- altera-unroll-loops, altera-id-dependent-backward-branch: ignore
for the moment being, add a TODO item
- bugprone-easily-swappable-parameters: ignore, nothing to do about
those
- readability-function-cognitive-complexity: ignore for the moment
being, add a TODO item
- altera-struct-pack-align: ignore, alignment is forced in protocol
headers
- concurrency-mt-unsafe: ignore for the moment being, add a TODO
item
Fix bugprone-implicit-widening-of-multiplication-result warnings,
though, that's doable and they seem to make sense.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
This is the only remaining Linux-specific include -- drop it to avoid
clang-tidy warnings and to make code more portable.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
...mostly false positives, but a number of very relevant ones too,
in tcp_get_sndbuf(), tcp_conn_from_tap(), and siphash PREAMBLE().
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Unions and structs, you all have names now.
Take the chance to enable bugprone-reserved-identifier,
cert-dcl37-c, and cert-dcl51-cpp checkers in clang-tidy.
Provide a ffsl() weak declaration using gcc built-in.
Start reordering includes, but that's not enough for the
llvm-include-order checker yet.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Flooding a virtio-net interface connected to a socket back-end results
in a TX stall I'm still debugging. The stall goes away by setting a
higher value for x-txburst (256 by default).
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
On pc-q35, pci.2 is usually configured by libvirt as a hotplug bus,
so we can't use address 0x0 there. Look for free busses starting from
pci.3 instead.
While at it, add proper error reporting for passt probing, and add
some comments to structs that were previously missing.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
As libvirt can pass e1000e (not just e1000) devices as well,
make sure we also drop those network devices from the command
line before adding the parameters we need.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
The previous approach wasn't really robust: adding a -netdev option
without libvirt knowing could result in clashes with other devices.
Drop network devices from command line, check the available busses
and addresses from all -device options according to the -machine
parameter, and add a virtio-net device using an available address
or bus. Then, add a corresponding -netdev socket option.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
If a socket netdev parameter is already passed, don't touch the command
line. If it's not, add it, taking the id= reference from a netdev=
parameter, if any.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
The name-guessing loop should iterate over names, not single
characters. Also add /usr/libexec/qemu-kvm as full path for
execvp(): execvp() won't find it if it's not in $PATH, which
is the reason why it shouldn't be under /usr/libexec/, but this
seems to be the case for some current version of Fedora.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
It might be impractical to pass options to qrap when using libvirt,
because the <emulator/> tag expects a path to an executable, without
further arguments.
If the first argument is not a plausible socket number, and the
second argument is not a valid executable, look up a qemu command
from a list of possible names, then start it patching the command line
to include the -netdev fd= parameter corresponding to the AF_UNIX
domain socket we just opened.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
This is a reimplementation, partially building on the earlier draft,
that uses L4 sockets (SOCK_DGRAM, SOCK_STREAM) instead of SOCK_RAW,
providing L4-L2 translation functionality without requiring any
security capability.
Conceptually, this follows the design presented at:
https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Networking.md
The most significant novelty here comes from TCP and UDP translation
layers. In particular, the TCP state and translation logic follows
the intent of being minimalistic, without reimplementing a full TCP
stack in either direction, and synchronising as much as possible the
TCP dynamic and flows between guest and host kernel.
Another important introduction concerns addressing, port translation
and forwarding. The Layer 4 implementations now attempt to bind on
all unbound ports, in order to forward connections in a transparent
way.
While at it:
- the qemu 'tap' back-end can't be used as-is by qrap anymore,
because of explicit checks now introduced in qemu to ensure that
the corresponding file descriptor is actually a tap device. For
this reason, qrap now operates on a 'socket' back-end type,
accounting for and building the additional header reporting
frame length
- provide a demo script that sets up namespaces, addresses and
routes, and starts the daemon. A virtual machine started in the
network namespace, wrapped by qrap, will now directly interface
with passt and communicate using Layer 4 sockets provided by the
host kernel.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
We can bypass a full-fledged network interface between qemu and merd by
connecting the qemu tap file descriptor to a provided UNIX domain
socket: this could be implemented in qemu eventually, qrap covers this
meanwhile.
This also avoids the need for the AF_PACKET socket towards the guest.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>