These 3 functions are easier to understand, and more efficient, when
the IPv4 address is viewed as a uint32 rather than an array of bytes.
virsocketAddrGetIPv4Addr() has bothered me for a long time - it was
doing ntohl of the address into a temporary uint32, and then a loop
one-by-one swapping the order of all the bytes back to network
order. Of course this only works as described on little-endian
architectures - on big-endian architectures the first assignment won't
swap the bytes' ordering, but the loop assumes the bytes are now in
little-endian order and "swaps them back", so the result will be
incorrect. (Do we not support any big-endian targets that would have
exposed this bug long before now??)
virSocketAddrCheckNetmask() was checking each byte of the two
addresses individually, when it could instead just do the operation
once on the full 32 bit values.
virSocketGetRange() was checking for "range > 65535" by seeing if the
first 2 bytes of the start and end were different, and then doing
arithmetic combining the lower two bytes (along with necessary bit
shifting to account for network byte order) to determine the exact
size of the range. Instead we can just get the ntohl of start & end,
and do the math directly.
Signed-off-by: Laine Stump <laine@redhat.com>
virSocketAddrIPv4 is a type used only internally by
virsocketaddr.c. It is defined to be a character array, which leads to
multiple occurences of extra bit fiddling and byte swapping for no
good reason (except to confuse).
An IPv4 address is really just a uint32_t with the bytes in network
order, which is exactly the type of the s_addr member of the
sockaddr_in that is a part of the publicly consumed struct
virSocketAddr, and that we are copying in and out of a
virSocketAddrIPv4. Sometimes it's simpler to just treat it as a
network-order uint32_t, so let's make our virSocketAddrIPv4 a union
that has both an unsigned char bytes[4] (for the times when we need to
look one byte at a time) and a uint32_t val (for the times when it's
simpler to treat it as a single value).
For now we just change all the uses from, e.g. x[i] to x.bytes[y];
an upcoming patch will simplify some of the code to remove loops by
using x.val instead of x.bytes when appropriate.
Signed-off-by: Laine Stump <laine@redhat.com>
Many years ago (2011), virSocketAddrMask() had caused a bug by failing
to initialize an IPv6-specific field in the result virSocketAddr. This
was fixed by memset(0)ing the entire result (*network) at the
beginning of the function (thus making sure anything and everything
was initialized).
The problem is that virSocketAddrMask() has a comment above it that
says that the source (addr) and destination (network) arguments can
point to the same virSocketAddr. But in that case, the
memset(*network, 0) at the top of the function is actually doing a
memset(*addr, 0), and so there is nothing left for all the assignments
to copy except a giant field of 0's.
Fortunately in the 13 years since the memset was added, nobody has
ever called virSocketAddrMask() with addr and network being the same.
This patch makes the code agree with the comment by copying/masking
into a local virSocketAddr (which is initialized to all 0) and then
copying that to *network after it's finished assigning things from
addr.
Fixes: ba08c5932e
Signed-off-by: Laine Stump <laine@redhat.com>
This patch simplifies (?) the of qemuDomainChangeNet() code while
fixing some incorrect decisions about exactly when it's necessary to
re-attach an interface's bridge device, or to fail the device update
(needReconnect[*]) because the type of connection has changed (or
within bridge and direct (macvtap) type because some attribute of the
connection has changed that can't actually be modified after the
tap/macvtap device of the interface is created).
Example 1: it's pointless to require the bridge device to be
reattached just because the interface has been switched to a different
network (i.e. the name of the network is different), since the new
network could be using the same bridge as the old network (very
uncommon, but technically possible). Instead we should only care if
the name of the *bridge device* changes (or if something in
<virtualport> changes - see Example 3).
Example 2: wrt changing the "type" of the interface, a change should
be allowed if old and new type both used a bridge device (whether or
not the name of the bridge changes), or if old and new type are both
"direct" *and* the device being linked and macvtap mode remain the
same. Any other change in interface type cannot be accommodated and
should be a failure (i.e. needReconnect).
Example 3: there is no valid reason to fail just because the interface
has a <virtualport> element - the <virtualport> could just say
"type='openvswitch'" in both the before and after cases (in which case
it isn't a change by itself, and so is completely acceptable), and
even if the interfaceid changes, or the <virtualport> disappears
completely, that can still be reconciled by simply re-attaching the
bridge device. (If, on the other hand, the modified <virtualport> is
for a type='direct' interface, we can't domodify that, and so must
fail (needReconnect).)
(I tried splitting this into multiple patches, but they were so
intertwined that the intermediate patches made no sense.)
[*] "needReconnect" was a flag added to this function way back in
2012, when I still believed that QEMU might someday support connecting
a new & different device backend (the way the virtual device connects
to the host) to an already existing guest netdev (the virtual device
as it appears to the guest). Sadly that has never happened, so for the
purposes of qemuDOmainChangeNet() "needReconnect" is equivalent to
"fail".
Resolves: https://issues.redhat.com/browse/RHEL-7036
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The new function does what the old qemuDomainChangeNetbridge() did
manually, except that:
1) the new function supports changing from a bridge of one type to
another, e.g. from a Linux host bridge to an OVS
bridge. (previously that wasn't handled)
2) the new function doesn't emit audit log messages. This is actually
a good thing, because the old code would just log a "detach"
followed immediately by "attach" for the same MAC address, so it's
essentially a NOP. (the audit logs don't have any more detailed
info about the connection - just the VM name and MAC address, so it
makes no sense to log the detach/attach pair as it's not providing
any information).
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
It can be useful to force an interface to be detached/reattached from
its bridge even if it's the same bridge - possibly something like the
virtualport profileID has changed, and a detach/attach cycle will get
it connected with the new profileID.
The one and only current use of virNetDevTapReattachBridge() sets
force to false, to preserve current behavior. An upcoming patch will
use it with force set to true.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Attempts to use update-device to modify just the link state of a guest
interface were failing due to a supposed attempt to modify something
in the interface that can't be modified live (even though the only
thing that was changing was the link state, which *can* be modified
live).
It turned out that this failure happened because the guest interface
in question was type='network', and the network in question was a
'direct' network that provides each guest interface with one device
from a pool of network devices. As a part of qemuDomainChangeNet() we
would always allocate a new port from the network driver for the
updated interface definition (by way of calling
virDomainNetAllocateActualDevice(newdev)), and this new port (ie the
ActualNetDef in newdev) would of course be allocated a new host device
from the pool (which would of course be different from the one
currently in use by the guest interface (in olddev)). Because direct
interfaces don't support changing the host device in a live update,
this would cause the update to fail.
The solution to this is to realize that as long as the interface
doesn't get switched to a different network as a part of the update,
the network port information (ie the ActualNetDef) will not change as
a part of updating the guest interface itself. So for sake of
comparison we can just point the newdev at the ActualNetDef of olddev,
and then clear out one or the other when we're done (to avoid a double
free or, more likely, attempt to reference freed memory).
(If, on the other hand, the name of the network has changed, or if the
interface type has changed to type='network' from something else, then
we *do* need to allocate a new port (actual device) from the network
driver (as we used to do in all cases when the new type was
'network'), and also indicate that we'll need to replace olddev in the
domain with newdev (because either of these changes is major enough
that we shouldn't just try to fix up olddev)
Partially-Resolves: https://issues.redhat.com/browse/RHEL-7036
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
'charstr' is unused since 36d06a5637, breaking the build on some
platforms. Remove it.
Fixes: 36d06a5637
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
QEMU supports only 'raw' and 'telnet' in the
<protocol type='telnets'/>
element. Reject 'telnets' and 'tls'. TLS transport for qemu chardevs is
configured via "tls='yes'" attribute added to the "<source>" element
instead, so this prevents potential misconfig as the value would be
silently accepted.
Closes: https://gitlab.com/libvirt/libvirt/-/issues/412
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use virDomainChrTcpProtocol as type, convert the parser to use
virXMLPropEnum and fix one switch statement in the VMX driver.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Now that we have a unified generator of chardev backend which is also
validated against the QMP schema we can replace the old generator with
it.
This patch modifies the monitor code to take virJSONValue 'props'
instead of the chardev definition and adds the conversion from the
chardev definition to JSON on higher levels.
The monitor code now also attempts to extract the returned 'pty' if
returned from qemu, so higher level code needs to report the error if
the path is needed and missing.
The current monitor generator is for now abandoned in place and will be
removed later.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The upcoming refactor of the monitor code will make the hotplug code
paths use the same generator we have for commandline -chardev backends
which doesn't refuse to format certain backends which can't be
hotplugged.
To prepare for this we add a check to qemuHotplugChardevAttach()
refusing such hotplug and remove 'qemumonitorjsontest' test cases which
will not make sense any more.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use the 'chardev-backends' test data as symlink to invoke the test case
again asserting QEMU_CAPS_CHARDEV_JSON which will make the commandline
generator use the JSON representation of the -chardev backend instead
allowing us to validate it agains the QMP schema.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
While qemu doesn't yet support JSON args for chardev, we can at least
for test purposes of schema validation plumb it to the '-chardev'
command as it's easier to create test cases via XML than to write them
into code in 'qemuhotplugtest'.
Additionally once this becomes available and if e.g. the syntax is fixed
we'll be able to also catch the differences early.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Similarly to how we approach the generators for
-device/-object/-blockdev/-netdev rewrite the generator of -chardev to
be unified with the generator for the monitor.
Unfortunately with -chardev it will be a bit more quirky when compared
to the others as the generator itself will need to know whether it
generates command line output or not as a few field names change and data
is nested differently.
This first step adds the generator and uses it only for command line
generation. This was possible to achieve without changing any of the
output in tests.
In further patches the same generator will then be used also in the
monitor code replacing both.
As basis for the generator I took the monitor code but modified it to
have the same field order as the commandline code and extended it
further to support all backend types, even those which are not
hotpluggable.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The test case attempts to test as many of the chardev backends as
possible by adding channels with various configs. The idea is to have a
representative sample which will later be used also for QMP schema
testing.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
I've added that capability a long time ago when I was converting various
stuff to use JSON but the support in '-chardev' didn't yet materialize.
Fix the comment to make that clear and also that it'll be used in tests
for the upcoming refactor of the chardev code (so that we can validate
generator against the schema even if that doesn't yet work).
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The riscv64 architecture is not yet fully integrated into
Fedora, but KVM support is already implemented across the stack
and the Fedora package for QEMU is already set up to generate
the qemu-kvm binary package when targeting it.
Thanks: David Abdurachmanov <davidlt@rivosinc.com>
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
When a bridge device for a virtual network had been placed in a
firewalld zone while starting the network, then even after the network
is shut down and the bridge device is deleted, its name will still
show up in the list of interfaces for whichever zone it had been in,
and this setting will persist through the next time a device with the
same name is created (until a zone is once again explicitly set, or
the device is removed via a firewalld API call).
Usually this isn't a problem, but in the case of forward mode='open',
someone might start the network once with a zone specified, then
shut down the network, remove the zone from its config, and start it
again; in this case the bridge device would come up using the zone
from the previous time it was started.
The solution to this is to remove the interface from whatever zone it
is in as the network is being shut down. There is no downside to doing
this, since the device is going to be deleted anyway. Note that
forward mode='bridge' uses a bridge device that was created outside of
libvirt, and libvirt won't be deleting that bridge, so we take care to
not unset the zone in that case.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
At the time the version check in this function was written, there were
still several supported versions of some distros that were using a
version of firewalld too old to support the "rich rule priorities"
used by the 'libvirt' zone that we installed for firewalld. Today the
newest distro that has a version of firewalld < 0.7.0 is
RHEL7/CentOS7, so we can remove the complexity and if the libvirt zone
is missing simply say "the libvirt zone is missing".
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
The bit of code that sets the firewalld zone was previously a part of
the function networkAddFirewallRules(), which is not called for
networks with <forward mode='open'/>.
Setting the 'libvirt' zone for the bridge device of virtual networks
that also add firewall rules is usually necessary in order to get the
expected traffic through without modifying firewalld's default zone
(which would be a bad idea, because that would affect all the other
host interfaces set to the default zone), but in general we would
*not* want the bridge device for a mode='open' virtual network to be
automatically placed in the "libvirt" zone. However, a user might want
to *explicitly* set some other firewalld zone for mode='open'
networks, and libvirt's network config is a convenient place to do
that.
We enable this by moving the code that sets the firewalld zone into a
separate function that is called for all forward modes that use a
bridge device created/managed by libvirt (nat, route, isolated,
open). If no zone is specified, then the bridge device will be in
whatever zone interfaces are put in by default, but if the <bridge>
element has a "zone" attribute, then the new bridge device will be
placed in the specified zone.
NB: This function is only called when the network is started, and
*not* when the firewall rules of an active network are reloaded at
virtnetworkd restart time, because the firewalld zone of an interface
isn't something that gets inadvertantly changed as a part of some
other unrelated action. For example all iptables rules are cleared by a
firewalld restart, including those rules added by libvirt, but there
is no blanket action that changes the zone of all interfaces, so it's
useful for libvirt to reload its rules when restarting virtnetworkd,
but pointless to re-add the interface to its preferred zone.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/215
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
The 'open' forward type probably hadn't yet been added when this
message was written.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
The whole point of <forward mode='open'/> is to supress libvirt from
adding any firewall rules for a network, and someone might want to
create a network with no IP address (i.e. they don't want the guests
to have connectivity to the host via this interface) and no firewall
rules (they don't want any, or they want to add their own). So there's
no reason to fail when a network has <forward mode='open'/> and also
has no IP address.
Kind-of-Resolves: https://gitlab.com/libvirt/libvirt/-/issues/588
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
If a network disappeared the daemon should not only remove it from the
list of networks, but also do a proper cleanup.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
The new function (networkCleanupInactive) can be called from an iterator
over the list of networks without the risk of deadlock.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Just in case one needs a clean up.
Resolves: https://issues.redhat.com/browse/RHEL-50968
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Once networkUpdateState() identifies a dead network it should clean up
after it as well.
Resolves: https://issues.redhat.com/browse/RHEL-50968
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
It skips the cleanup from networkStartNetwork and the only other path
already checks if the network is active or not.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
It will be more useful in there when calling from new places.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
The function networkShutdownNetwork already does that.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
The semantic does not change since inside networkUpdatePort() (well,
networkNotifyPort, for which the former is a wrapper) exits for inactive
networks, but with an error we can easily avoid with this patch.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Currently, if either template is missing AppArmor support is
completely disabled. This means that uninstalling the LXC
driver from a system results in QEMU domains being started
without AppArmor confinement, which obviously doesn't make any
sense.
The problematic scenario was impossible to hit in Debian until
very recently, because all AppArmor files were shipped as part
of the same package; now that the Debian package is much closer
to the Fedora one, and specifically ships the AppArmor files
together with the corresponding driver, it becomes trivial to
trigger it.
Drop the checks entirely. virt-aa-helper, which is responsible
for creating the per-domain profiles starting from the
driver-specific template, already fails if the latter is not
present, so they were always redundant.
https://bugs.debian.org/1081396
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The code did it "just in case" the allocation was not reset for new
subdirectories. That might've happened in the past with CAT settings,
but checking it now it is properly reset to its maximum values for each
new CLOSID (Class of Service ID).
The advantage of this is that we do not rewrite the value with itself
which causes an issue with the current linux kernel and mba_MBps option
where the default is UINT_MAX (or (uint32_t) -1), but gets rounded up to
bandwidth granularity (10), overflows and small number (4) is set
instead.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Unfortunately, devfs on FreeBSD (accessible via /dev/fd) exposes
only those FDs which can be represented as a file. To cite
manpage [1]:
The files /dev/fd/0 through /dev/fd/# refer to file descriptors
which can be accessed through the file system.
This means FDs representing pipes and/or unnamed sockets are not
visible by default. To expose all FDs a slightly different
filesystem must be mounted [2]:
mount -t fdescfs none /dev/fd
Apparently, on my test machine fdescfs is mounted by default and
thus I haven't seen any problem. Only after aforementioned patch
was merged our CI started reporting problems. While we could try
to figure out whether correct FS is mounted, it's a needless
micro optimization. Just revert the code to the state it was
before I touched it.
1: https://man.freebsd.org/cgi/man.cgi?query=fd&sektion=4&manpath=freebsd-release-ports
2: https://man.freebsd.org/cgi/man.cgi?query=fdescfs&sektion=5&n=1
This reverts commit 308ec0fb2c.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
On BSD-like systems "/dev/fd" serves the same purpose as
"/proc/self/fd". And since procfs is usually not mounted, on such
systems we can use "/dev/fd" instead.
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/518
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
The point of calling sysconf(_SC_OPEN_MAX) is to allocate big
enough bitmap so that subsequent call to
virCommandMassCloseGetFDsDir() can just set the bit instead of
expanding memory (this code runs in a forked off child and thus
using async-signal-unsafe functions like malloc() is a bit
tricky).
But on some systems the limit for opened FDs is virtually
non-existent (typically macOS Ventura started reporting EINVAL).
But with both glibc and musl using malloc() after fork() is safe.
And with sufficiently new glib too, as it's using malloc() with
newer releases instead of their own allocator.
Therefore, pick a sufficiently large value (glibc falls back to
256, [1], Darwin to 10240 [2] so 10240 should be good enough) to
fall back to and make the error non-fatal.
1: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/getdtsz.c;h=4c5a6208067d2f9eaaac6dba652702fb4af9b7e3;hb=HEAD
2 https://github.com/apple/darwin-xnu/blob/main/bsd/sys/syslimits.h#L104
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>