The define is not used since virFileIsAbsPath() has been dropped.
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The function is no longer used since commit faf2d811f3.
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The function is no longer used since commit faf2d811f3.
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Let's just use the plain g_get_home_dir(), from GLib, instead of
maintaining a code adapted from the GLib's one.
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Assuming that the backing image format is raw is wrong when doing image
detection:
1) In -drive mode qemu will still probe the image format of the backing
image. This means it will try to open a backing file of the image
which will fail if a more advanced security model is in use.
2) In blockdev mode the image will be opened as raw actually which is
wrong since it might be qcow. Not opening the backing images will
also end up in the guest seeing corrupted data.
Rather than attempt to solve various corner cases when us assuming the
storage file being raw and actually being right forbid startup when the
guest image doesn't have the format specified in the metadata.
https://bugzilla.redhat.com/show_bug.cgi?id=1588373
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Now that we have virNVMeDevice module (introduced in previous
commit), let's use it int virHostdev to track which NVMe devices
are free to be used by a domain and which are taken.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
This module will be used by virHostdevManager and it's inspired
by virPCIDevice module. They are very similar except instead of
what makes a NVMe device: PCI address AND namespace ID. This
means that a NVMe device can appear in a domain multiple times,
each time with a different namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
This function will return true if there's a storage source of
type VIR_STORAGE_TYPE_NVME, or false otherwise.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
ACKed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
To simplify implementation, some restrictions are added. For
instance, an NVMe disk can't go to any bus but virtio and has to
be type of 'disk' and can't have startupPolicy set.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
This helper is cleaner than plain memcpy() because one doesn't
have to look into virPCIDeviceAddress struct to see if it
contains any strings / pointers.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
ACKed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
In near future we will have a list of PCI devices we want to
re-attach to the host (held in virPCIDeviceListPtr) but we don't
have virDomainHostdevDefPtr. That's okay because
virHostdevReAttachPCIDevices() works with virPCIDeviceListPtr
mostly anyway. And in very few places where it needs
virDomainHostdevDefPtr are not interesting for our case.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
ACKed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
In near future we will have a list of PCI devices we want to
detach (held in virPCIDeviceListPtr) but we don't have
virDomainHostdevDefPtr. That's okay because
virHostdevPreparePCIDevices() works with virPCIDeviceListPtr
mostly anyway. And in very few places where it needs
virDomainHostdevDefPtr are not interesting for our case.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
ACKed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Sometimes, we have a PCI address and not fully allocated
virPCIDevice and yet we still want to know its /dev/vfio/N path.
Introduce virPCIDeviceAddressGetIOMMUGroupDev() function exactly
for that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Replace all the uses passing a single parameter as the length.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This reverts commit 7be5fe66cd.
This commit broke resctrl, because it missed the fact that the
virResctrlInfoGetCache() has side-effects causing it to actually
change the virResctrlInfo parameter, not merely get data from
it.
This code will need some refactoring before we can try separating
it from virCapabilities again.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The keycodemap tool is told to generate docs in rst format now
instead of pod.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Pull in changes which support use of RST for docs output format
instead of POD.
The generator tool has changed its command line arg handling
so all args must be after the command name. The docs title and
subtitle must be specified separately too.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Clang complains about condition being always true:
src/util/virkeyfile.c:113:23: error: result of comparison of constant 128 with expression of type 'const char' is always true [-Werror,-Wtautological-constant-out-of-range-compare]
while (!IS_EOF && IS_ASCII(CUR) && CUR != ']')
^~~~~~~~~~~~~
src/util/virkeyfile.c:80:26: note: expanded from macro 'IS_ASCII'
~~~ ^ ~~~
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
GLib doesn't provide alternative to c_isascii and this is the only usage
of that macro so define a replacement ourselves.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
The same way how we have IS_EOL in two files where we actually need it
defince IS_BLANK so we can drop usage of c_isblank.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
We always refresh the capabilities object when using virResctrlInfo
during process startup. This is undesirable overhead, because we can
just directly create a virResctrlInfo instead.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The virTypedParamsFilter function doesn't mind params == NULL if nparams
is zero. And there's no need to check for params == NULL && nparams > 0
because this is checked higher in the stack.
In fact all the virCheckNonNull* checks in virTypedParamsFilter are
useless.
https://bugzilla.redhat.com/show_bug.cgi?id=1777094
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
The magic number is taken from the coreutils stat.c file since
there is no constant for it in normal system headers.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This also isn't required (due to the vportprofile being stored in the
NetDef as a pointer rather than being directly contained), but it
seemed dishonest to not mark it as const (and thus permit users to
modify its contents)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
In this case, the virNetDevBandwidthPtr that is returned is not to a
region within the virDomainNetDef arg, but points elsewhere (the
NetDef has the pointer, not the entire object), so technically it's
not necessary to make the return value a const, but it's a bit
disingenuous to *not* do it.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
This is needed if we want to call the function when the
virDomainNetDef* we have is a const.
Since virDomainNetGetActualVlan returns a pointer to memory that is
within the virDomainNetDefPtr arg, the returned pointer must also be
made const. This leads to a cascade of other virNetDevVlanPtr's that
must be changed to "const virNetDevVlan *".
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
This previous commit introduced a simpler free callback for
hash data with only 1 arg, the value to free:
commit 49288fac96
Author: Peter Krempa <pkrempa@redhat.com>
Date: Wed Oct 9 15:26:37 2019 +0200
util: hash: Add possibility to use simpler data free function in virHash
It missed two functions in the hash table code which need
to call the alternate data free function, virHashRemoveEntry
and virHashRemoveSet.
After the previous patch though, there is no code that
makes functional use of the 2nd key arg in the data
free function. There is merely one log message that can
be dropped.
We can thus purge the current virHashDataFree callback
entirely, and rename virHashDataFreeSimple to replace
it.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Since g_strdup_printf will abort, we know @newfile won't be NULL.
Found by Coverity
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
This flag is not implied by g_mkstemp_full, only by g_mkstemp.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reported-by: Bjoern Walk <bwalk@linux.ibm.com>
Fixes: 4ac4773040
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
In functions implemented here we fill this attr union (type of
bpf_attr) and just pass it to syscall(2). Thing is that some of
the union members are type of __aligned_u64. This is not regular
uint64_t. This one is explicitly aligned to 8 bytes, while
uint64_t can be aligned to 4 bytes (on 32 bits). We've used
explicit typecast to uint64_t to shut compiler which would
otherwise complain of assigning a pointer into an integer. Well,
we have uintptr_t just for that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In virCgroupV2DevicesReallocMap() we are debug printing both
arguments passed to the function. However, the @size argument is
type of size_t but '%lu' is used to format it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There are some OSes which don't have syscall() nor
<sys/syscall.h>. We already check for the header file in
configure phase, so we just need to add check for
HAVE_SYS_SYSCALL_H to HAVE_DECL_BPF_PROG_QUERY.
While I'm at it, some header files we are including are not
needed, so their includes can be safely dropped.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Call first virCgroupNew on the parent group virCgroupNewPartition if
it is available on before the creation of the child group. This
ensures that the creation of a first level group on the unified
architecture, as the check at virCgroupV2ParseControllersFile as the
parent file is there.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1760233
Signed-off-by: Miguel Ángel Arruga Vivas <rosen644835@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Glib implementation follows the ISO C99 standard so it's safe to replace
the gnulib implementation.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
We need to mock virCgroupV2DevicesAvailable() in order to remove any
dependency on kernel as BPF devices might not be available.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
So the issue here is that you can end up with configuration where
you have cgroup v1 and v2 enabled at the same time and the devices
controllers is enabled for cgroup v1.
In cgroup v2 there is no devices controller, the device access is
controlled using BPF and since it is not a cgroup controller both
of them can exists at the same time and both of them are applied while
resolving access to devices.
In order to avoid configuring both BPF and cgroup v1 devices we will
use BPF if possible and otherwise fallback to cgroup v1 devices.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
If we want to deny all devices we just need to replace any existing
program with new program with empty map.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
If we want to allow all devices with all permissions we need to replace
any existing program that has any rule configured, otherwise we just
need to add new rule which will for example allow read access to all
devices.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In order to deny device we need to check if there is any entry in BPF
map and we need to load the current value from map if there is already
entry for that device. If both values are same we can remove that entry
but if they are different we need to update the entry because we don't
have to deny all access, but for example only write access.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In order to allow device we need to create key and value which will be
used to update BPF map. virBPFUpdateElem() can override existing
entries in BPF map so we need to check if that entry exists in order to
track number of entries in our map.
This can add rule for specific device but major and minor can be both
-1 which follows the same behavior as in cgroup v1.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Device rules are stored in BPF map that is a hash type, this function
will create a key based on major and minor id of device.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We need to close our FD that we have for BPF program and map in order
to let kernel remove all resources once the cgroup is removed as well.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This function will be called for every virCgroup(Allow|Deny)* API in
order to prepare BPF program for guest. Since libvirtd can be restarted
at any point we will first try to detect existing progam, if there is
none we will create a new empty BPF program and lastly if we don't have
any space left in the existing BPF map we will create a new copy of the
BPF map with more space and attach a new program with that map into the
guest cgroup.
This solution allows us to start with reasonably small BPF map consuming
only small amount of memory and if needed we can easily extend the BPF
map if there is a lot of host devices used in guest or if user wants to
hot-plug a lot of devices once the guest is running.
Since there is no way how to reallocate existing BPF map we need to
create a new copy if we run out of space in current BPF map.
This overcomes all the limitations in BPF:
- map used in program has to be created before the program is loaded
into kernel
- once map is created you cannot change its size
- you cannot replace map in existing program
- you cannot use an array of maps because it can store FD to maps
of one specific size so we would not be able to use it to overcome
the second issue
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This function creates new BPF program with new empty BPF map with the
default size and attaches it to the guest cgroup.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This function will be called if libvirtd was restarted while some
domains were running. It will try to detect existing programs attached
to the guest cgroup.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This function loads the BPF prog with prepared map into kernel and
attaches it into guest cgroup. It can be also used to replace existing
program in the cgroup if we need to resize BPF map to store more rules
for devices. The old program will be closed and removed from kernel.
There are two possible ways how to create BPF program:
- One way is to write simple C-like code which can by compiled into
BPF object file which can be loaded into kernel using elfutils.
- The second way is to define macros which look like assembler
instructions and can be used directly to create BPF program that
can be directly loaded into kernel.
Since the program is not too complex we can use the second option.
If there is no program, all devices are allowed, if there is some
program it is executed and based on the exit status the access is
denied for 0 and allowed for 1.
Our program will follow these rules:
- first it will try to look for the specific key using major and
minor to see if there is any rule for that specific device
- if there is no specific rule it will try to look for any rule that
matches only major of the device
- if there is no match with major it will try the same but with
minor of the device
- as the last attempt it will try to look for rule for all devices
and if there is no match it will return 0 to deny that access
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There is no exact way how to figure out whether BPF devices support is
compiled into kernel. One way is to check kernel configure options but
this is not reliable as it may not be available. Let's try to do
syscall to which will list BPF cgroup device programs.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In order to implement devices controller with cgroup v2 we need to
add support for BPF programs, cgroup v2 doesn't have devices controller.
This introduces required helpers wrapping linux syscalls.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
With g_mkstemp_full, there is no need to distinguish between
mkostemp and mkostemps (no suffix vs. a suffix of a fixed length),
because the GLib function looks for the XXXXXX pattern everywhere
in the string.
Use S_IRUSR | S_IWUSR for the permissions and do not pass O_RDWR
in flags since it's implied.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This saves us from allocating vars upfront, since GLib deals with
that for us.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The virHostGetBootTimeProcfs() function is defined only for Linux
and therefore it's only call should also be done if we're on
Linux.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Use our helper instead of the gnulib one.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Make it more obvious that the function will return NULL if the file is
not executable and stop reusing variables.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Simplify the final lookup loop by freeing memory automatically and thus
being able to directly return the result.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When we want to know the boot timestamp of the host, we can call
virHostGetBootTime(). Under the hood, it uses getutxid() which is
defined by POSIX and properly check for in configure. However,
musl took a path where it declares the function but instead of
providing any useful implementation it returns NULL meaning "no
record found". If that's the case, use our second best option -
/proc/uptime and a bit of maths.
https://bugzilla.redhat.com/show_bug.cgi?id=1760885
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Instead of vsnprintf from gnulib, use g_vsnprintf from GLib.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
The callers don't actually use the returned errno for reporting errors.
Additionally virFileResolveAllLinks returns -1 rather than -errno on
error thus you'd get a spurious EPERM even on other errors.
Don't try to return errno in this case.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Return -1 on failure rather than -errno since none of the callers
actually cares about the return value. This specifically fixes returns
of -ENOMEM in cases of bad usage, which would report wrong error
anyways.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use g_strndup in all the cases where we check upfront whether a pointer
is non-NULL and then use it to calculate the copied length.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Promote usage of separate buffers for separate formatting passes by
removing the now unused virBufferSetChildIndent.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use the new helper to initialize child XML element buffers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add a new macro which initializes a virBuffer on the stack and also sets
the indent level to be used for child XML element formatting.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Now that function is no longer used, it can be dropped.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Now that function is no longer used, it can be dropped.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Commit d19c21429f modified the condition so that it checks whether the
value is more than 0xFFFFFFFF. Since addr->domain is an unsigned int, it
will never be more than that.
Remove the whole check
src/util/virpci.c:1291:22: error: result of comparison 'unsigned int' > 4294967295 is always false [-Werror,-Wtautological-type-limit-compare]
if (addr->domain > 0xFFFFFFFF) {
~~~~~~~~~~~~ ^ ~~~~~~~~~~
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Libvirtd has set SIGPIPE to ignored, and virFork resets all signal
handlers to the defaults. But child process may write logs to
stderr/stdout, that may generate SIGPIPE if journald has stopped.
So set SIGPIPE to a dummy no-op handler before unmask signals in
virFork(), and the handler will get reset to SIG_DFL when execve()
runs. Now we can delete sigaction() call entirely in virExec().
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Wang Yechao <wang.yechao255@zte.com.cn>
When libvirt first implemented a stable and configurable MAC address
for the bridges created for libvirt virtual networks (commit
5754dbd56d, in libvirt v0.8.8) most distro stable releases didn't
support explicitly setting the MAC address of a bridge; the bridge
just always assumed the lowest numbered MAC of all attached
interfaces. Because of this, we stabilized the bridge MAC address by
creating a "dummy" tap interface with a MAC address guaranteed to be
lower than any of the guest tap devices' MACs (which all started with
0xFE, so it's not difficult to do) and attached it to the bridge -
this was the inception of the "virbr0-nic" device that has confused so
many people over the years.
Even though the linux kernel had recently gained support for
explicitly setting a bridge MAC, we deemed it unnecessary to set the
MAC that way, because the other (indirect) method worked everywhere.
But recently there have been reports that the bridge MAC address was
not following the setting in the network config, and mismatched the
MAC of the dummy tap device (which was still correct). It turns out
that this is due to a change in systemd-242 that persists whatever MAC
address is set for a bridge when it's initially started. According to
the systemd NEWS file entry for version 242
(https://github.com/systemd/systemd/blob/master/NEWS):
"if a bridge interface is created without any slaves, and gains
a slave later, then now the bridge does not inherit slave's MAC."
This change was the result of:
https://github.com/systemd/systemd/issues/3374
(apparently if there is no MAC saved for a bridge by the name of a
bridge being created, the random MAC generated during creation is
saved, and then that same MAC is used to explicitly set the MAC each
time it is created). Once a bridge has an explicitly set MAC, the "use
the lowest numbered MAC of attached devices" rule is ignored, so our
dummy tap device is like the goggles - it does nothing! (well, almost).
We could whine about changes in default behavior, etc. etc., but
because the change was in response to actual user problems, that seems
likely a fruitless task. Fortunately, time has marched on, and even
distro releases that are old enough that they are no longer supported
by upstream libvirt (e.g. RHEL6) have support for explicitly setting a
bridge device MAC address, either during creation or with a separate
ioctl after creation, so we can now do that.
To enable explicitly setting the mac during bridge creation, we add a
mac arg to virNetDevBridgeCreate(). In the case of platforms where
the bridge is created with a netlink RTM_NEWLINK message, we just add
that mac to the message. For platforms that still use an ioctl (either
SIOCBRADDBR or SIOCIFCREATE2), we make a separate call to
virNetDevSetMAC() after creating the bridge.
(NB: I was unable to test the calling of virNetDevSetMAC() from the
SIOCIFCREATE2 (BSD) version of virNetDevBridgeCreate(); even though I
managed to get a FreeBSD system setup and libvirt built there, when I
tried to start the default network the SIOCIFCREATE2 ioctl itself
failed, so it never even got to the virNetDevSetMAC(). That leaves the
FreeBSD implementation untested.)
This makes the dummy tap pointless for purposes of setting the MAC
address, but it is still useful for IPv6 DAD initialization (which
apparently requires at least one interface to be attached to the
bridge and online), as well as for setting an initial MTU for the
bridge, so it hasn't been removed.
(NB: we can safely *always* call virNetDevBridgeCreate() with
&def->mac from the network driver because, in spite of the existence
of a "mac_specified" bool in the config suggesting that it may not
always be present, in reality a mac address will always be added to
any network that doesn't have one - this is guaranteed in all cases by
commit a47ae7c004)
https://bugzilla.redhat.com/show_bug.cgi?id=1760851
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Although until now, any use of the extra_args argument (a pointer to a
struct containing extra attributes to add the the RTM_NEWLINK message)
would always have the ifindex and mac set, so the code could assume it
was safe to add both to the message if extra_args != NULL. There is
now a use for setting a MAC address in the RTM_NEWLINK without setting
the ifindex, so we should check each of these separately.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Now that we don't have to deal with errors of virBuffer we can also make
this function void.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function now does not return an error so we can drop it fully.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>