Make the header easier to read and let the compiler inline
what it wants.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Just like virhostdev, this depends on domain_conf and
it's shared by multiple hypervisor drivers.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This module depends on domain_conf and is used directly by various
hypervisor drivers.
Move it to src/hypervisor.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Currently they live in util/virhostdev.
However the virhostdev module is wrongly placed
in util, which is below conf/ in our hierarchy.
Move the functions that are actually used in conf/
to conf/ and remove the include of virhostdev.h
from domain_conf.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Another vircgroup helper to avoid code repetition between
the LXC and QEMU driver.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
lxcDomainSetMemoryParameters() and qemuDomainSetMemoryParameters()
has duplicated chunks of code that can be put in a new
helper.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This new helper avoids more code repetition inside
lxcDomainSetBlkioParameters() and qemuDomainSetBlkioParameters().
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
After the introduction of virDomainDriverMergeBlkioDevice() in a
previous patch, it is now clear that lxcDomainSetBlkioParameters() and
qemuDomainSetBlkioParameters() uses the same loop to set cgroup
blkio parameter of a domain.
Avoid the repetition by adding a new helper called
virDomainCgroupSetupDomainBlkioParameters().
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
lxcDomainParseBlkioDeviceStr() and qemuDomainParseBlkioDeviceStr()
are the same function. Avoid code repetition by putting the code
in a new helper.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
lxcDomainMergeBlkioDevice() and qemuDomainMergeBlkioDevice()
are the same functions. This duplicated code can't be put in
the existing domain_cgroup.c since it's not cgroup related.
This patch introduces a new src/hypervisor/domain_driver.c to
host this more generic code that can be shared between virt
drivers. This new file is then used to create a new helper
called virDomainDeivceMergeBlkioDevice() to eliminate the code
repetition mentioned above. Callers in LXC and QEMU files
were updated.
This change is a preliminary step for more code reduction of
cgroup related code inside lxcDomainSetBlkioParameters() and
qemuDomainSetBlkioParameters().
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
qemuSetupCgroupVcpuBW() and lxcSetVcpuBWLive() shares the
same code to set CPU CFS period and quota. This code can be
moved to a new virCgroupSetupCpuPeriodQuota() helper to
avoid code repetition.
A similar code is also executed in virLXCCgroupSetupCpuTune(),
but without the rollback on error. Use the new helper in this
function as well since the 'period' rollback, if not a
straight improvement for virLXCCgroupSetupCpuTune(), is
benign. And we end up cutting more code repetition.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The code that calls virCgroupSetCpuShares() and virCgroupGetCpuShares()
is repeated in 4 different places. Let's put it in a new
virCgroupSetupCpuShares() to avoid code repetition.
There's a reason of why we execute a Get in the same value we
just executed Set, explained in detail by commit 97814d8ab3.
Let's add a gist of the reasoning behind it as a comment in
this new function as well.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The code from qemuSetupCgroupCpusetCpus() and virLXCCgroupSetupCpusetTune()
can be centralized in a new helper called virCgroupSetupCpusetCpus().
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
virLXCCgroupSetupMemTune() and qemuSetupMemoryCgroup() shares
duplicated code that can be put in a new helper to avoid
code repetition.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There is duplicated code between virt drivers that needs to
be moved to avoid code repetition. In the case of duplicated
code between lxc_cgroup.c and qemu_cgroup.c a common place
would be utils/vircgroup.c. The problem is that this would
introduce /conf related definitions that shouldn't be imported
to vircgroup.c, which is supposed to be a place for utilitary
cgroups functions only. And syntax-check would forbid it anyway
due to cross-directory includes being used.
An alternative would be to overload domain_conf.c, which already
contains all the definitions required. But that file is already
crowded with XML handling code and we wouldn't do any favors to
it by putting more utilitary, non-XML parsing/formatting code
there.
In [1], Cole suggested a 'domain_cgroup' file to host common code
between lxc_cgroup and qemu_cgroup, and Daniel suggested a
'src/hypervisor' dir to host these type of files. This patch
introduces src/hypervisor/domain_cgroup.c and, to get started,
introduces a new virDomainCgroupSetupBlkio() function to host shared
code between virLXCCgroupSetupBlkioTune() and qemuSetupBlkioCgroup().
[1] https://www.redhat.com/archives/libvir-list/2019-December/msg00817.html
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Previous patch moved all duplicated code that were setting
and getting BlkioDevice parameters to vircgroup.c. We can
turn them into static and spare a few symbols in
libvirt_private.syms.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The current use of the functions that set and get
BlkioDevice attributes is doing a set(), followed by
a get() of the same parameter right after. This is done
because there is no guarantee that the kernel will accept
the desired value given by the set() call, thus we need to
execute a get() right after to get the actual value.
This patch adds helpers inside vircgroup.c to execute these
operations. Next patch will use these helpers to reduce
code repetition in LXC and QEMU files.
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This is a very simple thing to parse and format, but needs to be done
in 4 places, so two trivial utility functions have been made that can
be called from all the higher level parser/formatters:
<domain><interface>
<domain><interface><actual> (only in domain status)
<network>
<networkport>
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When this flag is set for an interface attached to a bridge, traffic
to/from the specified interface can only enter/exit the bridge via
another attached interface that *doesn't* have the BR_ISOLATED flag
set. This can be used to permit guests to communicate with the rest of
the network, but not with each other.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Even if an interface of type 'network', setting 'floor' is only supported
if the network's forward type is nat, route, open or none.
Signed-off-by: Pavel Mores <pmores@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This compound condition will be useful in several places so it
makes sense to give it a name for better readability.
Signed-off-by: Pavel Mores <pmores@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
It is no longer require since switching to the GLib based
event loop impl.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The libvirt-glib project has provided a GMainContext based
event loop impl for applications. This imports it and sets
it up for use by libvirt as the primary event loop. This
remains a private impl detail of libvirt.
IOW, applications must *NOT* assume that a call to
"virEventRegisterDefaultImpl" results in a GLib based
event loop. They should continue to use the libvirt-glib
API gvir_event_register() if they explicitly want to
guarantee a GLib event loop.
This follows the general principle that the libvirt public
API should not expose the fact that GLib is being used
internally.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The virFilePrintf function was a wrapper for fprintf() to provide
Windows portability, since gnulib's fprintf() replacement was
license restricted. This is no longer needed now we have the
g_fprintf function available.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This hides the differences between Windows and UNIX,
and adds standard error reporting.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Add a helper that concatenates the second array into the first.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Just like the existing virBufferTrim, but only
does one thing at a time.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
This is a simplified variant of gnulib's passfd module
without the portability code that we do not require.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This imports a simpler version of GNULIB's getpass() function
impl for Windows. Note that GNULIB's impl was buggy as it
returned a static string on UNIX, and a heap allocated string
on Windows. This new impl always heap allocates.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Currently it is possible to start a domain which have disks
in same iotune group and at the same time having different iotune
params. Both params set are passed to qemu in command line and the one
that is passed later down command line is get actually set.
Let's prohibit such configurations.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Currently, if only iotune group name is given for some disk and
no any params then later start of domain will fail. I guess it
will be convenient to allow such configuration if there is
another disk in the same iotune group with iotune params set. The
meaning is that the first disk have same iotunes and the latter.
Thus one can easily add a disk to iotune group - just add group
name parameter and no need to copy all the params.
Also let's expand iotunes params in the described case so we don't
need to refer to another disk to know iotunes and this will make
logic in many places simple.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
And introduce virDomainBlockIoTuneInfoHasAny.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
QEMU driver has two functions: qemuGetDHCPInterfaces() and
qemuARPGetInterfaces() that are being used inside only one single
function. They can be turned into generic functions that other drivers
can use. This commit move both from QEMU driver tree to domain conf
tree.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The function virSecretGetSecretString calls into secret driver and is
used from other hypervisors drivers and as such makes more sense in
util.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
A new helper for trimming combinations of specified characters from
the tail of the buffer.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
The idea is to offer callers an init function that they can call
independently to ensure that the global variables get
initialized.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
These functions are meant to replace verbose check for the old
style of specifying UEFI with a simple function call.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
g_canonicalize_filename was not introduced until glib 2.58
so we need a temporary backport of its impl.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This function is not called outside of the source file where it's
defined. There's no need to export it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The underlying resctrl monitoring is actually using 64 bit counters,
not the 32bit one. Correct this by using 64bit data type for reading
hardware value.
To keep the interface consistent, the result of CPU last level cache
that occupied by vcpu processors of specific restrl monitor group is
still reported with a truncated 32bit data type. because, in silicon
world, CPU cache size will never exceed 4GB.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Wang Huaqiang <huaqiang.wang@intel.com>
A wrapper that calls g_fsync on Win32/macOS and fdatasync
elsewhere. g_fsync is a stronger flush than we need but it
satisfies the caller's requirements & matches the approach
gnulib takes.
Reviewed-by: Fabiano Fidêncio <fidencio@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
g_fsync isn't available until 2.63 so we need a compat
wrapper temporarily.
Reviewed-by: Fabiano Fidêncio <fidencio@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The function is no longer used since commit faf2d811f3.
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The function is no longer used since commit faf2d811f3.
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Now that we have virNVMeDevice module (introduced in previous
commit), let's use it int virHostdev to track which NVMe devices
are free to be used by a domain and which are taken.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
This module will be used by virHostdevManager and it's inspired
by virPCIDevice module. They are very similar except instead of
what makes a NVMe device: PCI address AND namespace ID. This
means that a NVMe device can appear in a domain multiple times,
each time with a different namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
This function will return true if any of disks (or their backing
chain) for given domain contains an NVMe disk.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
ACKed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
This function will return true if there's a storage source of
type VIR_STORAGE_TYPE_NVME, or false otherwise.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
ACKed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
To simplify implementation, some restrictions are added. For
instance, an NVMe disk can't go to any bus but virtio and has to
be type of 'disk' and can't have startupPolicy set.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
This helper is cleaner than plain memcpy() because one doesn't
have to look into virPCIDeviceAddress struct to see if it
contains any strings / pointers.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
ACKed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Sometimes, we have a PCI address and not fully allocated
virPCIDevice and yet we still want to know its /dev/vfio/N path.
Introduce virPCIDeviceAddressGetIOMMUGroupDev() function exactly
for that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Next patch will validate QEMU_CAPS_NUMA_DIST in a new qemu_domain.c
function. The code to verify if a NUMA node distance is being
set will still be needed in qemuBuildNumaArgStr() though.
To avoid code repetition, let's put this logic in a helper to be
used in qemuBuildNumaArgStr() and in the new function.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Accept XML describing a generic block job, and output it again as
needed. This may still need a few tweaks to match the documented XML
and RNG schema.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Annoyingly there was no existing constructor, and identifying all the
places which do a VIR_ALLOC(cpu) is a bit error prone. Hopefully this
has found & converted them all.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The NUMA cells are stored directly in the virCapsHostPtr
struct. This moves them into their own struct allowing
them to be stored independantly of the rest of the host
capabilities. The change is used as an excuse to switch
the representation to use a GPtrArray too.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The XML parser currently calls virCapabilitiesDomainDataLookup during
parsing to find the domain capabilities matching the triple
(virt type, os type, arch)
This is, however, bogus with the QEMU driver as it assumes that there
is an emulator known to the default driver capabilities that matches
this triple. It is entirely possible for the driver to be parsing an
XML file with a custom emulator path specified pointing to a binary
that doesn't exist in the default driver capabilities. This will,
for example be the case on a RHEL host which only installs the host
native emulator to /usr/bin. The user can have built a custom QEMU
for non-native arches into $HOME and wish to use that.
Aside from validation, this call is also used to fill in a machine type
for the guest if not otherwise specified. Again, this data may be
incorrect for the QEMU driver because it is not taking account of
the emulator binary that is referenced.
To start fixing this, move the validation to the post-parse callbacks
where more intelligent driver specific logic can be applied.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Our normal practice is for the object type to be the name prefix, and
the object instance be the first parameter passed in.
Rename these to virDomainObjSave and virDomainDefSave moving their
primary parameter to be the first one. Ensure that the xml options
are passed into both functions in prep for future work.
Finally enforce checking of the return type and mark all parameters
as non-NULL.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
With this patch users can cold unplug some sound devices.
use "virsh detach-device vm sound.xml --config" command.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Jidong Xia <xiajidong@cmss.chinamobile.com>
If we static link to libvirt_util.la then we can't override functions in
this file by simply implementing them in the test code. Any tests should
dynamic link to the main libvirt.la and ensure symbols are exported.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
<interface> devices (virDomainNetDef) are a bit different from other
types of devices in that their actual type may come from a network (in
the form of a port connection), and that doesn't happen until the
domain is started. This means that any validation of an <interface> at
parse time needs to be a bit liberal in what it accepts - when
type='network', you could think that something is/isn't allowed, but
once the domain is started and a port is created by the configured
network, the opposite might be true.
To solve this problem hypervisor drivers need to do an extra
validation step when the domain is being started. I recently (commit
3cff23f7, libvirt 5.7.0) added a function to peform such validation
for all interfaces to the QEMU driver -
qemuDomainValidateActualNetDef() - but while that function is a good
single point to call for the multiple places that need to "start" an
interface (domain startup, device hotplug, device update), it can't be
called by the other hypervisor drivers, since 1) it's in the QEMU
driver, and 2) it contains some checks specific to QEMU. For
validation that applies to network devices on *all* hypervisors, we
need yet another interface validation function that can be called by
any hypervisor driver (not just QEMU) right after its network port has
been created during domain startup or hotplug. This patch adds that
function - virDomainActualNetDefValidate(), in the conf directory,
and calls it in appropriate places in the QEMU, lxc, and libxl
drivers.
This new function is the place to put all network device validation
that 1) is hypervisor agnostic, and 2) can't be done until we know the
"actual type" of an interface.
There is no framework for validation at domain startup as there is for
post-parse validation, but I don't want to create a whole elaborate
system that will only be used by one type of device. For that reason,
I just made a single function that should be called directly from the
hypervisors, when they are initializing interfaces to start a domain,
right after conditionally allocating the network port (and regardless
of whether or not that was actually needed). In the case of the QEMU
driver, qemuDomainValidateActualNetDef() is already called in all the
appropriate places, so we can just call the new function from
there. In the case of the other hypervisors, we search for
virDomainNetAllocateActualDevice() (which is the hypervisor-agnostic
function that calls virNetworkPortCreateXML()), and add the call to our
new function right after that.
The new function itself could be plunked down into many places in the
code, but we already have 3 validation functions for network devices
in 2 different places (not counting any basic validation done in
virDomainNetDefParseXML() itself):
1) post-parse hypervisor-agnostic
(virDomainNetDefValidate() - domain_conf.c:6145)
2) post-parse hypervisor-specific
(qemuDomainDeviceDefValidateNetwork() - qemu_domain.c:5498)
3) domain-start hypervisor-specific
(qemuDomainValidateActualNetDef() - qemu_domain.c:5390)
I placed (3) right next to (2) when I added it, specifically to avoid
spreading validation all over the code. For the same reason, I decided
to put this new function right next to (1) - this way if someone needs
to add validation specific to qemu, they go to one location, and if
they need to add validation applying to everyone, they go to the
other. It looks a bit strange to have a public function in between a
bunch of statics, but I think it's better than the alternative of
further fragmentation. (I'm open to other ideas though, of course.)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Commit 5751a0b6b1 added a helper function
called virDomainCapsFeaturesInitUnsupported which initialized all domain
capability features as unsupported.
When adding a new feature this would initialize it as unsupported also
for hypervisor drivers which the original author possibly didn't intend
to modify. To prevent accidental wrong value being reported in such case
revert back to initializing individual features in the hypervisor
drivers themselves.
This is not a straight revert as additonal patches modified how we store
the capabilities.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Both virDomainCapsCPUModelsAdd and virDomainCapsCPUModelsAddSteal are so
simple we can just squash the code in a single function.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Device rules are stored in BPF map that is a hash type, this function
will create a key based on major and minor id of device.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We need to close our FD that we have for BPF program and map in order
to let kernel remove all resources once the cgroup is removed as well.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This function will be called for every virCgroup(Allow|Deny)* API in
order to prepare BPF program for guest. Since libvirtd can be restarted
at any point we will first try to detect existing progam, if there is
none we will create a new empty BPF program and lastly if we don't have
any space left in the existing BPF map we will create a new copy of the
BPF map with more space and attach a new program with that map into the
guest cgroup.
This solution allows us to start with reasonably small BPF map consuming
only small amount of memory and if needed we can easily extend the BPF
map if there is a lot of host devices used in guest or if user wants to
hot-plug a lot of devices once the guest is running.
Since there is no way how to reallocate existing BPF map we need to
create a new copy if we run out of space in current BPF map.
This overcomes all the limitations in BPF:
- map used in program has to be created before the program is loaded
into kernel
- once map is created you cannot change its size
- you cannot replace map in existing program
- you cannot use an array of maps because it can store FD to maps
of one specific size so we would not be able to use it to overcome
the second issue
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This function creates new BPF program with new empty BPF map with the
default size and attaches it to the guest cgroup.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This function will be called if libvirtd was restarted while some
domains were running. It will try to detect existing programs attached
to the guest cgroup.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This function loads the BPF prog with prepared map into kernel and
attaches it into guest cgroup. It can be also used to replace existing
program in the cgroup if we need to resize BPF map to store more rules
for devices. The old program will be closed and removed from kernel.
There are two possible ways how to create BPF program:
- One way is to write simple C-like code which can by compiled into
BPF object file which can be loaded into kernel using elfutils.
- The second way is to define macros which look like assembler
instructions and can be used directly to create BPF program that
can be directly loaded into kernel.
Since the program is not too complex we can use the second option.
If there is no program, all devices are allowed, if there is some
program it is executed and based on the exit status the access is
denied for 0 and allowed for 1.
Our program will follow these rules:
- first it will try to look for the specific key using major and
minor to see if there is any rule for that specific device
- if there is no specific rule it will try to look for any rule that
matches only major of the device
- if there is no match with major it will try the same but with
minor of the device
- as the last attempt it will try to look for rule for all devices
and if there is no match it will return 0 to deny that access
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There is no exact way how to figure out whether BPF devices support is
compiled into kernel. One way is to check kernel configure options but
this is not reliable as it may not be available. Let's try to do
syscall to which will list BPF cgroup device programs.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In order to implement devices controller with cgroup v2 we need to
add support for BPF programs, cgroup v2 doesn't have devices controller.
This introduces required helpers wrapping linux syscalls.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
For future extensions of the domain caps it's useful to have a single
point that initializes all capabilities as unsupported by a driver. The
driver then can enable specific ones.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Remove the need to pass around strings and switch to the enum values
instead.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Now that function is no longer used, it can be dropped.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Now that function is no longer used, it can be dropped.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The enum name sounds too generic. It in fact describes the capabilities
of the process, thus add 'Process' to the name.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function now does not return an error so we can drop it fully.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function now does not return an error so we can drop it fully.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function basically does two very distinct things depending on a
bool. As a first step of conversion split out the case when @dynamic is
true and implement it as a new function and convert all callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add a helper that checks whether an entry with given name exists but
does not touch the userdata.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
ACKed-by: Eric Blake <eblake@redhat.com>
Add a simpler constructor for hash tables which specifically does not
require specifying the initial hash size and uses simpler freeing
function.
The initial hash table size usually is not important as the hash table
is growing when it reaches certain number of entries in one bucket.
Additionally many callers pass in a random small number for ad-hoc table
use so using a central one will simplify things.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
ACKed-by: Eric Blake <eblake@redhat.com>
Previous commit removed last use of this function so we can get rid of
it.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Introduce a simpler replacement for virDomainDiskByName when looking up
by disk target.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
In some places we need to check if a hostdev has VFIO backend.
Because of how complicated virDomainHostdevDef structure is, the
check consists of three lines. Move them to a function and
replace all checks with the function call.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
While the default iptables setup used by Fedora/RHEL distros
only restricts traffic on the INPUT and/or FORWARD rules,
some users might have custom firewalls that restrict the
OUTPUT rules too.
These can prevent DHCP/DNS/TFTP responses from dnsmasq
from reaching the guest VMs. We should thus whitelist
these protocols in the OUTPUT chain, as well as the
INPUT chain.
Signed-off-by: Malina Salina <malina.salina@protonmail.com>
Initial patch then modified to add unit tests and IPv6
support
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
These functions don't really abort() on OOM. The fix was merged
upstream, but not in the minimal version we require. Provide our
own implementation which can be removed once we bump the minimal
version.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Replace use of the gnulib base64 module with glib's own base64 API family.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
It is only used in virstoragefile.c
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
It is the only user. Rename it to match the local style
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Some of objects we manage can be autostarted on libvirtd startup
(e.g. domains, network, storage pools). The idea was that when
the host is started up these objects are started too without need
of user intervention. However, with the latest daemon split and
switch to socket activated, short lived daemons (we put --timeout
120 onto each daemon's command line) this doesn't do what we want
it to. The problem is not new though, we already had the session
daemon come and go and we circumvented this problem by
documenting it (see v4.10.0-92-g61b4e8aaf1). But now that we meet
the same problem at all fronts it's time to deal with it.
The solution implemented in this commit is to have a file (one
per each driver) that:
1) if doesn't exist, is created and autostart is allowed for
given driver,
2) if it does exist, then autostart is suppressed for given
driver.
All the files live in a location that doesn't survive host
reboots (/var/run/ for instance) and thus the file is
automatically not there on fresh host boot.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>