Format the mediated devices on the qemu command line as
-device vfio-pci,sysfsdev='/path/to/device/in/syfs'.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Since mdevs are just another type of VFIO devices, we should increase
the memory locking limit the same way we do for VFIO PCI devices.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
As goes for all the other hostdev device types, grant the qemu process
access to /dev/vfio/<mediated_device_iommu_group>.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Keep track of the assigned mediated devices the same way we do it for
the rest of hostdevs. Methods like 'Prepare', 'Update', and 'ReAttach'
are introduced by this patch.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
So far, the official support is for x86_64 arch guests so unless a
different device API than vfio-pci is available let's only turn on
support for PCI address assignment. Once a different device API is
introduced, we can enable another address type easily.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This merely introduces virDomainHostdevMatchSubsysMediatedDev method that
is supposed to check whether device being cold-plugged does not already
exist in the domain configuration.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This patch updates all of our security driver to start labeling the
VFIO IOMMU devices under /dev/vfio/ as well.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
A mediated device will be identified by a UUID (with 'model' now being
a mandatory <hostdev> attribute to represent the mediated device API) of
the user pre-created mediated device. We also need to make sure that if
user explicitly provides a guest address for a mdev device, the address
type will be matching the device API supported on that specific mediated
device and error out with an incorrect XML message.
The resulting device XML:
<devices>
<hostdev mode='subsystem' type='mdev' model='vfio-pci'>
<source>
<address uuid='c2177883-f1bb-47f0-914d-32a22e3a8804'>
</source>
</hostdev>
</devices>
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Beside creation, disposal, getter, and setter methods the module exports
methods to work with lists of mediated devices.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Just to make the code a bit cleaner, move hostdev specific post parse
code to its own function just in case it grows in the future.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Just a tiny wrapper over the SCSI def clearing logic to drop some
if-else branches from a switch, mainly because extending the switch in
the future would render the current code with branching less readable.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Enforce virDomainHostdevSubsysType checking during compilation. Again,
one of a few spots in our code where we should enforce the typecast to
the enum type, thus not forgetting to update *all* switch occurrences
dealing with the give enum.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
We keep forgetting that older setups don't like 'index':
CC util/libvirt_util_la-virsysinfo.lo
cc1: warnings being treated as errors
util/virstoragefile.c: In function 'virStorageSourceFindByNodeName':
util/virstoragefile.c:3804: error: declaration of 'index' shadows a global declaration [-Wshadow]
/usr/include/string.h:489: error: shadowed declaration is here [-Wshadow]
Signed-off-by: Eric Blake <eblake@redhat.com>
Instead of generating all of the capabilities, let's test more of our
code by probing sysfs data. This test needs quite some mocking for
now, but it paves the road for more future enhancements (hugepages
probing, for example).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
All mocked functions are related to numactl/virNuma and rely only on
virsysfs, so the paths they touch can be nicely controlled. And
because it is so nicely self-contained NUMA mock, it is named
numamock (instead of naming it after the test that will use it first).
We need top level API mock because some APIs might call libnuma
directly, e.g. virNumaIsAvailable(), virNumaGetMaxNode().
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Bit more test data, this time with complete info copied, mainly with
cache information, so we can easily add tests for it.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This way more drivers can utilize the functionality without copying
the code. And we can therefore test it in one place for all of them.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
That file has only two exported files and each one of them has
different naming. virNode is what all the other files use, so let's
use it. It wasn't used before because the clash with public API
naming, so let's fix that by shortening the name (there is no other
private variant of it anyway).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
There is no "node driver" as there was before, drivers have to do
their own ACL checking anyway, so they all specify their functions and
nodeinfo is basically just extending conf/capablities. Hence moving
the code to src/conf/ is the right way to go.
Also that way we can de-duplicate some code that is in virsysfs and/or
virhostcpu that got duplicated during the virhostcpu.c split. And
Some cleanup is done throughout the changes, like adding the vir*
prefix etc.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
There is no reason for it not to be in the utils, all global symbols
under that file already have prefix vir* and there is no reason for it
to be part of DRIVER_SOURCES because that is just a leftover from
older days (pre-driver modules era, I believe).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
While on that, drop support for kernels from RHEL-5 era (missing
cpu/present file). Also add some useful functions and export them.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
By using this we are able to easily switch the sysfs path being
used (fake it). This will not only help tests in the future but can
be also used from files where the code is duplicated currently.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The functionality these tests partially relied on (scanning the cpu
directory for cpu[0-9]+ subdirectories) is going to be removed, so we
need additional files that are present on all non-medieval systems.
Removing all these tests would be an option but we would lose the
ability to test the topologies. Even though we just extract number of
sockets/cores/threads from all these directory trees.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
These helpers are doing just a read and covert the value, but they
properly size the read limit, handle additional whitespace characters,
and unify error reporting.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Commits eaf18f4c2b and 86dd9fac0f separated util/host{cpu,mem}
stuff from nodeinfo, but did not adjust the syms file.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
It is everywhere else. I even remember one of our scripts failing if
the newline is missing, but it doesn't happen currently.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Don't leak guest if adding it to virCapabilities fails. Also return
NULL and not pointer to free'd object with zero references in such
case.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Guests are handled in callers, but if something goes wrong (when it
cannot be added to virCapabilities, for example), there's no way for
them to free it properly.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Both QEMU and bhyve are using the same function for setting up the CPU
in virCapabilities, so de-duplicate it, save code and time, and help
other drivers adopt it.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
STREQ_NULLABLE returns true if both parameters are NULL. And that's
not what we want here. We just want to skop comparing source nodes
that don't have that info set. The function wouldn't make much sense
with nodeName == NULL, so we don't need to check that. Moreover, the
function's declaration uses ATTRIBUDE_NONNULL for nodeName, which not
only means that function expects the parameter not to be NULL, but
actually tells the compiler that it can optimize out the NULL checks.
That way it could end up calling strcmp on NULL (either nodeformat or
nodebacking). GCC figures this out if libvirt is compiled with
lv_cv_static_analysis=yes, unfortunately not everyone uses that.
Caused by cbc6d53513.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Management tools may want to check whether the threshold is still set if
they missed an event. Add the data to the bulk stats API where they can
also query the current backing size at the same time.
To allow updating stats based on the node name, add a helper function
that will fetch the required data from 'query-named-block-nodes' and
return it in hash table for easy lookup.
Detect the node names when setting block threshold and when reconnecting
or when they are cleared when a block job finishes. This operation will
become a no-op once we fully support node names.
To allow matching the node names gathered via 'query-named-block-nodes'
we need to query and then use the top level nodes from 'query-block'.
Add the data to the structure returned by qemuMonitorGetBlockInfo.
oVirt uses relative names with directories in them. Test such
configuration. Also tests a snapshot done with _REUSE_EXTERNAL and a
relative backing file pre-specified in the qcow2 metadata.
Since we have to match the images by filename a common backing image
will break the detection process. Add a test case to see that the code
correctly did not continue the detection process.
qemu for some time already sets node names automatically for the block
nodes. This patch adds code that attempts a best-effort detection of the
node names for the backing chain from the output of
'query-named-block-nodes'. The only drawback is that the data provided
by qemu needs to be matched by the filename as seen by qemu and thus
if two disks share a single backing store file the detection won't work.
This will allow us to use qemu commands such as
'block-set-write-threshold' which only accepts node names.
In this patch only the detection code is added, it will be used later.
Add monitor tooling for calling query-named-block-nodes. The monitor
returns the data as the raw JSON array that is returned from the
monitor.
Unfortunately the logic to extract the node names for a complete backing
chain will be so complex that I won't be able to extract any meaningful
subset of the data in the monitor code.