Up until now we only had virObjectLockable which uses mutexes for
mutually excluding each other in critical section. Well, this is
not enough. Future work will require RW locks so we might as well
have virObjectRWLockable which is introduced here.
Moreover, polymorphism is introduced to our code for the first
time. Yay! More specifically, virObjectLock will grab a write
lock, virObjectLockRead will grab a read lock then (what a
surprise right?). This has great advantage that an object can be
made derived from virObjectRWLockable in a single line and still
continue functioning properly (mutexes can be viewed as grabbing
write locks only). Then just those critical sections that can
grab a read lock need fixing. Therefore the resulting change is
going to be way smaller.
In order to avoid writer starvation, the object initializes RW
lock that prefers writers.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
We already have virRWLockInit. But this uses pthread defaults
which prefer reader to initialize the RW lock. This may lead to
writer starvation. Therefore we need to have the counterpart that
prefers writers. Now, according to the
pthread_rwlockattr_setkind_np() man page setting
PTHREAD_RWLOCK_PREFER_WRITER_NP attribute is no-op. Therefore we
need to use PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP
attribute. So much for good enum value names.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
It was observed while adding new property that there should be a space
before closing a curly brace in intel-iommu object property definition.
Fixing it as a separate patch.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Currently, @port is type of string. Well, that's overkill and
waste of memory. Port is always an integer. Use it as such.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Split out parsing of one host into a separate function and add a new
function to loop through all the host XML nodes.
This change removes multiple levels of nesting due to the old XML
parsing approach used.
So the way we format this huge virQEMUCaps enum is we group the
values in groups of five. And then at the beginning of each group
we have a small comment that says what's the number of the first
item in the group. Well, the last commit of 11b2ebf3e1 does not
follow this formatting.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Alter the virStoragePoolObjNumOfVolumes, virStoragePoolObjVolumeGetNames,
and virStoragePoolObjVolumeListExport APIs to take a virStoragePoolObjPtr
instead of the &obj->volumes and obj->def.
Signed-off-by: John Ferlan <jferlan@redhat.com>
A virStoragePoolObjPtr will be an 'obj'.
A virStoragePoolPtr will be a 'pool'.
A virStorageVolPtr will be a 'vol'.
A virStorageVolDefPtr will be a 'voldef'.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than have repetitive code - create/use a couple of helpers:
testStoragePoolObjFindActiveByName
testStoragePoolObjFindInactiveByName
This will also allow for the reduction of some cleanup path logic.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rework some of the test driver API's to remove the need to return
failure when testStoragePoolObjFindByName returns NULL rather than
going to cleanup. This removes the need for check for "if (obj)" and in
some instances the need to for a cleanup label and a local ret variable.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add a path for UEFI VMs for AArch32 VMs, based on the path Debian is using.
libvirt is the de facto canonical location for defining where distros
should place these firmware images, so let's define this path here to try
and minimize distro fragmentation.
The call to qemuBuildDeviceAddressStr() happens no matter
what, so we can move it to the outer possible scope inside
the function.
We can also move the call to virBufferAsprintf() after all
the checks have been performed, where it makes more sense.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
The virDomainDeviceInfo struct is defined in device_conf,
so generic functions that operate on it should also be
defined there rather than in domain_conf.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
This patch addresses the same aspects on PPC the bug 1103314 addressed
on x86.
PCI expander bus creates multiple primary PCI busses, where each of these
busses can be assigned a specific NUMA affinity, which, on x86 is
advertised through ACPI on a per-bus basis.
For SPAPR, a PHB's NUMA affinities are assigned on a per-PHB basis, and
there is no mechanism for advertising NUMA affinities to a guest on a
per-bus basis. So, even if qemu-ppc manages to get some sort of multi-bus
topology working using PXB, there is no way to expose the affinities
of these busses to the guest. It can only be exposed on a per-PHB/per-domain
basis.
So patch enables NUMA node tag in pci-root controller on PPC.
The way to set the NUMA node is through the numa_node option of
spapr-pci-host-bridge device. However for the implicit PHB, the only way
to set the numa_node is from the -global option. The -global option applies
to all the PHBs unless explicitly specified with the option on the
respective PHB of CLI. The default PHB has the emulated devices only, so
the patch prevents setting the NUMA node for the default PHB.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
The patch adds a capability for spapr-pci-host-bridge.numa_node.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Instead of going through two completely different code paths,
one of which repeats the same hardcoded bit of information
three times in rapid succession, depending on whether or not
a firmware list has been provided at configure time, just
provide a reasonable default value and remove the extra code.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
'numad' may return a nodeset which contains NUMA nodes without memory
for certain configurations. Since cgroups code will not be happy using
nodes without memory we need to store only numa nodes with memory in
autoNodeset.
On the other hand autoCpuset should contain cpus also for nodes which
do not have any memory.
A new function virNetDevOpenvswitchUpdateVlan has been created to instruct
OVS of the changes. qemuDomainChangeNet has been modified to handle the
update of the VLAN configuration for a running guest and rely on
virNetDevOpenvswitchUpdateVlan to do the actual update if needed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Preparation for switching to virFileCache where there are two callbacks,
one to get a new data and second one to load a cached data.
This also removes virQEMUCapsReset which is no longer required.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
It's not required and following patches will change the code.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Cleanups the code a little bit and reduces amount of arguments passed
throughout the functions.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
While searching for an element using a function it may be
desirable to know the element key for future operation.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
At present shared disks can be migrated with either readonly or cache=none. But
cache=directsync should be safe for migration, because both cache=directsync and cache=none
don't use the host page cache, and cache=direct write through qemu block layer cache.
Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
Reviewed-by: Wang Yechao <wang.yechao255@zte.com.cn>
Use virStorageSource accessors to check the file and call
virStorageFileAccess before even attempting to stat the target. This
will be helpful once we try to add network destinations for block copy,
since there will be no need to stat them.
When copying to a block device, the block device will already exist. To
allow users using a block device without any preparation, they need to
use the block copy without VIR_DOMAIN_BLOCK_COPY_REUSE_EXT.
This means that if the target is an existing block device we don't need
to prepare it, but we can't reject it as being existing.
To avoid breaking this feature, explicitly assume that existing block
devices will be reused even without that flag explicitly specified,
while skipping attempts to create it.
qemuMonitorDriveMirror still needs to honor the flag as specified by the
user, since qemu overwrites the metadata otherwise.
The refactor to split up storage driver into modules broke the apparmor
helper program, since that did not initialize the storage driver
properly and thus detection of the backing chain could not work.
Register the storage driver backends explicitly. Unfortunately it's now
necessary to link with the full storage driver to satisfy dependencies
of the loadable modules.
Reviewed-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Reported-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Tested-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
The purpose of this function is to tell if the current position
in given FD is in data section or a hole and how much bytes there
is remaining until the end of the section. This is achieved by
couple of lseeks(). The most important part is that we reposition
the FD back, so that the position is unchanged from the caller
POV. And until now the final lseek() back to the original
position was done with no check for errors. And I was convinced
that that's okay since nothing can go wrong. However, review
feedback from a related series persuaded me, that it's better to
be safe than sorry. Therefore, lets check if the final lseek()
succeeded and if it doesn't report an error.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
All the pieces are now in place, so we can finally start
using isolation groups to achieve our initial goal, which is
separating hostdevs from emulated PCI devices while keeping
hostdevs that belong to the same host IOMMU group together.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1280542
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
These rules will make it possible for libvirt to
automatically assign PCI addresses in a way that
respects any isolation constraints devices might
have.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Isolation groups will eventually allow us to make sure certain
devices, eg. PCI hostdevs, are assigned to guest PCI buses in
a way that guarantees improved isolation, error detection and
recovery for machine types and hypervisors that support it,
eg. pSeries guest on QEMU.
This patch merely defines storage for the new information
we're going to need later on and makes sure it is passed from
the hypervisor driver (QEMU / bhyve) down to the generic PCI
address allocation code.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Now that we have a bit more control, let's convert our object into
a lockable object and let that magic handle the create and lock/unlock.
This also involves creating a virNodeDeviceEndAPI in order to handle
the object cleanup for API's that use the Add or Find API's in order
to get a locked/reffed object. The EndAPI will unlock and unref the
object returning NULL to indicate to the caller to not use the obj.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move the structures to withing virnodedeviceobj.c
Move the typedefs from node_device_conf to virnodedeviceobj.h
Signed-off-by: John Ferlan <jferlan@redhat.com>
In an overall effort to privatize access to virNodeDeviceObj and
virNodeDeviceObjList into the virnodedeviceobj module, move the
object list parsing from node_device_driver and replace with a
call to a virnodedeviceobj helper. This follows other similar
APIs/helpers which peruse the object list looking for some specific
data in order to get/return an @device (virNodeDevice) object to
the caller.
Signed-off-by: John Ferlan <jferlan@redhat.com>
We're about to move the call to nodeDeviceSysfsGetSCSIHostCaps from
node_device_driver into virnodedeviceobj, so move the guts of the code
from the driver specific node_device_linux_sysfs into its own API
since virnodedeviceobj cannot callback into the driver.
Nothing in the code deals with sysfs anyway, as that's hidden by the
various virSCSIHost* and virVHBA* utility function calls.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Create local @obj and @def for the API's rather than referencing the
devs->objs[i][->def->]. It'll make future patches easier to read.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Ensure that any function that walks the node device object list is prefixed
by virNodeDeviceObjList.
Also, modify the @filter param name for virNodeDeviceObjListExport to
be @aclfilter.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In preparation to make things private, make the ->devs be pointers to a
virNodeDeviceObjList and then manage everything inside virnodedeviceobj
Signed-off-by: John Ferlan <jferlan@redhat.com>
Create an allocator for the virNodeDeviceObjPtr - include setting up
the mutex, saving the virNodeDeviceDefPtr, and locking the return object.
Signed-off-by: John Ferlan <jferlan@redhat.com>
- In testDestroyVport rather than use a cleanup label, just return -1
immediately since nothing else is needed.
- In testStoragePoolDestroy, if !privpool, then just return -1 since
nothing else will happen anyway.
- Rather than "goto cleanup;" on failure to virNodeDeviceObjFindByName
an @obj, just return directly. This then allows the cleanup: label code
to not have to check "if (obj)" before calling virNodeDeviceObjUnlock.
This also simplifies some exit logic...
- In testNodeDeviceObjFindByName use an error: label to handle the failure
and don't do the ncaps++ within the VIR_STRDUP() source target index.
Only increment ncaps after success. Easier on eyes at error label too.
- In testNodeDeviceDestroy use "cleanup" rather than "out" for the goto
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than passing the object to be removed by reference, pass by value
and then let the caller decide whether or not the object should be free'd
and how to handle the logic afterwards. This includes free'ing the object
and/or setting the local variable to NULL to prevent subsequent unexpected
usage (via something like virNodeDeviceObjRemove in testNodeDeviceDestroy).
For now this function will just handle the remove of the object from the
list for which it was placed during virNodeDeviceObjAssignDef.
This essentially reverts logic from commit id '61148074' that free'd the
device entry on list, set *dev = NULL and returned. Thus fixing a bug in
node_device_hal.c/dev_refresh() which would never call dev_create(udi)
since @dev would have been set to NULL.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Recent refactors made it so that the function may use uninitialized
pointer, but it actually wanted to use a different variable and value
at all.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This reverts commit b3e71a8830.
As it turns out this ends up very badly as the @def could be Free'd
even though it's owned by @obj as a result of the AssignDef.
New API will be virNWFilterInstantiateFilterInternal as it's called from
the virNWFilterInstantiateFilter and virNWFilterUpdateInstantiateFilter.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rename to virNWFilterDoInstantiate to better describe the action.
Also fix the @vmuuid parameter to not have the ATTRIBUTE_UNUSED since it
is used in the call to virNWFilterDHCPSnoopReq.
Signed-off-by: John Ferlan <jferlan@redhat.com>
When looking for slots suitable for a PCI device, libvirt
might need to add an extra PCI controller: for pSeries guests,
we want that extra controller to be a PHB (pci-root) rather
than a PCI bridge.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
PCI bus has to be numbered sequentially, and no index can be
missing, so libvirt will fill in the blanks automatically for
the user.
Up until now, it has done so using either pci-bridge, for machine
types based on legacy PCI, or pcie-root-port, for machine types
based on PCI Express. Neither choice is good for pSeries guests,
where PHBs (pci-root) should be used instead.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Now that the multi-phb support series is in, work on the TODO at
qemuDomainGetMemLockLimitBytes() to arrive at the correct memlock limit
value.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Additional PHBs (pci-root controllers) will be created for
the guest using the spapr-pci-host-bridge QEMU device, if
available; the implicit default PHB, while present in the
guest configuration, will be skipped.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1431193
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Usually, a controller with alias 'x' will create a bus with the
same name; however, the bus created by a PHBs with alias 'x' will
be named 'x.0' instead, so we need to account for that.
As an exception to the exception, the implicit PHB that's added
automatically to every pSeries guest creates the 'pci.0' bus.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
This new capability can be used to detect whether a QEMU
binary supports the spapr-pci-host-bridge controller.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
pSeries guests will soon need the new information; luckily,
we can figure it out automatically most of the time, so
users won't have to worry about it.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Adding it to the virDomainControllerPCIModelName enumeration
is enough for existing code to handle it, so parsing and
formatting will work without further tweaking.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
pSeries guests will soon be allowed to have multiple
PHBs (pci-root controllers), meaning the current check
on the controller index no longer applies to them.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
pSeries guests will soon be allowed to have multiple
PHBs (pci-root controllers), which of course means that
all but one of them will have a non-zero index; hence,
we'll need to relax the current check.
However, right now the check is performed in the conf
module, which is generic rather than tied to the QEMU
driver, and where we don't have information such as the
guest machine type available.
To make this change of behavior possible down the line,
we need to move the check from the XML parser to the
drivers. Luckily, only QEMU and bhyve are using PCI
controllers, so this doesn't result in much duplication.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Moving the check and rewriting it this way doesn't alter
the current behavior, but will allow us to special-case
pci-root down the line.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
We will soon need to be able to return a NULL pointer
without the caller considering that an error: to make
it possible, change the return type to int and use
an out parameter for the string instead.
Add some documentation for the function as well.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
The current algorithm for slot allocation tries to be clever
and avoid looking at buses / slots more than once unless it's
necessary. Unfortunately that makes the code more complex,
and it will cause problem later on in some situations unless
even more complex code is added.
Since the performance gains are going to be pretty modest
anyway, we can just get rid of the extra complexity and use a
completely straighforward implementation instead.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
We use hostdev->info frequently enough that having
a shorter name for it makes the code more readable.
We will also be adding even more uses later on.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
Move @function after @flags to match other functions in the
same module like virDomainPCIAddressReserveNextAddr().
Also move virDomainPCIAddressReserveNextAddr() closer to
virDomainPCIAddressReserveAddr() in the header file.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
This function was private to the QEMU driver and was,
accordingly, called qemuDomainPCIBusFullyReserved().
However the function is really not QEMU-specific at
all, so it makes sense to move it closer to the
virDomainPCIAddressBus struct it operates on.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>
The virDomainDeviceInfoIsSet() function no longer exists.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Laine Stump <laine@laine.org>