After v6.5.0-rc1~148 we started to rectify vCPU to guest NUMA
assignment - if there is a vCPU not assigned to any guest NUMA
node it is automatically assigned to node #0.
A month later I've made it possible to define guest NUMA nodes
without vCPUs (v6.6.0-rc1~250) - this is needed because of HMAT.
As a part of that I fixed all callers of
virDomainNumaGetNodeCpumask() (which returns a bitmap of vCPUs for
given node) to handle case when NULL is returned (i.e. no vCPUs
assigned to given node). But of course my patch was written
before aforementioned vCPU rectify patch but merged afterwards
and hence I missed the virDomainNumaFillCPUsInNode() caller.
And because we are dealing with a NULL pointer, of course this
leads to a crash. Just try to define a domain with at least two
NUMA nodes and no vCPU assignment to any of the nodes.
Fixes: a26f61ee0c
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1880289
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Allow using the function for creating temporary snapshot disk
definitions for creating <transient/> disk overlays.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Tested-by: Ján Tomko <jtomko@redhat.com>
Allow to match with CCW addresses in addition to PCI addresses
(and MAC addresses).
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Sebastian Mitterle <smitterl@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
It's no longer needed and is valid only after virDomainSnapshotAlignDisks
is called while holding the lock.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Last step of the algorithm in virDomainSnapshotAlignDisks is to extend
the array of disks to all VM's disk and provide defaults. This was done
by extending the array, adding defaults at the end and then sorting it.
This requires the 'idx' variable and also a separate sorting function.
If we store the pointer to existing snapshot disk definitions in a hash
table and create a new array of snapshot disk definitions, we can fill
the new array directly by either copying the definition from the old
array or adding the default.
This avoids the sorting step and thus even the need to store the index
of the domain disk altogether.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Remove the use of the 'disk_snapshot' temporary variable since accessing
the disk definition now isn't that much longer to write and use explicit
value checks instead of the (non-)zero check to make it more obvious
what the code is doing.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The converted string is used exactly once so we can call the conversion
without storing the result in a variable.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Extract the disk def to a local variable so that it's more obvious
what's happening and it will also allow further simplification.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There are multiple places accessing the domain definition. Extract it to
a local variable so that it's more clear what's happening.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The 'disk' variable usually refers to a definition of a disk from the
domain definition. Rename it to 'snapdisk' to be clear that we are
talking about the snapshot disk definition especially since this
function also accesses the domain disk definition.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
While this function resides in the snapshot config module, the 'def'
variable is referencing the VM definition in most places. Change the
name to 'snapdef' to avoid ambiguity especially since we are also
dealing with the domain definition in this function.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use automatic pointer for the bitmap and get rid of the 'cleanup' label
and 'ret' variable.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add an abort() on the class/object allocation failures so that
virStorageSourceNew() always returns a virStorageSource and remove
checks from all callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The alignment for the pSeries NVDIMM does not depend on runtime
constraints. This means that it can be done in device parse
time, instead of runtime, allowing the domain XML to reflect
what the auto-alignment would do when the domain starts.
This brings consistency between the NVDIMM size reported by the
domain XML and what the guest sees, without impacting existing
guests that are using an unaligned size - they'll work as usual,
but the domain XML will be updated with the actual size of the
NVDIMM.
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
We'll use the auto-alignment function during parse time, in
domain_conf.c. Let's move the function to that file, renaming
it to virDomainNVDimmAlignSizePseries(). This will also make it
clearer that, although QEMU is the only driver that currently
supports it, pSeries NVDIMM restrictions aren't tied to QEMU.
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Introduce 'isa' controller type. In domain XML it looks this way:
...
<controller type='isa' index='0'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01'
function='0x0'/>
</controller>
...
Currently, this is needed for the bhyve driver to allow choosing a
specific PCI address for that. In bhyve, this controller is used to
attach serial ports and a boot ROM.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The backend for the SCSI host device is a storage source. While the
definition doesn't look like that it's converted to a storage source
when the VM is running.
Add the storage source to the definition object and also parse/format
its private data which will be used for internal state storage while
the VM is running.
Note that the virStorageSourcePtr may not be allocated all the time so
the private data parser allocates it if there is any private data
present.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use XPath instead of iterating through the nodes.
Few error messages were modified so that the parser can be written in a
simpler way.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use XPath to get the host list instead of iterating through the nodes.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There's just one caller for the function. Move the code into the caller.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Allow vfio-ccw mdev devices to be created besides vfio-pci mdev devices
as well.
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Bjoern Walk <bwalk@linux.ibm.com>
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Allow to filter for CSS devices.
Reviewed-by: Bjoern Walk <bwalk@linux.ibm.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Make channel subsystem (CSS) devices available in the node_device driver.
The CCS devices reside in the computer system and provide CCW devices, e.g.:
+- css_0_0_003a
|
+- ccw_0_0_1a2b
|
+- scsi_host0
|
+- scsi_target0_0_0
|
+- scsi_0_0_0_0
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Bjoern Walk <bwalk@linux.ibm.com>
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
g_regex_unref reports an error if called with a NULL argument.
We have two cases in the code where we (possibly) call it on a NULL
argument. The interesting one is in virDomainQemuMonitorEventCleanup.
Based on VIR_CONNECT_DOMAIN_QEMU_MONITOR_EVENT_REGISTER_REGEX, we unref
data->regex, which has two problems:
* On the client side, flags is -1 so the comparison is true even if no
regex was used, reproducible by:
$ virsh qemu-monitor-event --timeout 1
which results in an ugly error:
(process:1289846): GLib-CRITICAL **: 14:58:42.631: g_regex_unref: assertion 'regex != NULL' failed
* On the server side, we only create the regex if both the flag and the
string are present, so it's possible to trigger this message by:
$ virsh qemu-monitor-event --regex --timeout 1
Use a non-NULL comparison instead of the flag to decide whether we need
to unref the regex. And add a non-NULL check to the unref in the
VirtualBox test too.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: 71efb59a4dhttps://bugzilla.redhat.com/show_bug.cgi?id=1876907
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Currently, we are mixing: #if HAVE_BLAH with #if WITH_BLAH.
Things got way better with Pavel's work on meson, but apparently,
mixing these two lead to confusing and easy to miss bugs (see
31fb929eca for instance). While we were forced to use HAVE_
prefix with autotools, we are free to chose our own prefix with
meson and since WITH_ prefix appears to be more popular let's use
it everywhere.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
By default Xen only allows guests to write "known safe" values into PCI
configuration space, yet many devices require writes to other areas of
the configuration space in order to operate properly. To allow writing
any values Xen supports the 'permissive' setting, see xl.cfg(5) man page.
This change models Xen's permissive setting by adding a writeFiltering
attribute on the <source> element of a PCI hostdev. When writeFiltering
is set to 'no', the Xen permissive setting will be enabled and guests
will be able to write any values into the device's configuration space.
The permissive setting remains disabled in the absense of the
writeFiltering attribute, of if it is explicitly set to 'yes'.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Use https: links for websites that support them.
The URIs which are used as namespace identifiers
are left alone.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Neal Gompa <ngompa13@gmail.com>
Since the macro no longer includes the 'ignore_value'
statement, stop putting another empty statement after it.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Split those initializations that depend on a statement
above them.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Allow to map sound playback and recording devices to host devices
using "<audio type='oss'/>" OSS audio backend.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Introduce a new device element "<audio>" which allows
to map guest sound device specified using the "<sound>"
element to specific audio backend.
Example:
<sound model='ich7'>
<audio id='1'/>
</sound>
<audio id='1' type='oss'>
<input dev='/dev/dsp0'/>
<output dev='/dev/dsp0'/>
</audio>
This block maps to OSS audio backend on the host using
/dev/dsp0 device for both input (recording)
and output (playback).
OSS is the only backend supported so far.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Add 'ich7' sound model. This is a preparation for sound support in
bhyve, as 'ich7' is the only model it supports.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Back when macvtap support was added in commit 315baab944 in Feb. 2010
(libvirt-0.7.7), it was setup to autogenerate a name for the device if
one wasn't supplied, in the pattern "macvtap%d" (or "macvlan%d"),
similar to the way an unspecified standard tap device name will lead
to an autogenerated "vnet%d".
As a matter of fact, in commit ca1b7cc8e4 added in May 2010, the code
was changed to *always* ignore a supplied device name for macvtap
interfaces by deleting *any* name immediately during the <interface>
parsing (this was intended to prevent one domain which had failed to
completely start from deleting the macvtap device of another domain
which had subsequently been provided the same device name (this will
seem mildly ironic later). This was later fixed to only clear the
device name when inactive XML was being parsed. HOWEVER - this was
only done if the xml was <interface type='direct'> - autogenerated
names were not cleared for <interface type='network'> (which could
also result in a macvtap device).
Although the names of "vnetX" tap devices had always been
automatically cleared when parsing <interface> (see commit d1304583d
from July 2008 (!)), at the time macvtap support was added, both vnetX
and macvtapX device names were always included when formatting the
XML.
Then in commit a8be259d0c (July 2011, libvirt-0.9.4), <interface>
formatting was changed to also clear out "vnetX" device names during
XML formatting as well. However the same treatment wasn't given to
"macvtapX".
Now in 2020, there has been a report that a failed migration leads to
the macvtap device of some other unrelated guest on the destination
host losing its network connectivity. It was determined that this was
due to the domain XML in the migration containing a macvtap device
name, e.g. "macvtap0", that was already in use by the other guest on
the destination. Normally this wouldn't be a problem, because libvirt
would see that the device was already in use, and then find a
different unused name. But in this case, other external problems were
causing the migration to fail prior to selecting a macvtap device and
successfully opening it, and during error recovery, qemuProcessStop()
was called, which went through all def->nets objects and (if they were
macvtap) deleted the device specified in net->ifname; since libvirt
hadn't gotten to the point of replacing the incoming "macvtap0" with
the name of a device it actually created for this guest, that meant
that "macvtap0" was deleted, *even though it was currently in use by a
different guest*!
Whew!
So, it turns out that when formatting "migratable" XML, "vnetX"
devices are omitted, just as when formatting "inactive" XML. By making
the code in both interface parsing and formatting consistent for
"vnetX", "macvtapX", and "macvlanX", we can thus make sure that the
autogenerated (and unneeded / completely *not* wanted) macvtap device
name will not be sent with the migration XML. This way when a
migration fails, net->ifname will be NULL, and libvirt won't have any
device to try and (erroneously) delete.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
When adding support for HMAT, in f0611fe883 I've introduced a
check which aims to validate /domain/cpu/numa/interconnects. As a
part of that, there is a loop which checks whether all <latency/>
with @cache attribute refer to an existing cache level. For
instance:
<cpu mode='host-model' check='partial'>
<numa>
<cell id='0' cpus='0-5' memory='512000' unit='KiB' discard='yes'>
<cache level='1' associativity='direct' policy='writeback'>
<size value='8' unit='KiB'/>
<line value='5' unit='B'/>
</cache>
</cell>
<interconnects>
<latency initiator='0' target='0' cache='1' type='access' value='5'/>
<bandwidth initiator='0' target='0' type='access' value='204800' unit='KiB'/>
</interconnects>
</numa>
</cpu>
This XML defines that accessing L1 cache of node #0 from node #0
has latency of 5ns.
However, the loop was not written properly. Well, the check in
it, as it was always checking for the first cache in the target
node and not the rest. Therefore, the following example errors
out:
<cpu mode='host-model' check='partial'>
<numa>
<cell id='0' cpus='0-5' memory='512000' unit='KiB' discard='yes'>
<cache level='3' associativity='direct' policy='writeback'>
<size value='10' unit='KiB'/>
<line value='8' unit='B'/>
</cache>
<cache level='1' associativity='direct' policy='writeback'>
<size value='8' unit='KiB'/>
<line value='5' unit='B'/>
</cache>
</cell>
<interconnects>
<latency initiator='0' target='0' cache='1' type='access' value='5'/>
<bandwidth initiator='0' target='0' type='access' value='204800' unit='KiB'/>
</interconnects>
</numa>
</cpu>
This errors out even though it is a valid configuration. The L1
cache under node #0 is still present.
Fixes: f0611fe883
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Commit <2020c6af8a8e4bb04acb629d089142be984484c8> fixed an issue with
QEMU driver by reporting offline CPUs as well. However, doing so it
introduced a regression into libxl and test drivers by completely
ignoring the passed `hostcpus` variable.
Move the virHostCPUGetAvailableCPUsBitmap() out of the helper into QEMU
driver so it will not affect other drivers which gets the number of host
CPUs differently.
This was uncovered by running libvirt-dbus test suite which counts on
the fact that test driver has hard-coded host definition and must not
depend on the host at all.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
and stop erroneously equating NULL with "". The latter means that the
element has empty content, while the former means there was an error
during parsing (either internal with the parser, or the content of the
XML was bad).
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
virDomainBlkioDeviceParseXML() calls xmlNodeGetContent() multiple
times in a loop, but can easily be refactored to call it once for all
element nodes, and then use the result of that one call in each of the
(mutually exclusive) blocks that previously each had their own call to
xmlNodeGetContent.
This is being done in order to reduce the number of changes needed in
an upcoming patch that will eliminate the lack of checking for NULL on
return from xmlNodeGetContent().
As part of the simplification, the while() loop has been changed into
a for() so that we can use "continue" without bypassing the
"node = node->next".
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We already allow controlling the initiator IQN for iSCSI based disks.
Add the same for host devices.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
These variables are only used for assignment and have
no other effect.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>