Back when macvtap support was added in commit 315baab944 in Feb. 2010
(libvirt-0.7.7), it was setup to autogenerate a name for the device if
one wasn't supplied, in the pattern "macvtap%d" (or "macvlan%d"),
similar to the way an unspecified standard tap device name will lead
to an autogenerated "vnet%d".
As a matter of fact, in commit ca1b7cc8e4 added in May 2010, the code
was changed to *always* ignore a supplied device name for macvtap
interfaces by deleting *any* name immediately during the <interface>
parsing (this was intended to prevent one domain which had failed to
completely start from deleting the macvtap device of another domain
which had subsequently been provided the same device name (this will
seem mildly ironic later). This was later fixed to only clear the
device name when inactive XML was being parsed. HOWEVER - this was
only done if the xml was <interface type='direct'> - autogenerated
names were not cleared for <interface type='network'> (which could
also result in a macvtap device).
Although the names of "vnetX" tap devices had always been
automatically cleared when parsing <interface> (see commit d1304583d
from July 2008 (!)), at the time macvtap support was added, both vnetX
and macvtapX device names were always included when formatting the
XML.
Then in commit a8be259d0c (July 2011, libvirt-0.9.4), <interface>
formatting was changed to also clear out "vnetX" device names during
XML formatting as well. However the same treatment wasn't given to
"macvtapX".
Now in 2020, there has been a report that a failed migration leads to
the macvtap device of some other unrelated guest on the destination
host losing its network connectivity. It was determined that this was
due to the domain XML in the migration containing a macvtap device
name, e.g. "macvtap0", that was already in use by the other guest on
the destination. Normally this wouldn't be a problem, because libvirt
would see that the device was already in use, and then find a
different unused name. But in this case, other external problems were
causing the migration to fail prior to selecting a macvtap device and
successfully opening it, and during error recovery, qemuProcessStop()
was called, which went through all def->nets objects and (if they were
macvtap) deleted the device specified in net->ifname; since libvirt
hadn't gotten to the point of replacing the incoming "macvtap0" with
the name of a device it actually created for this guest, that meant
that "macvtap0" was deleted, *even though it was currently in use by a
different guest*!
Whew!
So, it turns out that when formatting "migratable" XML, "vnetX"
devices are omitted, just as when formatting "inactive" XML. By making
the code in both interface parsing and formatting consistent for
"vnetX", "macvtapX", and "macvlanX", we can thus make sure that the
autogenerated (and unneeded / completely *not* wanted) macvtap device
name will not be sent with the migration XML. This way when a
migration fails, net->ifname will be NULL, and libvirt won't have any
device to try and (erroneously) delete.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
When adding support for HMAT, in f0611fe883 I've introduced a
check which aims to validate /domain/cpu/numa/interconnects. As a
part of that, there is a loop which checks whether all <latency/>
with @cache attribute refer to an existing cache level. For
instance:
<cpu mode='host-model' check='partial'>
<numa>
<cell id='0' cpus='0-5' memory='512000' unit='KiB' discard='yes'>
<cache level='1' associativity='direct' policy='writeback'>
<size value='8' unit='KiB'/>
<line value='5' unit='B'/>
</cache>
</cell>
<interconnects>
<latency initiator='0' target='0' cache='1' type='access' value='5'/>
<bandwidth initiator='0' target='0' type='access' value='204800' unit='KiB'/>
</interconnects>
</numa>
</cpu>
This XML defines that accessing L1 cache of node #0 from node #0
has latency of 5ns.
However, the loop was not written properly. Well, the check in
it, as it was always checking for the first cache in the target
node and not the rest. Therefore, the following example errors
out:
<cpu mode='host-model' check='partial'>
<numa>
<cell id='0' cpus='0-5' memory='512000' unit='KiB' discard='yes'>
<cache level='3' associativity='direct' policy='writeback'>
<size value='10' unit='KiB'/>
<line value='8' unit='B'/>
</cache>
<cache level='1' associativity='direct' policy='writeback'>
<size value='8' unit='KiB'/>
<line value='5' unit='B'/>
</cache>
</cell>
<interconnects>
<latency initiator='0' target='0' cache='1' type='access' value='5'/>
<bandwidth initiator='0' target='0' type='access' value='204800' unit='KiB'/>
</interconnects>
</numa>
</cpu>
This errors out even though it is a valid configuration. The L1
cache under node #0 is still present.
Fixes: f0611fe883
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Commit <2020c6af8a8e4bb04acb629d089142be984484c8> fixed an issue with
QEMU driver by reporting offline CPUs as well. However, doing so it
introduced a regression into libxl and test drivers by completely
ignoring the passed `hostcpus` variable.
Move the virHostCPUGetAvailableCPUsBitmap() out of the helper into QEMU
driver so it will not affect other drivers which gets the number of host
CPUs differently.
This was uncovered by running libvirt-dbus test suite which counts on
the fact that test driver has hard-coded host definition and must not
depend on the host at all.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
and stop erroneously equating NULL with "". The latter means that the
element has empty content, while the former means there was an error
during parsing (either internal with the parser, or the content of the
XML was bad).
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
virDomainBlkioDeviceParseXML() calls xmlNodeGetContent() multiple
times in a loop, but can easily be refactored to call it once for all
element nodes, and then use the result of that one call in each of the
(mutually exclusive) blocks that previously each had their own call to
xmlNodeGetContent.
This is being done in order to reduce the number of changes needed in
an upcoming patch that will eliminate the lack of checking for NULL on
return from xmlNodeGetContent().
As part of the simplification, the while() loop has been changed into
a for() so that we can use "continue" without bypassing the
"node = node->next".
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We already allow controlling the initiator IQN for iSCSI based disks.
Add the same for host devices.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
These variables are only used for assignment and have
no other effect.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Both accept a NULL value gracefully and virStringFreeList
does not zero the pointer afterwards, so a straight replace
is safe.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Also remove the temporary variable - even virStringListCopy
aborts on OOM now.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
The shmem 'name' specifies the shared memory path in '/dev/shm/',
however, we may need to change it to avoid filename conflict
when VM migrate to other host. This patch remove shmem name
consistency check.
Signed-off-by: Wang Xin <wangxinxin.wang@huawei.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Role(master or peer) controls how the domain behaves on migration.
For more details about migration with ivshmem, see
https://git.qemu.org/?p=qemu.git;a=blob_plain;f=docs/system/ivshmem.rst;hb=HEAD
It's a optional attribute in libvirt, and qemu will choose default
role for ivshmem device if the user is not specified.
With device property 'role', the value can be 'master' or 'peer'.
- 'master' (means 'master=on' in qemu), the guest will copy
the shared memory on migration to the destination host.
- 'peer' (means 'master=off' in qemu), the migration is disabled.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Signed-off-by: Yang Hang <yanghang44@huawei.com>
Signed-off-by: Wang Xin <wangxinxin.wang@huawei.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
When trying to parse an XML with overlapping iothread scheduler
settings, the error message was rather confusing:
error: iothreadssched attributes 'vcpus' must not overlap
Pass the correct element name.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Pass the scheduler element name instead of trying to reconstructing
it from the attribute name.
This has the benefit of not mixing '%s' with regular text in
translatable strings as well as preventing the confusion when
the 's' marking the plural in the element name ('vcpus') is taken
as a first letter of the 'sched' suffix.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Fixes: 7ea55a481d
Fixes: 99c5fe0e7c
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
virDomainThreadSchedParseHelper is used for parsing both iothread
and vcpu scheduling settings. Rename its 'name' attribute to
make it obvious this refers to the attribute name, not the name of
the element (which is currently constructed from the attribute name).
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
virPCIDeviceAddressPtr 'addr' is forgotten to be freed in the branch
'VIR_APPEND_ELEMENT() < 0'. Use g_autoptr instead.
Signed-off-by: Hao Wang <wanghao232@huawei.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The storage pool code now attempts to disable COW by default on btrfs,
but management applications may wish to override this behaviour. Thus we
introduce a concept of storage pool features:
<features>
<cow state='yes|no'/>
</features>
If the <cow> feature policy is set, it will be enforced. It will always
return an hard error if COW cannot be explicitly set or unset.
Reviewed-by: Neal Gompa <ngompa13@gmail.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This is only used in the ESX driver where, when set to "no", it will
ignore all the checks libvirt does about the origin of the MAC address
(whether or not it's in a VMWare OUI) and forward the original one to
the ESX server telling it not to check it either.
This allows keeping a deterministic MAC address which can be useful for
licensed software which might dislike changes.
Signed-off-by: Bastien Orivel <bastien.orivel@diateam.net>
VMX conversion parts rewritten to apply on top of previously merged
support for type='generated|static'
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Remove superfluous breaks, as there is a "return" before them.
Signed-off-by: Liao Pingfang <liao.pingfang@zte.com.cn>
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Conver the code to the new approach which uses XPath to fetch known
elements rather than looping through all XML children.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Use a switch statement which will not be omitted when adding potential
new types.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Similarly to previous commit split out formatting of the mdev subsystem
related stuff.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Similarly to previous commit split out formatting of the vHBA subsystem
related stuff.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Split up formatting of the '<address>' element rather that trying to
optimize it with formatting string hacks.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Similarly to previous commit split out formatting of the SCSI subsystem
related stuff.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Similarly to previous commit split out formatting of the PCI subsystem
related stuff.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Separate out bits related to USB so that the logic isn't entangled in
multiple conditional statements.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Refactor the formatter to the new multiple buffer based approach so that
we can easily separate it into formatters per subsys type.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We store the config of an iSCSI hostdev in a virStorageSource structure.
Parse the private data portion.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
iSCSI subsystem hostdevs store the data as a virStorageSource. Format
the private data part of the virStorageSource in the appropriate place.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This is only used in the ESX driver where, when set to "static", it will
ignore all the checks libvirt does about the origin of the MAC address
(whether or not it's in a VMWare OUI) and forward the original one to
the ESX server telling it not to check it either.
This allows keeping a deterministic MAC address which can be useful for
licensed software which might dislike changes.
Signed-off-by: Bastien Orivel <bastien.orivel@diateam.net>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
All modified functions are similar, in all cases "cleanup" label is removed,
along with all the "goto" calls.
Signed-off-by: Nicolas Brignone <nmbrignone@gmail.com>
Reviewed-by: Laine Stump <laine@redhat.com>
There are several calls to virBufferFreeAndReset() when functions
encounter an error, but the caller never uses the virBuffer once an
error has been encountered (all callers detect error by looking at the
function return value, not the contents of the virBuffer being
operated on), and now that all virBuffers are auto-freed there is no
reason for the lower level functions like these to spend time freeing
a buffer that is guaranteed to be freed momentarily anyway.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Every other caller of this function checks for an error return and
ends their formatting early if there is an error. This function
happily continues on its way.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The output of vcpupin and emulatorpin for a domain with vcpu
placement='static' is based on a default bitmap that contains
all possible CPUs in the host, regardless of the CPUs being offline
or not. E.g. for a Linux host with this CPU setup (from lscpu):
On-line CPU(s) list: 0,8,16,24,32,40,(...),184
Off-line CPU(s) list: 1-7,9-15,17-23,25-31,(...),185-191
And a domain with this configuration:
<vcpu placement='static'>1</vcpu>
'virsh vcpupin' will return the following:
$ sudo ./run tools/virsh vcpupin vcpupin_test
VCPU CPU Affinity
----------------------
0 0-191
This is benign by its own, but can make the user believe that all
CPUs from the 0-191 range are eligible for pinning. Which can lead
to situations like this:
$ sudo ./run tools/virsh vcpupin vcpupin_test 0 1
error: Invalid value '1' for 'cpuset.cpus': Invalid argument
This is exarcebated by the fact that 'virsh vcpuinfo' considers only
available host CPUs in the 'CPU Affinity' field:
$ sudo ./run tools/virsh vcpuinfo vcpupin_test
(...)
CPU Affinity: y-------y-------y-------(...)
This patch changes the default bitmap of vcpupin and emulatorpin, in
the case of domains with static vcpu placement, to all available CPUs
instead of all possible CPUs. Aside from making it consistent with
the behavior of 'vcpuinfo', users will now have one less incentive to
try to pin a vcpu in an offline CPU.
https://bugzilla.redhat.com/show_bug.cgi?id=1434276
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
If the name is 'vcpus', we will get 'vcpussched' instead of 'vcpusched'
in the error message as following:
... 19155 : vcpussched attributes 'vcpus' must not overlap
So we use 'vcpu' to replace 'vcpus'.
Signed-off-by: Liao Pingfang <liao.pingfang@zte.com.cn>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
These APIs will be used by QEMU driver when building the command
line.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
There are several restrictions, for instance @initiator and
@target have to refer to existing NUMA nodes (daa), @cache has to
refer to a defined cache level and so on.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
To cite ACPI specification:
Heterogeneous Memory Attribute Table describes the memory
attributes, such as memory side cache attributes and bandwidth
and latency details, related to the System Physical Address
(SPA) Memory Ranges. The software is expected to use this
information as hint for optimization.
According to our upstream discussion [1] this is exposed under
<numa/> as <cache/> under NUMA <cell/> and <latency> or
<bandwidth/> under numa/latencies.
1: https://www.redhat.com/archives/libvir-list/2020-January/msg00422.html
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
QEMU allows creating NUMA nodes that have memory only.
These are somehow important for HMAT.
With check done in qemuValidateDomainDef() for QEMU 2.7 or newer
(checked via QEMU_CAPS_NUMA), we can be sure that the vCPUs are
fully assigned to NUMA nodes in domain XML.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
There is only one caller of virDomainNumaSetNodeCpumask() which
checks for the return value but because the function will return
NULL iff the @cpumask was NULL in the first place. But in that
place @cpumask can't be NULL because it was just allocated by
virBitmapParse().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
There are two functions virDomainNumaDefCPUFormatXML() and
virDomainNumaDefCPUParseXML() which format and parse domain's
<numa/>. There is nothing CPU specific about them. Drop the
infix.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
There is nothing domain specific about the function, thus it
should not have virDomain prefix. Also, the fact that it is a
static function makes it impossible to use from other files.
Move the function to virxml.c and drop the 'Domain' infix.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The semantics of the backup operation don't strictly require that all
disks being backed up are part of the same incremental part (when a disk
was checkpointed/backed up separately or in a different VM), or even
they may not have a previous checkpoint at all (e.g. when the disk
was freshly hotplugged to the vm).
In such cases we can still create a common checkpoint for all of them
and backup differences according to configuration.
This patch adds a per-disk configuration of the checkpoint to do the
incremental backup from via the 'incremental' attribute and allows
perform full backups via the 'backupmode' attribute.
Note that no changes to the qemu driver are necessary to take advantage
of this as we already obey the per-disk 'incremental' field.
https://bugzilla.redhat.com/show_bug.cgi?id=1829829
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>