Since the APIs support just one element per namespace and while
modifying an element all duplicates would be removed, let's do this
right away in the post parse callback.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1190590
Since adding the support for scheduler policy settings in commit
8680ea97, there are two enums with the same information. That was
caused by rewriting the patch since first draft.
Find out thanks to clang, but there was no impact whatsoever.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1142631
This patch resolves a situation where the same "<target dev='$name'...>"
can be used for multiple disks in the domain.
While the $name is "mostly" advisory regarding the expected order that
the disk is added to the domain and not guaranteed to map to the device
name in the guest OS, it still should be unique enough such that other
domblk* type operations can be performed.
Without the patch, the domblklist will list the same Target twice:
$ virsh domblklist $dom
Target Source
------------------------------------------------
sda /var/lib/libvirt/images/file.qcow2
sda /var/lib/libvirt/images/file.img
Additionally, getting domblkstat, domblkerror, domblkinfo, and other block*
type calls will not be able to reference the second target.
Fortunately, hotplug disallows adding a "third" sda value:
$ qemu-img create -f raw /var/lib/libvirt/images/file2.img 10M
$ virsh attach-disk $dom /var/lib/libvirt/images/file2.img sda
error: Failed to attach disk
error: operation failed: target sda already exists
$
BUT, it since 'sdb' doesn't exist one would get the following on the same
hotplug attempt, but changing to use 'sdb' instead of 'sda'
$ virsh attach-disk $dom /var/lib/libvirt/images/file2.img sdb
error: Failed to attach disk
error: internal error: unable to execute QEMU command 'device_add': Duplicate ID 'scsi0-0-1' for device
$
Since we cannot fix this issue at parsing time, the best that can be done so
as to not "lose" a domain is to make the check prior to starting the guest
with the results as follows:
$ virsh start $dom
error: Failed to start domain $dom
error: XML error: target 'sda' duplicated for disk sources '/var/lib/libvirt/images/file.qcow2' and '/var/lib/libvirt/images/file.img'
$
Running 'make check' found a few more instances in the tests where this
duplicated target dev value was being used. These also exhibited some
duplicated 'id=' values (negating the uniqueness argument of aliases) in
the corresponding .args file and of course the *xmlout version of a few
input XML files.
Commit 6992994 started filling the listen attribute
of the parent <graphics> elements from type='network' listens.
When this XML is passed to UpdateDevice, parsing fails:
XML error: graphics listen attribute 10.20.30.40 must match
address attribute of first listen element (found none)
Ignore the address in the parent <graphics> attribute
when no type='address' listens are found,
the same we ignore the address for the <listen> subelements
when parsing inactive XML.
At least Xen supports backend drivers in another domain (aka "driver
domain"). This patch introduces an XML config option for specifying the
backend domain name for <disk> and <interface> devices. E.g.
<disk>
<backenddomain name='diskvm'/>
...
</disk>
<interface type='bridge'>
<backenddomain name='netvm'/>
...
</interface>
In the future, same option will be needed for USB devices (hostdev
objects), but for now libxl doesn't have support for PVUSB.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Later patches will need to access the full definition to do check the
memory size and thus the checking needs to be done after the whole
definition including devices is known.
For historical reasons data regarding NUMA configuration were split
between the CPU definition and numatune. We cannot do anything about the
XML still being split, but we certainly can at least store the relevant
data in one place.
This patch moves the NUMA stuff to the right place.
Since our formatter now handles well if the config is allocated and not
filled we can safely always-allocate the NUMA config and remove the
ad-hoc allocation code.
This will help in later patches as the parser will be refactored to just
fill the data.
Move the existing virDomainDefNew to virDomainDefNewFull as it's setting
a few things in the conf and re-introduce virDomainDefNew as a function
without parameters for common use.
It's easier to recalculate the number in the one place it's used as
having a separate variable to track it. It will also help with moving
the NUMA code to the separate module.
For weird historical reasons NUMA cells are added as a subelement of
<cpu> while the actual configuration is done in <numatune>.
This patch splits out the cell parser code from cpu config to NUMA
config. Note that the changes to the code are minimal just to make it
work and the function will be refactored in the next patch.
Not all files we want to find using virFileFindResource{,Full} are
generated when libvirt is built, some of them (such as RNG schemas) are
distributed with sources. The current API was not able to find source
files if libvirt was built in VPATH.
Both RNG schemas and cpu_map.xml are distributed in source tarball.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Add an XML attribute to allow disabling merge of rx buffers
on the host:
<interface ...>
...
<model type='virtio'/>
<driver ...>
<host mrg_rxbuf='off'/>
</driver>
</interface>
https://bugzilla.redhat.com/show_bug.cgi?id=1186886
In order for QEMU vCPU (and other) threads to run with RT scheduler,
libvirt needs to take care of that so QEMU doesn't have to run privileged.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1178986
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Prior to commit 7d5bf484747 (first appearing in libvirt 1.2.2), the
status XML of a domain's interface was missing a lot of important
information; mainly it just output the config of the interface, plus
the name of the tap device and qemu device alias. Commit 7d5bf484747
changed the status XML to include many important bits of information
that were required to make network "hook" scripts useful - bandwidth
information, vlan tag, the name of the bridge (or physical device in
the case of macvtap) that the tap/macvtap device was attached to - the
commit log for 7d5bf484747 has a very detailed explanation of the
change. For quick reference - in the example given there, prior to the
change, status XML looked like figure [C]:
<interface type='network'>
<source network='testnet' portgroup='admin'/>
<target dev='macvtap0'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x03' function='0x0'/>
</interface>
and after the change, it looked like figure [E]:
<interface type='direct'>
<source dev='p4p1_0' mode='bridge'/>
<bandwidth>
<inbound average='1000' peak='5000' burst='1024'/>
<outbound average='128' peak='256' burst='256'/>
</bandwidth>
<target dev='macvtap0'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x03' function='0x0'/>
</interface>
You'll notice that bandwidth info, physdev, and macvtap mode have been
added, but the network and portgroup names are now missing - I didn't
think that this information was of any use once the needed
bandwidth/vlan/etc config had been pulled from the network/portgroup.
I was wrong.
A few months after that change a user on IRC asked what happened to
portgroup in the status XML and described how he used it (more or less
as a tag to decide what external information to use in a hook script
that was run at startup/migration time - see
http://wiki.libvirt.org/page/OVS_and_PVLANS ). At that time I planned
to make a patch to re-add portgroup, but life intervened as that was
just prior to a transatlantic move involving several weeks of
"vacation". During this time I somehow forgot to make the patch, and
also mistakenly remembered that I *had* made it.
Subsequent to this, as a part of mprivozn's work to add support for
network-specific hooks, I did re-add the output of the network name in
status XML, but once again completely forgot about portgroup. This was
in commit a3609121 (first appearing in libvirt 1.2.11). This made the
status XML from the above example look like this:
<interface type='direct'>
<source network='testnet' dev='p4p1_0' mode='bridge'/>
<bandwidth>
<inbound average='1000' peak='5000' burst='1024'/>
<outbound average='128' peak='256' burst='256'/>
</bandwidth>
<target dev='macvtap0'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x03' function='0x0'/>
</interface>
*This* patch just adds the portgroup back to the status XML, so the
same example interface will look like this:
<interface type='direct'>
<source network='testnet' portgroup='admin'
dev='p4p1_0' mode='bridge'/>
<bandwidth>
<inbound average='1000' peak='5000' burst='1024'/>
<outbound average='128' peak='256' burst='256'/>
</bandwidth>
<target dev='macvtap0'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x03' function='0x0'/>
</interface>
The result is that the status XML now contains all information about
how the interface is setup (bandwidth, physical device, tap device,
etc), in addition to pointers to its origin (the network and
portgroup).
virDomainGraphicsListenSetAddress() and
virDomainGraphicsListenSetNetwork() both set their respective char* to
NULL directly when asked to set it to NULL, which is okay as long as
it's already set to NULL. If these functions are ever called to clear
a listen object that has a valid string in address or network, it will
end up leaking the old value. Currently that doesn't happen, so this
is just a preemptive strike.
Prior to 0.9.4, libvirt only supported a single listen, and it had to
be an IP address:
<graphics listen='1.2.3.4' ..../>
Starting with 0.9.4, a graphics element could have a <listen>
subelement (actually the grammar supports multiples, but all of the
drivers only support a single <listen> per <graphics>), and that
listen element can be of type='address' or type='network'. For
type='address', <listen> also has an attribute called 'address' which
contains the IP address for listening:
<graphics ....>
<listen type='address' address='1.2.3.4' .../>
</graphics>
type can also be "network", and in that case listen will have a
"network" attribute which will contain the name of a libvirt
network:
<graphics ....>
<listen type='network' network='testnet' .../>
</graphics>
At domain start (or migrate) time, libvirt will attempt to
find an IP address associated with that network (e.g. the IP address
of the bridge device used by the network, or the physical device
listed in <forward dev='physdev'/>) and fill in that address in the
status XML:
<graphics ....>
<listen type='network' network='testnet' address='1.2.3.4' .../>
</graphics>
In the case that a <graphics> element has a <listen> subelement of
type='address', that listen subelement's "address" attribute is
backfilled into the parent graphics element's "listen" *attribute* for
backward compatibility (so that a management application unaware of
the separate <listen> element can still learn the listen
address). This backfill should be done with the IP learned from
type='network' as well, and that's what this patch does:
<graphics listen='1.2.3.4' ....>
<listen type='network' network='testnet' address='1.2.3.4' .../>
</graphics>
This is a continuation of the fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=1191016
The function virDomainVcpuPinDel() used vcpupin_list to stand for
def->cputune.vcpupin, which made the codes more readable.
However, in this function, it will realloc vcpupin_list later.
As the definition of realloc(), it may free vcpupin_list and then
points it to a new-realloced address, but def->cputune.vcpupin doesn't
point to the new address(it's freed however).
Thus,
1) When we refer to the def->cputune.vcpupin afterwards, which was freed
by realloc(), an INVALID READ occurs, and libvirtd may crash.
2) As no one will use vcpupin_list any more, and no one frees it(it's just
alloced by realloc()), memory leak occurs.
Part of the valgrind logs are shown as below:
==1837== Thread 15:
==1837== Invalid read of size 8
==1837== at 0x5367337: virDomainDefFormatInternal (domain_conf.c:18392)
which is : virBufferAsprintf(buf, "<vcpupin vcpu='%u' ",
def->cputune.vcpupin[i]->vcpuid);
==1837== by 0x536966C: virDomainObjFormat (domain_conf.c:18970)
==1837== by 0x5369743: virDomainSaveStatus (domain_conf.c:19166)
==1837== by 0x117B26DC: qemuDomainPinVcpuFlags (qemu_driver.c:4586)
==1837== by 0x53EA313: virDomainPinVcpuFlags (libvirt.c:9803)
==1837== by 0x14CB7D: remoteDispatchDomainPinVcpuFlags (remote_dispatch.h:6762)
==1837== by 0x14CC81: remoteDispatchDomainPinVcpuFlagsHelper (remote_dispatch.h:6740)
==1837== by 0x5464C30: virNetServerProgramDispatchCall (virnetserverprogram.c:437)
==1837== by 0x546507A: virNetServerProgramDispatch (virnetserverprogram.c:307)
==1837== by 0x171B83: virNetServerProcessMsg (virnetserver.c:172)
==1837== by 0x171E6E: virNetServerHandleJob (virnetserver.c:193)
==1837== by 0x5318E78: virThreadPoolWorker (virthreadpool.c:145)
==1837== Address 0x12ea2870 is 0 bytes inside a block of size 16 free'd
==1837== at 0x4C291AC: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==1837== by 0x52A3D14: virReallocN (viralloc.c:245)
==1837== by 0x52A3DFB: virShrinkN (viralloc.c:372)
==1837== by 0x52A3F57: virDeleteElementsN (viralloc.c:503)
==1837== by 0x533939E: virDomainVcpuPinDel (domain_conf.c:15405) //doReset为true时才会进到。
==1837== by 0x117B2642: qemuDomainPinVcpuFlags (qemu_driver.c:4573)
==1837== by 0x53EA313: virDomainPinVcpuFlags (libvirt.c:9803)
==1837== by 0x14CB7D: remoteDispatchDomainPinVcpuFlags (remote_dispatch.h:6762)
==1837== by 0x14CC81: remoteDispatchDomainPinVcpuFlagsHelper (remote_dispatch.h:6740)
==1837== by 0x5464C30: virNetServerProgramDispatchCall (virnetserverprogram.c:437)
==1837== by 0x546507A: virNetServerProgramDispatch (virnetserverprogram.c:307)
==1837== by 0x171B83: virNetServerProcessMsg (virnetserver.c:172)
Steps to reproduce the problem:
1) use virDomainPinVcpuFlags() to pin a guest's vcpu to all the pcpus
of the host.
This patch uses def->cputune.vcpupin instead of vcpupin_list to do the
realloc() job, to avoid invalid read or memory leaking.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
Signed-off-by: Yue Wenyuan <yuewenyuan@huawei.com@huawei.com>
The helpers will be useful when implementing hotplug and coldplug of
random number generator devices.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
When adding devices to the definition it's useful to check whether the
devices don't reside on a conflicting address. This patch adds a helper
that iterates all device info and compares the addresses with the given
info.
It is only supported for virtio adapters.
Silently drop it if it was specified for other models,
as is done for other virtio attributes.
Also mention this in the documentation.
https://bugzilla.redhat.com/show_bug.cgi?id=1147195
Add the missing jump to thje error label. The error message shouldn't
ever be triggered though as it's called only on pre-selected nodes.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1170492
In one of our previous commits (dc8b7ce7) we've done a functional
change even though it was intended as pure refactor. The problem is,
that the following XML:
<vcpu placement='static' current='2'>6</vcpu>
<cputune>
<emulatorpin cpuset='1-3'/>
</cputune>
<numatune>
<memory mode='strict' placement='auto'/>
</numatune>
gets translated into this one:
<vcpu placement='auto' current='2'>6</vcpu>
<cputune>
<emulatorpin cpuset='1-3'/>
</cputune>
<numatune>
<memory mode='strict' placement='auto'/>
</numatune>
We should not change the vcpu placement mode. Moreover, we're doing
something similar in case of emulatorpin and iothreadpin. If they were
set, but vcpu placement was auto, we've mistakenly removed them from
the domain XML even though we are able to set them independently on
vcpus.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Do the allocation first, then add the actual device.
The second part should never fail. This is good
for live hotplug where we don't want to remove the device
on OOM after the monitor command succeeded.
The only change in behavior is that on failure, the
vmdef->consoles array is freed, not just the first console.
Currently when launching the LXC controller we first write out
the plain, inactive XML configuration, then launch the controller,
then replace the file with the live status XML configuration.
By good fortune this hasn't caused any problems other than some
misleading error messages during failure scenarios.
This simplifies the code so it only writes out the XML once and
always writes the live status XML. To do this we need to handshake
with the child process, to make execution pause just before exec()
so we can write the XML status with the child PID present.
Previously the function returned either -1 in case of an error or 0 on
success. However, we should also distinguish between a case we
successfully added a controller and a case there wasn't a need to add any
controller
Ploop is a pseudo device which makeit possible to access
to an image in a file as a block device. Like loop devices,
but with additional features, like snapshots, write tracker
and without double-caching.
It used in PCS for containers and in OpenVZ. You can manage
ploop devices and images with ploop utility
(http://git.openvz.org/?p=ploop).
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Commit id 'aa2cc721' added calls to virSocketAddrFormat but did not
check for a NULL (error) return which could lead to bad output
in the XML file. Need to check for NULL return and cause failure.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The virDomainDefineXMLFlags and virDomainCreateXML APIs both
gain new flags allowing them to be told to validate XML.
This updates all the drivers to turn on validation in the
XML parser when the flags are set
There's this function virNetDevBandwidthParse which parses the
bandwidth XML snippet. But it's not clever much. For the
following XML it allocates the virNetDevBandwidth structure even
though it's completely empty:
<bandwidth>
</bandwidth>
Later in the code there are some places where we check if
bandwidth was set or not. And since we obtained pointer from the
parsing function we think that it is when in fact it isn't.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The virDomainDefParse* and virDomainDefFormat* methods both
accept the VIR_DOMAIN_XML_* flags defined in the public API,
along with a set of other VIR_DOMAIN_XML_INTERNAL_* flags
defined in domain_conf.c.
This is seriously confusing & error prone for a number of
reasons:
- VIR_DOMAIN_XML_SECURE, VIR_DOMAIN_XML_MIGRATABLE and
VIR_DOMAIN_XML_UPDATE_CPU are only relevant for the
formatting operation
- Some of the VIR_DOMAIN_XML_INTERNAL_* flags only apply
to parse or to format, but not both.
This patch cleanly separates out the flags. There are two
distint VIR_DOMAIN_DEF_PARSE_* and VIR_DOMAIN_DEF_FORMAT_*
flags that are used by the corresponding methods. The
VIR_DOMAIN_XML_* flags received via public API calls must
be converted to the VIR_DOMAIN_DEF_FORMAT_* flags where
needed.
The various calls to virDomainDefParse which hardcoded the
use of the VIR_DOMAIN_XML_INACTIVE flag change to use the
VIR_DOMAIN_DEF_PARSE_INACTIVE flag.
The virCPUDefFormat* methods were relying on the VIR_DOMAIN_XML_*
flag definitions. It is not desirable for low level internal
functions to be coupled to flags for the public API, since they
may need to be called from several different contexts where the
flags would not be appropriate.
https://bugzilla.redhat.com/show_bug.cgi?id=1181408
When we try to hotplug a channel chr device with no target, we
will get success (which should fail) in virDomainChrDefParseXML,
because we use goto cleanup this place and return an incomplete
definition (with no target). In qemuDomainAttachChrDevice,
we add it to the domain definition, but fail to remove it from
there when chardev-add fails, because virDomainChrRemove
matches chardevices according to the target name.
The device definition is then freed in qemuDomainAttachDeviceFlags,
leaving a stale pointer in the domain definition.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1179684
The way that we currently generate the <driver/> for <controller/> is
just madness:
<controller type='scsi' index='0' model='virtio-scsi'>
<driver queues='12'/>
<driver cmd_per_lun='123'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</controller>
It's obvious that we should be aiming at the following:
<controller type='scsi' index='0' model='virtio-scsi'>
<driver queues='12' cmd_per_lun='123'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</controller>
Signed-off-by: Luyao Huang <lhuang@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1177194
When migrate a vm, we will generate a xml via qemuDomainDefFormatLive and
pass this xml to target libvirtd. Libvirt will use the current network
state in def->data.network.actual to generate the xml, this will make
migrate failed when we set a network type guest interface use a macvtap
network as a source in a vm then migrate vm to another host(which has the
different macvtap network settings: different interface name, bridge name...)
Add a flag check in virDomainNetDefFormat, if we set a VIR_DOMAIN_XML_MIGRATABLE
flag when call virDomainNetDefFormat, we won't get the current vm interface
state.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>