Commit 8ba0a58 introduced a compiler warning that I hit during
a run of ./autobuild.sh:
../../src/nodeinfo.c: In function 'nodeCapsInitNUMA':
../../src/nodeinfo.c:1853:43: error: 'nsiblings' may be used uninitialized in this function [-Werror=maybe-uninitialized]
if (virCapabilitiesAddHostNUMACell(caps, n, memory,
^
Sure enough, nsiblings starts uninitialized, and is set by a call
to virNodeCapsGetSiblingInfo, but that function fails to assign
through the pointer if virNumaGetDistances fails.
* src/nodeinfo.c (nodeCapsInitNUMA): Initialize nsiblings.
Signed-off-by: Eric Blake <eblake@redhat.com>
Jim Fehlig reported a regression found by libvirt-TCK tests:
> ~ # perl /usr/share/libvirt-tck/tests/qemu/100-disk-encryption.t
...
> ok 4 - defined persistent domain config
> # Starting inactive domain config
> libvirt error code: 1, message: internal error: unable to execute QEMU command
> 'cont': 'drive-ide0-0-1'
> (/var/cache/libvirt-tck/300-disk-encryption/demo.qcow2) is encrypted
Commit 2279d560 converted a boolean into a pointer with the intent of
transferring that pointer out of a temporary object into the caller's
data structure. The temporary structure meant that meta->encryption
was always NULL on entry, so we could get away with blindly allocating
the pointer when the header said so. But later, commit 8823272d
tweaked things to do backing chain detection in-place, rather than via
a temporary object; this has the net result that meta->encryption can
be non-NULL on entry. Not only did this turn the latent behavior into
a memory leak, it is also a behavior regression: blindly allocating a
new pointer wipes out what secrets we already knew about the chain,
making it impossible to restart the domain.
Of course, no one in their right mind should be relying on qcow2
encryption - it is fundamentally flawed. And sadly, the TCK tests
don't get run often enough, and this shows that our virstoragetest
does not exercise encrypted images at all. Otherwise, we could
have avoided a release containing this regression.
* src/util/virstoragefile.c (virStorageFileGetMetadataInternal):
Don't nuke an already-existing encryption.
Signed-off-by: Eric Blake <eblake@redhat.com>
clang complains about possibly uninitialized variable:
vbox/vbox_snapshot_conf.c:1355:9: error: variable 'ret' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
if (!(xPathContext = xmlXPathNewContext(xml))) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
So init 'ret' with NULL.
Now that qemu 2.0 allows commit of the active layer, people are
attempting to use virsh blockcommit and getting into a stuck
state, because libvirt is unprepared to handle the two-phase
commit required by qemu.
Stepping back a bit, there are two valid semantics for a
commit operation:
1. Maintain a 'golden' base, and a transient overlay. Make
changes in the overlay, and if everything appears to work,
commit those changes into the base, but still keep the overlay
for the next round of changes; repeat the cycle as desired.
2. Create an external snapshot, then back up the stable state
in the backing file. Once the backup is complete, commit the
overlay back into the base, and delete the temporary snapshot.
Since qemu doesn't know up front which of the two styles is
preferred, a block commit of the active layer merely gets
the job into a synchronized state, and sends an event; then
the user must either cancel (case 1) or complete (case 2),
where qemu then sends a second event that actually ends the
job. However, until commit e6bcbcd, libvirt was blindly
assuming the semantics that apply to a commit of an
intermediate image, where there is only one sane conclusion
(the job automatically ends with fewer elements in the chain);
and getting stuck because it wasn't prepared for qemu to enter
a second phase of the job.
This patch adds a flag to the libvirt API that a user MUST
supply in order to acknowledge that they will be using two-phase
semantics. It might be possible to have a mode where if the
flag is omitted, we automatically do the case 2 semantics on
the user's behalf; but before that happens, I must do additional
patches to track the fact that we are doing an active commit
in the domain XML. Later patches will add support of the flag,
and once 2-phase semantics are working, we can then decide
whether to relax things to allow an omitted flag to cause an
automatic pivot.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_COMMIT_ACTIVE)
(VIR_DOMAIN_BLOCK_JOB_TYPE_ACTIVE_COMMIT): New enums.
* src/libvirt.c (virDomainBlockCommit): Document two-phase job
when committing active layer, through new flag.
(virDomainBlockJobAbort): Document that pivot also occurs after
active commit.
* tools/virsh-domain.c (vshDomainBlockJob): Cover new job.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Explicitly
reject active copy; later patches will add it in.
Signed-off-by: Eric Blake <eblake@redhat.com>
The machine is unregistered and its vbox XML file is changed in order to
add snapshot information. The machine is then registered with the
snapshot to redefine.
This structure contains the data to be saved in the VirtualBox XML file
and can be manipulated with severals exposed functions.
The structure is created by vboxSnapshotLoadVboxFile taking the
machine XML file.
It also can rewrite the XML by using vboxSnapshotSaveVboxFile.
The qemu driver always adds these options to the qemu commandlines,
but the commandline parser didn't recognize them, so sending a
libvirt-generated qemu commandline to its own argvtoxml would always
result in a warning message and a qemu namespace added to the
xml. Since the options don't add any functionality to the domain, they
should just be ignored (similar to -S).
Note that we can't yet add a test for this to qemuargv2xmltest,
because we would have to add QEMU_CAPS_NODEFCONFIG and
QEMU_CAPS_DEVICE to the capabilities for any corresponding
xml2argvtest, and QEMU_CAPS_DEVICE would necessitate having support
for parsing a memballoon device in order for qemuargv2xmltest to
pass. So we wait to add a test for -nodefconfig and -nodefaults until
after adding support for parsing -device virtio-balloon-*.
4d06af97d38c3648937eb8f732704379b3cd9e59 introduced a possible memory
leak of the memory allocated into the "cpu" pointer in
parallelsBuildCapabilities in the case "nodeGetInfo()" would fail right
after the allocation. Rearrange the code to avoid the possibility of the
leak.
Found by Coverity.
The original implementation of the VMX config parser assumed that the
virtualHW.version would have more influence on the content of the VMX
file than it actually seems to have. It started with accepting only
version 4. Additonal versions were added later without any additional
changes in the parser itself. This suggests that the influence of the
virtualHW.version on the content and format of the VMX file is small
or non-existent.
The parser worked without any changes across several virtualHW and
vSphere versions. So instead of adding new virtualHW.version values to
the parser as they come along, or adding an extra flag to allow unknown
virtualHW.version values just relax the check to require version 4 or
later.
Now that we track a disk mirror as a virStorageSource, we might
as well update the XML to theoretically allow any type of
mirroring destination (not just a local file). A later patch
will also be reusing <mirror> to track the block commit of the
top layer of a chain, which is another case where libvirt needs
to update the backing chain after the job is finally pivoted,
and since backing chains can have network backing files as the
destination to commit into, it makes more sense to display that
in the XML.
This patch changes output-only XML; it was already documented
that <mirror> does not affect a domain definition at this point
(because qemu doesn't provide persistent bitmaps yet). Any
application that was starting a block copy job with older libvirt
and then relying on the domain XML to determine if it was
complete will no longer be able to access the file= and format=
attributes of mirror that were previously used. However, this is
not going to be a problem in practice: the only time a block copy
job works is on a transient domain, and any app that is managing
a transient domain probably already does enough of its own
bookkeeping to know which file it is mirroring into without
having to re-read it from the libvirt XML. The one thing that
was likely to be used in a mirroring job was the ready=
attribute, which is unchanged. Meanwhile, I made sure the schema
and parser still accept the old format, even if we no longer
output it, so that upgrading from an older version of libvirt is
seamless.
* docs/schemas/domaincommon.rng (diskMirror): Alter definition.
* src/conf/domain_conf.c (virDomainDiskDefParseXML): Parse two
styles of mirror elements.
(virDomainDiskDefFormat): Output new style.
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror-old.xml: New
file, copied from...
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: ...here
before modernizing.
* tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-mirror-old*: New
files.
* tests/qemuxml2xmltest.c (mymain): Test both styles.
Signed-off-by: Eric Blake <eblake@redhat.com>
The current implementation of 'virsh blockcopy' (virDomainBlockRebase)
is limited to copying to a local file name. But future patches want
to extend it to also copy to network disks. This patch converts over
to a virStorageSourcePtr, although it should have no semantic change
visible to the user, in anticipation of those future patches being
able to use more fields for non-file destinations.
* src/conf/domain_conf.h (_virDomainDiskDef): Change type of
mirror information.
* src/conf/domain_conf.c (virDomainDiskDefParseXML): Localize
mirror parsing into new object.
(virDomainDiskDefFormat): Adjust clients.
* src/qemu/qemu_domain.c (qemuDomainDeviceDefPostParse):
Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockPivot)
(qemuDomainBlockJobImpl, qemuDomainBlockCopy): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
As part of the work on backing chains, I'm finding that it would
be easier to directly manipulate chains of pointers (adding a
snapshot merely adjusts pointers to form the correct list) rather
than copy data from one struct to another. This patch converts
domain disk source to be a pointer.
In this patch, the pointer is ALWAYS allocated (thanks in part to
the previous patch forwarding all disk def allocation through a
common point), and all other changse are just mechanical fallout of
the new type; there should be no functional change. It is possible
that we may want to leave the pointer NULL for a cdrom with no
medium in a later patch, but as that requires a closer audit of the
source to ensure we don't fault on a null dereference, I didn't do
it here.
* src/conf/domain_conf.h (_virDomainDiskDef): Change type of src.
* src/conf/domain_conf.c: Adjust all clients.
* src/security/security_selinux.c: Likewise.
* src/qemu/qemu_domain.c: Likewise.
* src/qemu/qemu_command.c: Likewise.
* src/qemu/qemu_conf.c: Likewise.
* src/qemu/qemu_process.c: Likewise.
* src/qemu/qemu_migration.c: Likewise.
* src/qemu/qemu_driver.c: Likewise.
* src/lxc/lxc_driver.c: Likewise.
* src/lxc/lxc_controller.c: Likewise.
* tests/securityselinuxlabeltest.c: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
A future patch wants to create disk definitions with non-zero
default contents; to avoid crashes, all callers that allocate
a disk definition should go through a common point.
I found allocation points by looking for any code that increments
ndisks, as well as any matches for ALLOC.*disk. Most places that
modified ndisks were covered by the parse from XML to domain/device
definition by initial domain creation or device hotplug; I also
hand-checked all drivers that generate a device struct on the
fly during getXMLDesc.
* src/conf/domain_conf.h (virDomainDiskDefNew): New prototype.
* src/conf/domain_conf.c (virDomainDiskDefNew): New function.
(virDomainDiskDefParseXML): Use it.
* src/parallels/parallels_driver.c (parallelsAddHddInfo):
Likewise.
* src/qemu/qemu_command.c (qemuParseCommandLine): Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainGetXMLDesc): Likewise.
* src/vmx/vmx.c (virVMXParseDisk): Likewise.
* src/xenxs/xen_sxpr.c (xenParseSxprDisks, xenParseSxpr):
Likewise.
* src/xenxs/xen_xm.c (xenParseXM): Likewise.
* src/libvirt_private.syms (domain_conf.h): Export it.
Signed-off-by: Eric Blake <eblake@redhat.com>
As part of the work on backing chains, I'm finding that it would
be easier to directly manipulate chains of pointers (adding a
snapshot merely adjusts pointers to form the correct list) rather
than copy data from one struct to another. This patch converts
snapshot source to be a pointer.
In this patch, the pointer is ALWAYS allocated (any code that
increases ndisks now also allocates a source pointer for each
new disk), and all other changes are just mechanical fallout of
the new type; there should be no functional change. It is
possible that we may want to leave the pointer NULL for internal
snapshots in a later patch, but as that requires a closer audit
of the source to ensure we don't fault on a null dereference, I
didn't do it here.
* src/conf/snapshot_conf.h (_virDomainSnapshotDiskDef): Change
type of src.
* src/conf/snapshot_conf.c: Adjust all clients.
* src/qemu/qemu_conf.c: Likewise.
* src/qemu/qemu_driver.c: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
A PCI device can be associated with a specific NUMA node. Later, when
a guest is pinned to one NUMA node the PCI device can be assigned on
different NUMA node. This makes DMA transfers travel across nodes and
thus results in suboptimal performance. We should expose the NUMA node
locality for PCI devices so management applications can make better
decisions.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This simplifies the usage in {libxl,qemu}DomainGetNumaParameters
and it's needed for consistent error reporting in virBitmapFormat.
Also remove the forgotten ATTRIBUTE_NONNULL marker.
Openstack uses (or will start to using) CPU info from the
capabilities XML. So this section is expanded, added CPU info
about arch, type and info about number of cores, sockets and threads.
Signed-off-by: Eric Blake <eblake@redhat.com>
OpenStack Nova requires this function
to start VM instance. Cpumask info is obtained via prlctl utility.
Unlike KVM, Parallels Cloud Server is unable to set cpu affinity
mask for every VCpu. Mask is unique for all VCpu. You can set it
using 'prlctl set <vm_id|vm_name> --cpumask <{n[,n,n1-n2]|all}>'
command. For example, 'prlctl set SomeDomain --cpumask 0,1,5-7'
would set this mask to yy---yyy.
Signed-off-by: Eric Blake <eblake@redhat.com>
This patch adds initial migration support to the libxl driver,
using the VIR_DRV_FEATURE_MIGRATION_PARAMS family of migration
functions.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Introduce a simple libxlDomainDefCheckABIStability() function that
can be used check ABI stability between two virDomainDef objects.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
On some systems, libnuma can be present but it's so ancient that
it misses some symbols that virNumaGetDistances() needs. To be
more precise: numa_bitmask_isbitset() and numa_nodes_ptr are the
symbols in question. Fortunately, they were both introduced in
the same release so it's sufficient for us to check for only one
of them. And the winner is numa_bitmask_isbitset().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In case the libvirt is built without numactl support, we're
missing the virNumaGetDistances() stub so the linking fails:
CCLD libvirt_lxc
libvirt_lxc-nodeinfo.o: In function `virNodeCapsGetSiblingInfo':
/home/zippy/tmp/libvirt.git/src/nodeinfo.c:1763: undefined reference to `virNumaGetDistances'
collect2: error: ld returned 1 exit status
make[3]: *** [libvirt_lxc] Error 1
The issue was introduced in 77c830d8c4.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If user or management application wants to create a guest,
it may be useful to know the cost of internode latencies
before the guest resources are pinned. For example:
<capabilities>
<host>
...
<topology>
<cells num='2'>
<cell id='0'>
<memory unit='KiB'>4004132</memory>
<distances>
<sibling id='0' value='10'/>
<sibling id='1' value='20'/>
</distances>
<cpus num='2'>
<cpu id='0' socket_id='0' core_id='0' siblings='0'/>
<cpu id='2' socket_id='0' core_id='2' siblings='2'/>
</cpus>
</cell>
<cell id='1'>
<memory unit='KiB'>4030064</memory>
<distances>
<sibling id='0' value='20'/>
<sibling id='1' value='10'/>
</distances>
<cpus num='2'>
<cpu id='1' socket_id='0' core_id='0' siblings='1'/>
<cpu id='3' socket_id='0' core_id='2' siblings='3'/>
</cpus>
</cell>
</cells>
</topology>
...
</host>
...
</capabilities>
We can see the distance from node1 to node0 is 20 and within nodes 10.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The API gets a NUMA node and find distances to other nodes. The
distances are returned in an array. If an item X within the array
equals to value of zero, then there's no such node as X.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If the leasehelper_path couldn't be found the code would leak the
freshly constructed command structure. Re-arrange code to avoid the
problem.
Found by coverity, broken by baafe668fa56767c031468ccd5df3e62eaa11370.
qemuMonitorJSONSendKey declares the "holdtime" argument as unsigned int
while the command was constructed in qemuMonitorJSONMakeCommand using
the "P" modifier which took a unsigned long from the variable
arguments which then made it possible to access uninitialized memory.
This broke the qemumonitorjsontest on 32bit fedora 20:
64) qemuMonitorJSONSendKey
... libvirt: QEMU Driver error : internal error: unsupported data type 'W' for arg 'WVSì D$0èwÿÿÃAå' FAILED
Uncovered by upstream commit f744b831c66d9e82453f7a96cab5eddf7570c253.
Additionally add test for the hold-time option.
The 'libxl_domain_config' object is stack allocated which means its
memory contents are undefined. The libxl_domain_config_dispose() call
is only safe if the memory is initialized to a defined state. Not all
code paths which reach libxl_domain_config_dispose() will ensure that
libxl_domain_config_init() is called. Move the libxl_domain_config_init()
call earlier in the function to ensure all codepaths have defined
memory state.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To allow the test suite to creat the XML option object,
move the virDomainXMLOptionNew call into a libxlCreateXMLConf
method.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To make it easier to test, change libxlBuildDomainConfig so
that it takes a virPortAllocatorPtr instead of the larger
libxlDriverPrivatePtr object.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To make it easier to unit test, change libxlBuildDomainConfig
so that it takes 'virDomainDefPtr' and 'libxl_ctx *' objects
as separate parameters.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some of the APIs already return int since they can produce errors that
need to be propagated. For consistency reasons, this patch changes the
rest of the APIs to also return int even though they do not fail or
report any errors.
In general, we should only remove a backend after seeing DEVICE_DELETED
event for a corresponding frontend.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
In general, we should only remove a backend after seeing DEVICE_DELETED
event for a corresponding frontend. This doesn't make any difference for
disks attached using -drive or drive_add since QEMU automatically
removes their backends but it's still better to make our code
consistent. And it may start making difference in case we switch to
attaching disks using -blockdev.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
[1] reported that we are removing network's backend too early. I didn't
really get the reproducer but libvirt behaves strangely when a guest
does not confirm the removal, e.g., it does not support PCI hotplug. In
such case, detaching a network device leaves its frontend in place but
removes the backend, which makes the device unusable for the guest.
Moreover attaching the same device again succeeds and both the guest and
libvirt will see two network interfaces attached but only one of them is
actually working.
I checked with Paolo Bonzini and he confirmed we should only remove a
backend after seeing DEVICE_DELETED event for a corresponding frontend.
[1] https://www.redhat.com/archives/libvir-list/2014-March/msg01740.html
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This patch adds option to specify that a json qemu command argument is
optional without the need to use if's or ternary operators to pass the
list. Additionally all the modifier characters are documented to avoid
user confusion.
To allow using the array manipulation macros on the arrays returned by
virStringSplit we need to know the count of the elements in the array.
Modify virStringSplit to return this value, rename it and add a helper
with the old name so that we don't need to update all the code.
Use the new backing store parser in the backing chain crawler. This
change needs one test change where information about the NBD image are
now parsed differently.
Add parsers for relative and absolute backing names for local and remote
storage files.
This parser parses relative paths as relative to their parents and
absolute paths according to the protocol or local access.
For remote storage volumes, all URI based backing file names are
supported and for the qemu colon syntax the NBD protocol is supported.