Commit 93e82727 introduced numatune_conf.h file that contains
typedefs already defined in domain_conf.h, such as:
- virDomainNumatune
- virDomainNumatunePtr
- virDomainDef
- virDomainDefPtr
As numatune_conf.h is included by domain_conf.h, clang
complains about redefinition of typedef and the build fails.
In order to fix it, drop typedefs already defined by numatume_conf.h
from domain_conf.h.
Coverity complains about the return value of ioctl not being checked.
Even though we carry on when this fails (just like qemu-img does),
we can log an error.
For non-local storage drivers we can't expect to use the "scrub" tool to
wipe the volume. Split the code into a separate backend function so that
we can add protocol specific code later.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1118710
The next patch will move the storage volume wiping code into the
individual backends. This patch splits out the common code to wipe a
local volume into a separate backend helper so that the next patch is
simpler.
When domain is started with numatune memory mode strict and the
nodeset does not include host NUMA node with DMA and DMA32 zones, KVM
initialization fails. This is because cgroup restrict even kernel
allocations. We are already doing numa_set_membind() which does the
same thing, only it does not restrict kernel allocations.
This patch leaves the userspace numa_set_membind() in place and moves
the cpuset.mems setting after the point where monitor comes up, but
before vcpu and emulator sub-groups are created.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Currently, we only bind the whole QEMU domain to memory nodes
specified in nodemask altogether. That, however, doesn't make much
sense when one wants to control from where the memory for particular
guest nodes should be allocated. QEMU allows us to do that by
specifying 'host-nodes' parameter for the 'memory-backend-ram' object,
so let's use that.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
When qemu switched to using OptsVisitor for -numa parameter, it did
two things in the same patch. One of them is that the numa parameter
is now visible in "query-command-line-options", the second one is that
it enabled using disjoint cpu ranges for -numa specification. This
will be used in later patch.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The numa patch series in qemu adds "memory-backend-ram" object type by
which we can tell whether we can use such objects.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
That can be lately achieved with by having .param == NULL in the
virQEMUCapsCommandLineProps struct.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
There were numerous places where numatune configuration (and thus
domain config as well) was changed in different ways. On some
places this even resulted in persistent domain definition not to be
stable (it would change with daemon's restart).
In order to uniformly change how numatune config is dealt with, all
the internals are now accessible directly only in numatune_conf.c and
outside this file accessors must be used.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Since there was already public virDomainNumatune*, I changed the
private virNumaTune to match the same, so all the uses are unified and
public API is kept:
s/vir\(Domain\)\?Numa[tT]une/virDomainNumatune/g
then shrunk long lines, and mainly functions, that were created after
that:
sed -i 's/virDomainNumatuneMemPlacementMode/virDomainNumatunePlacement/g'
And to cope with the enum name, I haad to change the constants as
well:
s/VIR_NUMA_TUNE_MEM_PLACEMENT_MODE/VIR_DOMAIN_NUMATUNE_PLACEMENT/g
Last thing I did was at least a little shortening of already long
name:
s/virDomainNumatuneDef/virDomainNumatune/g
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
There are many places with numatune-related code that should be put
into special numatune_conf and this patch creates a basis for that.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
In XML format, by definition, order of fields should not matter, so
order of parsing the elements doesn't affect the end result. When
specifying guest NUMA cells, we depend only on the order of the 'cell'
elements. With this patch all older domain XMLs are parsed as before,
but with the 'id' attribute they are parsed and formatted according to
that field. This will be useful when we have tuning settings for
particular guest NUMA node.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Excerpt from the virCommandAddArgBuffer() description: "Correctly
transfers memory errors or contents from buf to cmd."
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This patch adds support for the QEMU vhost-user feature to libvirt.
vhost-user enables the communication between a QEMU virtual machine
and other userspace process using the Virtio transport protocol.
It uses a char dev (e.g. Unix socket) for the control plane,
while the data plane based on shared memory.
The XML looks like:
<interface type='vhostuser'>
<mac address='52:54:00:3b:83:1a'/>
<source type='unix' path='/tmp/vhost.sock' mode='server'/>
<model type='virtio'/>
</interface>
Signed-off-by: Michele Paolino <m.paolino@virtualopensystems.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1119173 documents that
commit eaba79d was flawed in the implementation of the
VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag when it comes to completing
a blockcopy. Basically, the qemu pivot action is async (the QMP
command returns immediately, but the user must wait for the
BLOCK_JOB_COMPLETE event to know that all I/O related to the job
has finally been flushed), but the libvirt command was documented
as synchronous by default. As active block commit will also be
using this code, it is worth fixing now.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Don't skip wait
loop after pivot.
Signed-off-by: Eric Blake <eblake@redhat.com>
Now that we've finally fixed all the violators, it's time to
enforce that any pointer to a const object is never freed (it
is aliasing some other memory, where the non-const original
should be freed instead). Alas, the code still needs a normal
vs. Coverity version, but at least we are still guaranteeing
that the macro call evaluates its argument exactly once.
I verified that we still get the following compiler warnings,
which in turn halts the build thanks to -Werror on gcc (hmm,
gcc 4.8.3's placement of the ^ for ?: type mismatch is a bit
off, but that's not our problem):
int oops1 = 0;
VIR_FREE(oops1);
const char *oops2 = NULL;
VIR_FREE(oops2);
struct blah { int dummy; } oops3;
VIR_FREE(oops3);
util/virauthconfig.c:159:35: error: pointer/integer type mismatch in conditional expression [-Werror]
VIR_FREE(oops1);
^
util/virauthconfig.c:161:5: error: passing argument 1 of 'virFree' discards 'const' qualifier from pointer target type [-Werror]
VIR_FREE(oops2);
^
In file included from util/virauthconfig.c:28:0:
util/viralloc.h:79:6: note: expected 'void *' but argument is of type 'const void *'
void virFree(void *ptrptr) ATTRIBUTE_NONNULL(1);
^
util/virauthconfig.c:163:35: error: type mismatch in conditional expression
VIR_FREE(oops3);
^
* src/util/viralloc.h (VIR_FREE): No longer cast away const.
* src/xenapi/xenapi_utils.c (xenSessionFree): Work around bogus
header.
Signed-off-by: Eric Blake <eblake@redhat.com>
Add 'nocow' to storage volume xml so that user can have an option
to set NOCOW flag to the newly created volume. It's useful on btrfs
file system to enhance performance.
Btrfs has low performance when hosting VM images, even more when the guest
in those VM are also using btrfs as file system. One way to mitigate this
bad performance is to turn off COW attributes on VM files. Generally, there
are two ways to turn off COW on btrfs: a) by mounting fs with nodatacow,
then all newly created files will be NOCOW. b) per file. Add the NOCOW file
attribute. It could only be done to empty or new files.
This patch tries the second way, according to 'nocow' option, it could set
NOCOW flag per file:
for raw file images, handle 'nocow' in libvirt code; for non-raw file images,
pass 'nocow=on' option to qemu-img, and let qemu-img to handle that (requires
qemu-img version >= 2.1).
Signed-off-by: Chunyan Liu <cyliu@suse.com>
In many places we define a variable as a 'const char *' when in fact
we modify it just a few lines below. Or even free it. We should not do
that.
There's one exception though, in xenSessionFree() xenapi_utils.c. We
are freeing the xen_session structure which is defined in
xen/api/xen_common.h public header. The structure contains session_id
which is type of 'const char *' when in fact it should have been just
'char *'. So I'm leaving this unmodified, just noticing the fact in
comment.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When the backing store of a volume wasn't accessible while updating the
volume definition the call would fail altogether. In cases where we
currently (incorrectly) treat remote backing stores as local one this
might lead to strange errors.
Ignore the opening errors until we figure out how to track proper volume
metadata.
Use the backing store parser to properly create the information about a
volume's backing store. Unfortunately as the storage driver isn't
prepared to allow volumes backed by networked filesystems add a
workaround that will avoid changing the XML output.
Rework the apparmor lxc profile abstraction to mimic ubuntu's container-default.
This profile allows quite a lot, but strives to restrict access to
dangerous resources.
Removing the explicit authorizations to bash, systemd and cron files,
forces them to keep the lxc profile for all applications inside the
container. PUx permissions where leading to running systemd (and others
tasks) unconfined.
Put the generic files, network and capabilities restrictions directly
in the TEMPLATE.lxc: this way, users can restrict them on a per
container basis.
Rename linuxDomainInterfaceStats to virNetInterfaceStats in order
to allow adding platform specific implementations without
making consumer worrying about specific implementation to be used.
Also, rename util/virstatslinux.c to util/virstats.c so placing
other platform specific implementations into this file don't
look unexpected from the file name.
Code logic in libxlDomainAttachDeviceFlags and libxlDomainDetachDeviceFlags
is wrong with return value in error cases.
'ret' was being set to 0 if 'flags & VIR_DOMAIN_DEVICE_MODIFY_CONFIG' was
false. Then if something like virDomainDeviceDefParse() failed in the
VIR_DOMAIN_DEVICE_MODIFY_LIVE logic, the error would be reported but the
function would return success.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
4cc1f1a01f introduced a crash when doing a
block copy as virStorageSourceInitChainElement was called on
"disk->mirror" that is still NULL at that point instead of "mirror"
which temporarily holds the mirror source struct until it's fully
initialized. This resulted into a crash as a NULL was dereferenced.
Reported by: Shanzi Yu <shyu@redhat.com>
Commit id '3ea661de' refactored the code to use the 'disk->src->path'
instead of getting the path from virDomainDiskGetSource(). The one
call to qemuOpenFile() didn't use the disk source path, rather it used
the path as passed from the caller (in this case 'vda') - this caused
a failure with the virt-test/tp-libvirt as follows:
$ virsh domblkinfo virt-tests-vm1 vda
error: cannot stat file '/home/virt-test/shared/data/images/jeos-20-64.qcow2': Bad file descriptor
$
If the openvswitch service is stopped, and is followed by destroying a
VM, the openvswitch bridge translates into a state where it doesn't
recover the port configuration. While it successfully fetches data
from the internal DB, since the corresponding virtual interface does
not exists anymore the whole recovery process fails leaving restarted
VM with inability to connect to the bridge. The following set of
commands will trigger the problem:
virsh start vm
service openvswitch-switch stop
virsh destroy vm
service openvswitch-switch start
virsh start vm
Signed-off-by: Chunhe Li <lichunhe@huawei.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Instead of allocating the virSecurityLabelDef structure ourselves, we
can utilize virSecurityLabelDefNew which even sets the default values
for us.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1113860
We've always done that. Well, until 990e46c45. Point is, if we don't
format model, we may lose a domain on libvirtd restart. If the
seclabel is implicit however, we should skip it's formatting.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Snapshots and block-copy have a flag that forces qemu to re-use existing
file. Our docs weren't exactly clear on what the existing file should
contain for this to actually work.
Re-word the docs a bit to state that the file needs to be pre-created in
the desired format and the backing chain metadata needs to be set prior
to handing it over to qemu.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1084360
Commit dae1568c6c converted the perms
member of the virStorageVolTarget struct into a pointer to make it
optional. But virStorageVolTargetDefFormat did not check perms for
NULL before dereferencing it.
Don't fail when there is nothing to do, as a tweak to the previous
patch regarding output of libvirt-UUID.files for LXC apparmor profiles
Signed-off-by: Eric Blake <eblake@redhat.com>
This was converted to a typedef in 5a2bd4c917 "conf: more enum
cleanups in "src/conf/domain_conf.h"" causing:
libxl/libxl_conf.c: In function 'libxlDiskSetDiscard':
libxl/libxl_conf.c:724:19: error: conversion to incomplete type
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
In lxc, we could not use setmem command
with --config options.
This patch will add support for this.
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1066894
With current code it's possible to have for instance:
virsh dumpxml mydomain | grep seclabel
<seclabel type='dynamic' model='selinux' relabel='yes'/>
<seclabel type='dynamic' model='selinux' relabel='yes'/>
<seclabel type='dynamic' model='selinux' relabel='yes'/>
<seclabel type='dynamic' model='selinux' relabel='yes'/>
<seclabel type='dynamic' model='selinux' relabel='yes'/>
what doesn't make any sense. We should reject the XML in the config
parsing phase.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This negation in names of boolean variables is driving me insane. The
code is much more readable if we drop the 'no-' prefix. Well, at least
for me.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For non-local storage drivers we can't expect to use the FDStream
backend for up/downloading volumes. Split the code into a separate
backend function so that we can add protocol specific code later.
This saves a few lines of code and catches the error when:
<spice autoport ='yes' defaultMode='any' ..>
<channel name='main' mode='secure'/>
</spice>
is specified with spice_tls = 0 in qemu.conf.
Instead of this error in qemuBuildGraphicsSPICECommandLine:
error: unsupported configuration: spice secure channels set in XML
configuration, but TLS port is not provided
an error is reported in qemuProcessSPICEAllocatePorts:
error: unsupported configuration: Auto allocation of spice TLS port
requested but spice TLS is disabled in qemu.conf
Inspired by:
https://www.redhat.com/archives/libvir-list/2014-June/msg01408.html
When creating cgroups for vcpu and emulator threads whilst starting a
domain, we explicitly skip creating those cgroups in case priv->cgroup
is NULL (cgroups not supported) because SetAffinity() serves the same
purpose. If the host supports only some cgroups (the ones we need are
either unmounted or disabled in qemu.conf), we error out with weird
message even though we could continue starting the domain.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1097028
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The commit referenced above changed function arguments of
virStorageFileGetMetadataFromBuf() but didn't tweak the
ATTRIBUTE_NONNULL tied to them. This was caught by coverity as it
actually obeys them. We disabled them for GCC and thus it didn't show
up.
Additionally in commit 3ea661deea I passed
NULL to the backingFormat argument which was also marked as nonnull. Use
a dummy int's address when the argument isn't supplied so that the code
doesn't need to change much.
Split out checking of invalid metadata type from the switch statement so
that we can use the typecasted enum value to allow tracking addition of
new items by the compliler.
Also avoids two dead-code break statements.
The default graphics channel mode is 'any', so as to defaultMode attribute.
If defaultMode and channel mode are all the default value 'any',
qemuConnectDomainXMLToNative will set TLSPort.
But in qemuBuildGraphicsSPICECommandLine, if spice_tls is not enabled, libvirtd
will report an error to tell the user that spice TLS is disabled in qemu.conf.
So qemuConnectDomainXMLToNative should check spice_tls is enabled,
then decide to allocate an tlsPort number to this graphics.
If user specified defaultMode is 'secure', qemuConnectDomainXMLToNative
could allocate tlsPort, and then let qemuBuildGraphicsSPICECommandLine reports
the spice_tls disabled error.
The related bug is:
https://bugzilla.redhat.com/show_bug.cgi?id=1113868
Signed-off-by: Jincheng Miao <jmiao@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Now that cgroups/security driver/locking driver support labelling of
individual images and tolerate network storage we don't have to refrain
from passing all image files to it. This allows removing the checking
code as we already make sure that the snapshot function won't be called
with unsupported options.
Now that security, cgroup and locking APIs support working on individual
images and we track the backing chain security info on a per-image basis
we can finally kill swapping the disk source in virDomainDiskDef and use
the virStorageSource directly.
Until now we were changing information about the disk source via
multiple steps of copying data. Now that we changed to a pointer to
store the disk source we might use it to change the approach to track
the data.
Additionally this will allow proper tracking of the backing chain.
When pivoting to a new disk source after a block commit (and possibly
after a soon-to-be-added active block commit) we changed just a few
fields to the new target. In case we'd copy a network disk to a local
file we'd not change the type properly.
To avoid such problems, switch to tracking of the source via changing of
the complete source struct to the one tracking the mirroring info.
Use the source struct and the corresponding function so that we can
avoid using the path separately. Now that
qemuDomainPrepareDiskChainElementPath isn't use anywhere, we can safely
remove it.
Additionally, the removal fixes a misaligned comment as the removed
function was added under a comment for a different function.
Add security driver functions to label separate storage images using the
virStorageSource definition. This will help to avoid the need to do ugly
changes to the disk struct and use the source directly.
Add functions that will allow to set all the required cgroup stuff on
individual images taking a virStorageSourcePtr. Also convert functions
designed to setup whole backing chain to take advantage of the change.
When dispatching events from the event loop, the array of registered
handles is searched to see what handles happened an event on. However,
the array is searched in weird way: the check for the array boundaries
is at the end, so we may touch the elements after the end of the
array:
==10434== Invalid read of size 4
==10434== at 0x52D06B6: virEventPollDispatchHandles (vireventpoll.c:486)
==10434== by 0x52D10E4: virEventPollRunOnce (vireventpoll.c:660)
==10434== by 0x52CF207: virEventRunDefaultImpl (virevent.c:308)
==10434== by 0x1639D1: virNetServerRun (virnetserver.c:1139)
==10434== by 0x1220DC: main (libvirtd.c:1507)
==10434== Address 0xc11ff04 is 4 bytes after a block of size 960 alloc'd
==10434== at 0x4C2CA5E: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==10434== by 0x52AD378: virReallocN (viralloc.c:245)
==10434== by 0x52AD46E: virExpandN (viralloc.c:294)
==10434== by 0x52AD5B1: virResizeN (viralloc.c:352)
==10434== by 0x52CF2EC: virEventPollAddHandle (vireventpoll.c:116)
==10434== by 0x52CEF5B: virEventAddHandle (virevent.c:78)
==10434== by 0x11F69A90: nodeStateInitialize (node_device_udev.c:1797)
==10434== by 0x53C3C89: virStateInitialize (libvirt.c:743)
==10434== by 0x120563: daemonRunStateInit (libvirtd.c:919)
==10434== by 0x5317719: virThreadHelper (virthread.c:197)
==10434== by 0x8376F39: start_thread (in /lib64/libpthread-2.17.so)
==10434== by 0x8A7F9FC: clone (in /lib64/libc-2.17.so)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit a48f445100 introduced a helper
function to convert cgroup device mode to string. The function was only
conditionally compiled on platforms that support cgroup. This broke the
build when attempting to export the symbol:
CCLD libvirt.la
Cannot export virCgroupGetDevicePermsString: symbol not defined
Move the function out of the ifdef, as it doesn't really depend on the
cgroup code being present.
In libxlDomainMigrationConfirm(), a transient domain is removed
from the domain list after successful migration. Later in cleanup,
the domain object is unlocked, resulting in a crash
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fb4208ed700 (LWP 12044)]
0x00007fb4267251e6 in virClassIsDerivedFrom (klass=0xdeadbeef,
parent=0x7fb42830d0c0) at util/virobject.c:169
169 if (klass->magic == parent->magic)
(gdb) bt
0 0x00007fb4267251e6 in virClassIsDerivedFrom (klass=0xdeadbeef,
parent=0x7fb42830d0c0) at util/virobject.c:169
1 0x00007fb42672591b in virObjectIsClass (anyobj=0x7fb4100082b0,
klass=0x7fb42830d0c0) at util/virobject.c:365
2 0x00007fb42672583c in virObjectUnlock (anyobj=0x7fb4100082b0)
at util/virobject.c:338
3 0x00007fb41a8c7d7a in libxlDomainMigrationConfirm (driver=0x7fb4100404c0,
vm=0x7fb4100082b0, flags=1, cancelled=0) at libxl/libxl_migration.c:583
Fix by setting the virDomainObjPtr to NULL after removing it from
the domain list.
During migration, the libxl driver starts a modify job in the
begin phase, ending the job in the confirm phase. This is
essentially VIR_MIGRATE_CHANGE_PROTECTION semantics, but the
driver does not support that flag. Without CHANGE_PROTECTION
support, the job would never be terminated in error conditions
where migrate confirm phase is not executed. Further attempts
to modify the domain would result in failure to acquire a job
after LIBXL_JOB_WAIT_TIME.
Similar to the qemu driver, end the job in the begin phase.
Protecting the domain object across all phases of migration can
be done in a future patch adding CHANGE_PROTECTION support.
In libxlDomainMigrationPrepare(), a new virDomainObj is created
from the incoming domain def and added to the driver's domain
list, but never removed if there are subsequent failures during
the prepare phase.
targethost# virsh list --all
sourcehost# virsh migrate --live dom xen+ssh://targethost/system
error: operation failed: Fail to create socket for incoming migration.
targethost# virsh list --all
error: Failed to list domains
error: name in virGetDomain must not be NULL
After adding code to remove the domain on prepare failure, noticed
that libvirtd crashed due to double free of the virDomainDef. Similar
to the qemu driver, pass a pointer to virDomainDefPtr so it can be set
to NULL once a virDomainObj is created from it.
Qemu will fallback to aio=threads when the cache mode doesn't use
O_DIRECT, even if aio=native was explictly set.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1086704
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Cgroups code uses VIR_CGROUP_DEVICE_* flags to specify the mode but in
the end it needs to be converted to a string. Add a helper to do it and
use it in the cgroup code before introducing it into the rest of the
code.
When discovering a disk backing chain the parent disk's metadata need to
be populated into the guest images so that each piece of the backing
chain contains a copy of those. This will allow us to refactor the
security driver so that it will not need to carry around the original
disk definition.
We are going to modify storage source chains in place. Add a helper that
will copy relevant information such as security labels to the new
element if that doesn't contain it.
In the future we might need to track state of individual images. Move
the readonly and shared flags to the virStorageSource struct so that we
can keep them in a per-image basis.
Some of the further changes will propagate seclabels from a disk source
element into the backing store elements. This would change the XML
output of the backing store as the seclabels would be formatted for each
backing store element. Skip the seclabels formatting until we decide
that it's necessary.
Now that we are able to select images from the backing chain via indexed
access we should also convert possible network sources to
qemu-compatible strings before passing them to qemu.
Now that we are able to select images from the backing chain via indexed
access we should also convert possible network sources to
qemu-compatible strings before passing them to qemu.
Introduce flag for the block rebase API to allow the rebase operation to
leave the chain relatively addressed. Also adds a virsh switch to enable
this behavior.
Introduce flag for the block commit API to allow the commit operation to
leave the chain relatively addressed. Also adds a virsh switch to enable
this behavior.
The qemu block info function relied on working with local storage. Break
this assumption by adding support for remote volumes. Unfortunately we
still need to take a hybrid approach as some of the operations require a
filedescriptor.
Previously you'd get:
$ virsh domblkinfo gl vda
error: cannot stat file '/img10': Bad file descriptor
Now you get some stats:
$ virsh domblkinfo gl vda
Capacity: 10485760
Allocation: 197120
Physical: 197120
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1110198
To allow reusing this function in the qemu driver we need to allow
specifying the storage format. Also separate return of the backing store
path now isn't necessary.
For inactive domains, set both current and maximum memory
to the specified 'maximum memory' value.
This matches the behavior of QEMU driver's SetMaxMemory.
https://bugzilla.redhat.com/show_bug.cgi?id=1091132
For the regular dump operation we migrate the VM to a file. This won't
work when the VM has passthrough devices assigned. Rather than reporting
a cryptic error from qemu run our check whether it can be migrated.
This does not influence the memory-only dump that is allowed with
passthrough devices.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=874418
We invoked virCgroupHasController twice for checking
VIR_CGROUP_CONTROLLER_DEVICES
in lxcDomainAttachDeviceDiskLive.
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
To allow changing the name that is recorded in the top of the current
image chain used in a block pull/rebase operation, we need to specify
the backing name to qemu. This is done via the "backing-file" attribute
to the block-stream commad.
To allow changing the name that is recorded in the overlay of the TOP
image used in a block commit operation, we need to specify the backing
name to qemu. This is done via the "backing-file" attribute to the
block-commit command.
This command allows to change the backing file name recorded in the
metadata of a qcow (or other) image. The capability also notifies that
the "block-stream" and "block-commit" commands understand the
"backing-file" attribute.
Extract common operations done when creating an audit message to a
separate generic function that can be reused and convert RNG, disk, FS
and net audit to use it.
There's a lot of places where we skip doing actions based on the
locality of given storage type. The usual pattern is to skip it if:
virStorageSourceGetActualType(src) == VIR_STORAGE_TYPE_NETWORK
Add a simple helper to simplify the pattern to
virStorageSourceIsLocalStorage(src)
Replace the authType, chap, and cephx unions in virStoragePoolSource
with a single pointer to a virStorageAuthDefPtr. Adjust all users of
the previous chap/cephx and secret unions with the source->auth data.
Replace the inline "auth" struct in virStorageSource with a pointer
to a virStorageAuthDefPtr and utilize between the domain_conf, qemu_conf,
and qemu_command sources for finding the auth data for a domain disk
Introduce virStorageAuthDef and friends. Future patches will merge/utilize
their view of storage source/pool auth/secret definitions.
New API's include:
virStorageAuthDefParse: Parse the "<auth/>" XML data for either the
domain disk or storage pool returning a
virStorageAuthDefPtr
virStorageAuthDefCopy: Copy a virStorageAuthDefPtr - to be used by
the qemuTranslateDiskSourcePoolAuth when it
copies storage pool auth data into domain
disk auth data
virStorageAuthDefFormat: Common output of the "<auth" in the domain
disk or storage pool XML
virStorageAuthDefFree: Free memory associated with virStorageAuthDef
Subsequent patches will utilize the new functions for the domain disk and
storage pools.
Future work in the hostdev pass through can then make use of common data
structures and code.
Use the probing functionality added in the last patch to turn on
a capability bit when active commit is present, and gate active
commit on that capability.
For my own reference: the difference between BLOCKJOB_SYNC and
BLOCKJOB_ASYNC is whether qemu generated an event at the
conclusion of blockpull; basically, RHEL 6.2 was the only release
of qemu that has the sync semantics and lacks the event. RHEL
6.3 added blockcopy, but also picked up on the upstream style
of qemu generating events. As no one is likely to backport
active commit to RHEL 6.2, it's safe for blockcommit to always
require async blockjob support.
Modifying qemucapabilitiestest is painful; the .replies files would
be so much easier if they had comments correlating which command
generated the given reply. Maybe I'll fix that up later...
* src/qemu/qemu_capabilities.h (QEMU_CAPS_ACTIVE_COMMIT): New
capability.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Use the new bit
* src/qemu/qemu_capabilities.c (virQEMUCaps): Name the new bit.
(virQEMUCapsProbeQMPCommands): Set it.
* tests/qemucapabilitiesdata/caps_1.3.1-1.replies: Update.
* tests/qemucapabilitiesdata/caps_1.4.2-1.replies: Likewise.
* tests/qemucapabilitiesdata/caps_1.5.3-1.replies: Likewise.
* tests/qemucapabilitiesdata/caps_1.6.0-1.replies: Likewise.
* tests/qemucapabilitiesdata/caps_1.6.50-1.replies: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
We are about to turn on support for active block commit. Although
qemu 2.0 was the first version to mostly support it, that version
mis-handles 0-length files, and doesn't have anything available for
easy probing. But qemu 2.1 fixed bugs, and made life simpler by
letting the 'top' argument be optional. Unless someone begs for
active commit with qemu 2.0, for now we are just going to enable
it only by probing for qemu 2.1 behavior (anyone backporting active
commit can also backport the optional argument behavior). This
requires qemu.git commit 7676e2c597000eff3a7233b40cca768b358f9bc9.
Although all our actual uses of block-commit supply arguments for
both base and top, we can omit both arguments and use a bogus
device string to trigger an interesting behavior in qemu. All QMP
commands first do argument validation, failing with GenericError
if a mandatory argument is missing. Once that passes, the code
in the specific command gets to do further checking, and the qemu
developers made sure that if device is the only supplied argument,
then the block-commit code will look up the device first, with a
failure of DeviceNotFound, before attempting any further argument
validation (most other validations fail with GenericError). Thus,
the category of error class can reliably be used to decipher
whether the top argument was optional, which in turn implies a
working active commit. Since we expect our bogus device string to
trigger an error either way, the code is written to return a
distinct return value without spamming the logs.
* src/qemu/qemu_monitor.h (qemuMonitorSupportsActiveCommit): New
prototype.
* src/qemu/qemu_monitor.c (qemuMonitorSupportsActiveCommit):
Implement it.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockCommit):
Allow NULL for top and base, for probing purposes.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockCommit):
Likewise, implementing the probe.
* tests/qemumonitorjsontest.c (mymain): Enable...
(testQemuMonitorJSONqemuMonitorSupportsActiveCommit): ...a new test.
Signed-off-by: Eric Blake <eblake@redhat.com>
So far only information on disks and host devices are exposed in the
capabilities XML. Well, at least something. Even a new test is
introduced. The qemu capabilities are stolen from already existing
qemucapabilities test. There's one tricky point though. Functions that
checks host's KVM and VFIO capabilities, are impossible to mock
currently. So in the test, we are setting the capabilities by hand.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Sometimes it may be useful to get a default machine for given qemu
binary. Fortunately, the default machine is stored always on the first
position in the supported machines array.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This internal API is meant to answer the question 'Is this machine
type supported by given qemu?'.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The API may come handy if somebody has an architecture and wants to
look through available qemus if the architecture is supported or not.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This new module holds and formats capabilities for emulator. If you
are about to create a new domain, you may want to know what is the
host or hypervisor capable of. To make sure we don't regress on the
XML, the formatting is not something left for each driver to
implement, rather there's general format function.
The domain capabilities is a lockable object (even though the locking
is not necessary yet) which uses reference counter.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In the lastest rework (9e7ecabf) a cleanup label was left over which
results in compilation error.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Replace:
if (virBufferError(&buf)) {
virBufferFreeAndReset(&buf);
virReportOOMError();
...
}
with:
if (virBufferCheckError(&buf) < 0)
...
This should not be a functional change (unless some callers
misused the virBuffer APIs - a different error would be reported
then)
So far, we only report an error if formatting the siblings bitmap
in NUMA topology fails.
Be consistent and always report error in virCapabilitiesFormatXML.
Check if the buffer is in error state and report an error if it is.
This replaces the pattern:
if (virBufferError(buf)) {
virReportOOMError();
goto cleanup;
}
with:
if (virBufferCheckError(buf) < 0)
goto cleanup;
Document typical buffer usage to favor this.
Also remove the redundant FreeAndReset - if an error has
been set via virBufferSetError, the content is already freed.
https://bugzilla.redhat.com/show_bug.cgi?id=1086121
We now support startupPolicy='optional' for disks, but this
should work only for cold boot, not for restore or migrate.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The comments for lxcDomainCreateXMLWithFiles are out of date. So update them.
And add comments for lxcDomainCreateXML
Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
Signed-off-by: Yue wenyuan <yuewenyuan@huawei.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This introduces two new attributes "cmd_per_lun" and "max_sectors" same
with the names QEMU uses for virtio-scsi. An example of the XML:
<controller type='scsi' index='0' model='virtio-scsi' cmd_per_lun='50'
max_sectors='512'/>
The corresponding QEMU command line:
-device virtio-scsi-pci,id=scsi0,cmd_per_lun=50,max_sectors=512,
bus=pci.0,addr=0x3
Signed-off-by: Mike Perez <thingee@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
The IDE bus doesn't support readonly disks, so inform the user with an
error message instead of let qemu fail with a more obscure "Device
'ide-hd' could not be initialized" error message.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1112939
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
We have the following matrix of possible arguments handled by the logic
statement touched by this patch:
| flags & _REUSE_EXT | !(flags & _REUSE_EXT)
-------+--------------------+----------------------
format| (1) | (2)
-------+--------------------+----------------------
!format| (3) | (4)
-------+--------------------+----------------------
In cases 1 and 2 the user provided a format, in cases 3 and 4 not. The
user requests to use a pre-existing image in 1 and 3 and libvirt will
create a new image in 2 and 4.
The difference between cases 3 and 4 is that for 3 the format is probed
from the user-provided image, whereas in 4 we just use the existing disk
format.
The current code would treat cases 1,3 and 4 correctly but in case 2 the
format provided by the user would be ignored.
The particular piece of code was broken in commit 35c7701c64
but since it was introduced a few commits before that it was never
released as working.
Since there is code using functions from the libxml library,
libvirt_conf should have that in LIBADD so it can be linked against
even without libvirt_util (which usually deals with the error itself,
since libvirt_util has libxml in LIBADD). The same applies to
storage_backend.c.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
virFileReadAll already logs an error. If reading the 'speed' file
fails with EINVAL, we log an error even though we ignore it. If it
fails with other errors, we log two errors.
Use virFileReadAllQuiet - ignore EINVAL and report just one error
in other cases.
Fixes this error on libvirtd startup:
2014-06-30 12:47:14.583+0000: 20971: error : virFileReadAll:1297 :
Failed to read file '/sys/class/net/wlan0/speed': Invalid argument
This stops the error message spam when running unprivileged
libvirtd:
2014-06-30 12:38:47.990+0000: 631: error : virPCIDeviceConfigOpen:300 :
Failed to open config space file
'/sys/bus/pci/devices/0000:00:00.0/config': Permission denied
Reported by Daniel Berrange:
https://www.redhat.com/archives/libvir-list/2014-June/msg01082.html
Xen PV domains always have a PV console, so add one to the domain
config via post-parse callback if not explicitly specified in
the XML. The legacy Xen driver behaves similarly, causing a
regression when switching to the new Xen toolstack. I.e.
virsh console pv-domain
will no longer work after upgrading a xm/xend stack to xl/libxl.
Noticed the following error when building the vbox driver
in the openSUSE build service
CCLD vboxsnapshotxmltest
/usr/lib64/gcc/x86_64-suse-linux/4.8/../../../../x86_64-suse-linux/bin/ld:
../src/.libs/libvirt_driver_vbox_impl.a
(libvirt_driver_vbox_impl_la-vbox_snapshot_conf.o):
undefined reference to symbol 'xmlXPathRegisterNs@@LIBXML2_2.4.30'
/usr/lib64/libxml2.so: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
Fixed by adding LIBXML_LIBS to libvirt_driver_vbox_impl_la_LIBADD
libxl interface for vcpu pinning is changing in Xen 4.5. Basically,
libxl_set_vcpuaffinity() now wants one more parameter. That is
representative of 'VCPU soft affinity', which libvirt does not use.
To mark such change, the macro LIBXL_HAVE_VCPUINFO_SOFT_AFFINITY is
defined. Use it as a gate and, if present, re-#define the calls from
the old to the new interface, to avoid breaking the build.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Ian Jackson <Ian.Jackson@eu.citrix.com>
Throwing an error is much friendly than just
"error: An error occurred, but the cause is unknown"
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
Commit 55bbb011b9 introduced a regression
where we forgot to save the persistent domain configuration after an
external snapshot. This would make libvirt forget the snapshots and
effectively revert to the previous state in the following scenario:
1) Start VM
2) Take snapshot
3) Destroy VM
4) Restart libvirtd
Also fix spurious blank line added by patch mentioned above.
Instead of maintaining two very similar APIs, add the "@mac" parameter
to virNetworkGetDHCPLeases and kill virNetworkGetDHCPLeasesForMAC. Both
of those functions would return data the same way, so making @mac an
optional filter simplifies a lot of stuff.
libxl does not support save, restore, or migrate on all architectures,
notably ARM. Detect whether libxl supports these operations using
LIBXL_HAVE_NO_SUSPEND_RESUME. If not supported, drop advertisement of
<migration_features>.
Found by Ian Campbell while improving Xen's OSSTEST infrastructure
http://lists.xen.org/archives/html/xen-devel/2014-06/msg02171.html
Since commit d86c876a66 we are using
guestfwd=tcp:IP:PORT,chardev=ID for guestfwd specification, however,
that has not changed in qemu, so guestfwd does not work since.
Apart from that, guestfwd is not working with older qemu that doesn't
have QEMU_CAPS_DEVICE.
Both regressions exist since late 2009 and nobody found that (until
now), so I'm only fixing the first one.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1112066
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The QEMU VNC client arg code has a long standing typo
of SASL_CONF_DIR when it should be SASL_CONF_PATH for
the env variable name.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When creating a new disk mirror the new struct is stored in a separate
variable until everything went well. The removed hunk would actually
remove existing mirror information for example when the api would be run
if a mirror still exists.
The function headers contain type on the same line as the name. When
combined with usage of ATTRIBUTE_UNUSED, the function headers were very
long. Shorten them by breaking the line after the type.
virSecurityManagerSetDiskLabel and virSecurityManagerRestoreDiskLabel
don't have complementary semantics. Document the semantics to avoid
possible problems.
I'm going to add functions that will deal with individual image files
rather than whole disks. Rename the security function to make room for
the new one.
The new VIR_CONNECT_COMPARE_CPU_FAIL_INCOMPATIBLE flag for
virConnectCompareCPU can be used to get an error
(VIR_ERR_CPU_INCOMPATIBLE) describing the incompatibility instead of the
usual VIR_CPU_COMPARE_INCOMPATIBLE return code.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When CPU comparison APIs return VIR_CPU_COMPARE_INCOMPATIBLE, the caller
has no clue why the CPU is considered incompatible with host CPU. And in
some cases, it would be nice to be able to get such info in a client
rather than having to look in logs.
To achieve this, the APIs can be told to return VIR_ERR_CPU_INCOMPATIBLE
error for incompatible CPUs and the reason will be described in the
associated error message.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Fix missing whitespace when parsing 'managed' attribute.
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Currently, only LXC has hostdev mode 'capabilities' support,
so the other drivers should forbid to define it in XML.
The hostdev mode check is added to devicesPostParseCallback()
for each hypervisor driver.
But there are some drivers lack function devicesPostParseCallback(),
so only add check for qemu, libxl, openvz, uml, xen, xenapi.
Signed-off-by: Jincheng Miao <jmiao@redhat.com>
The parent directory doesn't necessarily need to be stored after we
don't mangle the path stored in the image. Remove it and tweak the code
to avoid using it.
Store backing chain paths as non-canonical. The canonicalization step
will be already taken. This will allow to avoid storing unnecessary
amounts of data.
Now that we store only relative names in virStorageSource's member
relPath the backingRelative member is obsolete. Remove it and adapt the
code to the removal.
Due to various refactors and compatibility with the virstoragetest the
relPath field of the virStorageSource structure was always filled either
with the relative name or the full path in case of absolutely backed
storage. Return its original purpose to store only the relative name of
the disk if it is backed relatively and tweak the tests.
This patch introduces a function that will allow us to resolve a
relative difference between two elements of a disk backing chain. This
function will be used to allow relative block commit and block pull
where we need to specify the new relative name of the image to qemu.
This patch also adds unit tests for the function to verify that it works
correctly.
As we are doing with the enum structures, a cleanup in "src/qemu/"
directory was done now. All the enums that were defined in the
header files were converted to typedefs in this directory. This
patch includes all the adjustments to remove conflicts when you do
this kind of change. "Enum-to-typedef"'s conversions were made in
"src/qemu/qemu_{capabilities, domain, migration, hotplug}.h".
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Don't free individual JSON array members as the array will be freed at
the end. This may potentially lead to a crash although it didn't crash
on my setup.
When looking for a port to allocate, the port allocator didn't take in
consideration ports that are statically set by the user. Defining
these two graphics elements in the XML would cause an error, as the
port allocator would try to use the same port for the spice graphics
element:
<graphics type='spice' autoport='yes'/>
<graphics type='vnc' port='5900' autoport='no'/>
The new *[pP]ortReserved variables keep track of the ports that were
successfully tracked as used by the port allocator but that weren't
bound.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1081881
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
virPortAllocatorSetUsed permits to set a port as already used and
prevent the port allocator to use it without any attempt to bind it.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Use virNetworkGetDHCPLeases and virNetworkGetDHCPLeasesForMAC in virsh.
The new feature supports the follwing methods:
1. Retrieve leases info for a given virtual network
2. Retrieve leases info for given network interface
tools/virsh-domain-monitor.c
* Introduce new command : net-dhcp-leases
Example Usage: net-dhcp-leases <network> [mac]
virsh # net-dhcp-leases --network default6
Expiry Time MAC address Protocol IP address Hostname Client ID or DUID
-------------------------------------------------------------------------------------------------------------------
2014-06-16 03:40:14 52:54:00:85:90:e2 ipv4 192.168.150.231/24 fedora20-test 01:52:54:00:85:90:e2
2014-06-16 03:40:17 52:54:00:85:90:e2 ipv6 2001:db8:ca2:2:1::c0/64 fedora20-test 00:04:b1:d8:86:42:e1:6a:aa:cf:d5:86:94:23:6f:94:04:cd
2014-06-16 03:34:42 52:54:00:e8:73:eb ipv4 192.168.150.181/24 ubuntu14-vm -
2014-06-16 03:34:46 52:54:00:e8:73:eb ipv6 2001:db8:ca2:2:1::5b/64 - 00:01:00:01:1b:30:c6:aa:52:54:00:e8:73:eb
tools/virsh.pod
* Document new command
src/internal.h
* Introduce new macro: EMPTYSTR
Query the network driver for the path of the custom leases file for the given
virtual network and parse it to retrieve info.
src/network/bridge_driver.c:
* Implement networkGetDHCPLeases
* Implement networkGetDHCPLeasesForMAC
* Implement networkGetDHCPLeasesHelper
Introduce 3 new APIs, virNetworkGetDHCPLeases, virNetworkGetDHCPLeasesForMAC
and virNetworkDHCPLeaseFree.
* virNetworkGetDHCPLeases: returns the dhcp leases information for a given
virtual network.
For DHCPv4, the information returned:
- Network Interface Name
- Expiry Time
- MAC address
- IAID (NULL)
- IPv4 address (with type and prefix)
- Hostname (can be NULL)
- Client ID (can be NULL)
For DHCPv6, the information returned:
- Network Interface Name
- Expiry Time
- MAC address
- IAID (can be NULL, only in rare cases)
- IPv6 address (with type and prefix)
- Hostname (can be NULL)
- Client DUID
Note: @mac, @iaid, @ipaddr, @clientid are in ASCII form, not raw bytes.
Note: @expirytime can 0, in case the lease is for infinite time.
* virNetworkGetDHCPLeasesForMAC: returns the dhcp leases information for a
given virtual network and specified MAC Address.
* virNetworkDHCPLeaseFree: allows the upper layer application to free the
network interface object conveniently.
There is no support for flags, so user is expected to pass 0 for
both the APIs.
include/libvirt/libvirt.h.in:
* Define virNetworkGetDHCPLeases
* Define virNetworkGetDHCPLeasesForMAC
* Define virNetworkDHCPLeaseFree
src/driver.h:
* Define networkGetDHCPLeases
* Define networkGetDHCPLeasesForMAC
src/libvirt.c:
* Implement virNetworkGetDHCPLeases
* Implement virNetworkGetDHCPLeasesForMAC
* Implement virNetworkDHCPLeaseFree
src/libvirt_public.syms:
* Export the new symbols
Fix lxcDomainGetMemoryParameters and lxcDomainGetSchedulerParametersFlags:
virsh -c lxc:/// memtune DOMAIN
error: Unable to get number of memory parameters
error: unsupported flags (0x4) in function lxcDomainGetMemoryParameters
Introduced by commit 399394.
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
If we are running on a system that is not capable of huge pages (e.g.
because the kernel is not configured that way) we still try to open
"/sys/kernel/mm/hugepages/" which however does not exist. We should
be tolerant to this specific use case.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
On the Linux kernel, if huge pages are allocated the size they cut off
from memory is accounted under the 'MemUsed' in the meminfo file.
However, we want the sum to be subtracted from 'MemTotal'. This patch
implements this feature. After this change, we can enable reporting
of the ordinary system pages in the capability XML:
<capabilities>
<host>
<uuid>01281cda-f352-cb11-a9db-e905fe22010c</uuid>
<cpu>
<arch>x86_64</arch>
<model>Haswell</model>
<vendor>Intel</vendor>
<topology sockets='1' cores='1' threads='1'/>
<feature/>
<pages unit='KiB' size='4'/>
<pages unit='KiB' size='2048'/>
<pages unit='KiB' size='1048576'/>
</cpu>
<power_management/>
<migration_features/>
<topology>
<cells num='4'>
<cell id='0'>
<memory unit='KiB'>4048248</memory>
<pages unit='KiB' size='4'>748382</pages>
<pages unit='KiB' size='2048'>3</pages>
<pages unit='KiB' size='1048576'>1</pages>
<distances/>
<cpus num='1'>
<cpu id='0' socket_id='0' core_id='0' siblings='0'/>
</cpus>
</cell>
...
</cells>
</topology>
</host>
</capabilities>
You can see the beautiful thing about this: if you sum up all the
<pages/> you'll get <memory/>.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Introduce a common function that will take a callback to resolve links
that will be used to canonicalize paths on various storage systems and
add extensive tests.
To free string lists with some strings stolen from the middle we need to
walk the complete array. Introduce a new helper that takes the string
list size to free such string lists.
The libxl driver currently sets the disk backend to
LIBXL_DISK_BACKEND_TAP when <driver name='file'> is specified
in the <disk> config. qdisk should be prefered with this
configuration, otherwise existing configuration such as the
following, which worked with the old Xen driver, will not work
with the libxl driver
<disk type='file' device='cdrom'>
<driver name='file'/>
<source file='/path/to/some/iso'/>
<target dev='hdc' bus='ide'/>
<readonly/>
</disk>
In addition, tap performs poorly compared to qdisk.
virNumaGetPages calls closedir(dir) in cleanup and dir could
be NULL if we jump there from the failed opendir() call.
While it's not harmful on Linux, FreeBSD libc crashes [1], so
make sure that dir is not NULL before calling closedir.
1: http://lists.freebsd.org/pipermail/freebsd-standards/2014-January/002704.html
When testing language bindings it is useful to be able to build
them against an uninstalled libvirt source tree. Add a dummy
set of pkg-config files to allow for this. This can be used by
setting
export PKG_CONFIG_PATH=/path/to/libvirt/git/src
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
One of previous commits (e6258a33) tried to build the huge page code
only on Linux since it's Linux centric indeed. But it failed miserably
as it used 'WITH_LINUX' which is an automake conditional not a gcc
one. In the sources we need to use __linux__.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There are no options to parse here other than the name of the device,
and all three possible device names have the same prefix
("virtio-balloon" with "-ccw", "-pci", or "-device" appended), so the
code is fairly simple. It has been implemented such that it will be
easier to add handling for other -device entries that aren't otherwise
recognized - just add another "else if (STRPREFIX(opts, ....)" clause.
qemuParseCommandLineString() previously would always add a <memballoon
model='virtio'/> to every result (the comments erroneously say that it
is adding a <memballoon model='none'/>) This has been changed to add
model='none', and 84 test case xml's updated accordingly (so that
qemuxml2argvtest won't fail).
Now that the memballoon device is properly parsed, we can safely add a
test for properly ignoring -nodefconfig and -nodefaults. Rather than
adding an entire new test case for this (and memballoon), we just
randomly pick the clock-utc test and modify it slightly to fulfill the
purpose.
Only three other callers possibly call closedir on a NULL argument.
Even though these probably won't be used on FreeBSD where this crashes,
let's be nice and only call closedir on an actual directory stream.
==== Invalid write of size 4
==== at 0x52E678C: virNumaGetDistances (virnuma.c:479)
==== by 0x5396890: nodeCapsInitNUMA (nodeinfo.c:1796)
==== by 0x203C2B: virQEMUCapsInit (qemu_capabilities.c:960)
==== Address 0xe10a1e0 is 0 bytes after a block of size 0 alloc'd
==== at 0x4C2A6D0: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==== by 0x52A10D6: virAllocN (viralloc.c:191)
==== by 0x52E674D: virNumaGetDistances (virnuma.c:470)
==== by 0x5396890: nodeCapsInitNUMA (nodeinfo.c:1796)
==== by 0x203C2B: virQEMUCapsInit (qemu_capabilities.c:960)
The hugepage sizing and counting code gathers the information from sysfs
and thus isn't portable. Stub it out for non-Linux so that we can report
a better error. This patch also avoids calling sysinfo() on Mingw where
it isn't supported.
It returns NULL on failure. Checking if the negation of it
is less than zero makes no sense. (Found by coverity after moving
the code)
In another case, the return value wasn't checked at all.
Migration code specifies the problematic non-cooperative resume mode
which is a known issue with Xen's libxl [1]. Instead, use the better
supported cooperative mode.
Without this, guests BUG() in xen_irq_resume after failing to bind
still-bound event channels.
[1] http://bugs.xenproject.org/xen/bug/30
So far three ARM processor families are known to libvirt,
however the cpu driver knows only about one of them. This
make host initialization on the other two fail:
2014-06-17 13:35:41.419+0000: 6840: info : libvirt version: 1.2.6
2014-06-17 13:35:41.419+0000: 6840: error : cpuNodeData:342 : this function is not supported by the connection driver: cannot get node CPU data for armv6l architecture
2014-06-17 13:35:41.433+0000: 6840: warning : virQEMUCapsInit:943 : Failed to get host CPU
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The virNodeParseSocket() function tries to get socked ID from
'topology/physical_package_id' file. However, on some architectures
the file contains the -1 constant which makes in turn libvirt think
the info extraction was unsuccessful. If that's the case, we need to
overwrite the obtained integer with zero like we are doing for other
architectures.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
As in previous commit, there are again some places where we can do
runtime decision instead of compile time. This time it's whether the
'topology/physical_package_id' is allowed to have '-1' within or not.
Then, core ID is pared differently on s390(x) than on the rest of
architectures.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So far, we are doing compile time decisions on which architecture is
used. However, for testing purposes it's much easier if we pass host
architecture as parameter and then let the function decide which code
snippet for extracting host CPU info will be used.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This modifies the formatting function of virInterface to be a proper
mirror of the parse function, including the addition of a
"parentIfType" arg so that we can decide whether or not it is
appropriate to emit the elements that are only in toplevel interfaces,
as well as the <link> element (which isn't allowed for bridge
interfaces).
Since the restructuring of the code necessarily changes the order of
some of the elements, some test case data had to be updated.
the switch cases for the 4 different interface types had repetitive
code which has now been pulled out as common. While touching those
lines, some extra usage of "!= NULL" etc has been eliminated to make
things more compact and inline with current coding practices.
NB: parentIfType == VIR_INTERFACE_TYPE_LAST means that this is a
toplevel interface (not a subordinate of a bridge or bond). Only
toplevel interfaces can have a start mode, mtu, or IP address element.
For some reason the bridge stp mode and delay were put directly into
the "bridge" case of the switch in virInterfaceDefParseXML(), although
they are inside the <bridge> element, and so should be parsed in the
function created for that purpose - virInterfaceBridgeDefFormat().
The interface state for bonds and vlans does seem to reflect the state
of the underlying physical devices, at least in some cases, so it
makes sense to allow reporting it (netcf now does).
The link state/speed for bridge devices is meaningless though, so we
don't even look for it.
I'm going to add functions that will deal with individual image files
rather than whole disks. Rename the security function to make room for
the new one.
The image labels are stored in the virStorageSource struct. Convert the
virDomainDiskDefGetSecurityLabelDef helper not to use the full disk def
and move it appropriately.
Generally, <interface> ... <script> is only supported for
type='ethernet'. Due to the long and pervasive use of
<interface type='bridge'>
...
<script path='foo'/>
</interface>
in Xen domain configuration, it was agreed to allow the use
of <script> with type='bridge' for backwards compatibility. See
the following discussion thread
http://www.redhat.com/archives/libvir-list/2013-April/msg00755.html
This patch limits the use of <script> to interface types ethernet
and bridge, raising an unsupported config error if <script> is
specified for all other interface types.
While at it, use VIR_ERR_CONFIG_UNSUPPORTED instead of
VIR_ERR_INTERNAL_ERROR when reporting unsupported interface types.
The aim of the API is to get information on number of free pages
on the system. The API behaves similar to the
virNodeGetCellsFreeMemory(). User passes starting NUMA cell, the
count of nodes that he's interested in, pages sizes (yes,
multiple sizes can be queried at once) and the counts are
returned in an array.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There are two places where you'll find info on page sizes. The first
one is under <cpu/> element, where all supported pages sizes are
listed. Then the second one is under each <cell/> element which refers
to concrete NUMA node. At this place, the size of page's pool is
reported. So the capabilities XML looks something like this:
<capabilities>
<host>
<uuid>01281cda-f352-cb11-a9db-e905fe22010c</uuid>
<cpu>
<arch>x86_64</arch>
<model>Westmere</model>
<vendor>Intel</vendor>
<topology sockets='1' cores='1' threads='1'/>
...
<pages unit='KiB' size='4'/>
<pages unit='KiB' size='2048'/>
<pages unit='KiB' size='1048576'/>
</cpu>
...
<topology>
<cells num='4'>
<cell id='0'>
<memory unit='KiB'>4054408</memory>
<pages unit='KiB' size='4'>1013602</pages>
<pages unit='KiB' size='2048'>3</pages>
<pages unit='KiB' size='1048576'>1</pages>
<distances/>
<cpus num='1'>
<cpu id='0' socket_id='0' core_id='0' siblings='0'/>
</cpus>
</cell>
<cell id='1'>
<memory unit='KiB'>4071072</memory>
<pages unit='KiB' size='4'>1017768</pages>
<pages unit='KiB' size='2048'>3</pages>
<pages unit='KiB' size='1048576'>1</pages>
<distances/>
<cpus num='1'>
<cpu id='1' socket_id='0' core_id='0' siblings='1'/>
</cpus>
</cell>
...
</cells>
</topology>
...
</host>
<guest/>
</capabilities>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For future work we need two functions that fetches total number of
pages and number of free pages for given NUMA node and page size
(virNumaGetPageInfo()).
Then we need to learn pages of what sizes are supported on given node
(virNumaGetPages()).
Note that system page size is disabled at the moment as there's one
issue connected. If you have a NUMA node with huge pages allocated the
kernel would return the normal size of memory for that node. It
basically ignores the fact that huge pages steal size from the system
memory. Until we resolve this, it's safer to not confuse users and
hence not report any system pages yet.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For future work we want to get info for not only the free memory
but overall memory size too. That's why the function must have
new signature too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Not on all hosts the set of NUMA nodes IDs is continuous. This is
critical, because our code currently assumes the set doesn't contain
holes. For instance in nodeGetFreeMemory() we can see the following
pattern:
if ((max_node = virNumaGetMaxNode()) < 0)
return 0;
for (n = 0; n <= max_node; n++) {
...
}
while it should be something like this:
if ((max_node = virNumaGetMaxNode()) < 0)
return 0;
for (n = 0; n <= max_node; n++) {
if (!virNumaNodeIsAvailable(n))
continue;
...
}
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When the block job event was first added, it was for block pull,
where the active layer of the disk remains the same name. It was
also in a day where we only cared about local files, and so we
always had a canonical absolute file name. But two things have
changed since then: we now have network disks, where determining
a single absolute string does not really make sense; and we have
two-phase jobs (copy and active commit) where the name of the
active layer changes between the first event (ready, on the old
name) and second (complete, on the pivoted name).
Adam Litke reported that having an unstable string between events
makes life harder for clients. Furthermore, all of our API that
operate on a particular disk of a domain accept multiple strings:
not only the absolute name of the active layer, but also the
destination device name (such as 'vda'). As this latter name is
stable, even for network sources, it serves as a better string
to supply in block job events.
But backwards-compatibility demands that we should not change the
name handed to users unless they explicitly request it. Therefore,
this patch adds a new event, BLOCK_JOB_2 (alas, I couldn't think of
any nicer name - but at least Migrate2 and Migrate3 are precedent
for a number suffix). We must double up on emitting both old-style
and new-style events according to what clients have registered for
(see also how IOError and IOErrorReason emits double events, but
there the difference was a larger struct rather than changed
meaning of one of the struct members).
Unfortunately, adding a new event isn't something that can easily
be broken into pieces, so the commit is rather large.
* include/libvirt/libvirt.h.in (virDomainEventID): Add a new id
for VIR_DOMAIN_EVENT_ID_BLOCK_JOB_2.
(virConnectDomainEventBlockJobCallback): Document new semantics.
* src/conf/domain_event.c (_virDomainEventBlockJob): Rename field,
to ensure we catch all clients.
(virDomainEventBlockJobNew): Add parameter.
(virDomainEventBlockJobDispose)
(virDomainEventBlockJobNewFromObj)
(virDomainEventBlockJobNewFromDom)
(virDomainEventDispatchDefaultFunc): Adjust clients.
(virDomainEventBlockJob2NewFromObj)
(virDomainEventBlockJob2NewFromDom): New functions.
* src/conf/domain_event.h: Add new prototypes.
* src/libvirt_private.syms (domain_event.h): Export new functions.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Generate two
different events.
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Likewise.
* src/remote/remote_protocol.x
(remote_domain_event_block_job_2_msg): New struct.
(REMOTE_PROC_DOMAIN_EVENT_BLOCK_JOB_2): New RPC.
* src/remote/remote_driver.c
(remoteDomainBuildEventBlockJob2): New handler.
(remoteEvents): Register new event.
* daemon/remote.c (remoteRelayDomainEventBlockJob2): New handler.
(domainEventCallbacks): Register new event.
* tools/virsh-domain.c (vshEventCallbacks): Likewise.
(vshEventBlockJobPrint): Adjust client.
* src/remote_protocol-structs: Regenerate.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit ac63014c introduced a regression in the conversion of Xen
xm config to XML by emitting an empty <cmdline>. Prior to this
commit, <cmdline> was omitted if the xm config was missing (or
contained an empty) 'extra='.