The error message reported when attempting to change/get persistent
configuration of a transient domain suggests that changes are being
made. Reword it to suit getter APIs too.
Before:
$ virsh vcpucount transient-domain --config
error: Requested operation is not valid: cannot change persistent config of a transient domain
After:
$ virsh vcpucount transient-domain --config
error: Requested operation is not valid: transient domains do not have any persistent config
Until now tranisent networks weren't really useful as libvirtd wasn't
able to remember them across restarts. This patch adds support for
loading status files of transient networks (that already were generated)
so that the status isn't lost.
This patch chops up virNetworkObjUpdateParseFile and turns it into
virNetworkLoadState and a few friends that will help us to load status
XMLs and refactors the functions that are loading the configs to use
them.
The cpu_map.xml file is there to separate CPU model definitions from the
code. Having the only interesting data for PowerPC models only in the
source code. This patch moves this data to the XML file and removes the
hardcoded list completely.
PowerPC CPUs are either identical or incompatible and thus we just need
to look up the right model for given PVR without pretending we have
several candidates which we may choose from.
The function is also renamed as ppcDecode to match other functions in
PowerPC CPU driver.
Baseline API is supposed to return guest CPU definition that can be used
on any of the provided host CPUs. Since PowerPC CPUs are either
identical or incompatible, the API just needs to check that all provided
CPUs are identical. Previous implementation was completely bogus.
The function is also renamed as ppcBaseline to match other functions in
PowerPC CPU driver.
When ppcVendorLoad fails to parse the vendor element for whatever
reason, it is supposed to ignore it and return 0 rather than -1. The
patch also removes PowerPC vendor string from the XML as it is not
actually used for anything.
Make getting node CPU data for PowerPC unsupported on other
architectures. The function is also renamed as ppcNodeData to match
other functions in PowerPC CPU driver.
Currently, -device xxx still doesn't work well for ppc64 platform.
It's better use legacy USB option with default for ppc64.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When running unprivileged, virSetUIDGIDWithCaps will fail because it
tries to add the requested capabilities to the permitted and effective
sets.
Detect this case, and invoke the child with cleared permitted and
effective sets. If it is a setuid program, it will get them.
Some care is needed also because you cannot drop capabilities from the
bounding set without CAP_SETPCAP. Because of that, ignore errors from
setting the bounding set.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The need_prctl variable is not really needed. If it is false,
capng_apply will be called twice with the same set, causing
a little extra work but no problem. This keeps the code a bit
simpler.
It is also clearer to invoke capng_apply(CAPNG_SELECT_BOUNDS)
separately, to make sure it is done while we have CAP_SETPCAP.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reusing the result of virArchFromHost instead of calling it multiple times
Signed-off-by: Tal Kain <tal.kain@ravellosystems.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Directories python/tools/examples should include them in <> form,
though this patch allows "" form in these directories by excluding
them, a later patch will do the cleanup.
The recent qemu requires "0x" prefix for the disk wwn, this patch
changes virValidateWWN to allow the prefix, and prepend "0x" if
it's not specified. E.g.
qemu-kvm: -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,\
drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,wwn=6000c60016ea71ad:
Property 'scsi-hd.wwn' doesn't take value '6000c60016ea71ad'
Though it's a qemu regression, but it's nice to allow the prefix,
and doesn't hurt for us to always output "0x".
Detected by a simple Shell script:
for i in $(git ls-files -- '*.[ch]'); do
awk 'BEGIN {
fail=0
}
/# *include.*\.h/{
match($0, /["<][^">]*[">]/)
arr[substr($0, RSTART+1, RLENGTH-2)]++
}
END {
for (key in arr) {
if (arr[key] > 1) {
fail=1
printf("%d %s\n", arr[key], key)
}
}
if (fail == 1)
exit 1
}' $i
if test $? != 0; then
echo "Duplicate header(s) in $i"
fi
done;
A later patch will add the syntax-check to avoid duplicate
headers.
Fix the error
util/vircgroup.c: In function 'virCgroupNewDomainPartition':
util/vircgroup.c:1299:11: error: declaration of 'dirname' shadows a global declaration [-Werror=shadow]
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Check for an unsupported QMP command when using the query-tpm-models
and query-tpm-types commands before checking for general errors
in order to avoid error messages in the log.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Check for QMP query-tpm-models and set a capability flag. Do not use
this QMP command if it is not supported.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
The LXC driver currently has code to detect cgroups mounts
and then re-mount them inside the new root filesystem. Replace
this fragile code with a call to virCgroupIsolateMount.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a virCgroupIsolateMount method which looks at where the
current process is place in the cgroups (eg /system/demo.lxc.libvirt)
and then remounts the cgroups such that this sub-directory
becomes the root directory from the current process' POV.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If a cgroup controller is co-mounted with another, eg
/sys/fs/cgroup/cpu,cpuacct
Then it is a requirement that there exist symlinks at
/sys/fs/cgroup/cpu
/sys/fs/cgroup/cpuacct
pointing to the real mount point. Add support to virCgroupPtr
to detect and track these symlinks
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virCgroupNewDriver method had a 'bool privileged' param.
If a false value was ever passed in, it would simply not
work, since non-root users don't have any privileges to create
new cgroups. Just delete this broken code entirely and make
the QEMU driver skip cgroup setup in non-privileged mode
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Historically QEMU/LXC guests have been placed in a cgroup layout
that is
$LOCATION-OF-LIBVIRTD/libvirt/{qemu,lxc}/$VMNAME
This is bad for a number of reasons
- The cgroup hierarchy gets very deep which seriously
impacts kernel performance due to cgroups scalability
limitations.
- It is hard to setup cgroup policies which apply across
services and virtual machines, since all VMs are underneath
the libvirtd service.
To address this the default cgroup location is changed to
be
/system/$VMNAME.{lxc,qemu}.libvirt
This puts virtual machines at the same level in the hierarchy
as system services, allowing consistent policy to be setup
across all of them.
This also honours the new resource partition location from the
XML configuration, for example
<resource>
<partition>/virtualmachines/production</partitions>
</resource>
will result in the VM being placed at
/virtualmachines/production/$VMNAME.{lxc,qemu}.libvirt
NB, with the exception of the default, /system, path which
is intended to always exist, libvirt will not attempt to
auto-create the partitions in the XML. It is the responsibility
of the admin/app to configure the partitions. Later libvirt
APIs will provide a way todo this.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Allow VMs to be placed into resource groups using the
following syntax
<resource>
<partition>/virtualmachines/production</partition>
</resource>
A resource cgroup will be backed by some hypervisor specific
functionality, such as cgroups with KVM/LXC.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A resource partition is an absolute cgroup path, ignoring the
current process placement. Expose a virCgroupNewPartition API
for constructing such cgroups
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently if virCgroupMakeGroup fails, we can get in a situation
where some controllers have been setup, but others not. Ensure
we call virCgroupRemove to remove what we've done upon failure
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the virCgroupPtr struct contains 3 pieces of
information
- path - path of the cgroup, relative to current process'
cgroup placement
- placement - current process' placement in each controller
- mounts - mount point of each controller
When reading/writing cgroup settings, the path & placement
strings are combined to form the file path. This approach
only works if we assume all cgroups will be relative to
the current process' cgroup placement.
To allow support for managing cgroups at any place in the
heirarchy a change is needed. The 'placement' data should
reflect the absolute path to the cgroup, and the 'path'
value should no longer be used to form the paths to the
cgroup attribute files.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Rename all the virCgroupForXXX methods to use the form
virCgroupNewXXX since they are all constructors. Also
make sure the output parameter is the last one in the
list, and annotate all pointers as non-null. Fix up
all callers, and make sure they use true/false not 0/1
for the boolean parameters
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The definition of structs for cgroups are kept in vircgroup.c since
they are intended to be private from users of the API. To enable
effective testing, however, they need to be accessible. To address
the latter issue, without compronmising the former, this introduces
a new vircgrouppriv.h file to hold the struct definitions.
To prevent other files including this private header, it requires
that __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ be defined before inclusion
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of calling virCgroupForDomain every time we need
the virCgrouPtr instance, just do it once at Vm startup
and cache a reference to the object in virLXCDomainObjPrivatePtr
until shutdown of the VM. Removing the virCgroupPtr from
the LXC driver state also means we don't have stale mount
info, if someone mounts the cgroups filesystem after libvirtd
has been started
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of calling virCgroupForDomain every time we need
the virCgrouPtr instance, just do it once at Vm startup
and cache a reference to the object in qemuDomainObjPrivatePtr
until shutdown of the VM. Removing the virCgroupPtr from
the QEMU driver state also means we don't have stale mount
info, if someone mounts the cgroups filesystem after libvirtd
has been started
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virCgroupForDriver method recently gained an 'int controllers'
parameter, but the stub impl did not
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Though they are the same thing, mixed use of them is uncomfortable.
"unsigned" is used a lot in old codes, this just tries to change the
ones in utils.
If libvirt makes any gcry_control() calls, then this
prevents gnutls for doing any initialization. As such
we must take care to do full initialization of libcrypt
on a par with what gnutls would have done. In particular
we must disable "sec mem" for cases where the user does
not have mlock() permission. We also skip our init of
libgcrypt if something else (ie the app using libvirt)
has beaten us to it.
https://bugzilla.redhat.com/show_bug.cgi?id=951630
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Report the errors as:
Domain not found: no domain with matching uuid '41414141-4141-4141-4141-414141414141' (crashtest)
instead of:
Domain not found: no domain with matching uuid '41414141-4141-4141-4141-414141414141'
Use the helper to lookup the domain object in the remaining places.
This patch also fixes error reporting when the domain was not found in several
functions that were printing the raw UUID buffer instead of the formatted
string. The offending functions were:
qemuDomainGetInterfaceParameters
qemuDomainSetInterfaceParameters
qemuGetSchedulerParametersFlags
qemuSetSchedulerParametersFlags
qemuDomainGetNumaParameters
qemuDomainSetNumaParameters
qemuDomainGetMemoryParameters
qemuDomainSetMemoryParameters
qemuDomainGetBlkioParameters
qemuDomainSetBlkioParameters
qemuDomainGetCPUStats
Some refactoring for virDomainChrSourceDef type of devices so
we can use common code.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
When a VM with a TPM passthrough device is started, the audit daemon
logs the following type of message:
type=VIRT_RESOURCE msg=audit(1365170222.460:3378): pid=16382 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 msg='virt=kvm resrc=dev reason=start vm="TPM-PT" uuid=a4d7cd22-da89-3094-6212-079a48a309a1 device="/dev/tpm0" exe="/usr/sbin/libvirtd" hostname=? addr=? terminal=? res=success'
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Parse the domain XML with TPM passthrough support.
The TPM passthrough XML may look like this:
<tpm model='tpm-tis'>
<backend type='passthrough'>
<device path='/dev/tpm0'/>
</backend>
</tpm>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Probe for QEMU's QMP TPM support by querying the lists of
supported TPM models (query-tpm-models) and backend types
(query-tpm-types).
The setting of the capability flags following the strings
returned from the commands above is only provided in the
patch where domain_conf.c gets TPM support due to dependencies
on functions only introduced there.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
This patch adds the ability to configure non-contiguous boot orders on boot
devices. This allows unplugging devices that have boot order specified without
breaking migration.
The new code now uses a slightly less memory efficient approach to store the
boot order fields in a hashtable instead of a bitmap.
To avoid the collision for creating USB controllers in machine->init()
and -device xx command line, it needs to set usb=off to avoid one USB
controller created in machine->init(). So that libvirt can use -device
or -usb to create USB controller sucessfully.
So QEMU_CAPS_MACHINE_USB_OPT capability is added, and it is for QEMU
v1.3.0 onwards which supports USB option.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
When migrating a domain with disk images stored locally (and using
storage migration), we should not complain about unsafe migration no
matter what cache policy is used for that disk.
https://bugzilla.redhat.com/show_bug.cgi?id=920441
Currently, we are discarding listen attribute from qemu cookie even though
we strive to gather it. This result in not so cool bug: if user have
different networks, one for management/migration, and one for VNC/SPICE we
pass incorrect host to the qemu in client_migrate_info. What we actually
pass is remote hostname, while we should be passing remote listen address.
It doesn't matter as long as these two are the same, but they don't need
necessary to be like that.
==5306== 8 bytes in 1 blocks are definitely lost in loss record 24 of 277
==5306== at 0x4C28B2F: calloc (vg_replace_malloc.c:593)
==5306== by 0x5293CAF: virAllocN (viralloc.c:152)
==5306== by 0x52DFEAE: virXPathNodeSet (virxml.c:611)
==5306== by 0x5313DD9: virNetworkDefParseXML (network_conf.c:1408)
==5306== by 0x53170F6: virNetworkObjUpdateParseFile (network_conf.c:2031)
==5306== by 0x131DA63C: networkStartup (bridge_driver.c:279)
==5306== by 0x53481DF: virStateInitialize (libvirt.c:822)
==5306== by 0x40DF44: daemonRunStateInit (libvirtd.c:877)
==5306== by 0x52D2FF5: virThreadHelper (virthreadpthread.c:161)
==5306== by 0x5D00C52: start_thread (in /usr/lib64/libpthread-2.17.so)
==5306== by 0x6410ECC: clone (in /usr/lib64/libc-2.17.so)
This patch fixes crash of the daemon that happens due to the following race
condition:
Let's have two threads in the libvirtd daemon's qemu driver:
A - thread executing undefine on the same domain
B - thread executing a API call to get information about a domain
Assume following serialization of operations done by the threads:
1) A has the lock on the domain object and is executing some code prior to
virDomainObjListRemove()
2) B takes the lock on the domain object list, looks up the domain object
pointer and blocks in the attempt to lock the domain object as A is holding the
lock
3) A reaches virDomainObjListRemove() and unlocks the lock on the domain object
4) A blocks on the attempt to get the domain list lock
5) B is able to lock the domain object now and unlocks the domain list
6) A is now able to lock the domain list, and sheds the last reference on the
domain object, this triggers the freeing function.
6) B starts executing the code on the pointer that is being freed
7) The libvirtd daemon crashes while attempting to access invalid pointer in
thread B.
This patch fixes the race by acquiring a reference on the domain object before
unlocking it in virDomainObjListRemove() and re-locks the object prior to
removing and freeing it. This ensures that no thread holds a lock on the domain
object at the time it is removed from the list, and that doing a list lookup
will never find a domain that is about to vanish.
This is a minimal fix of the problem, but a better solution will be to switch to
full reference counting for domain objects.
Commit 9a3ff01d7f (which was ACKed at
the end of January, but for some reason didn't get pushed until during
the 1.0.4 freeze) fixed the logic in virPCIGetVirtualFunctions().
Unfortunately, a typo in the fix (replacing VIR_REALLOC_N with
VIR_ALLOC_N during code movement) caused not only a memory leak, but
also resulted in most of the elements of the result array being
replaced with NULL. virNetDevGetVirtualFunctions() assumed (and I think
rightly so) that virPCIGetVirtualFunctions() wouldn't return any NULL
elements in the array, so it ended up segfaulting.
This was found when attempting to use a virtual network with an
auto-created pool of SRIOV VFs, e.g.:
<forward mode='hostdev' managed='yes'>
<pf dev='eth4'/>
</forward>
(the pool of PCI addresses is discovered by calling
virNetDevGetVirtualFunctions() on the PF dev).
The helper function to look up disk controller model may be used by scsi
hostdev. But it should be changed to use device info.
Signed-off-by: Han Cheng <hanc.fnst@cn.fujitsu.com>
Even though http://libvirt.org/formatdomain.html#elementsMetadata
states that it requires RFC4122 compliance UUIDs that are generated
by virUUIDGenerate() are not. Following patch modifies generated
UUIDs to conform to rules described in RFC.
Signed-off-by: Milos Vyletel <milos.vyletel@sde.cz>
If the user requests a mount for /run, this may hide any existing
mounts that are lower down in /run. The result is that the
container still sees the mounts in /proc/mounts, but cannot
access them
sh-4.2# df
df: '/run/user/501/gvfs': No such file or directory
df: '/run/media/berrange/LIVE': No such file or directory
df: '/run/media/berrange/SecureDiskA1': No such file or directory
df: '/run/libvirt/lxc/sandbox': No such file or directory
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/vg_t500wlan-lv_root 151476396 135390200 8384900 95% /
tmpfs 1970888 3204 1967684 1% /run
/dev/sda1 194241 155940 28061 85% /boot
devfs 64 0 64 0% /dev
tmpfs 64 0 64 0% /sys/fs/cgroup
tmpfs 1970888 1200 1969688 1% /etc/libvirt-sandbox/scratch
Before mounting any filesystem at a particular location, we
must recursively unmount anything at or below the target mount
point
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ensure lxcContainerUnmountSubtree is at the top of the
lxc_container.c file so it is easily referenced from
any other method. No functional change
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This allows a container-type domain to have exclusive access to one of
the host's NICs.
Wire <hostdev caps=net> with the lxc_controller - when moving the newly
created veth devices into a new namespace, also look for any hostdev
devices that should be moved. Note: once the container domain has been
destroyed, there is no code that moves the interfaces back to the
original namespace. This does happen, though, probably due to default
cleanup on namespace destruction.
Signed-off-by: Bogdan Purcareata <bogdan.purcareata@freescale.com>
This updates the definitions and supporting structures in the XML
schema and domain configuration files.
Signed-off-by: Bogdan Purcareata <bogdan.purcareata@freescale.com>
The virCgroupMounted method is badly named, since a controller can be
mounted, but disabled in the current object. Rename the method to be
virCgroupHasController. Also make it tolerant to a NULL virCgroupPtr
and out-of-range controller index, to avoid duplication of these
checks in all callers
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>