Check for QMP query-tpm-models and set a capability flag. Do not use
this QMP command if it is not supported.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
The LXC driver currently has code to detect cgroups mounts
and then re-mount them inside the new root filesystem. Replace
this fragile code with a call to virCgroupIsolateMount.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a virCgroupIsolateMount method which looks at where the
current process is place in the cgroups (eg /system/demo.lxc.libvirt)
and then remounts the cgroups such that this sub-directory
becomes the root directory from the current process' POV.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If a cgroup controller is co-mounted with another, eg
/sys/fs/cgroup/cpu,cpuacct
Then it is a requirement that there exist symlinks at
/sys/fs/cgroup/cpu
/sys/fs/cgroup/cpuacct
pointing to the real mount point. Add support to virCgroupPtr
to detect and track these symlinks
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virCgroupNewDriver method had a 'bool privileged' param.
If a false value was ever passed in, it would simply not
work, since non-root users don't have any privileges to create
new cgroups. Just delete this broken code entirely and make
the QEMU driver skip cgroup setup in non-privileged mode
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Historically QEMU/LXC guests have been placed in a cgroup layout
that is
$LOCATION-OF-LIBVIRTD/libvirt/{qemu,lxc}/$VMNAME
This is bad for a number of reasons
- The cgroup hierarchy gets very deep which seriously
impacts kernel performance due to cgroups scalability
limitations.
- It is hard to setup cgroup policies which apply across
services and virtual machines, since all VMs are underneath
the libvirtd service.
To address this the default cgroup location is changed to
be
/system/$VMNAME.{lxc,qemu}.libvirt
This puts virtual machines at the same level in the hierarchy
as system services, allowing consistent policy to be setup
across all of them.
This also honours the new resource partition location from the
XML configuration, for example
<resource>
<partition>/virtualmachines/production</partitions>
</resource>
will result in the VM being placed at
/virtualmachines/production/$VMNAME.{lxc,qemu}.libvirt
NB, with the exception of the default, /system, path which
is intended to always exist, libvirt will not attempt to
auto-create the partitions in the XML. It is the responsibility
of the admin/app to configure the partitions. Later libvirt
APIs will provide a way todo this.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Allow VMs to be placed into resource groups using the
following syntax
<resource>
<partition>/virtualmachines/production</partition>
</resource>
A resource cgroup will be backed by some hypervisor specific
functionality, such as cgroups with KVM/LXC.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A resource partition is an absolute cgroup path, ignoring the
current process placement. Expose a virCgroupNewPartition API
for constructing such cgroups
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently if virCgroupMakeGroup fails, we can get in a situation
where some controllers have been setup, but others not. Ensure
we call virCgroupRemove to remove what we've done upon failure
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the virCgroupPtr struct contains 3 pieces of
information
- path - path of the cgroup, relative to current process'
cgroup placement
- placement - current process' placement in each controller
- mounts - mount point of each controller
When reading/writing cgroup settings, the path & placement
strings are combined to form the file path. This approach
only works if we assume all cgroups will be relative to
the current process' cgroup placement.
To allow support for managing cgroups at any place in the
heirarchy a change is needed. The 'placement' data should
reflect the absolute path to the cgroup, and the 'path'
value should no longer be used to form the paths to the
cgroup attribute files.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Rename all the virCgroupForXXX methods to use the form
virCgroupNewXXX since they are all constructors. Also
make sure the output parameter is the last one in the
list, and annotate all pointers as non-null. Fix up
all callers, and make sure they use true/false not 0/1
for the boolean parameters
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The definition of structs for cgroups are kept in vircgroup.c since
they are intended to be private from users of the API. To enable
effective testing, however, they need to be accessible. To address
the latter issue, without compronmising the former, this introduces
a new vircgrouppriv.h file to hold the struct definitions.
To prevent other files including this private header, it requires
that __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ be defined before inclusion
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of calling virCgroupForDomain every time we need
the virCgrouPtr instance, just do it once at Vm startup
and cache a reference to the object in virLXCDomainObjPrivatePtr
until shutdown of the VM. Removing the virCgroupPtr from
the LXC driver state also means we don't have stale mount
info, if someone mounts the cgroups filesystem after libvirtd
has been started
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of calling virCgroupForDomain every time we need
the virCgrouPtr instance, just do it once at Vm startup
and cache a reference to the object in qemuDomainObjPrivatePtr
until shutdown of the VM. Removing the virCgroupPtr from
the QEMU driver state also means we don't have stale mount
info, if someone mounts the cgroups filesystem after libvirtd
has been started
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virCgroupForDriver method recently gained an 'int controllers'
parameter, but the stub impl did not
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Though they are the same thing, mixed use of them is uncomfortable.
"unsigned" is used a lot in old codes, this just tries to change the
ones in utils.
If libvirt makes any gcry_control() calls, then this
prevents gnutls for doing any initialization. As such
we must take care to do full initialization of libcrypt
on a par with what gnutls would have done. In particular
we must disable "sec mem" for cases where the user does
not have mlock() permission. We also skip our init of
libgcrypt if something else (ie the app using libvirt)
has beaten us to it.
https://bugzilla.redhat.com/show_bug.cgi?id=951630
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Report the errors as:
Domain not found: no domain with matching uuid '41414141-4141-4141-4141-414141414141' (crashtest)
instead of:
Domain not found: no domain with matching uuid '41414141-4141-4141-4141-414141414141'
Use the helper to lookup the domain object in the remaining places.
This patch also fixes error reporting when the domain was not found in several
functions that were printing the raw UUID buffer instead of the formatted
string. The offending functions were:
qemuDomainGetInterfaceParameters
qemuDomainSetInterfaceParameters
qemuGetSchedulerParametersFlags
qemuSetSchedulerParametersFlags
qemuDomainGetNumaParameters
qemuDomainSetNumaParameters
qemuDomainGetMemoryParameters
qemuDomainSetMemoryParameters
qemuDomainGetBlkioParameters
qemuDomainSetBlkioParameters
qemuDomainGetCPUStats
Some refactoring for virDomainChrSourceDef type of devices so
we can use common code.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
When a VM with a TPM passthrough device is started, the audit daemon
logs the following type of message:
type=VIRT_RESOURCE msg=audit(1365170222.460:3378): pid=16382 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 msg='virt=kvm resrc=dev reason=start vm="TPM-PT" uuid=a4d7cd22-da89-3094-6212-079a48a309a1 device="/dev/tpm0" exe="/usr/sbin/libvirtd" hostname=? addr=? terminal=? res=success'
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Parse the domain XML with TPM passthrough support.
The TPM passthrough XML may look like this:
<tpm model='tpm-tis'>
<backend type='passthrough'>
<device path='/dev/tpm0'/>
</backend>
</tpm>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Probe for QEMU's QMP TPM support by querying the lists of
supported TPM models (query-tpm-models) and backend types
(query-tpm-types).
The setting of the capability flags following the strings
returned from the commands above is only provided in the
patch where domain_conf.c gets TPM support due to dependencies
on functions only introduced there.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
This patch adds the ability to configure non-contiguous boot orders on boot
devices. This allows unplugging devices that have boot order specified without
breaking migration.
The new code now uses a slightly less memory efficient approach to store the
boot order fields in a hashtable instead of a bitmap.
To avoid the collision for creating USB controllers in machine->init()
and -device xx command line, it needs to set usb=off to avoid one USB
controller created in machine->init(). So that libvirt can use -device
or -usb to create USB controller sucessfully.
So QEMU_CAPS_MACHINE_USB_OPT capability is added, and it is for QEMU
v1.3.0 onwards which supports USB option.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
When migrating a domain with disk images stored locally (and using
storage migration), we should not complain about unsafe migration no
matter what cache policy is used for that disk.
https://bugzilla.redhat.com/show_bug.cgi?id=920441
Currently, we are discarding listen attribute from qemu cookie even though
we strive to gather it. This result in not so cool bug: if user have
different networks, one for management/migration, and one for VNC/SPICE we
pass incorrect host to the qemu in client_migrate_info. What we actually
pass is remote hostname, while we should be passing remote listen address.
It doesn't matter as long as these two are the same, but they don't need
necessary to be like that.
==5306== 8 bytes in 1 blocks are definitely lost in loss record 24 of 277
==5306== at 0x4C28B2F: calloc (vg_replace_malloc.c:593)
==5306== by 0x5293CAF: virAllocN (viralloc.c:152)
==5306== by 0x52DFEAE: virXPathNodeSet (virxml.c:611)
==5306== by 0x5313DD9: virNetworkDefParseXML (network_conf.c:1408)
==5306== by 0x53170F6: virNetworkObjUpdateParseFile (network_conf.c:2031)
==5306== by 0x131DA63C: networkStartup (bridge_driver.c:279)
==5306== by 0x53481DF: virStateInitialize (libvirt.c:822)
==5306== by 0x40DF44: daemonRunStateInit (libvirtd.c:877)
==5306== by 0x52D2FF5: virThreadHelper (virthreadpthread.c:161)
==5306== by 0x5D00C52: start_thread (in /usr/lib64/libpthread-2.17.so)
==5306== by 0x6410ECC: clone (in /usr/lib64/libc-2.17.so)
This patch fixes crash of the daemon that happens due to the following race
condition:
Let's have two threads in the libvirtd daemon's qemu driver:
A - thread executing undefine on the same domain
B - thread executing a API call to get information about a domain
Assume following serialization of operations done by the threads:
1) A has the lock on the domain object and is executing some code prior to
virDomainObjListRemove()
2) B takes the lock on the domain object list, looks up the domain object
pointer and blocks in the attempt to lock the domain object as A is holding the
lock
3) A reaches virDomainObjListRemove() and unlocks the lock on the domain object
4) A blocks on the attempt to get the domain list lock
5) B is able to lock the domain object now and unlocks the domain list
6) A is now able to lock the domain list, and sheds the last reference on the
domain object, this triggers the freeing function.
6) B starts executing the code on the pointer that is being freed
7) The libvirtd daemon crashes while attempting to access invalid pointer in
thread B.
This patch fixes the race by acquiring a reference on the domain object before
unlocking it in virDomainObjListRemove() and re-locks the object prior to
removing and freeing it. This ensures that no thread holds a lock on the domain
object at the time it is removed from the list, and that doing a list lookup
will never find a domain that is about to vanish.
This is a minimal fix of the problem, but a better solution will be to switch to
full reference counting for domain objects.
Commit 9a3ff01d7f (which was ACKed at
the end of January, but for some reason didn't get pushed until during
the 1.0.4 freeze) fixed the logic in virPCIGetVirtualFunctions().
Unfortunately, a typo in the fix (replacing VIR_REALLOC_N with
VIR_ALLOC_N during code movement) caused not only a memory leak, but
also resulted in most of the elements of the result array being
replaced with NULL. virNetDevGetVirtualFunctions() assumed (and I think
rightly so) that virPCIGetVirtualFunctions() wouldn't return any NULL
elements in the array, so it ended up segfaulting.
This was found when attempting to use a virtual network with an
auto-created pool of SRIOV VFs, e.g.:
<forward mode='hostdev' managed='yes'>
<pf dev='eth4'/>
</forward>
(the pool of PCI addresses is discovered by calling
virNetDevGetVirtualFunctions() on the PF dev).
The helper function to look up disk controller model may be used by scsi
hostdev. But it should be changed to use device info.
Signed-off-by: Han Cheng <hanc.fnst@cn.fujitsu.com>
Even though http://libvirt.org/formatdomain.html#elementsMetadata
states that it requires RFC4122 compliance UUIDs that are generated
by virUUIDGenerate() are not. Following patch modifies generated
UUIDs to conform to rules described in RFC.
Signed-off-by: Milos Vyletel <milos.vyletel@sde.cz>
If the user requests a mount for /run, this may hide any existing
mounts that are lower down in /run. The result is that the
container still sees the mounts in /proc/mounts, but cannot
access them
sh-4.2# df
df: '/run/user/501/gvfs': No such file or directory
df: '/run/media/berrange/LIVE': No such file or directory
df: '/run/media/berrange/SecureDiskA1': No such file or directory
df: '/run/libvirt/lxc/sandbox': No such file or directory
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/vg_t500wlan-lv_root 151476396 135390200 8384900 95% /
tmpfs 1970888 3204 1967684 1% /run
/dev/sda1 194241 155940 28061 85% /boot
devfs 64 0 64 0% /dev
tmpfs 64 0 64 0% /sys/fs/cgroup
tmpfs 1970888 1200 1969688 1% /etc/libvirt-sandbox/scratch
Before mounting any filesystem at a particular location, we
must recursively unmount anything at or below the target mount
point
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ensure lxcContainerUnmountSubtree is at the top of the
lxc_container.c file so it is easily referenced from
any other method. No functional change
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This allows a container-type domain to have exclusive access to one of
the host's NICs.
Wire <hostdev caps=net> with the lxc_controller - when moving the newly
created veth devices into a new namespace, also look for any hostdev
devices that should be moved. Note: once the container domain has been
destroyed, there is no code that moves the interfaces back to the
original namespace. This does happen, though, probably due to default
cleanup on namespace destruction.
Signed-off-by: Bogdan Purcareata <bogdan.purcareata@freescale.com>
This updates the definitions and supporting structures in the XML
schema and domain configuration files.
Signed-off-by: Bogdan Purcareata <bogdan.purcareata@freescale.com>
The virCgroupMounted method is badly named, since a controller can be
mounted, but disabled in the current object. Rename the method to be
virCgroupHasController. Also make it tolerant to a NULL virCgroupPtr
and out-of-range controller index, to avoid duplication of these
checks in all callers
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To support "shareable" for volume type disk, we have to translate
the source before trying to add the shared disk entry. To achieve
the goal, this moves the helper qemuTranslateDiskSourcePool into
src/qemu/qemu_conf.c, and introduce an internal only member (voltype)
for struct _virDomainDiskSourcePoolDef, to record the underlying
volume type for use when building the drive string.
Later patch will support "shareable" volume type disk.
This adds a new helper qemuTranslateDiskSourcePool which uses the
storage pool/vol APIs to translate the disk source before building
the drive string. Network volume is not supported yet. Disk chain
for volume type disk may be supported later, but before I'm confident
it doesn't break anything, it's just disabled now.
With this patch, one can specify the disk source using libvirt
storage like:
<disk type='volume' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<source pool='default' volume='fc18.img'/>
<target dev='vdb' bus='virtio'/>
</disk>
"seclabels" and "startupPolicy" are not supported for this new
disk type ("volume"). They will be supported in later patches.
docs/formatdomain.html.in:
* Add documents for new XMLs
docs/schemas/domaincommon.rng:
* Add rng for new XMLs;
src/conf/domain_conf.h:
* New struct for 'volume' type disk source (virDomainDiskSourcePoolDef)
* Add VIR_DOMAIN_DISK_TYPE_VOLUME for enum virDomainDiskType
src/conf/domain_conf.c:
* New helper virDomainDiskSourcePoolDefParse to parse the 'volume'
type disk source.
* New helper virDomainDiskSourcePoolDefFree to free the source def
if 'volume' type disk.
tests/qemuxml2argvdata/qemuxml2argv-disk-source-pool.xml:
tests/qemuxml2xmltest.c:
* New test
This finds the parent for vHBA by iterating over all the HBA
which supports vport_ops capability on the host, and return
the first one which is online, not saturated (vports in use
is less than max_vports).
startPool creates the vHBA if it's not existed yet, stopPool destroys
the vHBA. Also to support autostart, checkPool will creates the vHBA
if it's not existed yet.
The helper iterates over sysfs, to find out the matched scsi host
name by comparing the wwnn,wwpn pair. It will be used by checkPool
and refreshPool of storage scsi backend. New helper getAdapterName
is introduced in storage_backend_scsi.c, which uses the new util
helper virGetFCHostNameByWWN to get the fc_host adapter name.
node device driver names the HBA like "scsi_host5", but storage
driver uses "host5", which could make the user confused. This
changes them to be consistent. However, for back-compat reason,
adapter name like "host5" is still supported.
This introduces 4 new attributes for storage pool source adapter.
E.g.
<adapter type='fc_host' parent='scsi_host5' wwnn='20000000c9831b4b' wwpn='10000000c9831b4b'/>
Attribute 'type' can be either 'scsi_host' or 'fc_host', and defaults
to 'scsi_host' if attribute 'name' is specified. I.e. It's optional
for 'scsi_host' adapter, for back-compat reason. However, mandatory
for 'fc_host' adapter and any new future adapter types. Attribute
'parent' is to specify the parent for the fc_host adapter.
* docs/formatstorage.html.in:
- Add documents for the 4 new attrs
* docs/schemas/storagepool.rng:
- Add RNG schema
* src/conf/storage_conf.c:
- Parse and format the new XMLs
* src/conf/storage_conf.h:
- New struct virStoragePoolSourceAdapter, replace "char *adapter" with it;
- New enum virStoragePoolSourceAdapterType
* src/libvirt_private.syms:
- Export TypeToString and TypeFromString
* src/phyp/phyp_driver.c:
- Replace "adapter" with "adapter.data.name", which is member of the union
of the new struct virStoragePoolSourceAdapter now. Later patch will
add the checking, as "adapter.data.name" is only valid for "scsi_host"
adapter.
* src/storage/storage_backend_scsi.c:
- Like above
* tests/storagepoolxml2xmlin/pool-scsi-type-scsi-host.xml:
* tests/storagepoolxml2xmlin/pool-scsi-type-fc-host.xml:
- New test for 'fc_host' and "scsi_host" adapter
* tests/storagepoolxml2xmlout/pool-scsi.xml:
- Change the expected output, as the 'type' defaults to 'scsi_host' if 'name"
specified now
* tests/storagepoolxml2xmlout/pool-scsi-type-scsi-host.xml:
* tests/storagepoolxml2xmlout/pool-scsi-type-fc-host.xml:
- New test
* tests/storagepoolxml2xmltest.c:
- Include the test
There are a number of places which generate cast alignment
warnings, which are difficult or impossible to address. Use
pragmas to disable the warnings in these few places
conf/nwfilter_conf.c: In function 'virNWFilterRuleDetailsParse':
conf/nwfilter_conf.c:1806:16: warning: cast increases required alignment of target type [-Wcast-align]
item = (nwItemDesc *)((char *)nwf + att[idx].dataIdx);
conf/nwfilter_conf.c: In function 'virNWFilterRuleDefDetailsFormat':
conf/nwfilter_conf.c:3238:16: warning: cast increases required alignment of target type [-Wcast-align]
item = (nwItemDesc *)((char *)def + att[i].dataIdx);
storage/storage_backend_mpath.c: In function 'virStorageBackendCreateVols':
storage/storage_backend_mpath.c:247:17: warning: cast increases required alignment of target type [-Wcast-align]
names = (struct dm_names *)(((char *)names) + next);
nwfilter/nwfilter_dhcpsnoop.c: In function 'virNWFilterSnoopDHCPDecode':
nwfilter/nwfilter_dhcpsnoop.c:994:15: warning: cast increases required alignment of target type [-Wcast-align]
pip = (struct iphdr *) pep->eh_data;
nwfilter/nwfilter_dhcpsnoop.c:1004:11: warning: cast increases required alignment of target type [-Wcast-align]
pup = (struct udphdr *) ((char *) pip + (pip->ihl << 2));
nwfilter/nwfilter_learnipaddr.c: In function 'procDHCPOpts':
nwfilter/nwfilter_learnipaddr.c:327:33: warning: cast increases required alignment of target type [-Wcast-align]
uint32_t *tmp = (uint32_t *)&dhcpopt->value;
nwfilter/nwfilter_learnipaddr.c: In function 'learnIPAddressThread':
nwfilter/nwfilter_learnipaddr.c:501:43: warning: cast increases required alignment of target type [-Wcast-align]
struct iphdr *iphdr = (struct iphdr*)(packet +
nwfilter/nwfilter_learnipaddr.c:538:43: warning: cast increases required alignment of target type [-Wcast-align]
struct iphdr *iphdr = (struct iphdr*)(packet +
nwfilter/nwfilter_learnipaddr.c:544:48: warning: cast increases required alignment of target type [-Wcast-align]
struct udphdr *udphdr= (struct udphdr *)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When reading the inotify FD, we get back a sequence of
struct inotify_event, each with variable length data following.
It is not safe to simply cast from the char *buf to the
struct inotify_event struct since this may violate data
alignment rules. Thus we must copy from the char *buf
into the struct inotify_event instance before accessing
the data.
uml/uml_driver.c: In function 'umlInotifyEvent':
uml/uml_driver.c:327:13: warning: cast increases required alignment of target type [-Wcast-align]
e = (struct inotify_event *)tmp;
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The current way virObject instances are allocated using
VIR_ALLOC_N causes alignment warnings
util/virobject.c: In function 'virObjectNew':
util/virobject.c:195:11: error: cast increases required alignment of target type [-Werror=cast-align]
Changing to use VIR_ALLOC_VAR will avoid the need todo
the casts entirely.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virNetlinkCommand() method takes an 'unsigned char **'
parameter to be filled with the received netlink message.
The callers then immediately cast this to 'struct nlmsghdr',
triggering (bogus) warnings about increasing alignment
requirements
util/virnetdev.c: In function 'virNetDevLinkDump':
util/virnetdev.c:1300:12: warning: cast increases required alignment of target type [-Wcast-align]
resp = (struct nlmsghdr *)*recvbuf;
^
util/virnetdev.c: In function 'virNetDevSetVfConfig':
util/virnetdev.c:1429:12: warning: cast increases required alignment of target type [-Wcast-align]
resp = (struct nlmsghdr *)recvbuf;
Since all callers cast to 'struct nlmsghdr' we can avoid
the warning problem entirely by simply changing the
signature of virNetlinkCommand to return a 'struct nlmsghdr **'
instead of 'unsigned char **'. The way we do the cast inside
virNetlinkCommand does not have any alignment issues.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Playing games with field offsets in a struct causes all sorts
of alignment warnings on ARM platforms
util/virkeycode.c: In function '__virKeycodeValueFromString':
util/virkeycode.c:26:7: warning: cast increases required alignment of target type [-Wcast-align]
(*(typeof(field_type) *)((char *)(object) + field_offset))
^
util/virkeycode.c:91:28: note: in expansion of macro 'getfield'
const char *name = getfield(virKeycodes + i, const char *, name_offset);
^
util/virkeycode.c:26:7: warning: cast increases required alignment of target type [-Wcast-align]
(*(typeof(field_type) *)((char *)(object) + field_offset))
^
util/virkeycode.c:94:20: note: in expansion of macro 'getfield'
return getfield(virKeycodes + i, unsigned short, code_offset);
^
util/virkeycode.c: In function '__virKeycodeValueTranslate':
util/virkeycode.c:26:7: warning: cast increases required alignment of target type [-Wcast-align]
(*(typeof(field_type) *)((char *)(object) + field_offset))
^
util/virkeycode.c:127:13: note: in expansion of macro 'getfield'
if (getfield(virKeycodes + i, unsigned short, from_offset) == key_value)
^
util/virkeycode.c:26:7: warning: cast increases required alignment of target type [-Wcast-align]
(*(typeof(field_type) *)((char *)(object) + field_offset))
^
util/virkeycode.c:128:20: note: in expansion of macro 'getfield'
return getfield(virKeycodes + i, unsigned short, to_offset);
There is no compelling reason to use a struct for the keycode
tables. It can easily just use an array of arrays instead,
avoiding all alignment problems
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
For both "live" and "config" changes of vcpupin and emulatorpin, an
all clear bitmap doesn't make sense, and it can just cause corruptions.
E.g (similar for emulatorpin).
% virsh vcpupin hame 0 8,^8 --config
% virsh vcpupin hame
VCPU: CPU Affinity
----------------------------------
0:
1: 0-63
2: 0-63
3: 0-63
% virsh dumpxml hame | grep cpuset
<vcpupin vcpu='0' cpuset=''/>
% virsh start hame
error: Failed to start domain hame
error: An error occurred, but the cause is unknown
This introduce a new attribute "num_queues" (same with the good name
QEMU uses) for virtio-scsi controller. An example of the XML:
<controller type='scsi' index='0' model='virtio-scsi' num_queues='8'/>
The corresponding QEMU command line:
-device virtio-scsi-pci,id=scsi0,num_queues=8,bus=pci.0,addr=0x3 \
By default, libtool builds two .o files for every .lo rule:
src/foo.o - static builds
src/.libs/foo.o - shared library builds
But since commit ad42b34b disabled static builds, src/foo.o is
no longer built by default. On a fresh checkout, this means our
protocol check rules using pdwtags were testing a missing file,
and thanks to a lousy behavior of pdwtags happily giving no output
and 0 exit status (http://bugzilla.redhat.com/949034), we were
merely claiming that "dwarves is too old" and skipping the test.
However, if you swap between branches and do incremental builds,
such as building v0.10.2-maint and then switching back to master,
you end up with src/foo.o being leftover from its 0.10.2 state,
and then 'make check' fails because the .o file does not match
the protocol-structs file due to API additions in the meantime.
A simpler fix would be to always look in .libs for the .o to
be parsed; but since it is possible to pass ./configure options
to tell libtool to do a static-only build with no shared .o,
I went with the approach of finding the newest of the two files,
whenever both exist.
* src/Makefile.am (PDWTAGS): Ensure we test just-built file.
When setting processor count for a domain using the API libvirt enforced
a maximum processor count, while it isn't enforced when taking the XML path.
This patch removes the check to match the XML.
Currently when getting an instance of virCgroupPtr we will
create the path in all cgroup controllers. Only at the virt
driver layer are we attempting to filter controllers. This
is bad because the mere act of creating the dirs in the
controllers can have a functional impact on the kernel,
particularly for performance.
Update the virCgroupForDriver() method to accept a bitmask
of controllers to use. Only create dirs in the controllers
that are requested. When creating cgroups for domains,
respect the active controller list from the parent cgroup
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virCgroupGetAppRoot is not clear in its meaning. Change
to virCgroupForSelf to highlight that this returns the
cgroup config for the caller's process
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The last Viktor's effort to fix the race and memory corruption unfortunately
wasn't complete in the case the close callback was not registered in an
connection. At that time, the trail of event's that I'll describe later could
still happen and corrupt the memory or cause a crash of the client (including
the daemon in case of a p2p migration).
Consider the following prerequisities and trail of events:
Let's have a remote connection to a hypervisor that doesn't have a close
callback registered and the client is using the event loop. The crash happens in
cooperation of 2 threads. Thread E is the event loop and thread W is the worker
that does some stuff. R denotes the remote client.
1.) W - The client finishes everything and sheds the last reference on the client
2.) W - The virObject stuff invokes virConnectDispose that invokes doRemoteClose
3.) W - the remote close method invokes the REMOTE_PROC_CLOSE RPC method.
4.) W - The thread is preempted at this point.
5.) R - The remote side receives the close and closes the socket.
6.) E - poll() wakes up due to the closed socket and invokes the close callback
7.) E - The event loop is preempted right before remoteClientCloseFunc is called
8.) W - The worker now finishes, and frees the conn object.
9.) E - The remoteClientCloseFunc accesses the now-freed conn object in the
attempt to retrieve pointer for the real close callback.
10.) Kaboom, corrupted memory/segfault.
This patch tries to fix this by introducing a new object that survives the
freeing of the connection object. We can't increase the reference count on the
connection object itself or the connection would never be closed, as the
connection is closed only when the reference count reaches zero.
The new object - virConnectCloseCallbackData - is a lockable object that keeps
the pointers to the real user registered callback and ensures that the
connection callback is either not called if the connection was already freed or
that the connection isn't freed while this is being called.