As a consequence of the cgroup layout changes from commit '632f78ca', the
qemuDomainGetSchedulerParameters[Flags]()' and qemuGetSchedulerType() APIs
failed to return data for a non running domain. This can be seen through
a 'virsh schedinfo <domain>' command which returns:
Scheduler : Unknown
error: Requested operation is not valid: cgroup CPU controller is not mounted
Prior to that change a non running domain would return:
Scheduler : posix
cpu_shares : 0
vcpu_period : 0
vcpu_quota : 0
emulator_period: 0
emulator_quota : 0
This patch will restore the capability to return configuration only data
for a non running domain regardless of whether cgroups are available.
Paolo Bonzini pointed out that it's actually possible to migrate a qemu
instance that was paused due to I/O error and it will be able to work on
the destination if the storage is accessible.
This patch introduces flag VIR_MIGRATE_ABORT_ON_ERROR that cancels the
migration in case an I/O error happens while it's being performed and
allows migration without this flag. This flag can be possibly used for
other error reasons that may be introduced in the future.
Convert input XML to migratable before using it in
qemuDomainSaveImageOpen.
XML in the save image is migratable, i.e. doesn't contain implicit
controllers. If these controllers were in a non-default order in the
input XML, the ABI check would fail. Removing and re-adding these
controllers fixes it.
https://bugzilla.redhat.com/show_bug.cgi?id=834196
This patch fixes changes done in commit 29c1e913e4
that was pushed without implementing review feedback.
The flag introduced by the patch is changed to VIR_DOMAIN_VCPU_GUEST and
documentation makes the difference between regular hotplug and this new
functionality more explicit.
The virsh options that enable the use of the new flag are changed to
"--guest" and the documentation is fixed too.
Currently, there's a path to use the ncpuinfo variable uninitialized,
which leads to a compiler warning:
qemu/qemu_driver.c: In function 'qemuDomainGetVcpusFlags':
qemu/qemu_driver.c:4573:9: error: 'ncpuinfo' may be used
uninitialized in this function [-Werror=maybe-uninitialized]
for (i = 0; i < ncpuinfo; i++) {
^
The work was done at the time of snapshot xmlstring parsing
if (offline && def->memory &&
def->memory != VIR_DOMAIN_SNAPSHOT_LOCATION_NONE) {
virReportError(...);
}
If snapshot creation failed for example due to invalid use of the
"REUSE_EXTERNAL" flag, libvirt killed access to the original image file
instead of the new image file. On machines with selinux this kills the
whole VM as the selinux context is enforced immediately.
* qemu_driver.c:qemuDomainSnapshotUndoSingleDiskActive():
- Kill access to the new image file instead of the old one.
Partially resolves: https://bugzilla.redhat.com/show_bug.cgi?id=906639
This is a recurring problem for cygwin :)
For example, see commit 23a4df88.
qemu/qemu_driver.c: In function 'qemuStateInitialize':
qemu/qemu_driver.c:691:13: error: format '%d' expects type 'int', but argument 8 has type 'uid_t' [-Wformat]
* src/qemu/qemu_driver.c (qemuStateInitialize): Add casts.
* daemon/remote.c (remoteDispatchAuthList): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Function qemuDomainSetBlockIoTune() was checking QEMU capabilities
even when !(flags & VIR_DOMAIN_AFFECT_LIVE) and the domain was
shutoff, resulting in the following problem:
virsh # domstate asdf; blkdeviotune asdf vda --write-bytes-sec 100
shut off
error: Unable to change block I/O throttle
error: unsupported configuration: block I/O throttling not supported with this QEMU binary
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=965016
Since 0d70656afd, it starts to access the sysfs files to build
the qemu command line (by virSCSIDeviceGetSgName, which is to find
out the scsi generic device name by adpater🚌target:unit), there
is no way to work around, qemu wants to see the scsi generic device
like "/dev/sg6" anyway.
And there might be other places which need to access sysfs files
when building qemu command line in future.
Instead of increasing the arguments of qemuBuildCommandLine, this
introduces a new callback for qemuBuildCommandLine, and thus tests
can register their own callbacks for sysfs test input files accessing.
* src/qemu/qemu_command.h: (New callback struct
qemuBuildCommandLineCallbacks;
extern buildCommandLineCallbacks)
* src/qemu/qemu_command.c: (wire up the callback struct)
* src/qemu/qemu_driver.c: (Use the new syntax of qemuBuildCommandLine)
* src/qemu/qemu_hotplug.c: Likewise
* src/qemu/qemu_process.c: Likewise
* tests/testutilsqemu.[ch]: (Helper testSCSIDeviceGetSgName;
callback struct testCallbacks;)
* tests/qemuxml2argvtest.c: (Use testCallbacks)
* src/tests/qemuxmlnstest.c: (Like above)
Commit 632f78c introduced a regression which causes schedinfo being
unable to set some parameters. When migrating to priv->cgroup there
was missing variable left out and due to passed NULL to underlying
function, the setting failed.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=963592
This changes the helpers qemu{Add,Remove}SharedDisk into
qemu{Add,Remove}SharedDevice, as most of the code in the helpers
can be reused for scsi host device.
To track the shared scsi host device, first it finds out the
device path (e.g. /dev/s[dr]*) which is mapped to the sg device,
and use device ID of the found device path (/dev/s[dr]*) as the
hash key. This is because of the device ID is not unique between
between /dev/s[dr]* and /dev/sg*, e.g.
% sg_map
/dev/sg0 /dev/sda
/dev/sg1 /dev/sr0
% ls -l /dev/sda
brw-rw----. 1 root disk 8, 0 May 2 19:26 /dev/sda
%ls -l /dev/sg0
crw-rw----. 1 root disk 21, 0 May 2 19:26 /dev/sg0
"Shared disk" is not only the thing we should care about after "scsi
hostdev" is introduced. A same scsi device can be used as "disk" for
one domain, and as "scsi hostdev" for another domain at the same time.
That's why this patch renames qemu_driver->sharedDisks. Related functions
and structs are also renamed.
Adding a VNC WebSocket support for QEMU driver. This functionality is
in upstream qemu from commit described as v1.3.0-982-g7536ee4, so the
capability is being recognized based on QEMU version for now.
Although virtio-scsi supports SCSI PR (Persistent Reservations),
the device on host may do not support it. To avoid losing data,
Just like PCI and USB pass through devices, only one live guest
is allowed per SCSI host pass through device."
Signed-off-by: Han Cheng <hanc.fnst@cn.fujitsu.com>
It is possible to build a kernel without swap cgroup controls
present. This causes a fatal error when querying memory
parameters. Treat missing swap controls as meaning "unlimited".
The fatal error remains if the user tries to actually change
the limit.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=851411https://bugzilla.redhat.com/show_bug.cgi?id=955500
The first problem was that virFileOpenAs was returning fd (-1) in one
of the error cases rather than ret (-errno), so the caller thought
that the error was EPERM rather than ENOENT.
The second problem was that some log messages in the general purpose
qemuOpenFile() function would always say "Failed to create" even if
the caller hadn't included O_CREAT (i.e. they were trying to open an
existing file).
This fixes virFileOpenAs to jump down to the error return (which
returns ret instead of fd) in the previously mentioned incorrect
failure case of virFileOpenAs(), removes all error logging from
virFileOpenAs() (since the callers report it), and modifies
qemuOpenFile to appropriately use "open" or "create" in its log
messages.
NB: I seriously considered removing logging from all callers of
virFileOpenAs(), but there is at least one case where the caller
doesn't want virFileOpenAs() to log any errors, because it's just
going to try again (qemuOpenFile()). We can't simply make a silent
variation of virFileOpenAs() though, because qemuOpenFile() can't make
the decision about whether or not it wants to retry until after
virFileOpenAs() has already returned an error code.
Likewise, I also considered changing virFileOpenAs() to return -1 with
errno set on return, and may still do that, but only as a separate
patch, as it obscures the intent of this patch too much.
The LXC, QEMU, and LibXL drivers have all merged their handling of
the attach/update/modify device APIs into one large
'xxxxDomainModifyDeviceFlags'
which then does a 'switch()' based on the actual API being invoked.
While this saves some lines of code, it is not really all that
significant in the context of the driver API impls as a whole.
This merger of the handling of different APIs creates pain when
wanting to automated analysis of the code and do things which
are specific to individual APIs. The slight duplication of code
from unmerged the API impls, is preferrable to allow for easier
automated analysis.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the parsing of XML is pushed down into the various
migration helper APIs. This makes it difficult to insert the
correct access control checks, since one helper API services
many public APIs. Pull the parsing of XML up to the top level
of the QEMU driver APIs
The individual hypervisor drivers were directly referencing
APIs in virnodesuspend.c in their virDriverPtr struct. Separate
these methods, so there is always a wrapper in the hypervisor
driver. This allows the unused virConnectPtr args to be removed
from the virnodesuspend.c file. Again this will ensure that
ACL checks will only be performed on invocations that are
directly associated with public API usage.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The individual hypervisor drivers were directly referencing
APIs in src/nodeinfo.c in their virDriverPtr struct. Separate
these methods, so there is always a wrapper in the hypervisor
driver. This allows the unused virConnectPtr args to be
removed from the nodeinfo.c file. Again this will ensure that
ACL checks will only be performed on invocations that are
directly associated with public API usage.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the virGetHostname() API has a bogus virConnectPtr
parameter. This is because virtualization drivers directly
reference this API in their virDriverPtr tables, tieing its
API design to the public virConnectGetHostname API design.
This also causes problems for access control checks since
these must only be done for invocations from the public
API, not internal invocation.
Remove the bogus virConnectPtr parameter, and make each
hypervisor driver provide a dedicated function for the
driver API impl. This will allow access control checks
to be easily inserted later.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When attempting to generate the native command line from an XML file
that uses graphics port auto allocation, the generated commandline
wouldn't be valid.
This patch adds fake autoallocation of ports as done when starting the
actual machine.
The source code base needs to be adapted as well. Some files
include virutil.h just for the string related functions (here,
the include is substituted to match the new file), some include
virutil.h without any need (here, the include is removed), and
some require both.
virPCIDeviceReattach and virPCIDeviceUnbindFromStub (called by
virPCIDeviceReattach) had previously required the name of the stub
driver as input. This is unnecessary, because the name of the driver
the device is currently bound to can be found by looking at the link:
/sys/bus/pci/dddd:bb:ss.ff/driver
Instead of requiring that the name of the expected stub driver name
and only unbinding if that one name is matched, we no longer take a
driver name in the arglist for either of these
functions. virPCIDeviceUnbindFromStub just compares the name of the
currently bound driver to a list of "well known" stubs (right now
contains "pci-stub" and "vfio-pci" for qemu, and "pciback" for xen),
and only performs the unbind if it's one of those devices.
This allows virsh nodedevice-reattach to work properly across a
libvirtd restart, and fixes a couple of cases where we were
erroneously still hard-coding "pci-stub" as the drive name.
For some unknown reason, virPCIDeviceReattach had been calling
modprobe on the stub driver prior to unbinding the device. This was
problematic because we no longer know the name of the stub driver in
that function. However, it is pointless to probe for the stub driver
at that time anyway - because the device is bound to the stub driver,
we are guaranteed that it is already loaded, and so that call to
modprobe has been removed.
The differences from virNodeDeviceDettach are very minor:
1) Check that the flags are 0.
2) Set the virPCIDevice's stubDriver according to the driverName that
is passed in.
3) Call virPCIDeviceDetach with a NULL stubDriver, indicating it
should get the name of the stub driver from the virPCIDevice
object.
Ensure that all drivers implementing public APIs use a
naming convention for their implementation that matches
the public API name.
eg for the public API virDomainCreate make sure QEMU
uses qemuDomainCreate and not qemuDomainStart
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ensure that the driver struct field names match the public
API names. For an API virXXXX we must have a driver struct
field xXXXX. ie strip the leading 'vir' and lowercase any
leading uppercase letters.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Refactoring done in 19c6ad9ac7 didn't
correctly take into account the order cgroup limit modification needs to
be done in. This resulted into errors when decreasing the limits.
The operations need to take place in this order:
decrease hard limit
change swap hard limit
or
change swap hard limit
increase hard limit
This patch also fixes the check if the hard_limit is less than
swap_hard_limit to print better error messages. For this purpose I
introduced a helper function virCompareLimitUlong to compare limit
values where value of 0 is equal to unlimited. Additionally the check is
now applied also when the user does not provide all of the tunables
through the API and in that case the currently set values are used.
This patch resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=950478
Wrong use of the parentheses causes "rc" always having a boolean value,
either "1" or "0", and thus we can't get the detailed error message
when it fails:
Before (I only have 1 node):
% virsh numatune f18 --nodeset 12
error: Unable to change numa parameters
error: unable to set numa tunable: Unknown error -1
After:
virsh numatune f18 --nodeset 12
error: Unable to change numa parameters
error: unable to set numa tunable: Invalid argument
Detected by a simple Shell script:
for i in $(git ls-files -- '*.[ch]'); do
awk 'BEGIN {
fail=0
}
/# *include.*\.h/{
match($0, /["<][^">]*[">]/)
arr[substr($0, RSTART+1, RLENGTH-2)]++
}
END {
for (key in arr) {
if (arr[key] > 1) {
fail=1
printf("%d %s\n", arr[key], key)
}
}
if (fail == 1)
exit 1
}' $i
if test $? != 0; then
echo "Duplicate header(s) in $i"
fi
done;
A later patch will add the syntax-check to avoid duplicate
headers.
Rename all the virCgroupForXXX methods to use the form
virCgroupNewXXX since they are all constructors. Also
make sure the output parameter is the last one in the
list, and annotate all pointers as non-null. Fix up
all callers, and make sure they use true/false not 0/1
for the boolean parameters
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of calling virCgroupForDomain every time we need
the virCgrouPtr instance, just do it once at Vm startup
and cache a reference to the object in qemuDomainObjPrivatePtr
until shutdown of the VM. Removing the virCgroupPtr from
the QEMU driver state also means we don't have stale mount
info, if someone mounts the cgroups filesystem after libvirtd
has been started
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Report the errors as:
Domain not found: no domain with matching uuid '41414141-4141-4141-4141-414141414141' (crashtest)
instead of:
Domain not found: no domain with matching uuid '41414141-4141-4141-4141-414141414141'
Use the helper to lookup the domain object in the remaining places.
This patch also fixes error reporting when the domain was not found in several
functions that were printing the raw UUID buffer instead of the formatted
string. The offending functions were:
qemuDomainGetInterfaceParameters
qemuDomainSetInterfaceParameters
qemuGetSchedulerParametersFlags
qemuSetSchedulerParametersFlags
qemuDomainGetNumaParameters
qemuDomainSetNumaParameters
qemuDomainGetMemoryParameters
qemuDomainSetMemoryParameters
qemuDomainGetBlkioParameters
qemuDomainSetBlkioParameters
qemuDomainGetCPUStats
To support "shareable" for volume type disk, we have to translate
the source before trying to add the shared disk entry. To achieve
the goal, this moves the helper qemuTranslateDiskSourcePool into
src/qemu/qemu_conf.c, and introduce an internal only member (voltype)
for struct _virDomainDiskSourcePoolDef, to record the underlying
volume type for use when building the drive string.
Later patch will support "shareable" volume type disk.
For both "live" and "config" changes of vcpupin and emulatorpin, an
all clear bitmap doesn't make sense, and it can just cause corruptions.
E.g (similar for emulatorpin).
% virsh vcpupin hame 0 8,^8 --config
% virsh vcpupin hame
VCPU: CPU Affinity
----------------------------------
0:
1: 0-63
2: 0-63
3: 0-63
% virsh dumpxml hame | grep cpuset
<vcpupin vcpu='0' cpuset=''/>
% virsh start hame
error: Failed to start domain hame
error: An error occurred, but the cause is unknown
When setting processor count for a domain using the API libvirt enforced
a maximum processor count, while it isn't enforced when taking the XML path.
This patch removes the check to match the XML.
Currently when getting an instance of virCgroupPtr we will
create the path in all cgroup controllers. Only at the virt
driver layer are we attempting to filter controllers. This
is bad because the mere act of creating the dirs in the
controllers can have a functional impact on the kernel,
particularly for performance.
Update the virCgroupForDriver() method to accept a bitmask
of controllers to use. Only create dirs in the controllers
that are requested. When creating cgroups for domains,
respect the active controller list from the parent cgroup
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch removes the defaultDiskDriverName from the virCaps
structure. This particular default value is used only in the qemu driver
so this patch uses the recently added callback to fill the driver name
if it's needed instead of propagating it through virCaps.
This patch adds instrumentation that will allow hypervisor drivers to
fill and validate domain and device definitions after parsed by the XML
parser.
With this patch, after the XML is parsed, a callback to the driver is
issued requesting to fill and validate driver specific details of the
configuration. This allows to use sensible defaults and checks on a per
driver basis at the time the XML is parsed.
Two callback pointers are stored in the new virDomainXMLConf object:
* virDomainDeviceDefPostParseCallback (devicesPostParseCallback)
- called for a single device parsed and for every single device in a
domain config. A virDomainDeviceDefPtr is passed along with the
domain definition and virCaps.
* virDomainDefPostParseCallback, (domainPostParseCallback)
- A callback that is meant to process the domain config after it's
parsed. A virDomainDefPtr is passed along with virCaps.
Both types of callbacks support arbitrary opaque data passed for the
callback functions.
Errors may be reported in those callbacks resulting in a XML parsing
failure.
This patch is the result of running:
for i in $(git ls-files | grep -v html | grep -v \.po$ ); do
sed -i -e "s/virDomainXMLConf/virDomainXMLOption/g" -e "s/xmlconf/xmlopt/g" $i
done
and a few manual tweaks.
Mimic the fix done in 02b9097274 to fix crash by
accessing an already freed structure. Also copy the explaining comment why the
pointer can't be accessed any more.
The VIR_ERR_NO_SUPPORT error code is reserved for cases where an
API is not implemented in a driver. It definitely should not be
used when an API execution fails due to unsupported operation.
We should record the new disk src in the shared disk table for
updating disk (CD-ROM or Floppy) API. Fortunately, we only allow
to update the disk source now, otherwise we might also want to
set the unpriv_sgio setting.
Intend to reduce the redundant code,use virNumaSetupMemoryPolicy
to replace virLXCControllerSetupNUMAPolicy and
qemuProcessInitNumaMemoryPolicy.
This patch also moves the numa related codes to the
file virnuma.c and virnuma.h
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
We didn't yet expose the virtio device attach and detach functionality
for s390 domains as the device hotplug was very limited with the old
virtio-s390 bus. With the CCW bus there's full hotplug support for
virtio devices in QEMU, so we are adding this to libvirt too.
Since the virtio hotplug isn't limited to PCI anymore, we change the
function names from xxxPCIyyy to xxxVirtioyyy, where we handle all
three virtio bus types.
Signed-off-by: J.B. Joret <jb@linux.vnet.ibm.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
With our recent patch (1715c83b5f) we thrive to get the correct
number of maximal VCPUs. However, we are using a constant from
linux/kvm.h which may be not defined in every distro. Hence, we
should guard usage of the constant with ifdef preprocessor
directive. This was introduced in kernel:
commit 8c3ba334f8588e1d5099f8602cf01897720e0eca
Author: Sasha Levin <levinsasha928@gmail.com>
Date: Mon Jul 18 17:17:15 2011 +0300
KVM: x86: Raise the hard VCPU count limit
The patch raises the hard limit of VCPU count to 254.
This will allow developers to easily work on scalability
and will allow users to test high VCPU setups easily without
patching the kernel.
To prevent possible issues with current setups, KVM_CAP_NR_VCPUS
now returns the recommended VCPU limit (which is still 64) - this
should be a safe value for everybody, while a new KVM_CAP_MAX_VCPUS
returns the hard limit which is now 254.
$ git desc 8c3ba334f
v3.1-rc7-48-g8c3ba33
The virCaps structure gathered a ton of irrelevant data over time that.
The original reason is that it was propagated to the XML parser
functions.
This patch aims to create a new data structure virDomainXMLConf that
will contain immutable data that are used by the XML parser. This will
allow two things we need:
1) Get rid of the stuff from virCaps
2) Allow us to add callbacks to check and add driver specific stuff
after domain XML is parsed.
This first attempt removes pointers to private data allocation functions
to this new structure and update all callers and function that require
them.
When there are two concurrent threads, we may dereference a NULL
pointer, even though it has been checked before:
1. Thread1: starts executing qemuDomainBlockStatsFlags() with nparams != 0.
It finds given disk and successfully pass check for disk->info.alias
not being NULL.
2. Thread2: starts executing qemuDomainDetachDeviceFlags() on the very same
disk as Thread1 is working on.
3. Thread1: gets to qemuDomainObjBeginJob() where it sets a job on a
domain.
4. Thread2: also tries to set a job. However, we are not guaranteed which
thread wins. So assume it's Thread2 who can continue.
5. Thread2: does the actual detach and frees disk->info.alias
6. Thread2: quits the job
7. Thread1: now successfully acquires the job, and accesses a NULL pointer.
19c6ad9a (qemu: Refactor qemuDomainSetMemoryParameters) introduced
a new macro, VIR_GET_LIMIT_PARAMETER(PARAM, VALUE). But if statement
in the macro is not correct and so set_XXXX flags are set to false
in the wrong. As a result, libvirt ignores all memtune parameters.
This patch fixes the conditional expression to work correctly.
Signed-off-by: Satoru Moriya <satoru.moriya@hds.com>
Currently, qemuDomainShutdownFlags() chooses the agent method of
shutdown whenever the agent is configured. However, this
assumption is not enough as the guest agent may be unresponsive
at the moment. So unless guest agent method has been explicitly
requested, we should fall back to the ACPI method.
The unitialized local variable qemuVersion can cause an random value
to be returned for the hypervisor version, observable with virsh version.
Introduced by commit b46f7f4a0b
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
This change tried to fix a crash with changing CDROM media but
failed to actually do so
commit d0172d2b1b
Author: Osier Yang <jyang@redhat.com>
Date: Tue Feb 19 20:27:45 2013 +0800
qemu: Remove the shared disk entry if the operation is ejecting or updating
It was still accessing disk->src, when the entire 'disk' object
has been free'd already. Even if it weren't free'd, accessing
the 'src' value of virDomainDiskDef is not allowed without
first validating disk->type is file or block. Just remove the
broken code entirely.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The new TypedParam helper APIs allow to simplify this function
significantly.
This patch integrates the fix in 75e5bec97b
by correctly ordering the setting functions instead of reordering the
parameters.
To avoid having to hold the qemu driver lock while iterating through
close callbacks and calling them. This fixes a real deadlock when a
domain which is being migrated from another host gets autodestoyed as a
result of broken connection to the other host.
The max value of number of cpus to compute(id) should not
be equal or greater than max cpu number.
The bug ocurrs when id value is equal to max cpu number which
leads to the off-by-one error in the following for loop.
# virsh cpu-stats guest --start 1
error: Failed to virDomainGetCPUStats()
error: internal error cpuacct parse error
Currently, if lzop decompression binary produces a warning, it
doesn't exit with zero status but 2 instead. Terrifying, but
true. However, warnings may be ignored using '--ignore-warn'
command line argument. Moreover, in which case, the exit status
will be zero.
For both AttachDevice and UpdateDevice APIs, if the disk device
is 'cdrom' or 'floppy', the operations could be ejecting, updating,
and inserting. For either ejecting or updating, the shared disk
entry of the original disk src has to be removed, because it's
not useful anymore.
And since the original disk def will be changed, new disk def passed
as argument will be free'ed in qemuDomainChangeEjectableMedia, so
we need to copy the orignal disk def before
qemuDomainChangeEjectableMedia, to use it for qemuRemoveSharedDisk.
The disk def could be free'ed by qemuDomainChangeEjectableMedia,
which can thus cause crash if we reference the disk pointer. On
the other hand, we have to remove the added shared disk entry from
the table on error codepath.
The hash entry is changed from "ref" to {ref, @domains}. With this, the
caller can simply call qemuRemoveSharedDisk, without afraid of removing
the entry belongs to other domains. qemuProcessStart will obviously
benifit from it on error codepath (which calls qemuProcessStop to do
the cleanup).
Based on moving various checking into qemuAddSharedDisk, this
avoids the caller using it in wrong ways. Also this adds two
new checking for qemuCheckSharedDisk (disk device not 'lun'
and kernel doesn't support unpriv_sgio simply returns 0).
With the majority of fields in the virQEMUDriverPtr struct
now immutable or self-locking, there is no need for practically
any methods to be using the QEMU driver lock. Only a handful
of helper APIs in qemu_conf.c now need it
Currently the APIs for managing the shared disk list take
a virHashTablePtr as the primary argument. This is bad
because it requires the caller to deal with locking of
the QEMU driver. Switch the APIs to take the full
virQEMUDriverPtr instance
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To enable locking to be introduced to the security manager
objects later, turn virSecurityManager into a virObjectLockable
class
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The 'driver->caps' pointer can be changed on the fly. Accessing
it currently requires the global driver lock. Isolate this
access in a single helper, so a future patch can relax the
locking constraints.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To avoid confusion between 'virCapsPtr' and 'qemuCapsPtr'
do some renaming of various fucntions/variables. All
instances of 'qemuCapsPtr' are renamed to 'qemuCaps'. To
avoid that clashing with the 'qemuCaps' typedef though,
rename the latter to virQEMUCaps.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To enable virCapabilities instances to be reference counted,
turn it into a virObject. All cases of virCapabilitiesFree
turn into virObjectUnref
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We are requesting for stderr catching for all cases in
virFileWrapperFdNew(). There is no need to have a separate
function just to report an error, esp. when we can do it in
virFileWrapperFdClose().
Currently the activePciHostdevs, inactivePciHostdevsd and
activeUsbHostdevs lists are all implicitly protected by the
QEMU driver lock. Now that the lists all inherit from the
virObjectLockable, we can make the locking explicit, removing
the dependency on the QEMU driver lock for correctness.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To allow modifications to the lists to be synchronized, convert
virPCIDeviceList and virUSBDeviceList into virObjectLockable
classes. The locking, however, will not be self-contained. The
users of these classes will have to call virObjectLock/Unlock
in the critical regions.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The QEMU driver struct has a 'qemuVersion' field that was previously
used to cache the version lookup from capabilities. With the recent
QEMU capabilities rewrite the caching happens at a lower level so
this field is pointless. Removing it avoids worries about locking
when updating it.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The duplicate VM checking should be done atomically with
virDomainObjListAdd, so shoud not be a separate function.
Instead just use flags to indicate what kind of checks are
required.
This pair, used in virDomainCreateXML:
if (virDomainObjListIsDuplicate(privconn->domains, def, 1) < 0)
goto cleanup;
if (!(dom = virDomainObjListAdd(privconn->domains,
privconn->caps,
def, false)))
goto cleanup;
Changes to
if (!(dom = virDomainObjListAdd(privconn->domains,
privconn->caps,
def,
VIR_DOMAIN_OBJ_LIST_ADD_CHECK_LIVE,
NULL)))
goto cleanup;
This pair, used in virDomainRestoreFlags:
if (virDomainObjListIsDuplicate(privconn->domains, def, 1) < 0)
goto cleanup;
if (!(dom = virDomainObjListAdd(privconn->domains,
privconn->caps,
def, true)))
goto cleanup;
Changes to
if (!(dom = virDomainObjListAdd(privconn->domains,
privconn->caps,
def,
VIR_DOMAIN_OBJ_LIST_ADD_LIVE |
VIR_DOMAIN_OBJ_LIST_ADD_CHECK_LIVE,
NULL)))
goto cleanup;
This pair, used in virDomainDefineXML:
if (virDomainObjListIsDuplicate(privconn->domains, def, 0) < 0)
goto cleanup;
if (!(dom = virDomainObjListAdd(privconn->domains,
privconn->caps,
def, false)))
goto cleanup;
Changes to
if (!(dom = virDomainObjListAdd(privconn->domains,
privconn->caps,
def,
0, NULL)))
goto cleanup;
As a step towards making virDomainObjList thread-safe turn it
into an opaque virObject, preventing any direct access to its
internals.
As part of this a new method virDomainObjListForEach is
introduced to replace all existing usage of virHashForEach
Currently the virQEMUDriverPtr struct contains an wide variety
of data with varying access needs. Move all the static config
data into a dedicated virQEMUDriverConfigPtr object. The only
locking requirement is to hold the driver lock, while obtaining
an instance of virQEMUDriverConfigPtr. Once a reference is held
on the config object, it can be used completely lockless since
it is immutable.
NB, not all APIs correctly hold the driver lock while getting
a reference to the config object in this patch. This is safe
for now since the config is never updated on the fly. Later
patches will address this fully.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If a decompression binary prints something to stderr, currently
it is discarded. However, it can contain useful data from
debugging POV, so we should catch it.
https://bugzilla.redhat.com/show_bug.cgi?id=894723
Currently, if qemuProcessStart() succeeds, but it's decompression
binary that returns nonzero status, we don't kill the qemu process,
but remove it from internal domain list, leaving the qemu process
hanging around totally uncontrolled.
Currently QEMU capabilities are initialized before the QEMU driver
sets ownership on its various directories. The upshot is that if
you change the user/group in the qemu.conf file, libvirtd will fail
to probe QEMU the first time it is run after the config change.
Moving QEMU capabilities initialization to after the chown() calls
fixes this
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently, there is no reason to hold qemu driver locked
throughout whole API execution. Moreover, we can use the
new qemuDomObjFromDomain() internal API to lookup domain then.
https://bugzilla.redhat.com/show_bug.cgi?id=876829 complains that
if a guest is put into S3 state (such as via virsh dompmsuspend)
and then an external snapshot is taken, qemu forcefully transitions
the domain to paused, but libvirt doesn't reflect that change
internally. Thus, a user has to use 'virsh suspend' to get libvirt
back in sync with qemu state, and if the user doesn't know this
trick, then the guest appears hung.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateActiveExternal):
Track fact that qemu wakes up a suspended domain on migration.
The previous fix to avoid leaking securityDriverNames forgot to
handle the case of securityDriverNames being NULL, leading to
a crash
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Should have been done in commit 56fd513 already, but was missed
due to oversight: qemuDomainSendKey didn't release the driver lock
in its cleanup section. This fixes an issue introduced by commit
8c5d2ba.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
The snapshot name is used to create path to the definition save file.
When the name contains slashes the creation of the file fails. Reject
such names.
When the snapshot definition can't be saved, the
qemuDomainSnapshotCreate function succeeded without filling some of the
fields in the internal definition.
This patch removes the snapshot and returns failure if the XML file
cannot be written.
When running virDomainDestroy, we need to make sure that no other
background thread cleans up the domain while we're doing our work.
This can happen if we release the domain object while in the
middle of work, because the monitor might detect EOF in this window.
For this reason we have a 'beingDestroyed' flag to stop the monitor
from doing its normal cleanup. Unfortunately this flag was only
being used to protect qemuDomainBeginJob, and not qemuProcessKill
This left open a race condition where either libvirtd could crash,
or alternatively report bogus error messages about the domain already
having been destroyed to the caller
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The driver mutex was unlocked in qemuDomainModifyDeviceFlags before
entering qemuDomainObjBeginJobWithDriver where it will be unlocked once
more leaving it in an undefined state. The result was that two
threads were simultaneously looking up the domain hash table during
multiple parallel device attach/detach operations.
Luckily this triggered a virHashIterationError.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
The virDomainObj, qemuAgent, qemuMonitor, lxcMonitor classes
all require a mutex, so can be switched to use virObjectLockable
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
After live change of cpu counts, the number of processor threads is
verified. This patch makes use of this approach to check if qemu ignored
the request for cpu hot-unplug and report an appropriate message.
Pass stub driver name directly to pciDettachDevice and pciReAttachDevice to fit
for different libvirt drivers. For example, qemu driver prefers pci-stub, but
Xen prefers pciback.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
The current code is initializing capabilities before setting
directory permissions. Thus the QEMU binaries being run may
not have the ability to create the UNIX monitor socket on
the first run of libvirtd.
This prevents domain starting and disk attaching if the shared disk's
setting conflicts with other active domain(s), E.g. A domain with
"sgio" set as "filtered", however, another active domain is using
it set as "unfiltered".
This introduces a hash table for qemu driver, to store the shared
disk's info as (@major:minor, @ref_count). @ref_count is the number
of domains which shares the disk.
Since we only care about if the disk support unprivileged SG_IO
commands, and the SG_IO commands only make sense for block disk,
this patch only manages (add/remove hash entry) the shared disk for
block disk.
* src/qemu/qemu_conf.h: (Add member 'sharedDisks' of type
virHashTablePtr; Declare helpers
qemuGetSharedDiskKey, qemuAddSharedDisk
and qemuRemoveSharedDisk)
* src/qemu/qemu_conf.c (Implement the 3 helpers)
* src/qemu/qemu_process.c (Update 'sharedDisks' when domain
starting and shutdown)
* src/qemu/qemu_driver.c (Update 'sharedDisks' when attaching
or detaching disk).
When the disk alignment check done while redefining an existing snapshot
failed, the qemu driver attempted to free the existing snapshot. As in
the cleanup path the definition of the snapshot wasn't assigned, the
cleanup code dereferenced a NULL pointer.
This patch changes the behavior on error paths while redefining snapshot
in two ways:
1) On failure, modifications done on the snapshot definition object are
rolled back.
2) The previous definition of the data isn't freed until it's certain it
won't be needed any more.
This change avoids the segfault and additionally the snapshot doesn't
vanish if redefinition fails for some reason.
This also changes the function signature to take a
virDomainChrSourceDefPtr instead of just a path, since it needs to
differentiate behavior based on source->type.
The functionality provided in virchrdev.c (previously virconsole.c) is
applicable to other types of character devices besides consoles, such
as channels. This patch is just code motion, renaming things such as
"console" or "pty", instead using more general terms such as
"character device" or "device path".
Many internal qemu APIs must find domain object from passed
virDomainPtr. And with function Peter's introduced, we can use it
instead of copying multiple lines among code.
https://bugzilla.redhat.com/show_bug.cgi?id=888426
The code for doing a block-copy was supposed to track the destination
file in drive->mirror, but was set up to do all mallocs prior to
starting the copy so that OOM wouldn't leave things partially started.
However, the wrong variable was being written; later in the code we
silently did 'disk->mirror = mirror' which was still NULL, and thus
leaking memory and leaving libvirt to think that the mirror job was
never started, which prevented a pivot operation after a copy.
Problem introduced in commit 35c7701c6.
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): Initialize correct
variable.