Add capabilities for virtio-keyboard, virtio-mouse
and virtio-tablet devices:
name "virtio-keyboard-device", bus virtio-bus
name "virtio-keyboard-pci", bus PCI
name "virtio-mouse-device", bus virtio-bus
name "virtio-mouse-pci", bus PCI
name "virtio-tablet-device", bus virtio-bus
name "virtio-tablet-pci", bus PCI
Map both -device and -pci versions of the device to one capability.
https://bugzilla.redhat.com/show_bug.cgi?id=1231114
Check if virtio-gpu provides virgl option, and add qemu command line
formatter.
It is enabled with the existing accel3d attribute:
<model type='virtio' heads='1'>
<acceleration accel3d='yes'/>
</model>
Signed-off-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
qemu 2.5 provides virtio video device. It can be used with -device
virtio-vga for primary devices, or -device virtio-gpu for non-vga
devices. However, only the primary device (VGA) is supported with this
patch.
Reference:
https://bugzilla.redhat.com/show_bug.cgi?id=1195176
Signed-off-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Currently the QEMU stdout/stderr streams are written directly to
a regular file (eg /var/log/libvirt/qemu/$GUEST.log). While those
can be rotated by logrotate (using copytruncate option) this is
not very efficient. It also leaves open a window of opportunity
for a compromised/broken QEMU to DOS the host filesystem by
writing lots of text to stdout/stderr.
This makes it possible to connect the stdout/stderr file handles
to a pipe that is provided by virtlogd. The virtlogd daemon will
read from this pipe and write data to the log file, performing
file rotation whenever a pre-determined size limit is reached.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the QEMU monitor is given an FD to the logfile. This
won't work in the future with virtlogd, so it needs to use the
qemuDomainLogContextPtr instead, but it shouldn't directly
access that object either. So define a callback that the
monitor can use for reporting errors from the log file.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When the qemuProcessAttach/Stop methods write a marker into
the log file, they can use qemuDomainLogContextWrite to
write a formatted message.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of writing directly to a log file descriptor, change
qemuLogOperation to use qemuDomainLogContextWrite().
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The qemuDomainTaint APIs currently expect to be passed a log file
descriptor. Change them to instead use a qemuDomainLogContextPtr
to hide the implementation details.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the places which create/open log files to use the new
qemuDomainLogContextPtr object instead.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Introduce a qemuDomainLogContext object to encapsulate
handling of I/O to/from the domain log file. This will
hide details of the log file implementation from the
rest of the driver, making it easier to introduce
support for virtlogd later.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There are two pretty similar functions qemuProcessReadLog and
qemuProcessReadChildErrors. Both read from the QEMU log file
and try to strip out libvirt messages. The latter then reports
an error, while the former lets the callers report an error.
Re-write qemuProcessReadLog so that it uses a single read
into a dynamically allocated buffer. Then introduce a new
qemuProcessReportLogError that calls qemuProcessReadLog
and reports an error.
Convert all callers to use qemuProcessReportLogError.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The rename operation only works on inactive virtual machines,
but it none the less writes to the log file used by the QEMU
processes. This log file is not intended to provide a general
purpose audit trail of operations performed on VMs. The audit
subsystem has recording of important operations. If we want
to extend that to cover all significant public APIs that is
a valid thing to consider, but we shouldn't arbitrarily log
specific APIs into the QEMU log file in the meantime.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Using qemuProcess{Init,Launch,FinishStartup} allows us to run
pre-migration commands on destination before asking QEMU to wait for
incoming migration data.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
NBD storage migration will not work with offline migration anyway and we
already checked that the user did not ask for it. Thus it doesn't make
sense to keep the code after 'done' label where we jump in case of
offline migration.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Some failure paths in qemuMigrationPrepareAny forgot to kill the just
started QEMU process. This patch fixes this by combining 'stop' and
'endjob' label into a new label 'stopjob'. This name was chosen to avoid
confusion with the most common semantics of 'endjob'. Normally, 'endjob'
is always called at the end of an API to stop the job we entered at the
beginning. In qemuMigrationPrepareAny we only want to stop the job in
failure path; on success we need to carry the job over to the Finish
phase.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Once qemuProcessInit was called, qemuProcessLaunch will launch a new
QEMU process with stopped virtual CPUs.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemuProcessStart is going to be split in three parts: qemuProcessInit,
qemuProcessLaunch, and qemuProcessFinish so that migration Prepare phase
can insert additional code in the process. qemuProcessStart will be a
small wrapper for all other callers.
qemuProcessInit prepares the domain up to the point when priv->qemuCaps
is initialized.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
'model' attribute was added to a panic device but only one panic
device is allowed. This patch changes panic device presence
from 'optional' to 'zeroOrMore'.
Panic device type used depends on 'model' attribute.
If no model is specified then device type depends on hypervisor
and guest arch. 'pseries' model is used for pSeries guest and
'isa' model is used in other cases.
XML:
<devices>
<panic model='hyperv'/>
</devices>
QEMU command line:
qemu -cpu <cpu_model>,hv_crash
Now that new domains are started inside a QEMU_ASYNC_JOB_START job,
we need to pass it down to qemuProcessStartCPUs too.
This removes the warning:
qemuDomainObjEnterMonitorInternal:1750 : This thread seems to be the
async job owner; entering monitor without asking for a nested job is
dangerous
Introduced by commit 04c721f, before that this code path was only
executed with QEMU_ASYNC_JOB_NONE.
(This code is not executed on migration, because qemuMigrationPrepareAny
sets the VIR_QEMU_PROCESS_START_PAUSED flag.)
The domain definition is not needed in any of these functions.
Only pass it to qemuSetupChardevCgroup, which is used as a callback
for virDomainChrDefForeach.
Use the right type for passing virDomainObjPtr instead of
void* where possible.
The amount of memory a ppc64 domain might need to lock is different
than that of a equally-sized x86 domain, so we need to check the
domain's architecture and act accordingly.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1273480
The function is used everywhere else to check whether the locked
memory limit should be set / updated, and it should be used here
as well.
Moreover, qemuDomainGetMlockLimitBytes() expects the hostdev to
have already been added to the domain definition, but we only do
that at the end of qemuDomainAttachHostPCIDevice(). Work around
the issue by adding the hostdev before adjusting the locked memory
limit and removing it immediately afterwards.
Remembering to call qemuMonitorSetDomainLog in the right paths before
calling qemuProcessStop is annoying and easy to forget. And I already
forgot to do so in commit v1.2.8-52-g0389060: logfd may be leaked if
QEMU process dies between Prepare and Finish migration phases.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemuProcessStart is so big that any nontrivial code should be moved to
dedicated functions to make the code easier to read and maintain.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemuProcessStart is so big that any nontrivial code should be moved to
dedicated functions to make the code easier to read and maintain.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemuProcessStart is so big that any nontrivial code should be moved to
dedicated functions to make the code easier to read and maintain.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemuProcessStart is so big that any nontrivial code should be moved to
dedicated functions to make the code easier to read and maintain.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Traditionally, we pass incoming migration URI on QEMU command line,
which has some drawbacks. Depending on the URI QEMU may initialize its
migration state immediately without giving us a chance to set any
additional migration parameters (this applies mainly for fd: URIs). For
some URIs the monitor may be completely blocked from the beginning until
migration is finished, which means we may be stuck in qmp_capabilities
command without being able to send any QMP commands.
QEMU solved this by introducing "defer" parameter for -incoming command
line option. This will tell QEMU to prepare for an incoming migration
while the actual incoming URI is sent using migrate-incoming QMP
command. Before calling this command we can normally talk to the
monitor and even set any migration parameters which will be honored by
the incoming migration.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
We only started an async job for incoming migration from another host.
When we were starting a domain from scratch or restoring from a saved
state (migration from file) we didn't set any async job. Let's introduce
a new QEMU_ASYNC_JOB_START for these cases.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Incoming migration may require quite a few parameters (URI, fd, path) to
be considered while starting QEMU and we will soon add another one.
Let's group all of them in a single struct.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Make callers of qemuBuildCommandLine responsible for providing the URI
which should be passed as a parameter for -incoming.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Adjust the config code so that it does not enforce that target memory
node is specified. To avoid breakage, adjust the qemu memory hotplug
config checker to disallow such config for now.
Since we already make sure before that the domain configuration is
valid we may execute it always at the cost of doing 0 iterations of the
for loop.
This patch will simplify later refactor as it will avoid whitespace
changes.
Make the function usable so that -1 can be passed to it as cell ID so
that we can later enable memory hotplug on non-NUMA guests for certain
architectures.
Logging current async job while in BeginJob is useful, but the async job
we want to start is even more interesting.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The previous commit
commit 4e8993a250
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Mon Nov 9 16:20:08 2015 +0000
qemu: assume various QEMU 0.10 features are always available
Added broken handling of -sdl. Instead of duplicating existing
SDL handling code, just ensure it is invoked in the right
scenarios.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The -sdl and -net ...name=XXX arguments were both introduced
in QEMU 0.10, so the QEMU driver can assume they are always
available.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
As of QEMU 0.10.0 the -vga argument was introduced, so the
QEMU driver can assume it is always available.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
As of QEMU 0.10.0 the -drive format= parameter was added,
so the QEMU driver can assume it is always available.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
As of QEMU 0.10.0, the -drive cache option stopped using
the on/off value names, so the QEMU driver can assume
use of the new value names.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Since we require QEMU 0.12.0, we can assume that QEMU supports
all of the fd, tcp, unix and exec migration protocols.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We have twice previously attempted to remove Xenner
support
commit de9be0ab4d
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Wed Aug 22 17:29:01 2012 +0100
Remove xenner support
commit 92572c3d71
Author: Ján Tomko <jtomko@redhat.com>
Date: Wed Feb 18 16:33:50 2015 +0100
Remove code handling the QEMU_CAPS_DOMID capability
This change really does remove the last traces of it
in the capabilities handling code
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
As of QEMU 0.9.1 the -drive argument can be used to configure
all disks, so the QEMU driver can assume it is always available
and drop support for -hda/-cdrom/etc.
Many of the tests need updating because a great many were
running without CAPS_DRIVE set, so using the -hda legacy
syntax.
Fixing the tests uncovered a bug in the argv -> xml
convertor which failed to handle disk with if=floppy.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The QEMU argv -> virDomainDef conversion code was not handling
-drive arguments using the floppy bus. This caused them to be
added as hard disks instead.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The -no-reboot arg was added in QEMU 0.9.0, so the QEMU driver
can now assume it is always present.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
As of QEMU 0.11.0 the 'info chardev' monitor command can be
used to report on allocated chardev paths, so we can drop
support for parsing QEMU stderr to locate the PTY paths.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
As of QEMU 0.9.0 the -vnc option accepts a ':' to separate port
from listen address, so the QEMU driver can assume that support
for listen addresses is always available.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The kQEMU accelerator was deleted in QEMU 0.12, so we no
longer need to support it in the QEMU driver.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Check the QEMU version and refuse to work with QEMU versions
older than 0.12.0. This is approximately the vintage of QEMU
that is available in RHEL-6 era distros.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If mlock is required either due to use of VFIO hostdevs or due to the
fact that it's enabled it needs to be tweaked prior to adding new memory
or after removing a module. Add a helper to determine when it's
necessary and reuse it both on hotplug and hotunplug.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1273491
The code reported that a migration flag is unsupported but didn't jump
to the error label. Probably an oversight in commit f88af9dc that
introduced the flag checking.
Since the flag was not enabled when 'eating' the migration cookie,
libvirt reported a bogus error when memory hotplug was enabled:
unsupported migration cookie feature memory-hotplug
The error was ignored though due to a bug in the code so it slipped
through testing.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1278404
nodeset should be freed in both success and failure paths.
While tmppath is freed immediately after it's consumed, moving it from
error to cleanup label is a bit more consistent and robust.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Generally, we use "ret" variable for storing the value we are going to
return at the and of a function, but this is not the case in
qemuProcessStart. Let's rename "ret" as "rv".
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemuProcessStart was passing char * migrateFrom as the third argument to
qemuPrepareNVRAM. We should explicitly convert the pointer to bool which
is what the function expects.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1270715
Commit id '9deb96f' removed the code to fetch the nodeset from the
CpusetMems cgroup for a running vm in favor of using the return from
virDomainNumatuneFormatNodeset introduced by commit id '43b67f2e7'.
However, that API will return the value of the passed 'auto_nodeset'
when placement is VIR_DOMAIN_NUMATUNE_PLACEMENT_AUTO, which happens
to be NULL.
Since commit id 'c74d58ad' started using priv->autoNodeset in order
to manage the auto placement value during qemuProcessStart, it should
be passed along in order to return the correct value if the domain
requests the auto placement.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
In commit f41be296, we moved vm->persistent check into
qemuDomainRemoveInactive, but we didn't change the vm->persistent
before call qemuDomainRemoveInactive in some place before and just
call it to remove the inactive vm.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
This calls the PCI-, USB- and SCSI-specific functions just
like qemuHostdev{Prepare,ReAttach}DomainDevices() already do,
and was the missing piece for the qemuHostdev API to nicely
mirror the virHostdev API.
Update qemuProcessReconnect() to use the new function.
Adopt the same names used for virHostdevUpdateActive*Devices() for
consistency's sake and to make it easier to jump between the two.
No functional changes.
Adopt the same names used for virHostdevReAttach*Devices() for
consistency's sake and to make it easier to jump between the two.
No functional changes.
We have macros for both positive and negative string matching.
Therefore there is no need to use !STREQ or !STRNEQ. At the same
time as we are dropping this, new syntax-check rule is
introduced to make sure we won't introduce it again.
Signed-off-by: Ishmanpreet Kaur Khera <khera.ishman@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Tunnelled migration can hang if the destination qemu exits despite all the
ABI checks. This happens whenever the destination qemu exits before the
complete transfer is noticed by source qemu. The savevm state checks at
runtime can fail at destination and cause qemu to error out.
The source qemu cant notice it as the EPIPE is not propogated to it.
The qemuMigrationIOFunc() notices the stream being broken from virStreamSend()
and it cleans up the stream alone. The qemuMigrationWaitForCompletion() would
never get to 100% transfer completion.
The qemuMigrationWaitForCompletion() never breaks out as well since
the ssh connection to destination is healthy, and the source qemu also thinks
the migration is ongoing as the Fd to which it transfers, is never
closed or broken. So, the migration will hang forever. Even Ctrl-C on the
virsh migrate wouldn't be honoured. Close the source side FD when there is
an error in the stream. That way, the source qemu updates itself and
qemuMigrationWaitForCompletion() notices the failure.
Close the FD for all kinds of errors to be sure. The error message is not
copied for EPIPE so that the destination error is copied instead later.
Note:
Reproducible with repeated migrations between Power hosts running in different
subcores-per-core modes.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1249981
When qemuDomainPinIOThread was added in commit id 'fb562614', a check
for the IOThread capability was not needed since a check for iothreadpids
covered the condition where the support for IOThreads was not present.
The iothreadpids array was only created if qemuProcessDetectIOThreadPIDs
was able to query the monitor for IOThreads. It would only do that if
the QEMU_CAPS_OBJECT_IOTHREAD capability was set.
However, when iothreadids were added in commit id '8d4614a5' and the
check for iothreadpids was replaced by a search through the iothreadids[]
array for the matching iothread_id that left open the possibility that
an iothreadids[] array was defined, but the entries essentially pointed
to elements with only the 'iothread_id' defined leaving the 'thread_id'
value of 0 and eventually the cpumap entry of NULL.
This was because, the original IOThreads commit id '72edaae7' only
checked if IOThreads were defined and if the emulator had the IOThreads
capability, then IOThread objects were added at startup. The "capability
failure" check was only done when a disk was assigned to an IOThread in
qemuCheckIOThreads. This was because the initial implementation had no way
to dynamically add IOThreads, but it was possible to dynamically add a
disk to the domain. So the decision was if the domain supported it, then
add the IOThread objects. Then if a disk with an IOThread defined was
added, it could check the capability and fail to add if not there. This
just meant the 'iothreads' value was essentially ignored.
Eventually commit id 'a27ed6e7' allowed for the dynamic addition and
deletion of IOThread objects. So it was no longer necessary to generate
IOThread objects to dynamically attach a disk to. However, the startup
and disk check code was not modified to reflect this.
This patch will move the capability failure check to when IOThread
objects are being added to the command line. Thus a domain that has
IOThreads defined will not be started if the emulator doesn't support
the capability. This means when qemuCheckIOThreads is called to add
a disk, it's no longer necessary to check the capability. Instead the
code can use the IOThreadFind call to indicate that the IOThread
doesn't exist.
Finally because it could be possible to have a domain running with the
iothreadids[] defined prior to this change if libvirtd is restarted each
having mostly empty elements, qemuProcessDetectIOThreadPIDs will check
if there are niothreadids when the QEMU_CAPS_OBJECT_IOTHREAD capability
check fails and remove the elements and array if it exists.
With these changes in place, it turns out the cputune-numatune test
was failing because the right bit wasn't set in the test. So used the
opportunity to fix that and create a test that would expect to fail
with some sort of iothreads defined and used, but not having the
correct capability.
Although theoretically both should be the same value, the niothreadids
should be used in favor of iothreads when performing comparisons. This
leaves the iothreads as a purely numeric value to be saved in the config
file. The one exception to the rule is virDomainIOThreadIDDefArrayInit
where the iothreadids are being generated from the iothreads count since
iothreadids were added after initial iothreads support.
The internal representation of a JSON array counts the items in
size_t. However, for some reason, when asking for the count it's
reported as int. Firstly, we need the function to return a signed
type as it's returning -1 on an error. But, not every system has
integer the same size as size_t. Therefore, lets return ssize_t.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Coverity notices that net->ifname is potentially referenced after a
VIR_FREE(). Since the net->ifname will eventually be free'd during
virDomainDefFree when calling virDomainNetDefFree, let's just that
processing take care the free.
Signed-off-by: John Ferlan <jferlan@redhat.com>
So imagine you want to crate new security manager:
if (!(mgr = virSecurityManagerNew("selinux", "QEMU", false, true, false, true)));
Hard to parse, right? What about this:
if (!(mgr = virSecurityManagerNew("selinux", "QEMU",
VIR_SECURITY_MANAGER_DEFAULT_CONFINED |
VIR_SECURITY_MANAGER_PRIVILEGED)));
Now that's better! This is what the commit does.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This gets rid of the partially enforced alignment and makes it less
likely for a bogus value to be introduced in the enumeration.
Capabilities are divided in five-element groups for better readability.
Use #define for QEMU_CAPS_NET_NAME and QEMU_CAPS_HOST_NET_ADD, both
of which are aliases for QEMU_CAPS_0_10.
qemuMigrationIsAllowed would disallow offline migration if the VM
contained host devices or memory modules. Since during offline migration
we don't transfer any state we can safely migrate VMs with such
configuration.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1265049
Use the migration @flags for checking various migration aspects rather
than picking them out as booleans. Document the new semantics in the
function header.
Now that qemuMigrationIsAllowed is always called with @vm, we can drop
the @def argument and simplify the control flow.
Additionally the comment is invalid so drop it.
Extract the hostdev check from qemuMigrationIsAllowed into a separate
function since that is the only part that needs to be done in the v2
migration protocol prepare phase on the destination. All other checks
were added when the v3 protocol existed so they don't need to be
extracted.
This change will allow to drop the @def argument for
qemuMigrationIsAllowed and further simplify the function.
Even though QEMU on the source host reports completed migration and thus
we move to the Finish phase, QEMU on the destination host may still be
processing migration data. Thus before we can start guest CPUs on the
destination, we have to wait for a completed migration event.
https://bugzilla.redhat.com/show_bug.cgi?id=1265902
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
With new QEMU which supports migration events,
qemuMigrationCheckJobStatus needs to explicitly query QEMU for migration
statistics once migration is completed to make sure the caller sees
up-to-date statistics with both old and new QEMU. However, some callers
are not interested in the statistics at all and once we start waiting
for a completed migration on the destination host too, checking the
statistics would even fail. Let's push the decision whether to update
the statistics or not to the caller.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The function already has two bool parameters and we will need to add a
new one. Let's switch to flags to make the callers readable.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The destination host gets detailed statistics about the current
migration form the source host via migration cookie and copies them to
the domain object so that they can be queried using
virDomainGetJobStats. However, we should only copy statistics to the
domain object when migration finished successfully.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Even if we are migrating a domain with VIR_MIGRATE_PAUSED flag set, we
should still update the total time of the migration. Updating downtime
doesn't hurt either, even though we don't actually start guest CPUs.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemu-kvm can be used to run ppc64 guests on ppc64le hosts and vice
versa, since the hardware is actually the same and the endianness
is chosen by the guest kernel.
Up until now, however, libvirt didn't allow the use of qemu-kvm
to run guests if their endianness didn't match the host's.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1267882
Since we'd disallow migration of a guest that would have possibly
invalid config but still be able to work, relax the WWN check to be
performed only on new starts of the VM.
We are using memory-backing-file even when it's not needed, for example
if user requests hugepages for memory backing, but does not specify any
pagesize or memory node pinning. This causes migrations to fail when
migrating from older libvirt that did not do this. So similarly to
commit 7832fac847 which does it for
memory-backend-ram, this commit makes is more generic and
backend-agnostic, so the backend is not used if there is no specific
pagesize of hugepages requested, no nodeset the memory node should be
bound to, no memory access change required, and so on.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1266856
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
So since the introduction of the memory-backend-file object until now we
only added '-mem-path' for non-NUMA guests and we used the parameters of
the memory-backend-file object to specify the path to the hugetlbfs
mount. But hugepages can be also used without memory-backend-file
object, as it used to be before its introduction. Let's just get this
part of the code back and properly append the '-mem-path' for NUMA
guests as well, but only when the memory backend is not needed.
This parameter is already being applied when no numa is requested and
because we still use memory-object-file unconditionally for
hugepage-backed NUMA guests, this should not fire until later.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
That function is called qemuBuildMemPathStr() and will be used in
other places in the future. The change in the test suite is proper due
to the fact that -mem-prealloc makes only sense with -mem-path (from
qemu documentation -- html/qemu-doc.html).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Support for GICv3 has been recently introduced in qemu using gic-version
option for the 'virt' machine. The option can actually take values of
'2', '3' and 'host', however, since in libvirt this is a numeric
parameter, we limit it only to 2 and 3. Value of 2 is not added to the
command line in order to keep backward compatibility with older qemu
versions.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Unfortunately qemu currently doesn't offer introspection for machine types,
so we have to rely on version number, similar to QEMU_CAPS_MACHINE_USB_OPT.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Commit 307fb904 (Sep 10) added a 'privileged' variable when creating
the DAC driver:
@@ -153,6 +157,7 @@ virSecurityManagerNewDAC(const char *virtDriver,
bool defaultConfined,
bool requireConfined,
bool dynamicOwnership,
+ bool privileged,
virSecurityManagerDACChownCallback chownCallback)
But argument order is mixed up at the caller, swapping dynamicOwnership
and privileged values. This corrects the argument order
https://bugzilla.redhat.com/show_bug.cgi?id=1266628
This seemed to be more of a false positive as for some reason Coverity
was missing the "ret < 0" goto error condition and somehow believing that
event could be overwritten. At first I thought it was just the ret != 0
condition difference, but it wasn't.
In any case, make use of the recent change to qemuDomainEventQueue to
check event == NULL and just pass it as a parameter directly in the
error path. That avoids the error.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Coverity complains that return from virHookCall is not checked in
one place in qemuProcessStop. Since the comment notes that we cannot
stop the operation even it if fails, just added the ignore_value.
So while working on my previous patches, I've noticed that
virDomainRestore implementation in qemu and test drivers has the
same problem as I am fixing.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So far we have the following pattern occurring over and over
again:
if (!vm->persistent)
qemuDomainRemoveInactive(driver, vm);
It's safe to put the check into the function and save some LoC.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=871452
So, you want to create a domain from XML. The domain already
exists in libvirt's database of domains. It's okay, because name
and UUID matches. However, on domain startup, internal
representation of the domain is overwritten with your XML even
though we claim that the XML you've provided is a transient one.
The bug is to be found across nearly all the drivers.
Le sigh.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=871452
Okay, so we allow users to 'virsh create' an already existing
domain, providing completely different XML than the one stored in
Libvirt. Well, as long as name and UUID matches. However, in some
drivers the code that handles errors unconditionally removes the
domain that failed to start even though the domain might have
been persistent. Fortunately, the domain is removed just from the
internal list of domains and the config file is kept around.
Steps to reproduce:
1) virsh dumpxml $dom > /tmp/dom.xml
2) change XML so that it is still parse-able but won't boot, e.g.
change guest agent path to /foo/bar
3) virsh create /tmp/dom.xml
4) virsh dumpxml $dom
5) Observe "No such domain" error
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
I initially added this in order to keep the code more error-prone to
following additions, but it seems it's still frowned upon.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Now that virQEMUDriverCreateXMLConf is never called with NULL
(after 086f37e97a) we can safely drop useless check in
qemuDomainDeviceDefPostParse as we are guaranteed to be always
called with the driver initialized. Therefore checking if driver
is NULL makes no sense. Moreover, if we mix it with direct driver
dereference. And after that, we are sure that nor @cfg will be
NULL, therefore we can drop checks for that too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Qemu unfortunately doesn't update internal state right after migration
and so the actual balloon size as returned by 'query-balloon' are
invalid for a while after the CPUs are started after migration. If we'd
refresh our internal state at this point we would report invalid current
memory size until the next balloon event would arrive.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1242940
My original implementation was based on a qemu version that still did
not have all the checks in place. Using sizes that would align to odd
megabyte increments will produce the following error:
qemu-kvm: -device pc-dimm,node=0,memdev=memdimm0,id=dimm0: backend memory size must be multiple of 0x200000
qemu-kvm: -device pc-dimm,node=0,memdev=memdimm0,id=dimm0: Device 'pc-dimm' could not be initialized
Introduce an alignment retrieval function for memory devices and use it
to align the devices separately and modify a test case to verify it.
For some machine types ppc64 machines now require that memory sizes are
aligned to 256MiB increments (due to the dynamically reconfigurable
memory). As now we treat existing configs reasonably in regards to
migration, we can round all the sizes unconditionally. The only drawback
will be that the memory size of a VM can potentially increase by
(256MiB - 1byte) * number_of_NUMA_nodes.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1249006
When we are starting a qemu process for an incomming migration or
snapshot reloading we should not modify the memory sizes in the domain
since we could potentially change the guest ABI that was tediously
checked before. Additionally the function now updates the initial memory
size according to the NUMA node size, which should not happen if we are
restoring state.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1252685
When implementing memory hotplug I've opted to recalculate the initial
memory size (contents of the <memory> element) as a sum of the sizes of
NUMA nodes when NUMA was enabled. This was based on an assumption that
qemu did not allow starting when the NUMA node size total didn't equal
to the initial memory size. Unfortunately the check was introduced to
qemu just lately.
This patch uses the new XML parser flag to decide whether it's safe to
update the memory size total from the NUMA cell sizes or not.
As an additional improvement we now report an error in case when the
size of hotplug memory would exceed the total memory size.
The rest of the changes assures that the function is called with correct
flags.
Add 'initial_memory' member to struct virDomainMemtune so that the
memory size can be pre-calculated once instead of inferring it always
again and again.
Separating of the fields will also allow finer granularity of decisions
in later patches where it will allow to keep the old initial memory
value in cases where we are handling incomming migration from older
versions that did not always update the size from NUMA as the code did
previously.
The change also requires modification of the qemu memory alignment
function since at the point where we are modifying the size of NUMA
nodes the total size needs to be recalculated too.
The refactoring done in this patch also fixes a crash in the hyperv
driver that did not properly initialize def->numa and thus
virDomainNumaGetMemorySize(def->numa) crashed.
In summary this patch should have no functional impact at this point.
Add a new parser flag that will mark code paths that parse XML files
wich will not be used with existing VM state so that post parse
callbacks can possibly do ABI incompatible changes if needed.
Extract the size determination into a separate function and reuse it
across the memory device alignment functions. Since later we will need
to decide the alignment size according to architecture let's pass def to
the functions.
Since test suite now correctly creates capabilities cache, the hack is not
needed any more.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
The main purpose of this patch is to introduce test mode to
virQEMUCapsCacheLookup(). This is done by adding a global variable, which
effectively overrides binary name. This variable is supposed to be set by
test suite.
The second addition is qemuTestCapsCacheInsert() function which allows the
test suite to actually populate the cache.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
So far this function was not kept in sync with changing
virDomainDiskDef. Fill in all the missing checks and reorganize
their order so it's easier to track which items are not being
checked for.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
I always felt like this function is qemu specific rather than
libvirt-wide. Other drivers may act differently on virDomainDef
change and in fact may require talking to underlying hypervisor
even if something else's than disk->src has changed. I know that
the function is still incomplete, but lets break that into two
commits that are easier to review. This one is pure code
movement.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Firstly, our coding guidelines suggest using 'cleanup' label
instead of 'end'. Then, @ret should be set to value representing
success as the last statement before the 'cleanup' label.
And while I am at this function, lets enumerate all the possible
enum items (virDomainDiskDevice) and avoid using 'default' in
switch(). Pooh. Also, nothing bad happens if we look up the disk
to change in the domain upfront. In fact, it's going to be
helpful later when we want to keep some old values for performing
a rollback.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This new private API should return true iff sources of two disks
differs in sense that qemu should be instructed to change the
disk backend. For instance, ejecting a CDROM is such case, or
pointing disk into a different ISO location, and so on.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
While we currently only allow changing a media in a disk, this is
going to change in a while, so the function name would be
invalid. Moreover, the old name does not match the pattern laid
out by other update functions.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1159219
So, in 11e058ca58 I've tried to make UpdateDevice update
startupPolicy too. And it worked well until somebody came around
and pushed d0dc6c0369 which accidentally removed my
contribution. Redo my commit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When persistently migrating a domain to a destination host where the
same domain already exists (i.e., it is persistent and shutdown at the
destination), we would happily throw away the original persistent
definition without properly freeing it. And when updating the definition
fails for some reason we don't properly revert to the original state
leaving the domain broken.
In addition to fixing these issues, the patch also makes sure the domain
definition parsed from a migration cookie is either used or freed.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
For quite a long time we don't need to postpone queueing events until
the end of the function since we no longer have the big driver lock.
Let's make the code of qemuMigrationFinish simpler by queuing events at
the time we generate them.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Every single call to qemuDomainEventQueue() uses the following pattern:
if (event)
qemuDomainEventQueue(driver, event);
Let's move the check for valid event to qemuDomainEventQueue and
simplify all callers.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Finish is the final state in v2 of our migration protocol. If something
fails, we have no option to abort the migration and resume the original
domain. Non fatal errors (such as failure to start guest CPUs or make
the domain persistent) has to be treated as success. Keeping the domain
running while reporting the failure was just asking for trouble.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Whenever something fails during incoming migration in Finish phase
before we started guest CPUs, we need to kill the domain in addition to
reporting the failure.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When we save status XML at the point during migration where we have
already started the domain on destination, we can't really go back and
abort migration. Thus the only thing we can do is to log a warning and
report success.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Offline migration is quite special because we don't really need to do
anything but make the domain persistent. Let's do it separately from
normal migration to avoid cluttering the code with
!(flags & VIR_MIGRATE_OFFLINE).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Commit id 'f1f68ca33' added code to remove the directory paths for
auto-generated sockets, but that code could be called before the
paths were created resulting in generating error messages from
virFileDeleteTree indicating that the file doesn't exist.
Rather than "enforce" all callers to make the non-NULL and existence
checks, modify the virFileDeleteTree API to silently ignore NULL on
input and non-existent directory trees.
When looking for a QEMU binary suitable for running ppc64le guests
we have to take into account the fact that we use the QEMU target
as key for the hash, so direct comparison is not good enough.
Factor out the logic from virQEMUCapsFindBinaryForArch() to a new
virQEMUCapsFindTarget() function and use that both when looking
for QEMU binaries available on the system and when looking up
QEMU capabilities later.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1260753
Fixes the following error when attempting to add a disk with bus='virtio'
to a machine which actually supports virtio-mmio (caught with ARM virt):
virtio disk cannot have an address of type 'virtio-mmio'
The problem has been likely introduced by
e8d5517254. Before that
qemuAssignDevicePCISlots() was never called for ARM "virt" machine.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
We may want to do some decisions in drivers based on fact if we
are running as privileged user or not. Propagate this info there.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit 8125113c added code that should remove the disk backend if the
fronted hotplug failed for any reason. The code had a bug though as it
used the disk string for unplug rather than the backend alias. Fix the
code by pre-creating an alias string and using it instead of the disk
string. In cases where qemu does not support QEMU_CAPS_DEVICE, we ignore
the unplug of the backend since we can't really create an alias in that
case.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1262399
There's a couple reports of things failing in this area (bug 1259070),
but it's tough to tell what's going wrong without stderr from
qemu-bridge-helper. So let's report stderr in the error message
Couple new examples:
virbr0 is inactive:
internal error: /usr/libexec/qemu-bridge-helper --use-vnet --br=virbr0 --fd=21: failed to communicate with bridge helper: Transport endpoint is not connected
stderr=failed to get mtu of bridge `virbr0': No such device
bridge isn't on the ACL:
internal error: /usr/libexec/qemu-bridge-helper --use-vnet --br=br0 --fd=21: failed to communicate with bridge helper: Transport endpoint is not connected
stderr=access denied by acl file
Up until now, the default has been rtl8139, but no check was in
place to make sure that device was actually available.
Now we try rtl8139, e1000 and virtio-net in turn, checking for
availability before using any of them: this means we have a much
better chance for the guest to be able to boot.
Commit f1f68ca334 did not report an error if virFileMakePath()
returned -1. Well, who would've guessed function with name starting
with 'vir' sets an errno instead of reporting an error the libvirt way.
Anyway, let's fix it, so the output changes from:
$ virsh start arm
error: Failed to start domain arm
error: An error occurred, but the cause is unknown
to:
$ virsh start arm
error: Failed to start domain arm
error: Cannot create directory '/var/lib/libvirt/qemu/domain-arm': Not
a directory
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1146886
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
If the current live definition does not have memory hotplug enabled, but
the persistent one does libvirt would reject migration if the
destination does not support memory hotplug even if the user didn't want
to persist the VM at the destination and thus the XML containing the
memory hotplug definition would not be used. To fix this corner case the
code will check for memory hotplug in the newDef only if
VIR_MIGRATE_PERSIST_DEST was used.
Commit id '2e7cea243' added a check for an error from Finish instead
of 'unexpected error'; however, if for some reason there wasn't an
error, then virGetLastError could return NULL resulting in the
NULL pointer deref to err->domain.
https://bugzilla.redhat.com/show_bug.cgi?id=1258361
When attaching a disk, controller, or rng using an address type ccw
or s390, we need to ensure the support is provided by both the machine.os
and the emulator capabilities (corollary to unconditional setting when
address was not provided for the correct machine.os and emulator.
For an inactive guest, an addition followed by a start would cause the
startup to fail after qemu_command builds the command line and attempts
to start the guest. For an active guest, libvirtd would crash.
Rather than have different usages of STR function in order to determine
whether the domain is s390-ccw or s390-ccw-virtio, make a single API
which will check the machine.os prefix. Then use the function.
Adds a new interface type using UDP sockets, this seems only applicable
to QEMU but have edited tree-wide to support the new interface type.
The interface type required the addition of a "localaddr" (local
address), this then maps into the following xml and qemu call.
<interface type='udp'>
<mac address='52:54:00:5c:67:56'/>
<source address='127.0.0.1' port='11112'>
<local address='127.0.0.1' port='22222'/>
</source>
<model type='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
</interface>
QEMU call:
-net socket,udp=127.0.0.1:11112,localaddr=127.0.0.1:22222
Notice the xml "local" entry becomes the "localaddr" for the qemu call.
reference:
http://lists.gnu.org/archive/html/qemu-devel/2011-11/msg00629.html
Signed-off-by: Jonathan Toppins <jtoppins@cumulusnetworks.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
This reverts commit 1ce7c1d20c,
which introduced a significant semantic change to the
virDomainGetInfo() API. Additionally, the change was only
made to 2 of the 15 virt drivers.
Conflicts:
src/qemu/qemu_driver.c
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1226234#c3
If the qemu monitor fails to remove the memory from the guest for
any reason, the auditlog message will incorrectly use the current
actual memory (via virDomainDefGetMemoryActual) instead of the
value we were attempting to reduce to. The result is the 'new-mem'
and 'old-mem' values for the auditlog message would be identical.
This patch creates a local 'newmem' which accounts for the current
memory size minus the memory which is being removed. NB, for the
success case this results in the same value that would be returned
by virDomainDefGetMemoryActual without the need to do the math. This
follows the existing code which would subtract the size for cur_balloon.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1226234#c3
Prior to this patch, after successfully hot plugging memory
the audit log indicated that the update failed, e.g.:
type=VIRT_RESOURCE ... old-mem=1024000 new-mem=1548288 \
exe="/usr/sbin/libvirtd" hostname=? addr=? terminal=pts/2 res=failed
This patch will adjust where virDomainAuditMemory is called to
ensure the proper 'ret' value is used based on success or failure.
Additionally, the audit message should include the size of the
memory we were attempting to change to rather than the current
actual size. On failure to add, the message showed the same value
for old-mem and new-mem.
In order to do this, introduce a 'newmem' local which will compute
the new size based on the oldmem size plus the size of memory we
are about to add. NB: This would be the same as calling the
virDomainDefGetMemoryActual again on success, but avoids the
overhead of recalculating. Plus cur_balloon is already adjusted
by the same value, so this follows that.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Commit f1f68ca334 overused mdir_name()
event though it was not needed in the latest version, hence labelling
directory one level up in the tree and not the one it should.
If anyone with SElinux managed to try run a domain with guest agent set
up, it's highly possible that they will need to run 'restorecon -F
/var/lib/libvirt/qemu/channel/target' to fix what was done.
Reported-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1253107
Make a call virCgroupGetBlkioWeight to re-read blkio.weight right
after it is set in order to keep internal data up-to-date.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Commit aa2cc7 modified a previously unnecessary but innocuous check
for interface IP address during interface update incorrectly, causing
all attempted updates (e.g. changing link state) to interfaces of
type='ethernet' for QEMU to fail.
This patch fixes the issue by completely removing the check for IP
address, which is pointless since QEMU doesn't support setting
interface IP addresses from the domain interface XML anyway.
Signed-off-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
Signed-off-by: Laine Stump <laine@laine.org>
We will try to set the node to cpuset.mems without check if
it is available, since we already have helper to check this.
Call virNumaNodesetIsAvailable to check if node is available,
then try to change it in the cgroup.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
We are automatically generating some socket paths for domains, but all
those paths end up in a directory that's the same for multiple domains.
The problem is that multiple domains can each run with different
seclabels (users, selinux contexts, etc.). The idea here is to create a
per-domain directory labelled in a way that each domain can access its
own unix sockets.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1146886
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The problem here is that there are some values that kernel accepts, but
does not set them, for example 18446744073709551615 which acts the same
way as zero. Let's do the same thing we do with other tuning options
and re-read them right after they are set in order to keep our internal
structures up-to-date.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1165580
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1251886
Since iothread_id == 0 is an invalid value for QEMU let's point
that out specifically. For the IOThreadDel code, the failure would
have ended up being a failure to find the IOThread ID; however, for
the IOThreadAdd code - an IOThread 0 was added and that isn't good.
It seems during many reviews/edits to the code the check for
iothread_id = 0 being invalid was lost - it could have originally
been in the API code, but requested to be moved - I cannot remember.
Just like in commit 704cf06, if virCgroup*() fails, the error is
already reported. There's no need to overwrite the error with a
generic one and possibly hiding the true root cause of the error.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
It may happen that user (mistakenly) wants to rename a domain to
itself. Which is no renaming at all. We should reject that with
some meaningful error message.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Coverity complained that 'vm' wasn't initialized before jumping to
cleanup: and calling virDomainObjEndAPI if the VIR_STRDUP fails.
So I initialized vm = NULL and also moved the VIR_STRDUP closer to
usage and used endjob for goto. Lots of other reasons for failures.
Currently supports only renaming inactive domains without snapshots.
Signed-off-by: Tomas Meszaros <exo@tty.sk>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Otherwise the error is just
error: Failed to create domain from test1.xml
error: failed to retrieve file descriptor for interface: Transport endpoint is not connected
since we don't get a sensible error after the fork.
Pinning information returned for emulatorpin and vcpupin calls is being
returned from our data without querying cgroups for some time. However,
not all the data were utilized. When automatic placement is used the
information is not returned for the calls mentioned above. Since the
numad hint in private data is properly saved/restored, we can safely use
it to return true information.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1162947
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The numad hint stored in priv->autoNodeset is information that gets lost
during daemon restart. And because we would like to use that
information in the future, we also need to save it in the status XML.
For the sake of tests, we need to initialize nnumaCell_max to some
value, so that the restoration doesn't fail in our test suite. There is
no need to fill in the actual numa cell data since the recalculating
function virCapabilitiesGetCpusForNodemask() will not fail, it will just
skip filling the data in the bitmap which we don't use in tests anyway.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
When parsing private domain data, there are two paths that are flawed.
They are both error paths, just from different parts of the function.
One of them can call free() on an uninitialized pointer. Initialization
to NULL is enough here. The other one is a bit trickier to explain, but
as easy as the first one to fix. We create capabilities, parse them and
then assign them into the private data pointer inside the domain object.
If, however, we get to fail from now on, the error path calls unrefs the
capabilities and then, when the domain object is being cleaned,
qemuDomainObjPrivateFree() tries to unref them as well. That causes a
segfault. Settin the pointer to NULL upon successful addition to the
private data is enough.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1210587 (completed)
When generating the default drive address for a SCSI <disk> device,
check the generated address to ensure it doesn't conflict with a SCSI
<hostdev> address. The <disk> address generation algorithm uses the
<target> "dev" name in order to determine which controller and unit
in order to place the device. Since a SCSI <hostdev> device doesn't
require a target device name, its placement on the guest SCSI address
"could" conflict. For instance, if a SCSI <hostdev> exists at
controller=0 unit=0 and an attempt to hotplug 'sda' into the guest
made, there would be a conflict if the <hostdev> is already using
/dev/sda.
Hot-unplugging a disk from a guest that supports hot-unplugging generates an error
in the libvirt log when running QEMU with the "-msg timestamp=on" flag.
2015-08-06 10:48:59.945+0000: 11662: error : qemuMonitorTextDriveDel:2594 :
operation failed: deleting drive-virtio-disk4 drive failed:
2015-08-06T10:48:59.945058Z Device 'drive-virtio-disk4' not found
This error is caused because the HMP results are getting prefixed with a timestamp.
Parsing the output is not reliable with STRPREFIX as the results can be prefixed with a timestamp.
Using strstr ensures that parsing the output works whether the results are prefixed or not.
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Frank Schreuder <fschreuder@transip.nl>
This reverts commit ede34470fd, which
was apparently written based on testing performed before commits
1e15be1 and 9a12b6 were pushed upstream. Once those two patches are in
place, commit ede34470 is redundant, and can even cause
incorrect/unexpected behavior when auto-assigning addresses for
virtio-net devices.
Commit e8d5517 updated the domain post-parse to automatically add
pcie-root et al for certain ARM "virt" machinetypes, but didn't update
the function qemuDomainSupportsPCI() which is called later on when we
are auto-assigning PCI addresses and default settings for the PCI
controller <model> and <target> attributes. The result was that PCI
addresses weren't assigned, and the controllers didn't have their
attribute default values set, leading to an error when the domain was
started, e.g.:
internal error: autogenerated dmi-to-pci-bridge options not set
This patch adds the same check made in the earlier patch to
qemuDomainSupportsPCI(), so that PCI address auto-assignment and
target/model default values will be set.
Well, there are just two places that needs adjustment:
qemuDomainGetInterfaceParameters - to report the @floor
qemuDomainSetInterfaceParameters - now that the function has been
fixed, we can allow updating @floor too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
As sketched in previous commits, imagine the following scenario:
virsh # domiftune gentoo vnet0
inbound.average: 100
inbound.peak : 0
inbound.burst : 0
outbound.average: 100
outbound.peak : 0
outbound.burst : 0
virsh # domiftune gentoo vnet0 --inbound 0
virsh # shutdown gentoo
Domain gentoo is being shutdown
virsh # list --all
error: Failed to list domains
error: Cannot recv data: Connection reset by peer
Program received signal SIGSEGV, Segmentation fault.
0x00007fffe80ea221 in networkUnplugBandwidth (net=0x7fff9400c1a0, iface=0x7fff940ea3e0) at network/bridge_driver.c:4881
4881 net->floor_sum -= ifaceBand->in->floor;
This is rather unfortunate. We should not SIGSEGV here. The
problem is, that while in the second step the inbound QoS was
cleared out, the network part of it was not updated (moreover, we
don't report that vnet0 had inbound.floor set). Internal
structure therefore still had some fragments left (e.g.
class_id). So when qemuProcessStop() started to clean up the
environment it got to networkUnplugBandwidth(). Here, class_id is
set therefore function assumes that there is an inbound QoS. This
actually is a fair assumption to make, there's no need for a
special QoS box in network's QoS when there's no QoS to set.
Anyway, the problem is not the networkUnplugBandwidth() rather
than qemuDomainSetInterfaceParameters() which completely forgot
about QoS being disperse (some parts are set directly on
interface itself, some on bridge the interface is plugged into).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
nwfilter uses iptables and ebtables, which only work properly on
tap-based network connections (*not* on macvtap, for example), but we
just ignore any <filterref> elements for other types of networks,
potentially giving users a false sense of security.
This patch checks the network type and fails/logs an error if any
domain <interface> has a <filterref> when the connection isn't using a
tap device.
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1180011
There's no need to set mon->fd to a dummy value since
it's initialized to proper value just a few lines below.
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
This controller can be connected only to a port on a
pcie-switch-upstream-port. It provides a single hotpluggable port that
will accept any PCI or PCIe device, as well as any device requiring a
pcie-*-port (the only current example of such a device is the
pcie-switch-upstream-port).
The downstream ports of an x3130-upstream switch can each have one of
these plugged into them (and that is the only place they can be
connected). Each xio3130-downstream provides a single PCIe port that
can have PCI or PCIe devices hotplugged into it. Apparently an entire
set of x3130-upstream + several xio3130-downstreams can be hotplugged
as a unit, but it's not clear to me yet how that would be done, since
qemu only allows attaching a single device at a time.
This device will be used to implement the
"pcie-switch-downstream-port" model of pci controller.
This controller can be connected only to a pcie-root-port or a
pcie-switch-downstream-port (which will be added in a later patch),
which is the reason for the new connect type
VIR_PCI_CONNECT_TYPE_PCIE_PORT. A pcie-switch-upstream-port provides
32 ports (slot=0 to slot=31) on the downstream side, which can only
have pci controllers of model "pcie-switch-downstream-port" plugged
into them, which is the reason for the other new connect type
VIR_PCI_CONNECT_TYPE_PCIE_SWITCH.
This is the upstream part of a PCIe switch. It connects to a PCIe port
(but not PCI) on the upstream side, and can have up to 31
xio3130-downstream controllers (but no other types of devices)
connected to its downstream side.
This device will be used to implement the "pcie-switch-upstream-port"
model of pci controller.
This is backed by the qemu device ioh3420.
chassis and port from the <target> subelement are used to store/set the
respective qemu device options for the ioh3420. Currently, chassis is
set to be the index of the controller, and port is set to
"(slot << 3) + function" (per suggestion from Alex Williamson).
This controller can be connected (at domain startup time only - not
hotpluggable) only to a port on the pcie root complex ("pcie-root" in
libvirt config), hence the new connect type
VIR_PCI_CONNECT_TYPE_PCIE_ROOT. It provides a hotpluggable port that
will accept any PCI or PCIe device.
New attributes must be added to the controller <target> subelement for
this - chassis and port are guest-visible option values that will be
set by libvirt with values derived from the controller's index and pci
address information.
This is a PCIE "root port". It connects only to a port of the
integrated pcie.0 bus of a Q35 machine (can't be hotplugged), and
provides a single PCIe port that can have PCI or PCIe devices
hotplugged into it.
This device will be used to implement the "pcie-root-port" model of
pci controller.
This uses the new subelement/attribute in two ways:
1) If a "pci-bridge" pci controller has no chassisNr attribute, it
will automatically be set to the controller's index as soon as the
controller's PCI address is known (during
qemuDomainAssignPCIAddresses()).
2) when creating the commandline for a pci-bridge device, chassisNr
will be used to set qemu's chassis_nr option (rather than the previous
practice of hard-coding it to the controller's index).
This patch provides qemu support for the contents of <model> in
<controller> for the two existing PCI controller types that need it
(i.e. the two controller types that are backed by a device that must
be specified on the qemu commandline):
1) pci-bridge - sets <model> name attribute default as "pci-bridge"
2) dmi-to-pci-bridge - sets <model> name attribute default as
"i82801b11-bridge".
These both match current hardcoded practice.
The defaults are set at the end of qemuDomainAssignPCIAddresses().
This can't be done earlier because some of the options that will be
autogenerated need full PCI address info for the controller, and
because qemuDomainAssignPCIAddresses() might create extra controllers
which would need default settings added, and that hasn't yet been done
at the time the PostParse callbacks are being run.
qemuDomainAssignPCIAddresses() is still called prior to the XML being
written to disk, though, so the autogenerated defaults are persistent.
qemu capabilities bits aren't checked when the domain is defined, but
rather when the commandline is actually created (so the domain can
possibly be defined on a host that doesn't yet have support for the
given device, or a host different from the one where it will
eventually be run). When the commandline is being generated we compare
the modelName to known qemu device names implementing the given type
of controller, and check the capabilities bit for that device.
Qemu reports physical size 0 for block devices. As 15fa84acbb
changed the behavior of qemuDomainGetBlockInfo to just query the monitor
this created a regression since we didn't report the size correctly any
more.
This patch adds code to refresh the physical size of a block device by
opening it and seeking to the end and uses it both in
qemuDomainGetBlockInfo and also in qemuDomainGetStatsOneBlock that was
broken since it was introduced in this respect.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1250982
virtio-net-pci adapter is capable to use irqfd with vhost-net only in MSI-X
mode, which appears to be available only on PCIe bus, at least on ARM
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Legacy -net option works correctly only with embedded device models, which
do not require any bus specification. Therefore, we should use -device for
PCI hardware
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Here we assume that if qemu supports generic PCI host controller,
it is a part of virt machine and can be used for adding PCI devices.
In qemu this is actually a PCIe bus, so we also declare multibus
capability so that 0'th bus is specified to qemu correctly as 'pcie.0'
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This capability specifies that qemu can implement generic PCI host
controller. It is often used for virtual environments, including ARM.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Libvirt doesn't reliably know the location of the backing chain when
pre-creating images for non-shared migration. This isn't a problem for
full copy, but incremental copy requires the information.
Forbid pre-creating the image in cases where incremental migration is
required. This limitation can perhaps be lifted once libvirt will fully
support loading of backing chain information from the XML.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1249587
Rather than provide a somewhat generic error message when the API
returns false, allow the caller to supply a "report = true" option
in order to cause virReportError's to describe which of the 3 paths
that can cause failure.
Some callers don't care about what caused the failure, they just want
to have a true/false - for those, calling with report = false should
be sufficient.
PowerPC pseries based VMs do not support a floppy disk controller.
This prohibits libvirt from creating qemu command with floppy device.
Signed-off-by: Kothapally Madhu Pavan <kmp@linux.vnet.ibm.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1180486
Signed-off-by: Ján Tomko <jtomko@redhat.com>
PowerPC pseries based VMs do not support a floppy disk controller.
This prohibits libvirt from adding floppy disk for a PowerPC pseries VM.
Signed-off-by: Kothapally Madhu Pavan <kmp@linux.vnet.ibm.com>
The virDomainObjListRemove() function unlocks a domain that it's given
due to legacy code. And because of that code, which should be
refactored, that last virObjectUnlock() cannot be just removed. So
instead, lock it right back for qemu for now. All calls to
qemuDomainRemoveInactive() are followed by code that unlocks the domain
again, plus the domain should be locked during qemuDomainObjEndJob(), so
the right place to lock it is right after virDomainObjListRemove().
The only place where this would cause a problem is the autodestroy
callback, so we need to get another reference there and uref+unlock it
afterwards. Luckily, returning NULL from that function doesn't mean an
error, and only means that it doesn't need to be unlocked anymore.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
If cpuset is disabled or not available, it libvirt must not use it.
Mainly for actions that do not need it and can use sched_setaffinity()
or numa_membind() instead, because they will fail without good reason.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1244664
Signed-off-by: Luyao Huang <lhuang@redhat.com>
When stopping a domain on the destination host after a failed migration,
we need to avoid reseting security labels since the domain is still
running on the source host. While we were correctly doing so in some
cases, there were still some paths which did this wrong.
https://bugzilla.redhat.com/show_bug.cgi?id=1242904
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
In addition to checking the current asynchronous job
qemuMigrationJobIsActive reports an error if the current job does not
match the one we asked for. Let's just check the job directly since we
are not interested in the error in qemuProcessHandleMonitorEOF.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
If destination libvirt doesn't support memory hotplug since all the
support was introduced by adding new elements the destination would
attempt to start qemu with an invalid configuration. The worse part is
that qemu might hang in such situation.
Fix this by sending a required migration feature called 'memory-hotplug'
to the destination. If the destination doesn't recognize it it will fail
the migration.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1248350
Our atomic increment (virAtomicIntInc) uses (if available) gcc
__sync_add_and_fetch builtin. In qemu driver though, we'd profit more
from __sync_fetch_and_add builtin. To keep it simplistic, this patch
adjusts qemu driver initialization rather than adding a new atomic
increment macro.
Commit d506a51aeb meant to check for
QEMU_CAPS_DRIVE_IOTUNE_MAX, but checked for QEMU_CAPS_DRIVE_IOTUNE
instead. That's clearly visible from the diff, but it got in. Because
of that, we were supplying information unknown for QEMU if it wasn't new
enough and we couldn't even properly handle the error, leading to
"Unexpected error". Also iops_size came at the same time with all the
other "_max" options, so check whether we're not setting that either if
QEMU_CAPS_DRIVE_IOTUNE_MAX is not supported.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1224053
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This loop occurs just after we've assured that all devices that
require a PCI device have been assigned and all necessary PCI
controllers have been added. It is the perfect place to add other
potentially auto-generated PCI controller attributes that are
dependent on the controller's PCI address (upcoming patch).
There is a convenient loop through all controllers at the end of the
function, but the patch to add new functionality will be cleaner if we
first rearrange that loop a bit.
Note that the loop originally was accessing info.addr.pci.bus prior to
determining that the pci part of the object was valid. This isn't
dangerous in any way, but seemed a bit ugly, so I fixed it.
This reverts commit 7b401c3bda.
Until libvirt is able to differentiate whether heads='1' is just a
leftover from previous libvirt or whether that's added by user on
purpose and also whether the domain was started with the support for
qxl's max_outputs, we cannot incorporate this patch into the tree
due to compatibility reasons.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1245476
We won't return the errno after commit 0d7f45ae, and
the more clearly error will be set in the code in vircgroup*.
Also We will always report error "Operation not permitted",
because the return is -1.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
The scope name, even according to our docs is
"machine-$DRIVER\x2d$VMNAME.scope" virSystemdMakeScopeName would use the
resource partition name instead of "machine-" if it was specified thus
creating invalid scope paths.
This makes libvirt drop cgroups for a VM that uses custom resource
partition upon reconnecting since the detected scope name would not
match the expected name generated by virSystemdMakeScopeName.
The error is exposed by the following log entry:
debug : virCgroupValidateMachineGroup:302 : Name 'machine-qemu\x2dtestvm.scope' for controller 'cpu' does not match 'testvm', 'testvm.libvirt-qemu' or 'machine-test-qemu\x2dtestvm.scope'
for a "/machine/test" resource and "testvm" vm.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1238570
Few parts of the code looked at the current progress of and assumed that
a two phase blockjob is in the _READY state as soon as the progress
reached 100% (info.cur == info.end). In current versions of qemu this
assumption is invalid and qemu exposes a new flag 'ready' in the
query-block-jobs output that is set to true if the job is actually
finished.
This patch adds internal data handling for reading the 'ready' flag and
acting appropriately as long as the flag is present.
While this still doesn't fix the virsh client problem with two phase
block jobs and the --pivot option, it at least improves the error
message:
$ virsh blockcommit --wait --verbose vm vda --base vda[1] --active --pivot
Block commit: [100 %]error: failed to pivot job for disk vda
error: internal error: unable to execute QEMU command 'block-job-complete': The active block job for device 'drive-virtio-disk0' cannot be completed
to
$ virsh blockcommit --wait --verbose VM vda --base vda[1] --active --pivot
Block commit: [100 %]error: failed to pivot job for disk vda
error: block copy still active: disk 'vda' not ready for pivot yet
If one calls update-device with information that is not updatable,
libvirt reports success even though no data were updated. The example
used in the bug linked below uses updating device with <boot order='2'/>
which, in my opinion, is a valid thing to request from user's
perspective. Mainly since we properly error out if user wants to update
such data on a network device for example.
And since there are many things that might happen (update-device on disk
basically knows just how to change removable media), check for what's
changing and moreover, since the function might be usable in other
drivers (updating only disk path is a valid possibility) let's abstract
it for any two disks.
We can't possibly check for everything since for many fields our code
does not properly differentiate between default and unspecified values.
Even though this could be changed, I don't feel like it's worth the
complexity so it's not the aim of this patch.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1007228
Allows to specify maximum number of head to QXL driver.
Actually can be a compatiblity problem as heads in the XML configuration
was set by default to '1'.
Signed-off-by: Frediano Ziglio <fziglio@redhat.com>
For s390-ccw-virtio machines the default bus type is set to ccw.
Specifing an address element allows to override the default.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@us.ibm.com>
Reviewed-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
Adding the recently in qemu added 9pfs support for virtio-ccw.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@us.ibm.com>
Reviewed-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
If we are migrating to an UNIX socket, we accept() a connection
from qemu and use that FD to set up a tunnel. However, the FD is
not closed as often as it should be.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>