This improves commit 706b5b6277 in a way that we check qemu capabilities
instead of what architecture we are running on to detect whether we can
use *virtio-vga* model or not. This is not a case only for arm/aarch64.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Commit 21373feb added support for primary virtio-vga device but it was
checking for virtio-gpu. Let's check for existence of virtio-vga if we
want to use it.
Virtio video device is currently represented by three different models
*virtio-gpu-device*, *virtio-gpu-pci* and *virtio-vga*. The first two
models are tied together and if virtio video devices is compiled in they
both exist. However, the *virtio-vga* model doesn't have to exist on
some architectures even if the first two models exist. So we cannot
group all three together.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Before this patch we've checked qemu capabilities for video devices
only while constructing qemu command line using "-device" option.
Since we support qemu only if "-device" option is present we can use
the same capabilities to check also video devices while using "-vga"
option to construct qemu command line.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
All definition validation that doesn't depend on qemu capabilities
and was allowed previously as valid definition should be placed into
qemuDomainDefValidate.
The check whether video type is supported or not was based on an enum
that translates type into model. Use switch to ensure that if new
video type is added, it will be properly handled.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
We generally uses QEMU_CAPS_DEVICE_$NAME to probe for existence of some
device and QEMU_CAPS_$NAME_$PROP to probe for existence of some property
of that device.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
If QEMU in question supports QMP, this capability is set if
QEMU_CAPS_DEVICE_QXL was set based on existence of "-device qxl". If
libvirt needs to parse *help*, because there is no QMP support, it
checks for existence of "-vga qxl", but it also parses output of
"-device ?" and sets QEMU_CAPS_DEVICE_QXL too.
Now that libvirt supports only QEMU that has "-device" implemented it's
safe to drop this capability and stop using it.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
This patch simplifies QEMU capabilities for QXL video device. QEMU
exposes this device as *qxl-vga* and *qxl* and they are both the same
device with the same set of parameters, the only difference is that
*qxl-vga* includes VGA compatibility.
Based on QEMU code they are tied together so it's safe to check only for
presence of only one of them.
This patch also removes an invalid test case "video-qxl-sec-nodevice"
where there is only *qxl-vga* device and *qxl* device is not present.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Qemu supports *xen* video device only with XEN and this code was part
of xenner code. We dropped support for xenner in commit de9be0a.
Before this patch if you used 'xen' video type you ended up with
domain without any video device at all. Now we don't allow to start
such domain.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
It was never safe anyway and as such shouldn't have been enabled in the
first place. Future patches will allow hot-(un)pluging of some ivshmem
devices as a workaround.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Due to the switch of parameters in a call to virDomainShmemDefEquals()
no device was found when looking for device with all the information
except address. Also fix the indentation.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
If the last event callback is unregistered while the event loop is
dispatching, it is only marked as deleted, but not removed. The number
of callbacks is more than zero in that case, so the timer is not
removed. Because it can be removed in this function now (but also
accessed afterwards so that we set 'isDispatching = false' and have it
locked), we need to temporarily increase the reference counter of the
state for the duration of this function.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
There is a repeating pattern of code that removes the timer if it's not
needed. So let's move it to a new function. We'll also use it later.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
There should be one more reference because it is being kept in the list
of callbacks as an opaque. We also unref it properly using
virObjectFreeCallback.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Currently Libvirt allows attempts to migrate read only disks. Qemu
cannot handle this as read only disks cannot be written to on the
destination system. The end result is a cryptic error message and a
failed migration.
This patch causes migration to fail earlier and provides a meaningful
error message stating that migrating read only disks is not supported.
Signed-off-by: Corey S. McQuay <csmcquay@linux.vnet.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Make sure that the topology results into a sane number of cpus (up to
UINT_MAX) so that it can be sanely compared to the vcpu count of the VM.
Additionally the helper added in this patch allows to fetch the total
number the topology results to so that it does not have to be
reimplemented later.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1378290
In both virLogParseOutput and virLogParseFilter, rather than returning
NULL, goto cleanup since it's possible that for each the first condition
passes, but the || condition doesn't and thus we leak memory.
The dnsmasq man page recommends that dhcp-authoritative "should be
set when dnsmasq is definitely the only DHCP server on a network".
This is the case for libvirt-managed virtual networks.
The effect of this is that VMs that fail to renew their DHCP lease
in time (e.g. if the VM or host is suspended) will be able to
re-acquire the lease even if it's expired, unless the IP address has
been taken by some other host. This avoids various annoyances caused
by changing VM IP addresses.
Sometimes virObjectEventStateFlush can be called without timer (if the
last event was unregistered right when the timer fired). There is a
check for timer == -1, but that triggers warning and other log messages,
which is unnecessary.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Handling of outputs and filters has been changed in a way that splits
parsing and defining. Do the same thing for logging priority as well, this
however, doesn't need much of a preparation.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This is mainly virLogAddOutputTo* which were replaced by virLogNewOutputTo* and
the previously poorly named ones virLogParseAndDefine* functions. All of these
are unnecessary now, since all the original callers were transparently switched
to the new model of separate parsing and defining logic.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Similar to outputs, parser should do parsing only, thus the 'define' logic
is going to be stripped from virLogParseAndDefineFilters by replacing calls to
this method to virLogSetFilters instead.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Since virLogParseAndDefineOutputs is going to be stripped from 'output defining'
logic, replace all relevant occurrences with virLogSetOutputs call to make the
change transparent to all original callers (daemons mostly).
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This method will eventually replace virLogParseAndDefineFilters which
currently does both parsing and defining.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This API is the entry point to output modification of the logger. Currently,
everything is done by virLogParseAndDefineOutputs. Parsing and defining will be
split into two operations both handled by this method transparently.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Abstraction added over parsing a single filter. The method parses potentially a
set of logging filters, while adding each filter logging object to a
caller-provided array.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Another abstraction added on the top of parsing a single logging output. This
method takes and parses the whole set of outputs, adding each single output
that has already been parsed into a caller-provided array. If the user-supplied
string contained duplicate outputs, only the last occurrence is taken into
account (all the others are removed from the list), so we silently avoid
duplicate logs.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Same as for outputs, introduce a new method, that is basically the same as
virLogParseAndDefineFilter with the difference that it does not define the
filter. It rather returns a newly created object that needs to be inserted into
a list and then defined separately.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Introduce a method to parse an individual logging output. The difference
compared to the virLogParseAndDefineOutput is that this method does not define
the output, instead it makes use of the virLogNewOutputTo* methods introduced
in the previous patch and just returns the virLogOutput object that has to be
added to a list of object which then can be defined as a whole via
virLogDefineOutputs. The idea remains still the same - split parsing and
defining of the logging primitives (outputs, filters).
Additionally, since virLogNewOutputTo* methods are now finally used,
ATTRIBUTE_UNUSED can be successfully removed from the methods' definitions,
since that was just to avoid compiler complaints about unused static functions.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Now that we're in the critical section, syslog connection can be re-opened
by issuing openlog, which is something that cannot be done beforehand, since
syslog keeps its file descriptor private and changing the tag earlier might
introduce a log inconsistency if something went wrong with preparing a new set
of logging outputs in order to replace the existing one.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Continuing with the effort to split output parsing and defining, these new
functions return a logging object reference instead of defining the output.
Eventually, these functions will replace the existing ones (virLogAddOutputTo*)
which will then be dropped.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Prepare a method that only defines a set of filters. It takes a list of
filters, preferably created by virLogParseFilters. The original set of filters
is reset and replaced by the new user-provided set of filters.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Prepare a method that only defines a set of outputs. It takes a list of
outputs, preferably created by virLogParseOutputs. The original set of outputs
is reset and replaced by the new user-provided set of outputs.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Outputs are a bit trickier than filters, since the user(config)-specified
set of outputs can contain duplicates. That would lead to logging the same
message twice. For compatibility reasons, we cannot just error out and forbid
the daemon to start if we find duplicate outputs which do not make sense.
Instead, we could silently take into account only the last occurrence of the
duplicate output and remove all the previous ones, so that the logger will not
try to use them when it is looping over all of its registered outputs.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
In order to later split output parsing and output defining, introduce a new
function which will create a new virLogOutput object which the parser will
insert into a list with the list being eventually defined.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
There is really no reason why we could not keep journald's fd within the
journald output object the same way as we do for regular file-based outputs.
By doing this we later won't have to special case the journald-based output
(due to the fd being globally shared) when replacing the existing set of outputs
with a new one. Additionally, by making this change, we don't need the
virLogCloseJournald routine anymore, plain virLogCloseFd will suffice.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Right now virLogParse* functions are doing both parsing and defining of filters
and outputs which should be two separate operations. Since the naming is
apparently a bit poor this patch renames these functions to
virLogParseAndDefine* which eventually will be replaced by virLogSet*.
Additionally, virLogParse{Filter,Output} will be later (after the split) reused,
so that these functions do exactly what the their name suggests.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
During first stage of virlog.c refactor, commit 0b231195 forgot to remove the
macro definition along with its usage.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
If the event is already disabled, then don't bother with setting it
disabled again. Causes unnecessary error on systems that don't support
the feature anyway.
The intel-iommu device has existed since QEMU 2.2.0, but
it was only possible to create it with -device since
QEMU 2.7.0, thanks to:
commit 621d983a1f9051f4cfc3f402569b46b77d8449fc
Author: Marcel Apfelbaum <marcel@redhat.com>
Date: Mon Jun 27 18:38:34 2016 +0300
hw/iommu: enable iommu with -device
Use the standard '-device intel-iommu' to create the IOMMU device.
The legacy '-machine,iommu=on' can still be used.
The libvirt capability check & command line formatting code
is thus broken for all QEMU versions 2.2.0 -> 2.6.0 inclusive.
This fixes it to use iommu=on instead.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Since introduction of chardev hotplug the code was wrong for the UDP
case and basically created a TCP socket instead. Use proper objects and
type for UDP.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1377602
Rather than copy-paste - use a macro
Unfortunately due to how the RNG schema was written keeping the 'value'
and 'value'_max next to each other in the XML causes a schema failure,
so the FORMAT has to write out singly rather than optimizing to write
out both values at once
Signed-off-by: John Ferlan <jferlan@redhat.com>
We're about to add more options, let's avoid having multiple if-then-else
which each try to set up the qemuMonitorJSONMakeCommand call with all the
parameters it knows about.
Instead, use the fact that when a NULL is found in the argument list that
processing of the remaining arguments stops and just have call.
Signed-off-by: John Ferlan <jferlan@redhat.com>
We're about to add 6 new options and it appears (from testing) one cannot
utilize both the shorthand (alias) and (much) longer names for the arguments.
So modify the command builder to use the longer name and of course alter the
test output .args to have the similarly innocuous long name.
Also utilize a macro to build that name makes it so much more visually
appealing and saves a few characters or potential cut-n-paste issues.
Signed-off-by: John Ferlan <jferlan@redhat.com>
When I added support for the pcie-expander-bus controller in commit
bc07251f, I incorrectly thought that it only had a single slot
available. Actually it has 32 slots, just like the root complex aka
pcie-root (the part that I *did* get correct is that unlike pcie-root
a pcie-expander-bus doesn't allow any integrated endpoint devices -
only pcie-root-ports and dmi-to-pci-controllers are allowed).
Reduce some cut-n-paste code by creating common helper. Make use of the
recently added virJSONValueObjectStealArray to grab the devices list as
part of the common code (we we can Free the reply) and return devices for
each of the callers to continue to parse.
NB: This also adds error checking to qemuMonitorJSONDiskNameLookup
Signed-off-by: John Ferlan <jferlan@redhat.com>
Provide the Steal API for any code paths that will desire to grab the
object array and then free it afterwards rather than relying to freeing
the whole chain from the reply.
Rather than use stack allocated state context pointers, let's allocate and
free the state context pointer. In doing so, we'll shrink the code a bit
since many routines perform the same initialization sequence.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Since none of the callers check the status, let's just alter it to
a static void.
While we're at it - scrap the local runtime variable and just do the
math in the VIR_DEBUG directly.
Signed-off-by: John Ferlan <jferlan@redhat.com>
If attaching to a qemu process fails after opening the monitor socket
libvirt does not clean up the monitor. As the monitor also holds a
reference to the domain object the qemu attach API basically leaks it.
QEMU also does not interact on a second monitor connection and thus a
further attempt to attach to it would lock up.
Prevent libvirt from leaking the monitor by explicitly closing it.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1378401
Attaching to a existing qemu process allows to get us into a situation
when qemu is new enough to have JSON monitor and new vCPU hotplug but
the json monitor is not used. The vCPU detection code would require it
though. This broke attaching to qemu processes.
Make the condition less strict and just skip the vCPU hotplug detection
if JSON monitor is not available.
Resolves one of the symptoms in:
https://bugzilla.redhat.com/show_bug.cgi?id=1378401
Libvirt, on its own, shouldn't decide whether an expired lease should
stay in the custom leases database or not. It should rather rely on
the 'DEL' event from dnsmasq.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
We are about to add 6 new values to fetch. This will put us over the
current limit of 16 (we're at 13 now).
Once there are more than 16 parameters, this will affect existing clients
that attempt to fetch blockiotune config values for the domain from the
remote host since the server side has no mechanism to determine whether
the capability for the emulator exists and thus would attempt to return
all known values from the persistentDef. If attempting to fetch the
blockiotune values from a running domain, the code will check the emulator
capabilities and set maxparams (in qemuDomainGetBlockIoTune) appropriately.
On the client side of the remote connection, it uses this constant in
xdr_remote_domain_get_block_io_tune_ret and virTypedParamsDeserialize
calls, so if a remote server returns more than 16 parameters, then the
client will fail with "Unable to decode message payload".
Signed-off-by: John Ferlan <jferlan@redhat.com>
The REMOTE_DOMAIN_MEMORY_PARAMETERS_MAX was erroneously used in the
remoteDomainBlockStatsFlags and remoteDomainGetBlockIoTune calls. Change
the constant to be the right one.
Fortunately, all 3 are defined as 16.
Signed-off-by: John Ferlan <jferlan@redhat.com>
This breaks vCPU hotplug, because when starting a domain, we
create a copy of domain definition (which becomes live XML) and
during the post parse callbacks we might adjust some tunings so
that vCPU hotplug is possible.
This reverts commit 581b7756af.
This breaks vCPU hotplug, because when starting a domain, we
create a copy of domain definition (which becomes live XML) and
during the post parse callbacks we might adjust some tunings so
that vCPU hotplug is possible.
This reverts commit c0f90799bc.
Certain operations may make the vcpu order information invalid. Since
the order is primarily used to ensure migration compatibility and has
basically no other user benefits, clear the order prior to certain
operations and document that it may be cleared.
All the operations that would clear the order can still be properly
executed by defining a new domain configuration rather than using the
helper APIs.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1370357
virDomainDefSetVcpus was not designed to handle coldplug of vcpus now
that we can set state of vcpus individually.
Introduce qemuDomainSetVcpusConfig that properly handles state changes
of vcpus when coldplugging so that invalid configurations are not
created.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1375939
The current code that validates duplicate vcpu order would not work
properly if the order would exceed def->maxvcpus. Limit the order to the
interval described.
The bitmap indexes for the order duplicate check are shifted to 0 since
vcpu order 0 is not allowed. The error message doesn't need such
treating though.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1370360
https://bugzilla.redhat.com/show_bug.cgi?id=1292984
Hold on to your hats, because this is gonna be wild.
In bd3e16a3 I've tried to expose sanlock io_timeout. What I had
not realized (because there is like no documentation for sanlock
at all) was very unusual way their APIs work. Basically, what we
do currently is:
sanlock_add_lockspace_timeout(&ls, io_timeout);
which adds a lockspace to sanlock daemon. One would expect that
io_timeout sets the io_timeout for it. Nah! That's where you are
completely off the tracks. It sets timeout for next lockspace you
will probably add later. Therefore:
sanlock_add_lockspace_timeout(&ls, io_timeout = 10);
/* adds new lockspace with default io_timeout */
sanlock_add_lockspace_timeout(&ls, io_timeout = 20);
/* adds new lockspace with io_timeout = 10 */
sanlock_add_lockspace_timeout(&ls, io_timeout = 40);
/* adds new lockspace with io_timeout = 20 */
And so on. You get the picture.
Fortunately, we don't allow setting io_timeout per domain or per
domain disk. So we just need to set the default used in the very
first step and hope for the best (as all the io_timeout-s used
later will have the same value).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Currently, we are checking for sanlock_add_lockspace_timeout
which is good for now. But in a subsequent patch we are going to
use sanlock_write_lockspace (which sets an initial value for io
timeout for sanlock). Now, there is no reason to check for both
functions in sanlock library as the sanlock_write_lockspace was
introduced in 2.7 release and the one we are currently checking
for in the 2.5 release. Therefore it is safe to assume presence
of sanlock_add_lockspace_timeout when sanlock_write_lockspace
is detected.
Moreover, the macro for conditional compilation is renamed to
HAVE_SANLOCK_IO_TIMEOUT (as it now encapsulates two functions).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If this reminds you of a commit message from around a year ago, it's
41c2aa729f and yes, we're dealing with
"the same thing" again. Or f309db1f4d and
it's similar.
There is a logic in place that if there is no real need for
memory-backend-file, qemuBuildMemoryBackendStr() returns 0. However
that wasn't the case with hugepage backing. The reason for that was
that we abused the 'pagesize' variable for storing that information, but
we should rather have a separate one that specifies whether we really
need the new object for hugepage backing. And that variable should be
set only if this particular NUMA cell needs special treatment WRT
hugepages.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1372153
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Due to a copy and paste error, the scheduler 'cap' parameter
was over-writing the 'weight' parameter when preparing the
return parameters in libxlDomainGetSchedulerParametersFlags.
As a result, the scheduler weight was never shown when getting
schedinfo and setting the weight failed as well
virsh schedinfo testvm
Scheduler : credit
cap : 0
virsh schedinfo testvm --cap 50 --weight 500
Scheduler : credit
error: invalid scheduler option: weight
The obvious fix is to assign the 'caps' parameter to the correct
item in the parameter list.
Reported-by: Volo M. <vm@vovs.net>
Add support for formating/parsing libxl channels.
Syntax on xen libxl goes as following:
channel=["connection=pty|socket,path=/path/to/socket,name=XXX",...]
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Acked-by: Jim Fehlig <jfehlig@suse.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
And allow libxl to handle channel element which creates a Xen
console visible to the guest as a low-bandwitdh communication
channel. If type is PTY we also fetch the tty after boot using
libxl_channel_getinfo to fetch the tty path. On socket case,
we autogenerate a path if not specified in the XML. Path autogenerated
is slightly different from qemu driver: qemu stores also on
"channels/target" but it creates then a directory per domain with
each channel target name. libxl doesn't appear to have a clear
definition of private files associated with each domain, so for
simplicity we do it slightly different. On qemu each autogenerated
channel goes like:
channels/target/<domain-name>/<target name>
Whereas for libxl:
channels/target/<domain-name>-<target name>
Should note that if path is not specified it won't persist,
existing only on live XML, unless user had initially specified it.
Since support for libxl channels only came on Xen >= 4.5 we therefore
need to conditionally compile it with LIBXL_HAVE_DEVICE_CHANNEL.
After this patch and having a qemu guest agent:
$ cat domain.xml | grep -a1 channel | head -n 5 | tail -n 4
<channel type='unix'>
<source mode='bind' path='/tmp/channel'/>
<target type='xen' name='org.qemu.guest_agent.0'/>
</channel>
$ virsh create domain.xml
$ echo '{"execute":"guest-network-get-interfaces"}' | socat
stdio,ignoreeof unix-connect:/tmp/channel
{"execute":"guest-network-get-interfaces"}
{"return": [{"name": "lo", "ip-addresses": [{"ip-address-type": "ipv4",
"ip-address": "127.0.0.1", "prefix": 8}, {"ip-address-type": "ipv6",
"ip-address": "::1", "prefix": 128}], "hardware-address":
"00:00:00:00:00:00"}, {"name": "eth0", "ip-addresses":
[{"ip-address-type": "ipv4", "ip-address": "10.100.0.6", "prefix": 24},
{"ip-address-type": "ipv6", "ip-address": "fe80::216:3eff:fe40:88eb",
"prefix": 64}], "hardware-address": "00:16:3e:40:88:eb"}, {"name":
"sit0"}]}
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
So far only guestfwd and virtio were supported. Add an additional
for Xen as libxl channels create a Xen console visible to the guest.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
The qemucapsprobe helper calls virQEMUCapsNewForBinaryInternal with
caps == NULL, causing the following crash:
Program received signal SIGSEGV, Segmentation fault.
#0 0x00007ffff788775f in virQEMUCapsInitHostCPUModel
(qemuCaps=qemuCaps@entry=0x649680, host=host@entry=0x10) at
src/qemu/qemu_capabilities.c:2969
#1 0x00007ffff7889dbf in virQEMUCapsNewForBinaryInternal
(caps=caps@entry=0x0, binary=<optimized out>,
libDir=libDir@entry=0x4033f6 "/tmp", cacheDir=cacheDir@entry=0x0,
runUid=runUid@entry=4294967295, runGid=runGid@entry=4294967295,
qmpOnly=true) at src/qemu/qemu_capabilities.c:4039
#2 0x0000000000401702 in main (argc=2, argv=0x7fffffffd968) at
tests/qemucapsprobe.c:73
Caused by v2.2.0-182-g68c7011.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1368417
So far, when it comes to 'virsh update-device --config' of disks
we are limiting ourselves for just the disk source update and
just for CDROMs and floppies. This makes no sense. Especially if
you look around and see that we already allow full update to
graphics and net devices. So let's just take whatever XML user
wants to have there and replace our internal definition with it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
libxl events are delivered to libvirt via the libxlDomainEventHandler
callback registered with libxl. Documenation in
$xensrc/tools/libxl/libxl_event.h states that the callback "may occur
on any thread in which the application calls libxl". This can result
in deadlock since many of the libvirt callees of libxl hold a lock on
the virDomainObj they are working on. When the callback is invoked, it
attempts to find a virDomainObj corresponding to the domain ID provided
by libxl. Searching the domain obj list results in locking each obj
before checking if it is active, and its ID equals the requested ID.
Deadlock is possible when attempting to lock an obj that is already
locked further up the call stack. Indeed, Max Ustermann reported an
instance of this deadlock
https://www.redhat.com/archives/libvir-list/2015-November/msg00130.html
Guido Rossmueller also recently stumbled across it
https://www.redhat.com/archives/libvir-list/2016-September/msg00287.html
Fix the deadlock by moving the lookup of virDomainObj to the
libxlDomainShutdownThread. After this patch, libxl events are
enqueued on the libvirt side and processed by dedicated thread,
avoiding the described deadlock.
Reported-by: Max Ustermann <ustermann78@web.de>
Reported-by: Guido Rossmueller <Guido.Rossmueller@gdata.de>
enum types are unsigned and the qemuGetCompressionProgram
function can return -1 on error. It is therefore inappropriate
to return an enum type. This fixes a build error where the
internal 'ret' variable was used in a comparison with -1
../../src/qemu/qemu_driver.c: In function 'qemuGetCompressionProgram':
../../src/qemu/qemu_driver.c:3280:5: error: comparison of unsigned expression < 0 is always false [-Werror=type-limits]
../../src/qemu/qemu_driver.c:3289:5: error: comparison of unsigned expression < 0 is always false [-Werror=type-limits]
cc1: all warnings being treated as errors
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When creating a copy of virDomainDef we save ourselves the
trouble of writing deep-copy functions and just format and parse
back domain/device XML. However, the XML we are parsing was
already fully formatted - there is no reason to run post parse
callbacks (which fill in blanks - there are none!).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is an internal flag that prevents our two entry points to
XML parsing (virDomainDefParse and virDomainDeviceDefParse) from
running post parse callbacks. This is expected to be used in
cases when we already have full domain/device XML and we are just
parsing it back (i.e. virDomainDefCopy or virDomainDeviceDefCopy)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Just like we did two commits ago, don't try to fetch capabilities
for non-existing binary. Re-use the ones we have for running
domain.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Just like we did two commits ago, don't try to fetch capabilities
for non-existing binary. Re-use the ones we have for running
domain.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We can't rely on def->emulator path. It may be provided by user
as we give them opportunity to provide their own XML for
migration. Therefore the path may point to just whatever binary
(or even to a non-existent file). Moreover, this path is meant
for destination, but the capabilities lookup is done on source.
What we can do is to assume same capabilities for post parse
callbacks as the running domain has. They will be used just to
add some default models/controllers/devices/... anyway.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Just like virDomainDefPostParseCallback has gained new
parseOpaque argument, we need to follow the logic with
virDomainDeviceDefPostParse.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We want to pass the proper opaque pointer instead of NULL to
virDomainDefParse and subsequently virDomainDefParseNode too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We want to pass the proper opaque pointer instead of NULL to
virDomainDefParseXML and subsequently virDomainDefPostParse too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Some callers might want to pass yet another pointer to opaque
data to post parse callbacks. The driver generic one is not
enough because two threads executing post parse callback might
want to see different data (e.g. domain object pointer that
domain def belongs to).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Based upon a patch from Chen Hanxiao <chenhanxiao@gmail.com>, rather than
need to call virFindFileInPath twice, let's just save the path and pass it
along with the compressed type. (NB: the second call would be in virExec as
called from virCommandRunAsync which is called from qemuMigrationToFile
using the argument 'compressor' which up to this point would be the string
from the cfg file that isn't the fully qualified path).
Since we now have the path, we can remove qemuCompressProgramName which
would return NULL or the string representation of the compress type.
There's only one caller and the code is duplicitous just converting the
recently converted cfg image name back into it's string value in order to
get/find the path to the image. A subsequent patch can return this path.
Let's do some more code reuse - there are 3 other callers that care to
check/get the compress program. Each of those though cares whether the
requested cfg image is valid and exists. So, add a parameter to handle
those cases.
NB: We won't need to initialize the returned value in the case where
the cfg image doesn't exist since the called program will handle that.
Add a new parameter 'styleFormat' to be used when printing the
warning message so that it's "clearer" what style of compression
call caused the error. Add that style to both messages as a paremter.
Also a VIR_WARN error message doesn't need to be translated
(e.g. inside _()), so remove the need for the translation.
There's only one caller now anyway... Besides it's just a shell for
getting the compress type. Subsequent patches will return the path
to the compression program.
Split out the guts of getCompressionType to perform the same functionality
in the new helper program with a subsequent patch goal to be reusable for
other callers making similar checks/calls to ensure the compression type
is valid and that the compression program cannot be found.
If passing an empty usbdevice_list to libxl, qemu will always get an
-usb parameter for HVM guests with only non-USB input devices. This
causes qemu to crash when passing pvusb device on HVM guests.
The solution is to allocate the list only when an item to put in it
is found.
Both cpuCompare* APIs are renamed to virCPUCompare*. And they should now
work for any guest CPU definition, i.e., even for host-passthrough
(trivial) and host-model CPUs. The implementation in x86 driver is
enhanced to provide a hint about -noTSX Broadwell and Haswell models
when appropriate.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The function is similar to virCPUDataCheckFeature, but it works directly
on CPU definition rather than requiring it to be transformed into CPU
data first.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The API is supposed to make sure the provided CPU definition does not
use a CPU model which is not supported by the hypervisor (if at all
possible, of course).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Keeping nfeatures_max set to 0 while nfeatures > 0 and some features are
already stored in features array is just asking for problems once we
want to add a new feature into the array.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The reworked API is now called virCPUUpdate and it should change the
provided CPU definition into a one which can be consumed by the QEMU
command line builder:
- host-passthrough remains unchanged
- host-model is turned into custom CPU with a model and features
copied from host
- custom CPU with minimum match is converted similarly to host-model
- optional features are updated according to host's CPU
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
x86ModelFromCPU is used to provide CPUID data for features matching
@policy. This patch allows callers to set @policy to -1 to get combined
CPUID for all CPU features (including those implicitly provided a CPU
model) specified in CPU def.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The domain capabilities XML is capable of showing whether each guest CPU
mode is supported or not with a possibility to provide additional
details. This patch enhances host-model capability to advertise the
exact CPU model which will be used as a host-model:
<cpu>
...
<mode name='host-model' supported='yes'>
<model fallback='allow'>Broadwell</model>
<vendor>Intel</vendor>
<feature policy='disable' name='aes'/>
<feature policy='require' name='vmx'/>
</mode>
...
</cpu>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The ARM CPU driver wrongly reported host CPU model as "host", which made
host-model to be just an alias for host-passthrough. Let's drop this
insanity.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Host capabilities provide libvirt's view of the host CPU, but for a
useful support for host-model CPUs we really need a hypervisor's view of
the CPU. And since the view can be differ with emulator, qemu
capabilities is the best place to store the host CPU model.
This patch just copies the CPU model from host capabilities, but this
will change in the future.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The function filters all CPU features through a given callback while
copying CPU model related parts of a CPU definition.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The function moves CPU model related parts from one CPU definition to
another. It can be used to avoid unnecessary copies from a temporary CPU
definitions which will be freed anyway.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Useful for copying a CPU definition without model related parts (i.e.,
without model name, feature list, vendor).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
In case a hypervisor is able to tell us a list of supported CPU models
and whether each CPU models can be used on the current host, we can
propagate this to domain capabilities. This is a better alternative
to calling virConnectCompareCPU for each supported CPU model.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Listing all CPU models supported by QEMU in domain capabilities makes
little sense when libvirt will refuse any model it doesn't know about.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Some CPU drivers (such as arm) do not provide list of CPUs libvirt
supports and just pass any CPU model from domain XML directly to QEMU.
Such driver need to return models == NULL and success from cpuGetModels.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemu_command.c should deal with translating our domain definition into a
QEMU command line and nothing else.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The list of supported CPU models in domain capabilities is stored in
virDomainCapsCPUModels. Let's use the same object for storing CPU models
in QEMU capabilities.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The patch adds <cpu> element to domain capabilities XML:
<cpu>
<mode name='host-passthrough' supported='yes'/>
<mode name='host-model' supported='yes'/>
<mode name='custom' supported='yes'>
<model>Broadwell</model>
<model>Broadwell-noTSX</model>
...
</mode>
</cpu>
Applications can use it to inspect what CPU configuration modes are
supported for a specific combination of domain type, emulator binary,
guest architecture and machine type.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Our internal APIs mostly use virArch rather than strings. Switching
cpuGetModels to virArch will save us from unnecessary conversions in the
future.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
By default, virt-manager (and likely other libvirt-based apps) sets
the VIR_MIGRATE_PERSIST_DEST flag when invoking the migrate API, which
fails in a Xen setup since the libxl driver does not support the flag.
Persisting a domain is a trivial task in the grand scheme of migration,
so be nice to libvirt apps and add support for VIR_MIGRATE_PERSIST_DEST
in the libxl driver.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
We have a few of senarios that libvirtd would invoke qemuProcessStop
and leave a "shutting down" in /var/log/libvirt/qemu/$DOMAIN.log.
The shutoff reason showing in debug log is also very important
for us to know why VM shutting down in domain log,
as we seldom enable debug log of libvirtd.
Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1322717
During offline migration, no storage is copied. Nor disks, nor
NVRAM file, nor anything. We use qemu for that and because domain
is not running there's nobody to copy that for us.
We should document this to avoid confusing users.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Calling virDomainGetEmulatorPinInfo on a live VM with automatic NUMA
pinning and VIR_DOMAIN_AFFECT_CONFIG would return the automatic pinning
data in some cases which is bogus. Use the autoCpuset property only when
called on a live definition.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1365779
Calling virDomainGetVcpuPinInfo on a live VM with automatic NUMA pinning
and VIR_DOMAIN_AFFECT_CONFIG would return the automatic pinning data
in some cases which is bogus. Use the autoCpuset property only when
called on a live definition.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1365779
Sometimes adding a separate variable to access vm->privateData is not
necessary. Add a macro that will do the typecasting rather than having
to add a temp variable to force the compiler to typecast it.
Old libvirt represents
<graphics type='spice'>
<listen type='none'/>
</graphics>
as
<graphics type='spice' autoport='no'/>
In this mode, QEMU doesn't listen for SPICE connection anywhere and
clients have to use virDomainOpenGraphics* APIs to attach to the domain.
That is, the client has to run on the same host where the domains runs
and it's impossible to tell the client to reconnect to the destination
QEMU during migration (unless there is some kind of proxy on the host).
While current libvirt correctly ignores such graphics devices when
creating graphics migration cookie, old libvirt just sends
<graphics type='spice' port='0' listen='0.0.0.0' tlsPort='-1'/>
in the cookie. After seeing this cookie, we happily would call
client_migrate_info QMP command and wait for SPICE_MIGRATE_COMPLETED
event, which is quite pointless since the doesn't know where to connecti
anyway. We should just ignore such cookies.
https://bugzilla.redhat.com/show_bug.cgi?id=1376083
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Checking if a domain's definition or if it is active before we got a job
is pointless since the domain might have changed in the meantime.
Luckily libvirtd didn't crash when the API tried to talk to an inactive
domain:
debug : qemuDomainObjBeginJobInternal:2914 : Started job: modify
(async=none vm=0x7f8f340140c0 name=ble)
debug : qemuDomainObjEnterMonitorInternal:3137 : Entering monitor
(mon=(nil) vm=0x7f8f340140c0 name=ble)
warning : virObjectLock:319 : Object (nil) ((unknown)) is not a
virObjectLockable instance
debug : qemuMonitorOpenGraphics:3505 : protocol=spice fd=27
fdname=graphicsfd skipauth=1
error : qemuMonitorOpenGraphics:3508 : invalid argument: monitor must
not be NULL
debug : qemuDomainObjExitMonitorInternal:3160 : Exited monitor
(mon=(nil) vm=0x7f8f340140c0 name=ble)
debug : qemuDomainObjEndJob:3068 : Stopping job: modify (async=none
vm=0x7f8f340140c0 name=ble)
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
We can receive NULL as sync reply in two situations. First
is garbage sync reply and this situation is handled by
resending sync message. Second is different cases
of rebooting guest, destroing domain etc and we can
give more meaningful error message. Actually we have
this error message in qemuAgentCommand already which checks
for the same sitatuion. AFAIK case with mon->running
is just to be safe on adding some future(?) cases of
returning NULL reply.
We can easily handle receiving garbage on sync. We don't
have to make client deal with this situation. We just
need to resend sync command but this time garbage is
not be possible.
When we wait for sync reply we can receive delayed
reply to syncs or commands that were sent erlier. We can
safely skip them until we receive sync reply with correct id.
There is no much sense report this situation to client.
Actually with a bit of "luck" if we involve client into
this the play can go on forever: send sync 0, receive
sync reply -1, send sync 1, receive reply 0 ...
After sync is sent we can receive garbare and this is not error.
Consider next regular case:
1. libvirtd sends sync
2. qga sends partial sync reply and die
3. libvirtd sends sync
4. qga sends sync reply
5. libvirtd receives garbage
(half of first reply and second reply together)
We should handle this situation as it is recoverable.
Next sync can succeed. Let's report reply is NULL,
it will be converted to the VIR_ERR_AGENT_UNSYNCED
which signals client to retry.
Errors in qemuAgentIOProcessLine stop agent IO processing just
like any regular IO error, however some of current errors
that this functions spawns are false positives. Consider
next case for example:
1. send sync (unsynced state)
2. receive sync reply (sync established)
3. command send, but timeout occured (unsynced state)
4. receive command reply
Last IO triggers error because current code ignores
only delayed syncs when unsynced
We should not treat any delayed reply as error in unsynced
state. Until client and qga are not in sync delayed reply to any
command is possible. msg == NULL is the exact criterion
that we are not in sync.
Put it into qemuDomainPrepareShmemChardev() so it can be used later.
Also don't fill in the path unless the server option is enabled.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Some checks will need to be performed for newer device types as well, so
let's not duplicate them.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Commit 839a060 tied the lifecycle of virtlogd more
closely to that of libvirtd. Unfortunately, while starting
virtlogd when libvirtd is started is definitely a good idea,
restarting virtlogd or shutting it down at any time outside
of system poweroff is not.
Revert part of that commit by removing the PartOf= lines,
meaning that only startup requests will be propagated from
libvirtd to virtlogd.
Resolves: https://bugzilla.redhat.com/1372576
Now that we have two same implementations for getting path for
huge pages backed guest memory, lets merge them into one function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When trying to migrate a huge page enabled guest, I've noticed
the following crash. Apparently, if no specific hugepages are
requested:
<memoryBacking>
<hugepages/>
</memoryBacking>
and there are no hugepages configured on the destination, we try
to dereference a NULL pointer.
Program received signal SIGSEGV, Segmentation fault.
0x00007fcc907fb20e in qemuGetHugepagePath (hugepage=0x0) at qemu/qemu_conf.c:1447
1447 if (virAsprintf(&ret, "%s/libvirt/qemu", hugepage->mnt_dir) < 0)
(gdb) bt
#0 0x00007fcc907fb20e in qemuGetHugepagePath (hugepage=0x0) at qemu/qemu_conf.c:1447
#1 0x00007fcc907fb2f5 in qemuGetDefaultHugepath (hugetlbfs=0x0, nhugetlbfs=0) at qemu/qemu_conf.c:1466
#2 0x00007fcc907b4afa in qemuBuildMemoryBackendStr (size=4194304, pagesize=0, guestNode=0, userNodeset=0x0, autoNodeset=0x0, def=0x7fcc70019070, qemuCaps=0x7fcc70004000, cfg=0x7fcc5c011800, backendType=0x7fcc95087228, backendProps=0x7fcc95087218,
force=false) at qemu/qemu_command.c:3297
#3 0x00007fcc907b4f91 in qemuBuildMemoryCellBackendStr (def=0x7fcc70019070, qemuCaps=0x7fcc70004000, cfg=0x7fcc5c011800, cell=0, auto_nodeset=0x0, backendStr=0x7fcc70020360) at qemu/qemu_command.c:3413
#4 0x00007fcc907c0406 in qemuBuildNumaArgStr (cfg=0x7fcc5c011800, def=0x7fcc70019070, cmd=0x7fcc700040c0, qemuCaps=0x7fcc70004000, auto_nodeset=0x0) at qemu/qemu_command.c:7470
#5 0x00007fcc907c5fdf in qemuBuildCommandLine (driver=0x7fcc5c07b8a0, logManager=0x7fcc70003c00, def=0x7fcc70019070, monitor_chr=0x7fcc70004bb0, monitor_json=true, qemuCaps=0x7fcc70004000, migrateURI=0x7fcc700199c0 "defer", snapshot=0x0,
vmop=VIR_NETDEV_VPORT_PROFILE_OP_MIGRATE_IN_START, standalone=false, enableFips=false, nodeset=0x0, nnicindexes=0x7fcc95087498, nicindexes=0x7fcc950874a0, domainLibDir=0x7fcc700047c0 "/var/lib/libvirt/qemu/domain-1-fedora") at qemu/qemu_command.c:9547
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Both qemu monitor and agent print the same
log on HUANGUP event, which would be confusing
when reading libvirtd log.
This patch will give a different log message to them.
Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Most of QEMU's PCI display device models, such as:
libvirt video/model/@type QEMU -device
------------------------- ------------
cirrus cirrus-vga
vga VGA
qxl qxl-vga
virtio virtio-vga
come with a linear framebuffer (sometimes called "VGA compatibility
framebuffer"). This linear framebuffer lives in one of the PCI device's
MMIO BARs, and allows guest code (primarily: firmware drivers, and
non-accelerated OS drivers) to display graphics with direct memory access.
Due to architectural reasons on aarch64/KVM hosts, this kind of
framebuffer doesn't / can't work in
qemu-system-(arm|aarch64) -M virt
machines. Cache coherency issues guarantee a corrupted / unusable display.
The problem has been researched by several people, including kvm-arm
maintainers, and it's been decided that the best way (practically the only
way) to have boot time graphics for such guests is to consolidate on
QEMU's "virtio-gpu-pci" device.
>From <https://bugzilla.redhat.com/show_bug.cgi?id=1195176>, libvirt
supports
<devices>
<video>
<model type='virtio'/>
</video>
</devices>
but libvirt unconditionally maps @type='virtio' to QEMU's "virtio-vga"
device model. (See the qemuBuildDeviceVideoStr() function and the
"qemuDeviceVideo" enum impl.)
According to the above, this is not right for the "virt" machine type; the
qemu-system-(arm|aarch64) binaries don't even recognize the "virtio-vga"
device model (justifiedly). Whereas "virtio-gpu-pci", which is a pure
virtio device without a compatibility framebuffer, is available, and works
fine.
(The ArmVirtQemu ("AAVMF") platform of edk2 -- that is, the UEFI firmware
for "virt" -- supports "virtio-gpu-pci", as of upstream commit
3ef3209d3028. See
<https://tianocore.acgmultimedia.com/show_bug.cgi?id=66>.)
Override the default mapping of "virtio", from "virtio-vga" to
"virtio-gpu-pci", if qemuDomainMachineIsVirt() evaluates to true.
Cc: Andrea Bolognani <abologna@redhat.com>
Cc: Drew Jones <drjones@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Martin Kletzander <mkletzan@redhat.com>
Suggested-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1372901
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Martin Kletzander <mkletzan@redhat.com>
There is nothing Linux-specific in that function. Also since commit
8c3b5bf481 mingw build is broken due to
the fact that this function is not compiled in the library.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
In testOpenDefault we create a virtual computer that is later
presented to user. We also pretend to have NUMA cells and
initialize them somehow. But whilst doing so a magical constant
is used. Drop it.
Signed-off-by: Tomáš Ryšavý <tom.rysavy.0@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We will need this function shortly when implementing
nodeGetCPUStats in the test driver.
Signed-off-by: Tomáš Ryšavý <tom.rysavy.0@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Use the state information (online, hotpluggable) provided by the monitor
code rather than trying to infer it. This fixes an issue where on
architectures that require hotplug of multiple threads at once the
sub-cores would get updated as offline on daemon restart thus creating
an invalid configuration.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1375783
Return whether a vcpu entry is hotpluggable or online so that upper
layers don't have to infer the information from other data.
Advantage is that this code can be tested by unit tests.
The algorithm that matches data from query-cpus and
query-hotpluggable-cpus is quite complex. Start using descriptive
iterator names to avoid confusion.
https://bugzilla.redhat.com/show_bug.cgi?id=1372613
Apparently, some management applications use the following code
pattern when waiting for a block job to finish:
while (1) {
virDomainGetBlockJobInfo(dom, disk, info, flags);
if (info.cur == info.end)
break;
sleep(1);
}
Problem with this approach is in its corner cases. In case of
QEMU, libvirt merely pass what has been reported on the monitor.
However, if the block job hasn't started yet, qemu reports cur ==
end == 0 which tricks mgmt apps into thinking job is complete.
The solution is to mangle cur/end values as described here [1].
1: https://www.redhat.com/archives/libvir-list/2016-September/msg00017.html
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Even though we merely just pass to users whatever qemu provided
on the monitor, we still do some translation. For instance we
turn bytes into mebibytes, or fix job type if needed. However, in
the future there is more fixing to be done so this code deserves
its own function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Name it virNumaGetHostMemoryNodeset and return only NUMA nodes which
have memory installed. This is necessary as the kernel is not very happy
to set the memory cgroup setting for nodes which do not have any memory.
This would break vcpu hotplug with following message on such
configruation:
Invalid value '0,8' for 'cpuset.mems': Invalid argument
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1375268
In a full domain config, libvirt allows overriding the normal PCI
vs. PCI Express rules when a device address is explicitly provided
(so, e.g., you can force a legacy PCI device to plug into a PCIe port,
although libvirt would never do that on its own). However, due to a
bug libvirt doesn't give this same leeway when hotplugging devices. On
top of that, current libvirt assumes that *all* devices are legacy
PCI. The result of all this is that it's impossible to hotplug a
device into a PCIe port, even if you manually add the PCI address.
This can all be traced to the function
virDomainPCIAddressEnsureAddr(), and the fact that it calls
virDomainPCIaddressReserveSlot() for manually set addresses, and that
function hardcodes the argument "fromConfig" to false (meaning "this
address was auto-assigned, so it should be subject to stricter
validation").
Since virDomainPCIAddressReserveSlot() is just a one line simple
wrapper around virDomainPCIAddressReserveAddr() (adding in a hardcoded
reserveEntireSlot = true and fromConfig = false), all that's needed to
solve the problem with no unwanted side effects is to replace that
call for virDomainPCIAddressReserveSlot() with a direct call to
virDomainPCIAddressReserveAddr(), but with reserveEntireSlot = true,
fromConfig = true. That's what this patch does.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1337490
virQEMUDriverConfigNew() always initializes the bitmap in its
cgroupControllers member to -1 (i.e. all 1's).
Prior to commit a9331394, if qemu.conf had a line with
"cgroup_controllers", cgroupControllers would get reset to 0 before
going through a loop setting a bit for each named cgroup controller.
commit a9331394 left out the "reset to 0" part, so cgroupControllers
would always be -1; if you didn't want a controller included, there
was no longer a way to make that happen.
This was discovered by users who were using qemu commandline
passthrough to use the "input-linux" method of directing
keyboard/mouse input to a virtual machine:
https://www.redhat.com/archives/vfio-users/2016-April/msg00105.html
Here's the first report I found of the problem encountered after
upgrading libvirt beyond v2.0.0:
https://www.redhat.com/archives/vfio-users/2016-August/msg00053.html
Thanks to sL1pKn07 SpinFlo <sl1pkn07@gmail.com> for bringing the
problem up in IRC, and then taking the time to do a git bisect and
find the patch that started the problem.
previous commit:
commit 2c3223785c
Author: John Ferlan <jferlan@redhat.com>
Date: Mon Jun 13 12:30:34 2016 -0400
qemu: Add the ability to hotplug the TLS X.509 environment
added a parameter "bool listen" in some methods. This
unfortunately clashes with the listen() method, causing
compile failures on certain platforms (RHEL-6 for example)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit id 'a48c7141' altered how to determine if a volume was encrypted
by adding a peek at an offset into the file at a specific buffer location.
Unfortunately, all that was compared was the first "char" of the buffer
against the expect "int" value.
Restore the virReadBufInt32BE to get the complete field in order to
compare against the expected value from the qcow2EncryptionInfo or
qcow1EncryptionInfo "modeValue" field.
This restores the capability to create a volume with encryption, then
refresh the pool, and still find the encryption for the volume.
A LUKS volume uses the volume secret type just like the QCOW2 secret, so
adjust the loading of the default secrets to handle any volume that the
virStorageFileGetMetadataFromBuf code has deemed to be an encrypted volume
to search for the volume's secret. This lookup is done by volume usage
where the usage is expected to be the path to volume.
When a new filter is being defined, the return code is not handled properly,
thus triggering OOM error reporting routine (bug introduced by 51b2606f).
Signed-off-by: Erik Skultety <eskultet@redhat.com>
When migration fails, we need to poke QEMU monitor to check for a reason
of the failure. We did this using query-migrate QMP command, which is
not supposed to return any meaningful result on the destination side.
Thus if the monitor was still functional when we detected the migration
failure, parsing the answer from query-migrate always failed with the
following error message:
"info migration reply was missing return status"
This irrelevant message was then used as the reason for the migration
failure replacing any message we might have had.
Let's use harmless query-status for poking the monitor to make sure we
only get an error if the monitor connection is broken.
https://bugzilla.redhat.com/show_bug.cgi?id=1374613
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Commit id 'b00d7f29' shifted the opening of the /sys/devices/intel_cqm/type
file from event enable to perf event initialization. If the file did not
exist, then an error would be written to the domain log:
2016-09-06 20:51:21.677+0000: 7310: error : virFileReadAll:1360 : Failed to open file '/sys/devices/intel_cqm/type': No such file or directory
Since the error is now handled in virPerfEventEnable by checking if the
event_attr->attrType == 0 for CMT, MBML, and MBMT events - we can just
use the Quiet API in order to not log the error we're going to throw away.
Additionally, rather than using virReportSystemError, use virReportError
and VIR_ERR_ARGUMENT_UNSUPPORTED in order to signify that support isn't there
for that type of perf event - adjust the error message as well.
Akin to previous commit but for "virsh cpu-baseline" which
computes a baseline CPU for a set of host cpu elements.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Implement support for "virsh cpu-compare" so that we can calculate
common cpu element between a pool of hosts, which had a requirement
of providing host cpu description.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Parse libxl_hwcap accounting for versions since Xen 4.4 - Xen 4.7.
libxl_hwcaps is a set of cpuid leaves output that is described in [0] or
[1] in Xen 4.7. This is a collection of CPUID leaves that we version
in libvirt whenever feature words are reordered or added. Thus we keep the
common ones in one struct and others for each version. Since
libxl_hwcaps doesn't appear to have a stable format across all supported
versions thus we need to keep track of changes as a compromise until it's
exported in xen libxl API. We don't fail in initializing the driver in case
parsing of hwcaps failed for that reason. In addition, change the notation
on PAE feature such that is easier to read which bit it corresponds.
[0] xen/include/asm-x86/cpufeature.h
[1] xen/include/public/arch-x86/cpufeatureset.h
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Add support for describing cpu topology in host cpu element. In doing
so, refactor hwcaps part to its own helper namely libxlCapsInitCPU to
handle all host cpu related operations, including topology.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Qemu always opens the tray if forced to. Skip the waiting step in such
case.
This also helps if qemu does not report the tray change event when
opening the cdrom forcibly (the documentation says that the event will
not be sent although qemu in fact does trigger it even if @force is
selceted).
This is a workaround for a qemu issue where qemu does not send the tray
change event in some cases (after migration with empty closed locked
drive) and thus renders the cdrom useless from libvirt's point of view.
Partially resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1368368
When a source image is dropped when missing due to startup policy the
policy needs to be cleared since it was relevant only for the given
storage source. New sources need to update it if needed.
Just like in the previous commit, teach qemu driver to detect
whether qemu supports this configuration knob or not.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add a new secret usage type known as "tls" - it will handle adding the
secret objects for various TLS objects that need to provide some sort
of passphrase in order to access the credentials.
The format is:
<secret ephemeral='no' private='no'>
<description>Sample TLS secret</description>
<usage type='tls'>
<name>mumblyfratz</name>
</usage>
</secret>
Once defined and a passphrase set, future patches will allow the UUID
to be set in the qemu.conf file and thus used as a secret for various
TLS options such as a chardev serial TCP connection, a NBD client/server
connection, and migration.
Signed-off-by: John Ferlan <jferlan@redhat.com>
If the incoming XML defined a path to a TLS X.509 certificate environment,
add the necessary 'tls-creds-x509' object to the VIR_DOMAIN_CHR_TYPE_TCP
character device.
Likewise, if the environment exists the hot unplug needs adjustment as
well. Note that all the return ret were changed to goto cleanup since
the cfg needs to be unref'd
Signed-off-by: John Ferlan <jferlan@redhat.com>
When building a chardev device string for tcp, add the necessary pieces to
access provide the TLS X.509 path to qemu. This includes generating the
'tls-creds-x509' object and then adding the 'tls-creds' parameter to the
VIR_DOMAIN_CHR_TYPE_TCP command line.
Finally add the tests for the qemu command line. This test will make use
of the "new(ish)" /etc/pki/qemu setting for a TLS certificate environment
by *not* "resetting" the chardevTLSx509certdir prior to running the test.
Also use the default "verify" option (which is "no").
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add a new TLS X.509 certificate type - "chardev". This will handle the
creation of a TLS certificate capability (and possibly repository) for
properly configured character device TCP backends.
Unlike the vnc and spice there is no "listen" or "passwd" associated. The
credentials eventually will be handled via a libvirt secret provided to
a specific backend.
Make use of the default verify option as well.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than specify perhaps multiple TLS X.509 certificate directories,
let's create a "default" directory which can then be used if the service
(e.g. for now vnc and spice) does not supply a default directory.
Since the default for vnc and spice may have existed before without being
supplied, the default check will first check if the service specific path
exists and if so, set the cfg entry to that; otherwise, the default will
be set to the (now) new defaultTLSx509certdir.
Additionally add a "default_tls_x509_verify" entry which can also be used
to force the peer verification option (for vnc it's a x509verify option).
Add/alter the macro for the option being found in the config file to accept
the default value.
Signed-off-by: John Ferlan <jferlan@redhat.com>
If a migration of a domain which is already defined on the destination
host failed early (before we tried to start QEMU), we would forget to
remove the incoming transient definition. Later on when someone starts
the domain on the destination host, we will use the stale incoming
definition and the persistent definition will just be ignored.
https://bugzilla.redhat.com/show_bug.cgi?id=1368774
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The code for replacing domain's transient definition with the persistent
one is repeated in several places and we'll need to add one more. Let's
make a nice helper for it.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
There is an issue with a wrong label inside vah_add_path().
The compilation fails with the error:
make[3]: Entering directory '/tmp/libvirt/src'
CC security/virt_aa_helper-virt-aa-helper.o
security/virt-aa-helper.c: In function 'vah_add_path':
security/virt-aa-helper.c:769:9: error: label 'clean' used but not defined
goto clean;
This patch moves 'clean' label to 'cleanup' label.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
This patch fixes a segfault in virt-aa-helper caused by attempting to
modify a static string literal. It is triggered when a domain has a
<filesystem> with type='mount' configured read-only and libvirt is
using the AppArmor security driver for sVirt confinement. An "R" is
passed into the function and converted to 'r'.
Similarly to vcpu hotplug the emulator thread cgroup numa mapping needs
to be relaxed while hot-adding vcpus so that the threads can allocate
data in the DMA zone.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1370084
When hot-adding vcpus qemu needs to allocate some structures in the DMA
zone which may be outside of the numa pinning. Extract the code doing
this in a set of helpers so that it can be reused.
There is a possibility that qemu driver frees by unreferencing its
closeCallbacks pointer as it has the only reference to the object,
while in fact not all users of CloseCallbacks called thier
virCloseCallbacksUnset.
Backtrace is the following:
Thread #1:
0 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
1 in virCondWait (c=<optimized out>, m=<optimized out>)
at util/virthread.c:154
2 in virThreadPoolFree (pool=0x7f0810110b50)
at util/virthreadpool.c:266
3 in qemuStateCleanup () at qemu/qemu_driver.c:1116
4 in virStateCleanup () at libvirt.c:808
5 in main (argc=<optimized out>, argv=<optimized out>)
at libvirtd.c:1660
Thread #2:
0 in virClassIsDerivedFrom (klass=0xdeadbeef, parent=0x7f0837c694d0) at util/virobject.c:169
1 in virObjectIsClass (anyobj=anyobj@entry=0x7f08101d4760, klass=<optimized out>) at util/virobject.c:365
2 in virObjectLock (anyobj=0x7f08101d4760) at util/virobject.c:317
3 in virCloseCallbacksUnset (closeCallbacks=0x7f08101d4760, vm=vm@entry=0x7f08101d47b0, cb=cb@entry=0x7f081d078fc0 <qemuProcessAutoDestroy>) at util/virclosecallbacks.c:163
4 in qemuProcessAutoDestroyRemove (driver=driver@entry=0x7f081018be50, vm=vm@entry=0x7f08101d47b0) at qemu/qemu_process.c:6368
5 in qemuProcessStop (driver=driver@entry=0x7f081018be50, vm=vm@entry=0x7f08101d47b0, reason=reason@entry=VIR_DOMAIN_SHUTOFF_SHUTDOWN, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_NONE, flags=flags@entry=0) at qemu/qemu_process.c:5854
6 in processMonitorEOFEvent (vm=0x7f08101d47b0, driver=0x7f081018be50) at qemu/qemu_driver.c:4585
7 qemuProcessEventHandler (data=<optimized out>, opaque=0x7f081018be50) at qemu/qemu_driver.c:4629
8 in virThreadPoolWorker (opaque=opaque@entry=0x7f0837c4f820) at util/virthreadpool.c:145
9 in virThreadHelper (data=<optimized out>) at util/virthread.c:206
10 in start_thread () from /lib64/libpthread.so.0
Let's reference CloseCallbacks object in virCloseCallbacksSet and
unreference in virCloseCallbacksUnset.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
In the latest glibc, major() and minor() functions are marked as
deprecated (glibc commit dbab6577):
CC util/libvirt_util_la-vircgroup.lo
util/vircgroup.c: In function 'virCgroupGetBlockDevString':
util/vircgroup.c:768:5: error: '__major_from_sys_types' is deprecated:
In the GNU C Library, `major' is defined by <sys/sysmacros.h>.
For historical compatibility, it is currently defined by
<sys/types.h> as well, but we plan to remove this soon.
To use `major', include <sys/sysmacros.h> directly.
If you did not intend to use a system-defined macro `major',
you should #undef it after including <sys/types.h>.
[-Werror=deprecated-declarations]
if (virAsprintf(&ret, "%d:%d ", major(sb.st_rdev), minor(sb.st_rdev)) < 0)
^~
In file included from /usr/include/features.h:397:0,
from /usr/include/bits/libc-header-start.h:33,
from /usr/include/stdio.h:28,
from ../gnulib/lib/stdio.h:43,
from util/vircgroup.c:26:
/usr/include/sys/sysmacros.h:87:1: note: declared here
__SYSMACROS_DEFINE_MAJOR (__SYSMACROS_FST_IMPL_TEMPL)
^
Moreover, in the glibc commit, there's suggestion to keep
ordering of including of header files as implemented here.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Current implementation uses the dev.cpu.0.freq sysctl that is
provided by the cpufreq(4) framework and returns the actual
CPU frequency. However, there are environments where it's not available,
e.g. when running nested in KVM. In this case fall back to hw.clockrate
that reports CPU frequency at the boot time.
Resolves (hopefully):
https://bugzilla.redhat.com/show_bug.cgi?id=1369964
We already guarantee that virtlogd.socket is enabled/disabled
along with libvirtd.service, but if libvirtd.service has just
been installed and is started before rebooting, then
virtlogd.socket will not be running and guest startup will
fail.
Add Requires=virtlogd.socket to libvirtd.service to make sure
virtlogd.socket is always started along with libvirtd.service,
and add Before=libvirtd.service to both virtlogd.socket and
virtlogd.service so that virtlogd never disappears before
libvirtd has exited.
Also add PartOf=libvirtd.service to both virtlogd.socket and
virtlogd.service, so that virtlogd can be shut down when not
needed.
Resolves: https://bugzilla.redhat.com/1372576
We already have the ability to turn off dumping of guest
RAM via the domain XML. This is not particularly useful
though, as it is under control of the management application.
What is needed is a way for the sysadmin to turn off guest
RAM defaults globally, regardless of whether the mgmt app
provides its own way to set this in the domain XML.
So this adds a 'dump_guest_core' option in /etc/libvirt/qemu.conf
which defaults to false. ie guest RAM will never be included in
the QEMU core dumps by default. This default is different from
historical practice, but is considered to be more suitable as
a default because
a) guest RAM can be huge and so inflicts a DOS on the host
I/O subsystem when dumping core for QEMU crashes
b) guest RAM can contain alot of sensitive data belonging
to the VM owner. This should not generally be copied
around inside QEMU core dumps submitted to vendors for
debugging
c) guest RAM contents are rarely useful in diagnosing
QEMU crashes
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the QEMU processes inherit their core dump rlimit
from libvirtd, which is really suboptimal. This change allows
their limit to be directly controlled from qemu.conf instead.