This patch adds new xml element, and so we can have the option of
also having perf events enabled immediately at startup.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Message-id: 1459171833-26416-6-git-send-email-qiaowei.ren@intel.com
This patch adds support for "vpindex", "runtime", "synic",
"stimer", and "vendor_id" features available in qemu 2.5+.
- When Hyper-V "vpindex" is on, guest can use MSR HV_X64_MSR_VP_INDEX
to get virtual processor ID.
- Hyper-V "runtime" enlightement feature allows to use MSR
HV_X64_MSR_VP_RUNTIME to get the time the virtual processor consumes
running guest code, as well as the time the hypervisor spends running
code on behalf of that guest.
- Hyper-V "synic" stands for Synthetic Interrupt Controller, which is
lapic extension controlled via MSRs.
- Hyper-V "stimer" switches on Hyper-V SynIC timers MSR's support.
Guest can setup and use fired by host events (SynIC interrupt and
appropriate timer expiration message) as guest clock events
- Hyper-V "reset" allows guest to reset VM.
- Hyper-V "vendor_id" exposes hypervisor vendor id to guest.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
If a user specify network type ethernet, then create it via libvirt and run
script if it provided. After this commit user does not need to
run external script to create tap device or add root permissions to qemu
process.
Signed-off-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
This will skip few steps from qemuProcessStart in order to create only
qemu CMD. Use a VIR_QEMU_PROCESS_START_PRETEND for all the qemuProcess*
functions called by this one to not modify or check host.
This new function will be used later on for XMLToNative API and also for
qemuxml2argvtest to make sure that both API and test uses the same code
as qemuProcessStart.
We need also update qemuProcessInit to wrap few lines of code with check
that VIR_QEMU_PROCESS_START_PRETEND that makes sense only for
qemuProcessStart.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Move all code that checks host and domain. Do not check host if we use
VIR_QEMU_PROCESS_START_PRETEND flag.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Move all code that modifies only live XML to this function. The new
VIR_QEMU_PROCESS_START_PRETEND flag will be used by qemuXMLToNative and
qemuxml2argvtest later in order to reuse the same code as
qemuProcessStart uses.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The postParse callback is the correct place to generate default values
that should be present in offline XML.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
QEMU changed the error message to:
"Tray of device 'drive-sata0-0-1' is not open"
and they may change the error massage in the future.
This updates the code to not depend on the text from the error message
but only on error itself.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The function already takes two bool arguments, switching to flags makes
it a lot easier to read. Especially in case we need to add another
boolean in the future.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
In post-copy mode none of the hosts has a complete guest state and
rolling back migration is impossible. Thus aborting it would be
equivalent to destroying the domain.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When migration fails in the post-copy mode, it's impossible to just kill
the destination domain and resume the source since the source no longer
contains current guest state. Let's mark domains on both sides as
VIR_DOMAIN_PAUSED_POSTCOPY_FAILED to let the upper layer decide what to
do with them.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When destination libvirtd is restarted during migration in Finish phase
just after the point we started guest CPUs, we should not kill the
domain.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Migration enters "postcopy-active" state after QEMU switches to
post-copy and pauses guest CPUs. From libvirt's point of view this state
is similar to "completed" because we need to transfer guest execution to
the destination host.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
If a <graphics type='spice'> has no port nor tlsPort set, the generated
QEMU command line will contain -spice port=0.
This is later going to be ignored by spice-server, but it's better not
to add it at all in this situation.
As an empty -spice is not allowed, we still need to append port=0 if we
did not add any other argument.
The end goal is to avoid adding -spice port=0,addr=127.0.0.1 to QEMU command
line when no SPICE port is specified in libvirt XML.
Currently, the code relies on port=xx to always be present, so subsequent
args can be unconditionally appended with a leading ','. Since port=0
will no longer be added in a subsequent commit, we append a ',' to every
arg instead of prepending, and remove the last one before adding it to
the arg list.
It's just a combination of AddImplicitControllers, and AddConsoleCompat.
Every caller that wants ImplicitControllers also wants the ConsoleCompat
AFAICT, so lump them together. We also need it for future patches.
Commit 'ef2ab8fd' moved just the virDomainConfNWFilterTeardown and left
the logic to save/restore the current error essentially doing nothing
in the error path for qemuBuildCommandLine. So move it to where it
was meant to be.
Although the original code would reset the filter on command creation
errors after building the network command portion and commit 'ef2ab8fd'
altered that logic, the teardown is called during qemuProcessStop from
virDomainConfVMNWFilterTeardown and that code has the save/restore
last error logic, so just allow that code to handle the teardown rather
than running it twice. The qemuProcessStop would be called in the failure
path of qemuBuildCommandLine.
We include the file in plenty of places. This is mostly due to
historical reasons. The only place that needs something from the
header file is storage_backend_fs which opens _PATH_MOUNTED. But
it gets the file included indirectly via mntent.h. At no other
place in our code we need _PATH_.*. Drop the include and
configure check then.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Mostly it is just passing new parameter here and there. In case
of zero value we fallback to auto selecting port and thus keep
backward compatibility.
Also we need to fix places of auto selected port managment.
We should bother only when auto selected was done that is
when externally specified port is not 0.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
In qemuConnectDomainXMLToNative() we set up the monitor, but we never
memset() it to zeros. Thanks to the introduction of the logfile
parameter of chardevs (and the logfile member of the struct), we started
checking whether that's non-NULL and that exposed this old error.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reverting to a snapshot may change domain configuration. New
configuration should be saved if domain has persistent flag.
VIR_DOMAIN_EVENT_DEFINED_FROM_SNAPSHOT is emitted in case of
configuration update.
Add new function to manage adding the panic device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the NVRAM device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the RNG device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Also modify the qemuBuildRNGDevStr to use const virDomainDef instead
of virDomainDefPtr.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the memballoon device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Also modify the qemuBuildMemballoonDevStr to use const virDomainDef
instead of virDomainDefPtr.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the host device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Also modify qemuBuildPCIHostdevDevStr, qemuBuildUSBHostdevDevStr,
and qemuBuildSCSIHostdevDevStr to use const virDomainDef instead
of virDomainDefPtr.
Make qemuBuildPCIHostdevPCIDevStr and qemuBuildUSBHostdevUSBDevStr
static to the qemu_command.c.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the redirdev device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Also move the qemuBuildRedirdevDevStr closer to the new function and
modify to use the const virDomainDef instead of virDomainDefPtr
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the watchdog device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Also since qemuBuildWatchdogDevStr was only local here, make it static as
well as modifying the const virDomainDef.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the sound device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Also since qemuBuildSoundDevStr was only local here, make it static as
well as modifying the const virDomainDef.
Signed-off-by: John Ferlan <jferlan@redhat.com>
This function can be called over a domain definition that has no
video configured. The
tests/qemuxml2argvdata/qemuxml2argv-minimal.xml file could serve
as an example. Problem is, before the check that domain has some
or none video configured, def->videos is dereferenced causing a
segmentation fault in case there's none video configured.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add new function to manage adding the video device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the input device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Make qemuBuildUSBInputDevStr static since only this module calls it.
Also the change to use const virDomainDef forces other changes.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Modify the argument order and types to match other similar helpers.
Also modify called functions to use the def->emulator instead of passing
def->emulator and def.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the console device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the channel device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the parallels device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Alter logic slight to reduce indention level.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the serial device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Using const virDomainDef causes collateral damage in other called APIs
which need to make the similar adjustment
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the smartcard device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Alter the logic slightly to make !nsmartcards check first so that remainder
of the code is less indented.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the network device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the -fsdev options to the
command line removing that task from the mainline qemuBuildCommandLine.
Alter the code slightly to perform the !caps and fsdev failure check
up front.
Since both qemuBuildFSStr and qemuBuildFSDevStr are local, make them
static and fix their prototypes to use the const virDomainDef as well.
Make some minor formatting changes for long lines.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the disk -drive options to the
command line removing that task from the mainline qemuBuildCommandLine.
Also since using const virDomainDef in new function, that means other
functions called needed to change their usage.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the hub -device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Also make qemuBuildHubDevStr static to the module since it's only
used here.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the controller -device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Also adjust to using const virDomainDef instead of virDomainDefPtr.
This causes collateral damage in order to modify called APIs to use
the const virDomainDef instead as well.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the -global controller options to
the command line removing that task from the mainline qemuBuildCommandLine.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the -boot options to the command
line removing that task from the mainline qemuBuildCommandLine.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the power management options to the
command line removing that task from the mainline qemuBuildCommandLine.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the '-clock' options to the command
line removing that task from the mainline qemuBuildCommandLine.
Also includes some minor formatting cleanups.
Signed-off-by: John Ferlan <jferlan@redhat.com>
When debug-threads is enabled, individual threads are given a separate
name (on Linux)
Fixes:
https://bugzilla.redhat.com/show_bug.cgi?id=1140121
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
QEMU (somewhere around 2.0) added a new sub-option to the -name flag
-name debug-threads=on.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Currently the file based character devices let QEMU write
directly to a file on disk. This allows a malicious QEMU
to inflict a denial of service by consuming all free space.
Switch QEMU to use a pipe to virtlogd, which will enforce
file rollover.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If use of virtlogd is enabled, then use it for backing the
character device log files too. This avoids the possibility
of a guest denial of service by writing too much data to
the log file.
The functions for handling FD passing when building command line
arguments need to be used by many different bits of code, so need
to be at the start of the source file
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The act of formatting a chardev backend value may need to
append command line arguments for passing FDs. If we append
the -chardev arg before formatting the value, then the
resulting arguments will end up interspersed
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Honour the <log file='...'/> element in chardevs to output
data to a file. This requires QEMU >= 2.6
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Now that the function was extracted we can get rid of some temp
variables. Additionally formatting of the bitmap string for the event
code should be checked.
Allow pinning for inactive vcpus. The pinning mask will be automatically
applied as we would apply the default mask in case of a cpu hotplug.
Setting the scheduler settings for a vcpu has the same semantics.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1306556
Introduce a helper to check supported device and domain config and move
the memory hotplug checks to it.
The advantage of this approach is that by default all new features are
considered unsupported by all hypervisors unless specifically changed
rather than the previous approach where every hypervisor would need to
declare that a given feature is unsupported.
We would happily report and free statistics of a completed migration
even before it actually completed (on the source host while migration is
in the Finish phase).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Computing a total downtime during a migration requires us to store a
time stamp when guest CPUs get stopped. The value (and all other
statistics) is then transferred to the destination to compute the
downtime. Because the stopped time stamp is stored by a STOP event
handler while the statistics which will be sent over to the destination
are copied synchronously within qemuMigrationWaitForCompletion.
Depending on the timing of STOP and MIGRATION events, we may end up
copying (and transferring) statistics without the stopped time stamp
set. Let's make sure we always use the correct time stamp.
https://bugzilla.redhat.com/show_bug.cgi?id=1282744
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
With a very old QEMU which doesn't support events we need to explicitly
call qemuMigrationSetOffline at the end of migration to update our
internal state. On the other hand, if we talk to QEMU using QMP, we
should just wait for the STOP event and let the event handler update the
state and trigger a libvirt event.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
We should not overwrite all migration statistics on the source with the
numbers sent by the destination since the source may have an updated
view in some cases (such as post-copy migration). It's safer to update
just the timing info we need to get from the destination and be prepared
for the future. And we should only do all this after a successful
migration.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Statistics for a completed migration only make sense if the migration
was successful. Let's not store them in priv->job.completed until we
are sure it was a success.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The code does not handle renaming of the save state file. In addition to
that the resuming code would need to be tweaked to handle the name
change since the XML is extracted from the save image. The easies option
is to make the rename API even less useful by forbiding this.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1314594
While trying to build with -Os couple of compile errors showed
up.
conf/domain_conf.c: In function 'virDomainChrRemove':
conf/domain_conf.c:13666:24: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized]
virDomainChrDefPtr ret, **arrPtr = NULL;
^
Compiler fails to see that @ret is used only if set in the loop,
but whatever, there's no harm in initializing the variable.
In vboxAttachDrivesNew and _vboxAttachDrivesOld compiler thinks
that @rc may be used uninitialized. Well, not directly, but maybe
after some optimization. Yet again, no harm in initializing a
variable.
In file included from ./util/virthread.h:26:0,
from ./datatypes.h:28,
from vbox/vbox_tmpl.c:43,
from vbox/vbox_V3_1.c:37:
vbox/vbox_tmpl.c: In function '_vboxAttachDrivesOld':
./util/virerror.h:181:5: error: 'rc' may be used uninitialized in this function [-Werror=maybe-uninitialized]
virReportErrorHelper(VIR_FROM_THIS, code, __FILE__, \
^
In file included from vbox/vbox_V3_1.c:37:0:
vbox/vbox_tmpl.c:1041:14: note: 'rc' was declared here
nsresult rc;
^
Yet again, one uninitialized variable:
qemu/qemu_driver.c: In function 'qemuDomainBlockCommit':
qemu/qemu_driver.c:17194:9: error: 'baseSource' may be used uninitialized in this function [-Werror=maybe-uninitialized]
qemuDomainPrepareDiskChainElement(driver, vm, baseSource,
^
And another one:
storage/storage_backend_logical.c: In function 'virStorageBackendLogicalMatchPoolSource.isra.2':
storage/storage_backend_logical.c:618:33: error: 'thisSource' may be used uninitialized in this function [-Werror=maybe-uninitialized]
thisSource->devices[j].path))
^
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When SPICE graphics is configured for a domain but we did not ask the
client to switch to the destination, we should not wait for
SPICE_MIGRATE_COMPLETED event (which will never come).
https://bugzilla.redhat.com/show_bug.cgi?id=1151723
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Migration statistics are not available on the destination host and
starting a query job during incoming migration is not allowed. Trying to
do that would result in
Timed out during operation: cannot acquire state change lock (held
by remoteDispatchDomainMigratePrepare3Params)
error. We should not even try to start the job.
https://bugzilla.redhat.com/show_bug.cgi?id=1278727
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemuProcessSetupEmulator runs at a point in time where there is only
the qemu main thread. Use virCgroupAddTask to put just that one task
into the emulator cgroup. That patch makes virCgroupMoveTask and
virCgroupAddTaskStrController obsolete.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
Move qemuProcessSetupEmulator up under qemuSetupCgroup. That way
we move the one main thread right into the emulator cgroup, instead
of moving multiple threads later on. And we do not actually want any
threads running in the parent cgroups (cpu cpuacct cpuset).
Signed-off-by: Henning Schild <henning.schild@siemens.com>
This attribute is used to extend secondary PCI bar and expose it to the
guest as 64bit memory. It works like this: attribute vram is there to
set size of secondary PCI bar and guest sees it as 32bit memory,
attribute vram64 can extend this secondary PCI bar. If both attributes
are used, guest sees two memory bars, both address the same memory, with
the difference that the 32bit bar can address only the first part of the
whole memory.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1260749
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
We always place primary video device at first place, to make it easier
to create a qemu command or format an xml, but we should also set the
primary boolean for primary video device to 'true'.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
We honour the placement bitmaps when starting up, so there's no point in
having this check. Additionally the check was buggy since it checked
vm->def all the time even if the user requested to modify the persistent
definition which had different configuration.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1308317
Add Spice graphics gl attribute. qemu 2.6 should have -spice gl=on argument to
enable opengl rendering context (patches on the ML). This is necessary to
actually enable virgl rendering.
Add a qemuxml2argv test for virtio-gpu + spice with virgl.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Per-domain directories were introduced in order to be able to
completely separate security labels for each domain (commit
f1f68ca334). However when the domain
name is long (let's say a ridiculous 110 characters), we cannot
connect to the monitor socket because on length of UNIX socket address
is limited. In order to get around this, let's shorten it in similar
fashion and in order to avoid conflicts, throw in an ID there as well.
Also save that into the status XML and load the old status XMLs
properly (to clean up after older domains). That way we can change it
in the future.
The shortening can be seen in qemuxml2argv tests, for example in the
hugepages-pages2 case.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Commit f1a89a8 allowed parsing configs from /etc/libvirt
without validating the emulator capabilities.
Check for the presence of a machine type in the qemu driver's
post parse function instead of crashing.
https://bugzilla.redhat.com/show_bug.cgi?id=1267256
There's this check when building command line that whenever
domain has no graphics card configured we put -nographics onto
qemu command line. The check is 'if (!def->graphics)'. This
makes coverity think that def->graphics can be NULL, which is
true. But later in the code every access to def->graphics is
guarded by check for def->ngraphics, so no crash occurs. But this
is something that coverity fails to deduct.
In order to shut coverity up lets change the condition to
'if (!def->ngraphics)'.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In case you will specify graphics like this:
<graphics type='spice' port='-1'/>
or
<graphics type='spice' port='-1' tlsPort='6000'/>
libvirt will automatically add autoport='no'. This leads to an issue
that in qemuProcessStop() we don't release that port because we are
releasing both port if autoport=yes or only port marked as reserved.
If autoport=no but we request to generate port via '-1' we need to mark
that port as reserved in order to release it.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1299696
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Checking whether x > 0 before looping over [0..x] items doesn't make
sense and multi-line body must have curly brackets around it.
Best viewed with '-w'.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This does nothing more than adding the new device and capability.
The device is present since QEMU 2.6.0.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
GIC v2 is the default, but checking against that specific version when
we want to know whether the default has been selected is potentially
error prone; using an alias instead makes it safer.
Similarly to VM startup always set the legacy affinity. Additionally we
don't need to report an explicit error since virProcessSetAffinity
reports them themselves.
Similarly to VM startup always set the legacy affinity. Additionally we
don't need to report an explicit error since virProcessSetAffinity
reports them themselves.
PostParse handles it for us now.
This causes some test suite churn; qemu's custom PostParse could is
now invoked before the generic AddImplicitControllers, so PCI
controllers end up sequentially in the XML before the generically
added IDE controllers. So it's just some XML reordering
Calling qemuProcessStop without a job opens a way to race conditions
with qemuDomainObjExitMonitor called in another thread. A real world
example of such a race condition:
- migration thread (A) calls qemuMigrationWaitForSpice
- another thread (B) starts processing qemuDomainAbortJob API
- thread B signals thread A via qemuDomainObjAbortAsyncJob
- thread B enters monitor (qemuDomainObjEnterMonitor)
- thread B calls qemuMonitorSend
- thread A awakens and calls qemuProcessStop
- thread A calls qemuMonitorClose and sets priv->mon to NULL
- thread B calls qemuDomainObjExitMonitor with priv->mon == NULL
=> monitor stays ref'ed and locked
Depending on how lucky we are, the race may result in a memory leak or
it can even deadlock libvirtd's event loop if it tries to lock the
monitor to process an event received before qemuMonitorClose was called.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Stopping a domain without a job risks a race condition with another
thread which started a job a which does not expect anyone else to be
messing around with the same domain object.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Only a small portion of processGuestPanicEvent was enclosed within a
job, let's make sure we use the job for all operations to avoid race
conditions.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When destroying a domain we need to make sure we will be able to start a
job no matter what other operations are running or even stuck in a job.
This is done by killing the domain before starting the destroy job.
Let's introduce qemuProcessBeginStopJob which combines killing a domain
and starting a job in a single API which can be called everywhere we
need a job to stop a domain.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Ending a nested job is no different from ending any other (non-async)
job, after all the code in qemuDomainBeginJobInternal does not handle
them differently either. Thus we should call qemuDomainObjEndJob to stop
nested jobs.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemuDomainHelperGetVcpus would correctly return an array of
virVcpuInfoPtr structs for online vcpus even for sparse topologies, but
the loop that fills the returned typed parameters would number the vcpus
incorrectly. Fortunately sparse topologies aren't supported yet.
Add new function to manage adding the '-mon' or '-monitor' options to
the command line removing that task from the mainline qemuBuildCommandLine.
Also adjusted qemuBuildChrChardevStr and qemuBuildChrArgStr to use
const virDomainChrSourceDef *def rather than virDomainChrSourceDefPtr def.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the '-device sga' to the command
line removing that task from the mainline qemuBuildCommandLine
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the '-smbios' options to the command
line removing that task from the mainline qemuBuildCommandLine
Also while I was looking at it, move the uuid processing closer to usage.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the '-numa' options to the command
line removing that task from the mainline qemuBuildCommandLine
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the IOThread '-object' to the command
line removing that task from the mainline qemuBuildCommandLine
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rename function and move code in from qemuBuildCommandLine to
keep smp related code together. Also make a few style changes
for long lines, return value change, and 2 spaces between functions.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the '-m' memory options to the command
line removing that task from the mainline qemuBuildCommandLine
Signed-off-by: John Ferlan <jferlan@redhat.com>
Create qemuBuildCommandLineValidate to make some checks before trying
to build the command. This will move some logic from much later to much
earlier - we shouldn't be adjusting any data so that shouldn't matter.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Now that the file migration doesn't require us to use 'dd' and other
legacy stuff for too old qemus we don't even have to calcuate the
offsets and other stuff.
With the currently supported qemus we always migrate to file
descriptors so the old function is not required any more.
Additionally QEMU_MONITOR_MIGRATE_TO_FILE_TRANSFER_SIZE macro is now
unused.
Our existing virHashForEach method iterates through all items disregarding the
fact, that some of the iterators might have actually failed. Errors are usually
dispatched through an error element in opaque data which then causes the
original caller of virHashForEach to return -1. In that case, virHashForEach
could return as soon as one of the iterators fail. This patch changes the
iterator return type and adjusts all of its instances accordingly, so the
actual refactor of virHashForEach method can be dealt with later.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
When adding disk images to ACL we may call those functions on NFS
shares. In that case we might get an EACCES, which isn't really relevant
since NFS would not hold a block device. This patch adds a flag that
allows to stop reporting an error on EACCES to avoid spaming logs.
Currently there's no functional change.
Since commit 47e5b5ae virCgroupAllowDevice allows to pass -1 as either
the minor or major device number and it automatically uses '*' in place
of that. Reuse the new approach through the code and drop the duplicated
functions.
After removing capability check for fd migration the code that was left
behind didn't make quite sense. The old exec migration would be used in
case when pipe() failed. Remove the old code and make failure of pipe()
a hard error.
This additionally removes usage of virCgroupAllowDevicePath outside of
qemu_cgroup.c.
Create new modules qemu_domain_address.c and qemu_domain_address.h to
contain all the new functions and header data. Additionally move any
supporting static functions.
Make qemuDomainSupportsPCI non static.
Also, move and rename the following:
qemuSetSCSIControllerModel to qemuDomainSetSCSIControllerModel
qemuCollectPCIAddress to qemuDomainCollectPCIAddress
qemuValidateDevicePCISlotsPIIX3 to qemuDomainValidateDevicePCISlotsPIIX3
qemuAssignDevicePCISlots to qemuDomainAssignDevicePCISlots
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move the misplaced function from qemu_command.c to qemu_interface.c
since it's closer in functionality there and had less to do with building
the command line.
Rename function to qemuInterfaceBridgeConnect and modify callers.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move the misplaced function from qemu_command.c to qemu_interface.c
since it's closer in functionality there and had less to do with building
the command line.
Rename function to qemuInterfaceDirectConnect and modify callers.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move function closer to where it's used in qemuBuildTPMCommandLine
Also fix function header to match current coding practices
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move function closer to where it's called in qemuBuildTPMCommandLine
Also adjust function header to fit current coding guidelines
Signed-off-by: John Ferlan <jferlan@redhat.com>
Currently, on hot unplug of PCI devices with VFIO driver for QEMU, libvirt is
trying to restore the host devices to it's previous value (basically a chown
on the previous user/group).
However for devices with VFIO driver, when the device is unbinded it is
removed from the /dev/vfio file system causing the restore label to fail.
The fix is to not restore the label for those PCI devices since they are going
to be teared down anyway.
Signed-off-by: Ludovic Beliveau <ludovic.beliveau@windriver.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1293351
Since we already have virtio channel events, we know when guest
agent within guest has (dis-)connected. Instead of us blindly
connecting to a socket that no one is listening to, we can just
follow what qemu-ga does. This has a nice benefit that we don't
need to 'guest-ping' the agent just to timeout and find out
nobody is listening.
The way that this commit is implemented:
- don't connect in qemuProcessLaunch directly, defer that to event
callback (which already follows the agent) -
processSerialChangedEvent
- after migration is settled, before we resume vCPUs, ask qemu
whether somebody is listening on the socket and if so, connect
to it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Extract out the qemuParseCommandLine{String|Pid} into their own
separate module - taking with it all the various static functions.
Causes a ripple effect with a few other modules to include the
new qemu_parse_command.h.
Narrowed down the list of #include's in the split out module to
those that are necessary for build.
Recent refactors in the vbox code to check the return status for the
function tipped Coverity's scales of justice for any functions that
do not check status - such as this one.
While I'm at it, since the call is essentially the same other than
whether starting from val or val+1 when val[0] = '[', just adjust
the val pointer by one and have one call instead of two.
Additionally, the call to virDomainGraphicsListenGetAddress is redundant
since it checking that the address field got filled. It's a leftover
from the strndup -> ListenSetAddress conversion (commit id 'ef79fb5b5')
Signed-off-by: John Ferlan <jferlan@redhat.com>
Refactor qemuParseCommandLine to pull out the "-vnc" argument parsing
into its own helper function. Modify the code to use "cleanup" instead
of "error" and use the standard return processing to indicate success
or failure by using ret
Signed-off-by: John Ferlan <jferlan@redhat.com>
Since majority of the steps is shared, the function can be reused to
simplify code.
Similarly to previous path doing this same for vCPUs this also fixes the
a similar bug (which is not tracked).
Rather than iterating 3 times for various settings this function
aggregates all the code into single place. One of the other advantages
is that it can then be reused for properly setting IOThread info on
hotplug.
Since majority of the steps is shared, the function can be reused to
simplify code.
Additionally this resolves
https://bugzilla.redhat.com/show_bug.cgi?id=1244128 since the cpu
bandwidth limiting with cgroups would not be set on the hotplug path.
Additionally the code now sets the thread affinity and honors autoCpuset
as in the regular startup code path.
Rather than iterating 3 times for various settings this function
aggregates all the code into single place. One of the other advantages
is that it can then be reused for properly setting vCPU info on hotplug.
With this approach autoCpuset is also used when setting the process
affinity rather than just via cgroups.
Due to bad design the vcpu sched element is orthogonal to the way how
the data belongs to the corresponding objects. Now that vcpus are a
struct that allow to store other info too, let's convert the data to the
sane structure.
The helpers for the conversion are made universal so that they can be
reused for iothreads too.
This patch also resolves https://bugzilla.redhat.com/show_bug.cgi?id=1235180
since with the correct storage approach you can't have dangling data.
Now that qemuDomainDetectVcpuPids is able to refresh the vCPU pid
information it can be reused in the hotplug and hotunplug code paths
rather than open-coding a very similar algorithm.
A slight algorithm change is necessary for unplug since the vCPU needs
to be marked offline prior to calling the thread detector function and
eventually rolled back if something fails.
Move the logic from virDomainDiskDefDstDuplicates into
virDomainDiskDefCheckDuplicateInfo so that we don't have to loop
multiple times through the array of disks. Since the original function
was called in qemuBuildDriveDevStr, it was actually called for every
single disk which was quite wasteful.
Additionally the target uniqueness check needed to be duplicated in
the disk hotplug case, since the disk was inserted into the domain
definition after the device string was formatted and thus
virDomainDiskDefDstDuplicates didn't do anything in that case.
When starting a qemu process there are certain checks done to ensure
that the configuration makes sense. Extract them into a separate
function so that they can be reused in the test code.
In b3d2a42e I've refactored the code and moved the 'cleanup' label.
Unfortunately the code that was originally in the 'endjob' label and
wanted to jump to cleanup is now in the cleanup label. Remove the jump
and let the function finish.
If we are attempting to thaw the filesystems on error, the code would
overwrite the error code that caused the snapshot to fail with the error
of thawing the filesystem. Since the thawing function allows control of
error reporting behavior we can use this feature.
So, systemd-machined has this philosophy that machine names are like
hostnames and hence should follow the same rules. But we always allowed
international characters in domain names. Thus we need to modify the
machine name we are passing to systemd.
In order to change some machine names that we will be passing to systemd,
we also need to call TerminateMachine at the end of a lifetime of a
domain. Even for domains that were started with older libvirt. That
can be achieved thanks to virSystemdGetMachineNameByPID(). And because
we can change machine names, we can get rid of the inconsistent and
pointless escaping of domain names when creating machine names.
So this patch modifies the naming in the following way. It creates the
name as <drivername>-<id>-<name> where invalid hostname characters are
stripped out of the name and if the resulting name is longer, it
truncates it to 64 characters. That way we can start domains we
couldn't start before. Well, at least on systemd.
To make it work all together, the machineName (which is needed only with
systemd) is saved in domain's private data. That way the generation is
moved to the driver and we don't need to pass various unnecessary
arguments to cgroup functions.
The only thing this complicates a bit is the scope generation when
validating a cgroup where we must check both old and new naming, so a
slight modification was needed there.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1282846
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The virDomainSnapshotDefFormat calls into virDomainDefFormat,
so should be providing a non-NULL virCapsPtr instance. On the
qemu driver we change qemuDomainSnapshotWriteMetadata to also
include caps since it calls virDomainSnapshotDefFormat.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
The virDomainObjFormat and virDomainSaveStatus methods
both call into virDomainDefFormat, so should be providing
a non-NULL virCapsPtr instance.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
virDomainSaveConfig calls virDomainDefFormat which was setting the caps
to NULL, thus keeping the old behaviour (i.e. not looking at
netprefix). This patch adds the virCapsPtr to the function and allows
the configuration to be saved and skipping interface names that were
registered with virCapabilitiesSetNetPrefix().
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
And use the newly added caps->host.netprefix (if it exists) for
interface names that match the autogenerated target names.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
A pretty nasty deadlock occurs while trying to rename a VM in parallel
with virDomainObjListNumOfDomains.
The short description of the problem is as follows:
Thread #1:
qemuDomainRename:
------> aquires domain lock by qemuDomObjFromDomain
---------> waits for domain list lock in any of the listed functions:
- virDomainObjListFindByName
- virDomainObjListRenameAddNew
- virDomainObjListRenameRemove
Thread #2:
virDomainObjListNumOfDomains:
------> aquires domain list lock
---------> waits for domain lock in virDomainObjListCount
Introduce generic virDomainObjListRename function for renaming domains.
It aquires list lock in right order to avoid deadlock. Callback is used
to make driver specific domain updates.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Free the old vcpupids array in case when this function is called again
during the run of the VM. It will be later reused in the vCPU hotplug
code. The function now returns the number of detected VCPUs.
In case of guest panicked, preserved crashed domain has stopped CPUs.
It's not possible to use tools like WinDbg for the problem investigation
until we start CPUs back.
Error paths after sending the event that domain is started written as if ret = -1
which is set at the beginning of the function. It's common idioma to keep 'ret'
equal to -1 until the end of function where it is set to 0. But here we use ret
to keep result of restore operation too and thus breaks the idioma and its users :)
Let's use different variable to hold restore result.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Having on_crash set to either coredump-destroy or coredump-restart
creates core dumps with option memory-only in the directory specified
by auto_dump_path. When a watchdog is triggered with the action dump
the core dump is also placed into the directory specified by auto_dump_path
but is created without the option memory-only.
This patch sets the option memory-only also for core dumps created by the
watchdog event.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
This patch creates two bitmaps, one for macvlan device names and one
for macvtap. The bitmap position is used to indicate that libvirt is
currently using a device with the name macvtap%d/macvlan%d, where %d
is the position in the bitmap. When requested to create a new
macvtap/macvlan device, libvirt will now look for the first clear bit
in the appropriate bitmap and derive the device name from that rather
than just starting at 0 and counting up until one works.
When libvirtd is restarted, the qemu driver code that reattaches to
active domains calls the appropriate function to "re-reserve" the
device names as it is scanning the status of running domains.
Note that it may seem strange that the retry counter now starts at
8191 instead of 5. This is because we now don't do a "pre-check" for
the existence of a device once we've reserved it in the bitmap - we
move straight to creating it; although very unlikely, it's possible
that someone has a running system where they have a large number of
network devices *created outside libvirt* named "macvtap%d" or
"macvlan%d" - such a setup would still allow creating more devices
with the old code, while a low retry max in the new code would cause a
failure. Since the objective of the retry max is just to prevent an
infinite loop, and it's highly unlikely to do more than 1 iteration
anyway, having a high max is a reasonable concession in order to
prevent lots of new failures.
The current code was a little bit odd. At first we've removed all
possible implicit input devices from domain definition to add them later
back if there was any graphics device defined while parsing XML
description. That's not all, while formating domain definition to XML
description we at first ignore any input devices with bus different to
USB and VIRTIO and few lines later we add implicit input devices to XML.
This seems to me as a lot of code for nothing. This patch may look
to be more complicated than original approach, but this is a preferred
way to modify/add driver specific stuff only in those drivers and not
deal with them in common parsing/formating functions.
The update is to add those implicit input devices into config XML to
follow the real HW configuration visible by guest OS.
There was also inconsistence between our behavior and QEMU's in the way,
that in QEMU there is no way how to disable those implicit input devices
for x86 architecture and they are available always, even without graphics
device. This applies also to XEN hypervisor. VZ driver already does its
part by putting correct implicit devices into live XML.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The VIR_DOMAIN_STATS_VCPU flag to virDomainListGetStats
enables reporting of stats about vCPUs. Currently we
only report the cumulative CPU running time and the
execution state.
This adds reporting of the wait time - time the vCPU
wants to run, but the host scheduler has something else
running ahead of it.
The data is reported per-vCPU eg
$ virsh domstats --vcpu demo
Domain: 'demo'
vcpu.current=4
vcpu.maximum=4
vcpu.0.state=1
vcpu.0.time=1420000000
vcpu.0.wait=18403928
vcpu.1.state=1
vcpu.1.time=130000000
vcpu.1.wait=10612111
vcpu.2.state=1
vcpu.2.time=110000000
vcpu.2.wait=12759501
vcpu.3.state=1
vcpu.3.time=90000000
vcpu.3.wait=21825087
In implementing this I notice our reporting of CPU execute
time has very poor granularity, since we are getting it
from /proc/$PID/stat. As a future enhancement we should
prefer to get CPU execute time from /proc/$PID/schedstat
or /proc/$PID/sched (if either exist on the running kernel)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Since 'savevm' was not converted to QMP libvirt has to parse for error
strings in the text monitor output. One of the unhandled errors is
produced when qemu treats a device as unmigratable.
As current qemu actually does support AHCI migration this bug is
applicable only to older versions of qemu.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1293899
When acpi is used to reboot/shutdown qemu domain, qemu emits
SHUTDOWN event. Libvirt uses fakeReboot variable in order to
differentiate reboot or shutdown. fakeReboot value is reseted
to false after domain restart/reset.
When mode=agent is used to reboot qemu domain, qemu doesn't emit
SHUTDOWN event and libvirt doesn't reset fakeReboot value to false.
In this case next 'shutdown -h now' performs reboot. That's why
we don't need to set fakeReboot=true for mode=agent.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
So I can observe this crasher that with freshly started daemon
(and virtlogd enabled) I am trying to startup a domain that
immediately dies (because it's said to use huge pages but I
haven't allocated a single one in the pool). Hardly reproducible
with -O0 or under valgrind. But I just got lucky:
==20469== Invalid write of size 8
==20469== at 0x4C2E99B: memcpy@GLIBC_2.2.5 (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==20469== by 0x217EDD07: qemuProcessReadLog (qemu_process.c:1670)
==20469== by 0x217EDE1D: qemuProcessReportLogError (qemu_process.c:1696)
==20469== by 0x217EE8C1: qemuProcessWaitForMonitor (qemu_process.c:1957)
==20469== by 0x217F6636: qemuProcessLaunch (qemu_process.c:4955)
==20469== by 0x217F71A4: qemuProcessStart (qemu_process.c:5152)
==20469== by 0x21846582: qemuDomainObjStart (qemu_driver.c:7396)
==20469== by 0x218467DE: qemuDomainCreateWithFlags (qemu_driver.c:7450)
==20469== by 0x21846845: qemuDomainCreate (qemu_driver.c:7468)
==20469== by 0x5611CD0: virDomainCreate (libvirt-domain.c:6753)
==20469== by 0x125D9A: remoteDispatchDomainCreate (remote_dispatch.h:3613)
==20469== by 0x125CB7: remoteDispatchDomainCreateHelper (remote_dispatch.h:3589)
==20469== Address 0x27a52ad0 is 0 bytes after a block of size 5,584 alloc'd
==20469== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==20469== by 0x9B8D1DB: xdr_string (in /lib64/libc-2.21.so)
==20469== by 0x563B39C: xdr_virLogManagerProtocolNonNullString (log_protocol.c:24)
==20469== by 0x563B6B7: xdr_virLogManagerProtocolDomainReadLogFileRet (log_protocol.c:123)
==20469== by 0x164B34: virNetMessageDecodePayload (virnetmessage.c:407)
==20469== by 0x5682360: virNetClientProgramCall (virnetclientprogram.c:379)
==20469== by 0x563B30E: virLogManagerDomainReadLogFile (log_manager.c:272)
==20469== by 0x217CD613: qemuDomainLogContextRead (qemu_domain.c:2485)
==20469== by 0x217EDC76: qemuProcessReadLog (qemu_process.c:1660)
==20469== by 0x217EDE1D: qemuProcessReportLogError (qemu_process.c:1696)
==20469== by 0x217EE8C1: qemuProcessWaitForMonitor (qemu_process.c:1957)
==20469== by 0x217F6636: qemuProcessLaunch (qemu_process.c:4955)
This points to memmove() in qemuProcessReadLog(). Imagine we just
read the following string from qemu:
"abc\n2016-01-18T09:40:44.022744Z qemu-system-x86_64: Error\n"
After the first pass of the while() loop in the
qemuProcessReadLog() (in which we have taken the false branch in
the if) @buf still points to the beginning of the string,
@filter_next points to the beginning of the second line. So we
start second iteration because there is yet another newline
character at the end. In this iteration @eol points to it
actually. Now, the control gets inside true branch of if(). Just
to remind you:
got = 58
filter_next = buf + 5,
eol = buf + 58.
Therefore skip = 54 which is correct. The message we want to skip
is 54 bytes long. However:
memmove(filter_next, eol + 1, (got - skip) +1);
which is
memmove(filter_next, eol + 1, 5)
is obviously wrong as there is only one byte we can access, not 5!
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We have this function qemuAgentNotifyEvent() which is supposed to
be called from thread pool responsible for processing qemu
monitor events. The function then should wake up other thread
that is waiting for a guest to shutdown or reboot. However, if we
have received a different error a warning is printed out. This
warning lacks info on which event is expected.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit id '90b721e43' moved where the virCgroupAddTask was made until
after the check for the vcpupin checks. However, in doing so it missed
an option where if the cpumap didn't exist, then the code would continue
back to the top of the current vcpu loop. The results was that the
virCgroupAddTask wouldn't be called.
Signed-off-by: John Ferlan <jferlan@redhat.com>
This reverts commit a41c00b472.
After much testing and upstream discussion this has been deemed to be
the incorrect operation since it means we no longer have any guarantee
about which resource controllers the QEMU processes in general are in.
So, you try to start a domain, but before we even get to the part
where chardev part of qemu command line is generated (and
possibly missing path to unix sockets is made up) an error occurs
which results in calling qemuProcessStop. This will then try to
clean up the mess and possibly ends up calling unlink(NULL).
==8085== Thread 3:
==8085== Syscall param unlink(pathname) points to unaddressable byte(s)
==8085== at 0xA85EA57: unlink (in /lib64/libc-2.21.so)
==8085== by 0x213D3C24: qemuProcessCleanupChardevDevice (qemu_process.c:2866)
==8085== by 0x558D6B1: virDomainChrDefForeach (domain_conf.c:22924)
==8085== by 0x213DA9AE: qemuProcessStop (qemu_process.c:5326)
==8085== by 0x213DA2F2: qemuProcessStart (qemu_process.c:5190)
==8085== by 0x2142957F: qemuDomainObjStart (qemu_driver.c:7396)
==8085== by 0x214297DB: qemuDomainCreateWithFlags (qemu_driver.c:7450)
==8085== by 0x21429842: qemuDomainCreate (qemu_driver.c:7468)
==8085== by 0x5611B95: virDomainCreate (libvirt-domain.c:6753)
==8085== by 0x125D9A: remoteDispatchDomainCreate (remote_dispatch.h:3613)
==8085== by 0x125CB7: remoteDispatchDomainCreateHelper (remote_dispatch.h:3589)
==8085== by 0x568BF41: virNetServerProgramDispatchCall (virnetserverprogram.c:437)
==8085== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==8085==
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Autodeflate can be enabled/disabled for memballon device
of model 'virtio'.
xml:
<devices>
<memballoon model='virtio' autodeflate='on'/>
</devices>
qemu:
qemu -device virtio-balloon-pci,...,deflate-on-oom=on
Autodeflate cannot be enabled/disabled for running domain.
Use virDomainDefAddUSBController() to add an EHCI1+UHCI1+UHCI2+UHCI3
controller set to newly defined Q35 domains that don't have any USB
controllers defined.
The real Q35 machine puts the first USB controller set (EHCI+(UHCIx4))
on bus 0 slot 0x1D, and the 2nd USB controller set on bus 0 slot 0x1A,
so let's attempt to make the virtual machine match that for
controllers with auto-assigned addresses when possible.
Three test cases were added to assure that the proper addresses are
assigned - one with a single set of unaddressed USB controllers, one
with 3 (to grab both preferred slots plus one more), and one with the
order of the controller definitions reordered, to assure that the
auto-assignment isn't mixed up by order.
When qemuAssignDevicePCISlots() is looking for companion controllers
for a USB controller that has no PCI address specified, it initializes
a virDevicePCIAddress to 0000:00:00.0, fills it in with the
companion's address if one is found, then checks whether or not there
was a find based on slot == 0. On a system with a single PCI bus, that
is a valid way to check, because slot 0 is reserved, but on most other
PCI buses, slot 0 is not reserved, and is open for use by any
device. This patch adds a separate bool that is set when a companion
is found rather than relying on the faulty information provided with
"slot == 0".
While this is no functional change, whole channel definition is
going to be needed very soon. Moreover, while touching this obey
const correctness rule in qemuAgentOpen() - so far it was passed
regular pointer to channel config even though the function is
expected to not change pointee at all. Pass const pointer
instead.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In qemu driver we listen to virtio channel events like an agent
connected to or disconnected from the guest part of socket.
However, with a little exception - when we find out that the
socket in question is the guest agent one, we connect or
disconnect guest agent which is done prior setting new state in
internal structure. Due to a bug in our code it may happen that
we got the event but failed to set it in internal structure
representing the channel.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Earlier commit 7140807917 forgot to deal
properly with status XMLs where we want the libvirt-internal paths to be
kept in place and not cleared, otherwise we could end up copying a NULL
string and segfaulting th daemon.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
If the q35 specific disable s3/s4 setting isn't supported, fallback to
specifying the PIIX setting, which is the previous behavior. It doesn't
have any effect, but qemu will just warn about it rather than error:
qemu-system-x86_64: Warning: global PIIX4_PM.disable_s3=1 not used
qemu-system-x86_64: Warning: global PIIX4_PM.disable_s4=1 not used
Since it doesn't error, I don't think we should either, since there
may be configs in the wild that already have q35 + disable_s3/4 (via
virt-manager)
This function may be called with @dconnuri == NULL, e.g. from
virDomainMigrateToURI3() if the flags are missing
VIR_MIGRATE_PEER2PEER flag. Moreover, all later functions called
from here do wrap it into NULLSTR() so why not do the same here?
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The condition was checking for UHCI (and OHCI for ppc64) availability so
that it can specify the proper device instead of legacy usb. However,
for ppc64, we don't need to check both OHCI and UHCI, but only OHCI as
that is the legacy default. The condition is so big that it was just a
matter of time when someone will make a mistake there, so let's use more
lines so that it is visible what the condition checks for.
This fixes usage of -device instead of -usb for ppc64 that supports
pci-usb-ohci and does not support piix3-usb-uhci.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1297020
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
memory_dirty_rate corresponds to dirty-pages-rate in QEMU and
memory_iteration is what QEMU reports in dirty-sync-count.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The structure actually contains migration statistics rather than just
the status as the name suggests. Renaming it as
qemuMonitorMigrationStats removes the confusion.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
A migration is in "setup" state after it was "inactive" and before it
becomes "active". Let's reflect this in our migration status enum.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
My commit 674afcb09e moved computing the
default listen address from qemuMigrationPrepareAny to
qemuMigrationPrepareIncoming. However, I didn't notice listenAddress was
later passed to qemuMigrationStartNBDServer. Thus, it would be called
with the original value of listenAddress (NULL).
Let's add the updated listen address to qemuProcessIncomingDef and use
it when starting NBD servers.
Reported-by: Michael Chapman <mike@very.puzzling.org>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
If user defines a virtio channel with UNIX socket backend and doesn't
care about the path for the socket (e.g. qemu-agent channel), we still
generate it into the persistent XML. Moreover when then user renames
the domain, due to its persistent socket path saved into the per-domain
directory, it will not start. So let's forget about old generated paths
and also stop putting them into the persistent definition.
https://bugzilla.redhat.com/show_bug.cgi?id=1278068
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Just recently, qemu forbade specifying format for sourceless
disks (qemu commit 39c4ae941ed992a3bb5). It kind of makes sense.
If there's no file to open, why specify its format. Anyway, I
have a domain like this:
<disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<target dev='hda' bus='ide'/>
<readonly/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>
and obviously I am unable to start it. Therefore, a fix on our
side is needed too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For completeness, use the VIR_TRISTATE_SWITCH_ABSENT for data.file.append
comparisons. Commit ids '70ffa02f' and '53a15aed' just went with the non
zero comparison.
While reviewing 1b43885d17 I've noticed a virReportError()
followed by a goto endjob; without setting the correct return
value. Problem is, if block job is so fast that it's bandwidth
does not fit into ulong, an error is reported. However, by that
time @ret is already set to 1 which means success. Since the
scenario can be hardly considered successful, we should return a
value meaning error.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Valgrind complained:
==18990== 20 (16 direct, 4 indirect) bytes in 1 blocks are definitely lost in loss record 188 of 996
==18990== at 0x4A057BB: calloc (vg_replace_malloc.c:593)
==18990== by 0x5292E9B: virAllocN (viralloc.c:191)
==18990== by 0x2221E731: qemuMigrationCookieXMLParseStr (qemu_migration.c:1012)
==18990== by 0x2221F390: qemuMigrationEatCookie (qemu_migration.c:1413)
==18990== by 0x222228CE: qemuMigrationPrepareAny (qemu_migration.c:3463)
==18990== by 0x22224121: qemuMigrationPrepareDirect (qemu_migration.c:3865)
==18990== by 0x22251C25: qemuDomainMigratePrepare3Params (qemu_driver.c:12414)
==18990== by 0x5389EE0: virDomainMigratePrepare3Params (libvirt-domain.c:5107)
==18990== by 0x1278DB: remoteDispatchDomainMigratePrepare3ParamsHelper (remote.c:5425)
==18990== by 0x53FF287: virNetServerProgramDispatch (virnetserverprogram.c:437)
==18990== by 0x540523D: virNetServerProcessMsg (virnetserver.c:135)
==18990== by 0x54052C7: virNetServerHandleJob (virnetserver.c:156)
==18990==
==18990== 20 (16 direct, 4 indirect) bytes in 1 blocks are definitely lost in loss record 189 of 996
==18990== at 0x4A057BB: calloc (vg_replace_malloc.c:593)
==18990== by 0x5292E9B: virAllocN (viralloc.c:191)
==18990== by 0x2221E731: qemuMigrationCookieXMLParseStr (qemu_migration.c:1012)
==18990== by 0x2221F390: qemuMigrationEatCookie (qemu_migration.c:1413)
==18990== by 0x222249D2: qemuMigrationRun (qemu_migration.c:4395)
==18990== by 0x22226365: doNativeMigrate (qemu_migration.c:4693)
==18990== by 0x22228E45: qemuMigrationPerform (qemu_migration.c:5553)
==18990== by 0x2225144B: qemuDomainMigratePerform3Params (qemu_driver.c:12621)
==18990== by 0x539F5D8: virDomainMigratePerform3Params (libvirt-domain.c:5206)
==18990== by 0x127305: remoteDispatchDomainMigratePerform3ParamsHelper (remote.c:5557)
==18990== by 0x53FF287: virNetServerProgramDispatch (virnetserverprogram.c:437)
==18990== by 0x540523D: virNetServerProcessMsg (virnetserver.c:135)
If we're replacing the NBD data, it's simplest to free the old object
(including the disk list) and allocate a new one.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
Valgrind complained:
==23975== Conditional jump or move depends on uninitialised value(s)
==23975== at 0x22255FA6: qemuDomainGetBlockJobInfo (qemu_driver.c:16538)
==23975== by 0x538E97C: virDomainGetBlockJobInfo (libvirt-domain.c:9685)
==23975== by 0x12F740: remoteDispatchDomainGetBlockJobInfoHelper (remote.c:2834)
==23975== by 0x53FF287: virNetServerProgramDispatch (virnetserverprogram.c:437)
==23975== by 0x540523D: virNetServerProcessMsg (virnetserver.c:135)
==23975== by 0x54052C7: virNetServerHandleJob (virnetserver.c:156)
==23975== by 0x52F515B: virThreadPoolWorker (virthreadpool.c:145)
==23975== by 0x52F4668: virThreadHelper (virthread.c:206)
==23975== by 0x6E08A50: start_thread (in /lib64/libpthread-2.12.so)
==23975== by 0x82BE93C: clone (in /lib64/libc-2.12.so)
==23975==
==23975== Conditional jump or move depends on uninitialised value(s)
==23975== at 0x22255FB4: qemuDomainGetBlockJobInfo (qemu_driver.c:16542)
==23975== by 0x538E97C: virDomainGetBlockJobInfo (libvirt-domain.c:9685)
==23975== by 0x12F740: remoteDispatchDomainGetBlockJobInfoHelper (remote.c:2834)
==23975== by 0x53FF287: virNetServerProgramDispatch (virnetserverprogram.c:437)
==23975== by 0x540523D: virNetServerProcessMsg (virnetserver.c:135)
==23975== by 0x54052C7: virNetServerHandleJob (virnetserver.c:156)
==23975== by 0x52F515B: virThreadPoolWorker (virthreadpool.c:145)
==23975== by 0x52F4668: virThreadHelper (virthread.c:206)
==23975== by 0x6E08A50: start_thread (in /lib64/libpthread-2.12.so)
==23975== by 0x82BE93C: clone (in /lib64/libc-2.12.so)
If no matching block job is found, qemuMonitorGetBlockJobInfo returns 0
and we should not write anything to the caller-supplied
virDomainBlockJobInfo pointer.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
The manpage for sysconf() suggest including unistd.h as the
function is declared there. Even though we are not hitting any
compile issues currently, let's include the correct header file
instead of relying on some hidden include chain.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
By default, QEMU truncates serial file on open. Sometimes, it could be weird -
for example, when we are trying to investigate some event, which occured several
restarts ago. This patch adds an ability to preserve previous content.
Signed-off-by: Dmitry Mishin <dim@virtuozzo.com>
This replaces the virPCIKnownStubs string array that was used
internally for stub driver validation.
Advantages:
* possible values are well-defined
* typos in driver names will be detected at compile time
* avoids having several copies of the same string around
* no error checking required when setting / getting value
The names used mirror those in the
virDomainHostdevSubsysPCIBackendType enumeration.
We only support hotplugging SCSI controllers.
The USB and virtio-serial related code was never reachable because
this function was only called for VIR_DOMAIN_CONTROLLER_TYPE_SCSI
controllers.
This reverts commit ee0d97a and parts of commits 16db8d2
and d6d54cd1.
This function calls qemuDomainAttachControllerDevice for SCSI
controllers and reports an error for all other controllers.
Move the error inside qemuDomainAttachControllerDevice and delete this
wrapper.
A closer review of the code shows that for the transition from paused to
running which was supposed to emit the VIR_DOMAIN_EVENT_RESUMED - no event
would be generated. Rather the event is generated when going from running
to running.
Following the 'was_running' boolean shows it is set when the domain obj
is active and the domain obj state is VIR_DOMAIN_RUNNING. So rather than
using was_running to generate the RESUMED event, use !was_running
When the function changes the memory lock limit for the first time,
it will retrieve the current value and store it inside the
virDomainObj for the domain.
When the function is called again, if memory locking is no longer
needed, it will be able to restore the memory locking limit to its
original value.
We increase the limit before plugging in a PCI hostdev or a memory
module because some memory might need to be locked due to eg. VFIO.
Of course we should do the opposite after unplugging a device: this
was already the case for memory modules, but not for PCI hostdevs.
In commit 686eb7a24f, the break was not considered part of the
condition, hence breaking after first node when searching.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
when appropriate, of course. If the config for a domain specifies boot
order with <boot dev='blah'/> elements, e.g.:
<os>
...
<boot dev='hd'/>
<boot dev='network'/>
</os>
Then the first disk device in the config will have ",bootindex=1"
appended to its qemu commandline -device options, and the first (and
*only* the first) network interface device will get ",bootindex=2".
However, if the first network interface device is a "hostdev" device
(an SRIOV Virtual Function (VF) being assigned to the domain with
vfio), then the bootindex option will *not* be appended. This happens
because the bootindex=n option corresponding to the order of "<boot
dev='network'/>" is added to the -device for the first network device
when network device commandline args are constructed, but if it's a
hostdev network device, its commandline arg is instead constructed in
the loop for hostdevs.
This patch fixes that omission by noticing (in bootHostdevNet) if the
first network device was a hostdev, and if so passing on the proper
bootindex to the commandline generator for hostdev devices - the
result is that ",bootindex=2" will be properly appended to the first
"network" device in the config even if it is really a hostdev
(including if it is assigned from a libvirt network pool). (note that
this is only the case if there is no <bootmenu enabled='yes'/> element
in the config ("-boot menu-on" in qemu) , since the two are mutually
exclusive - when the bootmenu is enabled, the individual per-device
bootindex options can't be used by qemu, and we revert to using "-boot
order=xyz" instead).
If a greater level of control over boot order is desired (e.g., more
than one network device should be tried, or a network device other
than the first one encountered in the config), then <boot
dev='network'/> in the <os> element should not be used; instead, the
individual device elements in the config should be given a "<boot
order='n'/>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1278421
Commit 256496e1 introduced a detection if "is locked" in error replay
from qemu monitor. Commit c4073657 fixed a memory leak, but it was
pointed out by Peter, that this could be done cleaner without
stringifing the replay.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The return value of virJSONValueToString() should be freed when
no longer needed. This is not the case after 256496e1.
==26902== 138 bytes in 2 blocks are definitely lost in loss record 1,051 of 1,239
==26902== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==26902== by 0xAA5F599: strdup (in /lib64/libc-2.21.so)
==26902== by 0x552BAD9: virStrdup (virstring.c:726)
==26902== by 0x54F60A7: virJSONValueToString (virjson.c:1790)
==26902== by 0x1DF6EBB9: qemuMonitorJSONEjectMedia (qemu_monitor_json.c:2225)
==26902== by 0x1DF57A4C: qemuMonitorEjectMedia (qemu_monitor.c:1985)
==26902== by 0x1DF1EF2D: qemuDomainChangeEjectableMedia (qemu_hotplug.c:199)
==26902== by 0x1DF90314: qemuDomainChangeDiskLive (qemu_driver.c:7985)
==26902== by 0x1DF90476: qemuDomainUpdateDeviceLive (qemu_driver.c:8030)
==26902== by 0x1DF91ED7: qemuDomainUpdateDeviceFlags (qemu_driver.c:8677)
==26902== by 0x561785F: virDomainUpdateDeviceFlags (libvirt-domain.c:8559)
==26902== by 0x134210: remoteDispatchDomainUpdateDeviceFlags (remote_dispatch.h:10966)
==26902== 106 bytes in 1 blocks are definitely lost in loss record 1,033 of 1,239
==26902== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==26902== by 0xAA5F599: strdup (in /lib64/libc-2.21.so)
==26902== by 0x552BAD9: virStrdup (virstring.c:726)
==26902== by 0x54F60A7: virJSONValueToString (virjson.c:1790)
==26902== by 0x1DF6EC0C: qemuMonitorJSONEjectMedia (qemu_monitor_json.c:2227)
==26902== by 0x1DF57A4C: qemuMonitorEjectMedia (qemu_monitor.c:1985)
==26902== by 0x1DF1EF2D: qemuDomainChangeEjectableMedia (qemu_hotplug.c:199)
==26902== by 0x1DF90314: qemuDomainChangeDiskLive (qemu_driver.c:7985)
==26902== by 0x1DF90476: qemuDomainUpdateDeviceLive (qemu_driver.c:8030)
==26902== by 0x1DF91ED7: qemuDomainUpdateDeviceFlags (qemu_driver.c:8677)
==26902== by 0x561785F: virDomainUpdateDeviceFlags (libvirt-domain.c:8559)
==26902== by 0x134210: remoteDispatchDomainUpdateDeviceFlags (remote_dispatch.h:10966)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Moving tasks to cgroups implied sched_setaffinity. Changing the cpus in
a set implies the same for all tasks in the group.
The old code put the the thread into the cpuset inherited from the
machine cgroup, which allowed it to run outside of vcpupin for a short
while.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
The machine cgroup is a superset, a parent to the emulator and vcpuX
cgroups. The parent cgroup should never have any tasks directly in it.
In fact the parent cpuset might contain way more cpus than the sum of
emulatorpin and vcpupins. So putting tasks in the superset will allow
them to run outside of <cputune>.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
virCgroupNewMachine used to add the pidleader to the newly created
machine cgroup. Do not do this implicit anymore.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
When user configures vhost-user interface and forgets to also configure
any shared memory, the search for the root cause of non-operational
interface might take unpleasantly long time. Let's enhance user
experience by emitting a warning in the logs.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1266982
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1240439
Ta-da! Now that we know how to open a macvtap device multiple
times, we can finally enable the multiqueue feature. Everything
else is already prepared (e.g. command line generation) from the
previous iteration where the feature was implemented for
TUN/TAP devices.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For the multiqueue on macvtaps we are going to need to open
the device multiple times. Currently, this is not supported.
Rework the function, so that upper layers can be reworked too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So yet again one of integer arguments that we use as a boolean.
Since the argument count of the function is unbearably long
enough, lets turn those booleans into flags.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add qemuDomainHasVCpuPids to do the checking and replace in place checks
with it.
We no longer need checking whether the thread contains fake data
(vcpupids[0] == vm->pid) as in b07f3d821d
and 65686e5a81 this was removed.
The vCPU threads make sense in the counterparts that set the vCPU
bandwidth/quota, not in the emulator one. The emulator tunables are set
all the time anyways.
Drop the extra check and remove the now unneeded vm argument.
Since commit 0c04906fa the check for priv->cgroup doesn't make sense as
the calls to virCgroupHasController return the same information. Remove
it and move it's comment partially to the new check.
The already spurious check was also later copied to the iothreads code.
Refactor the code flow so that 'exit_monitor:' can be removed.
This patch moves the auditing functions into places where it's certain
that hotunplug was or was not successful and reports errors from
qemuMonitorGetCPUInfo properly.
Refactor the code flow so that 'exit_monitor:' can be removed.
This patch also moves the auditing and setting of the new vCPU count
right to the place where the hotplug happens, since it's possible that
the hotplug succeeds and adds a cpu while other stuff fails.
Lastly, failures of qemuMonitorGetCPUInfo are now reported rather than
ignored. The function retuns 0 if it "successfully" detected 0 threads.
qemuDomainHotplugVcpus/qemuDomainHotunplugVcpus are complex enough in
regards of adding one CPU. Additionally it will be desired to reuse
those functions later with specific vCPU hotplug.
Move the loops for adding vCPUs into qemuDomainSetVcpusFlags so that the
helpers can be made simpler and more straightforward.
The cpu hotplug helper functions used negative error handling in a part
of them, although some code that was added later didn't properly set the
error codes in some cases. This would cause improper error messages in
cases where we couldn't modify the numa cpu mask and a few other cases.
Fix the logic by converting it to the regularly used pattern.
With a very unfortunate timing, the agent might vanish before we do the
second call while the locks were down. Re-check that the agent is
available before attempting it again.