Previously, qemuDomainAttachDeviceFlags was doing two things:
handling the job and attaching devices. Now the second part is
in a new function.
This change is required to make it possible to test more complex
device attachment situations, like attaching a device to both
config and live at once.
We want to be able to pass a NULL instead of the connection
and use this function in tests. To achieve this, the virConnectPtr
is passed instead of virDomainPtr, and the driver is a new separate
parameter.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1354238
So we spend some time and effort constructing perfect file name
for an automatic coredump of a domain, but then just leak it and
use the domain name anyway. This is probably due to a silly
mistake that slipped even through review.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Allow to store driver specific data on a per-vcpu basis.
Move of the virDomainDef*Vcpus* functions was necessary as
virDomainXMLOptionPtr was declared below this block and I didn't want to
split the function headers.
Most callers make sure that it's never called with an out of range vCPU.
Every other caller reports a different error explicitly. Drop the error
reporting and clean up some dead code paths.
Some code paths already assume that it is allocated since it was always
allocated by virDomainPerfDefParseXML. Make it member of virDomainDef
directly so that we don't have to allocate it all the time.
This fixes crash when attempting to connect to an existing process via
virDomainQemuAttach since we would not allocate it in that code path.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1350688
in qemuConnectDomainXMLToNative. This function was only accounting for
about 1/10 of all the allocated items in the NetDef prior to memseting
it to all 0's. On top of that, it was going to great pains to learn
the name of the bridge device, but then never doing anything useful
with it (just putting it into data.ethernet.dev, which is *never* used
when building a qemu commandline). (I think this again all started off
as code with good intentions, but it was never completed, and instead
was just Frankensteinically cargo-culted into the odd mish mash we
have today).
The resulting code is much simpler, produces exactly the same output,
and doesn't leak memory.
This patch removes the expanded and duplicated code that all sprung
out of two well-intentioned-but-useless settings of
net->data.(bridge|ethernet).ipaddr.
qemu has never supported even a single IP address in the interface
config, much less a list of them. All of the instances of "clearing
out the IP addresses" that are now in this function originated with
commit d8dbd6 "Basic domain XML conversions for Xen/QEMU drivers" in
May 2009, but even then the single "ipaddr" in the struct for
type='ethernet' and type='bridge' wasn't used in the qemu driver (only
in xen and openvz). Since then anyone who added a new interface type
also tacked on another unnecessary clearing of ipaddr, and when it was
made into a list of IPs (so far supported only by the LXC driver) this
simple setting was turned into a loop (well, multiple loops) to clear
them all.
I'm tired of mistyping this all the time, so let's do it the same all
the time (similar to how we changed all "Pci" to "PCI" awhile back).
(NB: I've left alone some things in the esx and vbox drivers because
I'm unable to compile them and they weren't obviously *not* a part of
some API. I also didn't change a couple of variables named,
e.g. "somethingIptables", because they were derived from the name of
the "iptables" command)
This kvmGetMaxVCPUs() needs to be used at two different places
so move it to utils with appropriate name and mark it as private
global now.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
On PPC the legacy passthrough is not supported and only
VFIO is supported. So, the checks at places to confirm if the
host is passthrough capable checks only legacy, fix it. This
is seen at only one place now.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
One can not issue monitor commands manually during async calls thru
designated API while this could be useful for testing/debugging purposes.
qemuDomainQemuMonitorCommand uses job of type QEMU_JOB_MODIFY and any async
call disable parallel execution of this type of job. The only state that is
changed is taint variable. AFAIU the only place we can mess is resetting
taint flag in qemuProcessStop routine under some async job. But this can not
happen thanx to both virDomainObjIsActive check in qemuDomainQemuMonitorCommand
and resetting active status in qemuProcessStop before taint flag.
Change job type to QEMU_JOB_QUERY and thus make the API call available for
most of async jobs.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Allow gathering available vcpu ids, their state and offlinability via
the qemu guest agent. The maximum id was chosen arbitrarily and ought
to be enough for everybody.
Documentation for the "guest-set-vcpus" command describes a proper
algorithm how to set vcpus. This patch makes the following changes:
- state of cpus that has not changed is not updated
- if the command was partially successful the command is re-tried with
the rest of the arguments to get a proper error message
- code is more robust against malicious guest agent
- fix testsuite to the new semantics
If the domain is not running, but for example the CPUs are stopped, the
ACPI event gets queued and resume of the domain will just shut it off.
https://bugzilla.redhat.com/show_bug.cgi?id=1216281
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Since obtaining a job can wait for another job to finish, the state
might change in the meantime. And checking it more than once is
pointless.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The virQEMUDriverConfig object contains lists of
loader:nvram pairs to advertise firmwares supported by
by the driver, and qemu_conf.c contains code to populate
the lists, all of which is useful for other drivers too.
To avoid code duplication, introduce a virFirmware object
to encapsulate firmware details and switch the qemu driver
to use it.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Move all APIs with a virHostMEM name prefix out into new
util/virhostmem.h & util/virhostmem.c files
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move all APIs with a virHostCPU name prefix out into new
util/virhostcpu.h & util/virhostcpu.c files
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In preparation for moving all the CPU related APIs out of
the nodeinfo file, give them a virHostCPU name prefix.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In preparation for moving all the memory related APIs out of
the nodeinfo file, give them a virHostMem name prefix.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Nearly all the methods in the nodeinfo file are given a
'const char *sysfs_prefix' parameter to override the
default sysfs path (/sys/devices/system). Every single
caller passes in NULL for this, except one use in the
unit tests. Furthermore this parameter is totally
Linux-specific, when the APIs are intended to be cross
platform portable.
This removes the sysfs_prefix parameter and instead gives
a new method linuxNodeInfoSetSysFSSystemPath for use by
the test suite.
For two of the methods this hardcodes use of the constant
SYSFS_SYSTEM_PATH, since the test suite does not need to
override the path for thos methods.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If you want to set block device I/O tuning values that end with '_max'
and there is nothing else set, libvirt emits an error. In particular:
error: internal error: Unexpected error
That's an unknown error. That is because *_max values depend on their
respective non-_max values. QEMU even says that in the error message
sent as a response to the monitor command:
"error": {"class": "GenericError", "desc": "bps_max/iops_max require
corresponding bps/iops values"}
the problem was that we didn't know that and there was no check for it.
Adding such check makes sure that there will be less confused users.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The virConnectOpenInternal method opens the libvirt client
config file and uses it to resolve things like URI aliases.
There may be driver specific things that are useful to
store in the config file too, so rather than have them
re-parse the same file, pass the virConfPtr down to the
drivers.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
qemuProcessStart does not unset the infrastructure that retrieves errors
from the qemu log file in case of migration. As this wasn't handled
properly in qemuDomainSaveImageStartVM we kept the logging context/fd
open for the lifetime of the VM rather than closing it after it's not
needed.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1325080
Similarly to the domain definition validator add a device validator. The
change to the prototype of the domain validator is necessary as
virDomainDeviceInfoIterateInternal requires a non-const pointer.
Until now we weren't able to add checks that would reject configuration
once accepted by the parser. This patch adds a new callback and
infrastructure to add such checks. In this patch all the places where
rejecting a now-invalid configuration wouldn't be a good idea are marked
with a new parser flag.
Hand-entering indexes for 20 PCI controllers is not as tedious as
manually determining and entering their PCI addresses, but it's still
annoying, and the algorithm for determining the proper index is
incredibly simple (in all cases except one) - just pick the lowest
unused index.
The one exception is USB2 controllers because multiple controllers in
the same group have the same index. For these we look to see if 1) the
most recently added USB controller is also a USB2 controller, and 2)
the group *that* controller belongs to doesn't yet have a controller
of the exact model we're just now adding - if both are true, the new
controller gets the same index, but in all other cases we just assign
the lowest unused index.
With this patch in place and combined with the automatic PCI address
assignment, we can define a PCIe switch with several ports like this:
<controller type='pci' model='pcie-root-port'/>
<controller type='pci' model='pcie-switch-upstream-port'/>
<controller type='pci' model='pcie-switch-downstream-port'/>
<controller type='pci' model='pcie-switch-downstream-port'/>
<controller type='pci' model='pcie-switch-downstream-port'/>
<controller type='pci' model='pcie-switch-downstream-port'/>
<controller type='pci' model='pcie-switch-downstream-port'/>
...
These will each get a unique index, and PCI addresses that connect
them together appropriately with no pesky numbers required.
The libvirt internal bits can be changed for disks that don't otherwise
support changing media. Remove the switch statement and allow changes of
non-source data for all disks.
qemuDomainChangeDiskLive rolled back few changes to the disk definition
if changing of the media failed. This can be avoided by moving some code
around.
If the stats for a block device can't be acquired from qemu we've
fallen back to loading them from the file on the disk in libvirt.
If qemu is not cooperating due to being stuck on an inaccessible NFS
share we would then attempt to read the files and get stuck too with
the VM object locked. All other APIs would eventually get stuck waiting
on the VM lock.
Avoid this problem by skipping the block stats if the VM is online but
the monitor did not provide any stats.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1337073
Some Intel processor families (e.g. the Intel Xeon processor E5 v3
family) introduced some RDT (Resource Director Technology) features
to monitor or control shared resource. Among these features, MBM
(Memory Bandwidth Monitoring), which is build on the CMT (Cache
Monitoring Technology) infrastructure, provides OS/VMM a way to
monitor bandwidth from one level of cache to another.
With current perf framework, this patch adds support to perf event
for MBM.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
hotplug APIs with the AFFECT_CONFIG flag are essentially replicating
'insert <device> into XML document, and redefine XML'. Thinking of
it this way, it's natural that we call virDomainDefPostParse after
manually editing the XML here.
Not only does doing so allow us to drop a bunch of open coded calls
to qemuDomainAssignAddresses, but it also means we are going through
the standard channels for XML validation and potentially catching
errors in user submitted XML.
This wires up qemuDomainAssignAddresses into the new
virDomainDefAssignAddressesCallback, so it's always triggered
via virDomainDefPostParse. We are essentially doing this already
with open coded calls sprinkled about.
qemu argv parse output changes slightly since previously it wasn't
hitting qemuDomainAssignAddresses.
Extract the relevant parts of the existing checker and reuse them for
blockcopy since copying to a non-block device creates an invalid
configuration.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1209802
Similar to the qemuDomainSecretDiskPrepare, generate the secret
for the Hostdev's prior to call qemuProcessLaunch which calls
qemuBuildCommandLine. Additionally, since the secret is not longer
added as part of building the command, the hotplug code will need
to make the call to add the secret in the hostdevPriv.
Since this then is the last requirement to pass a virConnectPtr
to qemuBuildCommandLine, we now can remove that as part of these
changes. That removal has cascading effects through various callers.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than needing to pass the conn parameter to various command
line building API's, add qemuDomainSecretPrepare just prior to the
qemuProcessLaunch which calls qemuBuilCommandLine. The function
must be called after qemuProcessPrepareHost since it's expected
to eventually need the domain masterKey generated during the prepare
host call. Additionally, future patches may require device aliases
(assigned during the prepare domain call) in order to associate
the secret objects.
The qemuDomainSecretDestroy is called after the qemuProcessLaunch
finishes in order to clear and free memory used by the secrets
that were recently prepared, so they are not kept around in memory
too long.
Placing the setup here is beneficial for future patches which will
need the domain masterKey in order to generate an encrypted secret
along with an initialization vector to be saved and passed (since
the masterKey shouldn't be passed around).
Finally, since the secret is not added during command line build,
the hotplug code will need to get the secret into the private disk data.
Signed-off-by: John Ferlan <jferlan@redhat.com>
When the domain definition describes a machine with NUMA, setting the
maximum vCPU count via the API might lead to an invalid config.
Add a check that will forbid this until we add more advanced cpu config
capabilities.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1327499
If the domain name is long enough, the timestamp can prolong the
filename for automatic coredump to more than the filesystem's limit.
Simply shorten it like we do in other places. The timestamp helps with
the unification, but having the ID in the name won't hurt.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1289363
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This function - in contrast with qemuBuildCommandLine - merely
constructs our internal command representation of a domain. This
is then later compared against expected output. Or, this function
is used also in virConnectDomainXMLToNative(). But due to a copy
paste error this function, just like its image - has @forceFips
argument that if enabled forces FIPS, otherwise mimics FIPS state
in the host. If FIPS is enabled or forced the generated command
line is different to state in which FIPS is disabled. Problem is,
while this could be desired in the virConnectDomainXMLToNative()
case, this is undesirable in the test suite as it will produce
unpredicted results.
Solution to this is to rename argument to @enableFips to
specifically tell whether we expect command line to be build in
either of fashions and make virConnectDomainXMLToNative()
implementation fetch FIPS state and pass it to
qemuProcessCreatePretendCmd().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The values are currently limited to LLONG_MAX which causes some
problems. QEMU conveniently changed their maximum to 1e15 (1 PB) which
is enough for some time and we need to adapt to that so that we don't
throw "Unknown error" messages. Strictly limiting these values actually
fixes some corner case values (off-by-one checks in QEMU probably).
Since values out of the new specified range do not overflow anything,
change the type of error as well.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1317531
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This an ubuntu/debian packaging convention. At one point it may have
been an actually different binary, but at least as of ubuntu precise
(the oldest supported ubuntu distro, released april 2012) kvm-img is
just a symlink to qemu-img for back compat.
I think it's safe to drop support for it
The common idiom in the driver API implementations is roughly:
- ACL check
- BeginJob (if needed)
- AgentAvailable (if needed)
- !IsActive
A few calls had an extra !IsActive before BeginJob, which doesn't
seem to serve much use. Drop them
Migration API allows to specify a destination domain configuration.
Offline domain has only inactive XML and it is replaced by configuration
specified using VIR_MIGRATE_PARAM_DEST_XML param. In case of live
migration VIR_MIGRATE_PARAM_DEST_XML param is applied for active XML.
This commit introduces the new VIR_MIGRATE_PARAM_PERSIST_XML param
that can be used within live migration to replace persistent/inactive
configuration.
Required for: https://bugzilla.redhat.com/show_bug.cgi?id=835300
@flags have a valid modification impact only after calling
virDomainObjUpdateModificationImpact. virDomainObjGetOneDef calls it but
doesn't update them in the caller.
Even though we have the machine locked throughout whole APIs we
are querying/modifying domain internal state. We should grab a
job whilst doing that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Now that we have @flags we can support changing perf events just
in active or inactive configuration regardless of the other.
Previously, calling virDomainSetPerfEvents set events in both
active and inactive configuration at once. Even though we allow
users to set perf events that are to be enabled once domain is
started up. The virDomainGetPerfEvents API was flawed too. It
returned just runtime info.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
I've noticed that these APIs are missing @flags argument. Even
though we don't have a use for them, it's our policy that every
new API must have @flags.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Since commit v1.3.2-119-g1e34a8f which enabled debug-threads in QEMU
qemuGetProcessInfo would fail to parse stats for any thread with a space
in its name.
https://bugzilla.redhat.com/show_bug.cgi?id=1316803
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This patch adds new xml element, and so we can have the option of
also having perf events enabled immediately at startup.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Message-id: 1459171833-26416-6-git-send-email-qiaowei.ren@intel.com
Move all code that checks host and domain. Do not check host if we use
VIR_QEMU_PROCESS_START_PRETEND flag.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
In post-copy mode none of the hosts has a complete guest state and
rolling back migration is impossible. Thus aborting it would be
equivalent to destroying the domain.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
It's just a combination of AddImplicitControllers, and AddConsoleCompat.
Every caller that wants ImplicitControllers also wants the ConsoleCompat
AFAICT, so lump them together. We also need it for future patches.
We include the file in plenty of places. This is mostly due to
historical reasons. The only place that needs something from the
header file is storage_backend_fs which opens _PATH_MOUNTED. But
it gets the file included indirectly via mntent.h. At no other
place in our code we need _PATH_.*. Drop the include and
configure check then.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Mostly it is just passing new parameter here and there. In case
of zero value we fallback to auto selecting port and thus keep
backward compatibility.
Also we need to fix places of auto selected port managment.
We should bother only when auto selected was done that is
when externally specified port is not 0.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
In qemuConnectDomainXMLToNative() we set up the monitor, but we never
memset() it to zeros. Thanks to the introduction of the logfile
parameter of chardevs (and the logfile member of the struct), we started
checking whether that's non-NULL and that exposed this old error.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reverting to a snapshot may change domain configuration. New
configuration should be saved if domain has persistent flag.
VIR_DOMAIN_EVENT_DEFINED_FROM_SNAPSHOT is emitted in case of
configuration update.
If use of virtlogd is enabled, then use it for backing the
character device log files too. This avoids the possibility
of a guest denial of service by writing too much data to
the log file.
Now that the function was extracted we can get rid of some temp
variables. Additionally formatting of the bitmap string for the event
code should be checked.
Allow pinning for inactive vcpus. The pinning mask will be automatically
applied as we would apply the default mask in case of a cpu hotplug.
Setting the scheduler settings for a vcpu has the same semantics.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1306556
We would happily report and free statistics of a completed migration
even before it actually completed (on the source host while migration is
in the Finish phase).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The code does not handle renaming of the save state file. In addition to
that the resuming code would need to be tweaked to handle the name
change since the XML is extracted from the save image. The easies option
is to make the rename API even less useful by forbiding this.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1314594
While trying to build with -Os couple of compile errors showed
up.
conf/domain_conf.c: In function 'virDomainChrRemove':
conf/domain_conf.c:13666:24: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized]
virDomainChrDefPtr ret, **arrPtr = NULL;
^
Compiler fails to see that @ret is used only if set in the loop,
but whatever, there's no harm in initializing the variable.
In vboxAttachDrivesNew and _vboxAttachDrivesOld compiler thinks
that @rc may be used uninitialized. Well, not directly, but maybe
after some optimization. Yet again, no harm in initializing a
variable.
In file included from ./util/virthread.h:26:0,
from ./datatypes.h:28,
from vbox/vbox_tmpl.c:43,
from vbox/vbox_V3_1.c:37:
vbox/vbox_tmpl.c: In function '_vboxAttachDrivesOld':
./util/virerror.h:181:5: error: 'rc' may be used uninitialized in this function [-Werror=maybe-uninitialized]
virReportErrorHelper(VIR_FROM_THIS, code, __FILE__, \
^
In file included from vbox/vbox_V3_1.c:37:0:
vbox/vbox_tmpl.c:1041:14: note: 'rc' was declared here
nsresult rc;
^
Yet again, one uninitialized variable:
qemu/qemu_driver.c: In function 'qemuDomainBlockCommit':
qemu/qemu_driver.c:17194:9: error: 'baseSource' may be used uninitialized in this function [-Werror=maybe-uninitialized]
qemuDomainPrepareDiskChainElement(driver, vm, baseSource,
^
And another one:
storage/storage_backend_logical.c: In function 'virStorageBackendLogicalMatchPoolSource.isra.2':
storage/storage_backend_logical.c:618:33: error: 'thisSource' may be used uninitialized in this function [-Werror=maybe-uninitialized]
thisSource->devices[j].path))
^
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Migration statistics are not available on the destination host and
starting a query job during incoming migration is not allowed. Trying to
do that would result in
Timed out during operation: cannot acquire state change lock (held
by remoteDispatchDomainMigratePrepare3Params)
error. We should not even try to start the job.
https://bugzilla.redhat.com/show_bug.cgi?id=1278727
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
We honour the placement bitmaps when starting up, so there's no point in
having this check. Additionally the check was buggy since it checked
vm->def all the time even if the user requested to modify the persistent
definition which had different configuration.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1308317
Per-domain directories were introduced in order to be able to
completely separate security labels for each domain (commit
f1f68ca334). However when the domain
name is long (let's say a ridiculous 110 characters), we cannot
connect to the monitor socket because on length of UNIX socket address
is limited. In order to get around this, let's shorten it in similar
fashion and in order to avoid conflicts, throw in an ID there as well.
Also save that into the status XML and load the old status XMLs
properly (to clean up after older domains). That way we can change it
in the future.
The shortening can be seen in qemuxml2argv tests, for example in the
hugepages-pages2 case.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Similarly to VM startup always set the legacy affinity. Additionally we
don't need to report an explicit error since virProcessSetAffinity
reports them themselves.
Similarly to VM startup always set the legacy affinity. Additionally we
don't need to report an explicit error since virProcessSetAffinity
reports them themselves.
Calling qemuProcessStop without a job opens a way to race conditions
with qemuDomainObjExitMonitor called in another thread. A real world
example of such a race condition:
- migration thread (A) calls qemuMigrationWaitForSpice
- another thread (B) starts processing qemuDomainAbortJob API
- thread B signals thread A via qemuDomainObjAbortAsyncJob
- thread B enters monitor (qemuDomainObjEnterMonitor)
- thread B calls qemuMonitorSend
- thread A awakens and calls qemuProcessStop
- thread A calls qemuMonitorClose and sets priv->mon to NULL
- thread B calls qemuDomainObjExitMonitor with priv->mon == NULL
=> monitor stays ref'ed and locked
Depending on how lucky we are, the race may result in a memory leak or
it can even deadlock libvirtd's event loop if it tries to lock the
monitor to process an event received before qemuMonitorClose was called.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Stopping a domain without a job risks a race condition with another
thread which started a job a which does not expect anyone else to be
messing around with the same domain object.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Only a small portion of processGuestPanicEvent was enclosed within a
job, let's make sure we use the job for all operations to avoid race
conditions.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When destroying a domain we need to make sure we will be able to start a
job no matter what other operations are running or even stuck in a job.
This is done by killing the domain before starting the destroy job.
Let's introduce qemuProcessBeginStopJob which combines killing a domain
and starting a job in a single API which can be called everywhere we
need a job to stop a domain.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemuDomainHelperGetVcpus would correctly return an array of
virVcpuInfoPtr structs for online vcpus even for sparse topologies, but
the loop that fills the returned typed parameters would number the vcpus
incorrectly. Fortunately sparse topologies aren't supported yet.
Now that the file migration doesn't require us to use 'dd' and other
legacy stuff for too old qemus we don't even have to calcuate the
offsets and other stuff.
Our existing virHashForEach method iterates through all items disregarding the
fact, that some of the iterators might have actually failed. Errors are usually
dispatched through an error element in opaque data which then causes the
original caller of virHashForEach to return -1. In that case, virHashForEach
could return as soon as one of the iterators fail. This patch changes the
iterator return type and adjusts all of its instances accordingly, so the
actual refactor of virHashForEach method can be dealt with later.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
After removing capability check for fd migration the code that was left
behind didn't make quite sense. The old exec migration would be used in
case when pipe() failed. Remove the old code and make failure of pipe()
a hard error.
This additionally removes usage of virCgroupAllowDevicePath outside of
qemu_cgroup.c.
Extract out the qemuParseCommandLine{String|Pid} into their own
separate module - taking with it all the various static functions.
Causes a ripple effect with a few other modules to include the
new qemu_parse_command.h.
Narrowed down the list of #include's in the split out module to
those that are necessary for build.
Since majority of the steps is shared, the function can be reused to
simplify code.
Similarly to previous path doing this same for vCPUs this also fixes the
a similar bug (which is not tracked).
Since majority of the steps is shared, the function can be reused to
simplify code.
Additionally this resolves
https://bugzilla.redhat.com/show_bug.cgi?id=1244128 since the cpu
bandwidth limiting with cgroups would not be set on the hotplug path.
Additionally the code now sets the thread affinity and honors autoCpuset
as in the regular startup code path.
Due to bad design the vcpu sched element is orthogonal to the way how
the data belongs to the corresponding objects. Now that vcpus are a
struct that allow to store other info too, let's convert the data to the
sane structure.
The helpers for the conversion are made universal so that they can be
reused for iothreads too.
This patch also resolves https://bugzilla.redhat.com/show_bug.cgi?id=1235180
since with the correct storage approach you can't have dangling data.
Now that qemuDomainDetectVcpuPids is able to refresh the vCPU pid
information it can be reused in the hotplug and hotunplug code paths
rather than open-coding a very similar algorithm.
A slight algorithm change is necessary for unplug since the vCPU needs
to be marked offline prior to calling the thread detector function and
eventually rolled back if something fails.
Move the logic from virDomainDiskDefDstDuplicates into
virDomainDiskDefCheckDuplicateInfo so that we don't have to loop
multiple times through the array of disks. Since the original function
was called in qemuBuildDriveDevStr, it was actually called for every
single disk which was quite wasteful.
Additionally the target uniqueness check needed to be duplicated in
the disk hotplug case, since the disk was inserted into the domain
definition after the device string was formatted and thus
virDomainDiskDefDstDuplicates didn't do anything in that case.
In b3d2a42e I've refactored the code and moved the 'cleanup' label.
Unfortunately the code that was originally in the 'endjob' label and
wanted to jump to cleanup is now in the cleanup label. Remove the jump
and let the function finish.
If we are attempting to thaw the filesystems on error, the code would
overwrite the error code that caused the snapshot to fail with the error
of thawing the filesystem. Since the thawing function allows control of
error reporting behavior we can use this feature.
The virDomainSnapshotDefFormat calls into virDomainDefFormat,
so should be providing a non-NULL virCapsPtr instance. On the
qemu driver we change qemuDomainSnapshotWriteMetadata to also
include caps since it calls virDomainSnapshotDefFormat.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
The virDomainObjFormat and virDomainSaveStatus methods
both call into virDomainDefFormat, so should be providing
a non-NULL virCapsPtr instance.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
virDomainSaveConfig calls virDomainDefFormat which was setting the caps
to NULL, thus keeping the old behaviour (i.e. not looking at
netprefix). This patch adds the virCapsPtr to the function and allows
the configuration to be saved and skipping interface names that were
registered with virCapabilitiesSetNetPrefix().
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
A pretty nasty deadlock occurs while trying to rename a VM in parallel
with virDomainObjListNumOfDomains.
The short description of the problem is as follows:
Thread #1:
qemuDomainRename:
------> aquires domain lock by qemuDomObjFromDomain
---------> waits for domain list lock in any of the listed functions:
- virDomainObjListFindByName
- virDomainObjListRenameAddNew
- virDomainObjListRenameRemove
Thread #2:
virDomainObjListNumOfDomains:
------> aquires domain list lock
---------> waits for domain lock in virDomainObjListCount
Introduce generic virDomainObjListRename function for renaming domains.
It aquires list lock in right order to avoid deadlock. Callback is used
to make driver specific domain updates.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In case of guest panicked, preserved crashed domain has stopped CPUs.
It's not possible to use tools like WinDbg for the problem investigation
until we start CPUs back.
Error paths after sending the event that domain is started written as if ret = -1
which is set at the beginning of the function. It's common idioma to keep 'ret'
equal to -1 until the end of function where it is set to 0. But here we use ret
to keep result of restore operation too and thus breaks the idioma and its users :)
Let's use different variable to hold restore result.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Having on_crash set to either coredump-destroy or coredump-restart
creates core dumps with option memory-only in the directory specified
by auto_dump_path. When a watchdog is triggered with the action dump
the core dump is also placed into the directory specified by auto_dump_path
but is created without the option memory-only.
This patch sets the option memory-only also for core dumps created by the
watchdog event.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
The VIR_DOMAIN_STATS_VCPU flag to virDomainListGetStats
enables reporting of stats about vCPUs. Currently we
only report the cumulative CPU running time and the
execution state.
This adds reporting of the wait time - time the vCPU
wants to run, but the host scheduler has something else
running ahead of it.
The data is reported per-vCPU eg
$ virsh domstats --vcpu demo
Domain: 'demo'
vcpu.current=4
vcpu.maximum=4
vcpu.0.state=1
vcpu.0.time=1420000000
vcpu.0.wait=18403928
vcpu.1.state=1
vcpu.1.time=130000000
vcpu.1.wait=10612111
vcpu.2.state=1
vcpu.2.time=110000000
vcpu.2.wait=12759501
vcpu.3.state=1
vcpu.3.time=90000000
vcpu.3.wait=21825087
In implementing this I notice our reporting of CPU execute
time has very poor granularity, since we are getting it
from /proc/$PID/stat. As a future enhancement we should
prefer to get CPU execute time from /proc/$PID/schedstat
or /proc/$PID/sched (if either exist on the running kernel)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When acpi is used to reboot/shutdown qemu domain, qemu emits
SHUTDOWN event. Libvirt uses fakeReboot variable in order to
differentiate reboot or shutdown. fakeReboot value is reseted
to false after domain restart/reset.
When mode=agent is used to reboot qemu domain, qemu doesn't emit
SHUTDOWN event and libvirt doesn't reset fakeReboot value to false.
In this case next 'shutdown -h now' performs reboot. That's why
we don't need to set fakeReboot=true for mode=agent.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
In qemu driver we listen to virtio channel events like an agent
connected to or disconnected from the guest part of socket.
However, with a little exception - when we find out that the
socket in question is the guest agent one, we connect or
disconnect guest agent which is done prior setting new state in
internal structure. Due to a bug in our code it may happen that
we got the event but failed to set it in internal structure
representing the channel.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The structure actually contains migration statistics rather than just
the status as the name suggests. Renaming it as
qemuMonitorMigrationStats removes the confusion.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
While reviewing 1b43885d17 I've noticed a virReportError()
followed by a goto endjob; without setting the correct return
value. Problem is, if block job is so fast that it's bandwidth
does not fit into ulong, an error is reported. However, by that
time @ret is already set to 1 which means success. Since the
scenario can be hardly considered successful, we should return a
value meaning error.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Valgrind complained:
==23975== Conditional jump or move depends on uninitialised value(s)
==23975== at 0x22255FA6: qemuDomainGetBlockJobInfo (qemu_driver.c:16538)
==23975== by 0x538E97C: virDomainGetBlockJobInfo (libvirt-domain.c:9685)
==23975== by 0x12F740: remoteDispatchDomainGetBlockJobInfoHelper (remote.c:2834)
==23975== by 0x53FF287: virNetServerProgramDispatch (virnetserverprogram.c:437)
==23975== by 0x540523D: virNetServerProcessMsg (virnetserver.c:135)
==23975== by 0x54052C7: virNetServerHandleJob (virnetserver.c:156)
==23975== by 0x52F515B: virThreadPoolWorker (virthreadpool.c:145)
==23975== by 0x52F4668: virThreadHelper (virthread.c:206)
==23975== by 0x6E08A50: start_thread (in /lib64/libpthread-2.12.so)
==23975== by 0x82BE93C: clone (in /lib64/libc-2.12.so)
==23975==
==23975== Conditional jump or move depends on uninitialised value(s)
==23975== at 0x22255FB4: qemuDomainGetBlockJobInfo (qemu_driver.c:16542)
==23975== by 0x538E97C: virDomainGetBlockJobInfo (libvirt-domain.c:9685)
==23975== by 0x12F740: remoteDispatchDomainGetBlockJobInfoHelper (remote.c:2834)
==23975== by 0x53FF287: virNetServerProgramDispatch (virnetserverprogram.c:437)
==23975== by 0x540523D: virNetServerProcessMsg (virnetserver.c:135)
==23975== by 0x54052C7: virNetServerHandleJob (virnetserver.c:156)
==23975== by 0x52F515B: virThreadPoolWorker (virthreadpool.c:145)
==23975== by 0x52F4668: virThreadHelper (virthread.c:206)
==23975== by 0x6E08A50: start_thread (in /lib64/libpthread-2.12.so)
==23975== by 0x82BE93C: clone (in /lib64/libc-2.12.so)
If no matching block job is found, qemuMonitorGetBlockJobInfo returns 0
and we should not write anything to the caller-supplied
virDomainBlockJobInfo pointer.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
This replaces the virPCIKnownStubs string array that was used
internally for stub driver validation.
Advantages:
* possible values are well-defined
* typos in driver names will be detected at compile time
* avoids having several copies of the same string around
* no error checking required when setting / getting value
The names used mirror those in the
virDomainHostdevSubsysPCIBackendType enumeration.
This function calls qemuDomainAttachControllerDevice for SCSI
controllers and reports an error for all other controllers.
Move the error inside qemuDomainAttachControllerDevice and delete this
wrapper.
A closer review of the code shows that for the transition from paused to
running which was supposed to emit the VIR_DOMAIN_EVENT_RESUMED - no event
would be generated. Rather the event is generated when going from running
to running.
Following the 'was_running' boolean shows it is set when the domain obj
is active and the domain obj state is VIR_DOMAIN_RUNNING. So rather than
using was_running to generate the RESUMED event, use !was_running
Add qemuDomainHasVCpuPids to do the checking and replace in place checks
with it.
We no longer need checking whether the thread contains fake data
(vcpupids[0] == vm->pid) as in b07f3d821d
and 65686e5a81 this was removed.
The vCPU threads make sense in the counterparts that set the vCPU
bandwidth/quota, not in the emulator one. The emulator tunables are set
all the time anyways.
Drop the extra check and remove the now unneeded vm argument.
Refactor the code flow so that 'exit_monitor:' can be removed.
This patch moves the auditing functions into places where it's certain
that hotunplug was or was not successful and reports errors from
qemuMonitorGetCPUInfo properly.
Refactor the code flow so that 'exit_monitor:' can be removed.
This patch also moves the auditing and setting of the new vCPU count
right to the place where the hotplug happens, since it's possible that
the hotplug succeeds and adds a cpu while other stuff fails.
Lastly, failures of qemuMonitorGetCPUInfo are now reported rather than
ignored. The function retuns 0 if it "successfully" detected 0 threads.
qemuDomainHotplugVcpus/qemuDomainHotunplugVcpus are complex enough in
regards of adding one CPU. Additionally it will be desired to reuse
those functions later with specific vCPU hotplug.
Move the loops for adding vCPUs into qemuDomainSetVcpusFlags so that the
helpers can be made simpler and more straightforward.
The cpu hotplug helper functions used negative error handling in a part
of them, although some code that was added later didn't properly set the
error codes in some cases. This would cause improper error messages in
cases where we couldn't modify the numa cpu mask and a few other cases.
Fix the logic by converting it to the regularly used pattern.
With a very unfortunate timing, the agent might vanish before we do the
second call while the locks were down. Re-check that the agent is
available before attempting it again.
The qemuDomainTaint APIs currently expect to be passed a log file
descriptor. Change them to instead use a qemuDomainLogContextPtr
to hide the implementation details.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The rename operation only works on inactive virtual machines,
but it none the less writes to the log file used by the QEMU
processes. This log file is not intended to provide a general
purpose audit trail of operations performed on VMs. The audit
subsystem has recording of important operations. If we want
to extend that to cover all significant public APIs that is
a valid thing to consider, but we shouldn't arbitrarily log
specific APIs into the QEMU log file in the meantime.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We only started an async job for incoming migration from another host.
When we were starting a domain from scratch or restoring from a saved
state (migration from file) we didn't set any async job. Let's introduce
a new QEMU_ASYNC_JOB_START for these cases.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Make callers of qemuBuildCommandLine responsible for providing the URI
which should be passed as a parameter for -incoming.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1270715
Commit id '9deb96f' removed the code to fetch the nodeset from the
CpusetMems cgroup for a running vm in favor of using the return from
virDomainNumatuneFormatNodeset introduced by commit id '43b67f2e7'.
However, that API will return the value of the passed 'auto_nodeset'
when placement is VIR_DOMAIN_NUMATUNE_PLACEMENT_AUTO, which happens
to be NULL.
Since commit id 'c74d58ad' started using priv->autoNodeset in order
to manage the auto placement value during qemuProcessStart, it should
be passed along in order to return the correct value if the domain
requests the auto placement.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Although theoretically both should be the same value, the niothreadids
should be used in favor of iothreads when performing comparisons. This
leaves the iothreads as a purely numeric value to be saved in the config
file. The one exception to the rule is virDomainIOThreadIDDefArrayInit
where the iothreadids are being generated from the iothreads count since
iothreadids were added after initial iothreads support.
So imagine you want to crate new security manager:
if (!(mgr = virSecurityManagerNew("selinux", "QEMU", false, true, false, true)));
Hard to parse, right? What about this:
if (!(mgr = virSecurityManagerNew("selinux", "QEMU",
VIR_SECURITY_MANAGER_DEFAULT_CONFINED |
VIR_SECURITY_MANAGER_PRIVILEGED)));
Now that's better! This is what the commit does.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Use the migration @flags for checking various migration aspects rather
than picking them out as booleans. Document the new semantics in the
function header.
Now that qemuMigrationIsAllowed is always called with @vm, we can drop
the @def argument and simplify the control flow.
Additionally the comment is invalid so drop it.
Commit 307fb904 (Sep 10) added a 'privileged' variable when creating
the DAC driver:
@@ -153,6 +157,7 @@ virSecurityManagerNewDAC(const char *virtDriver,
bool defaultConfined,
bool requireConfined,
bool dynamicOwnership,
+ bool privileged,
virSecurityManagerDACChownCallback chownCallback)
But argument order is mixed up at the caller, swapping dynamicOwnership
and privileged values. This corrects the argument order
https://bugzilla.redhat.com/show_bug.cgi?id=1266628
This seemed to be more of a false positive as for some reason Coverity
was missing the "ret < 0" goto error condition and somehow believing that
event could be overwritten. At first I thought it was just the ret != 0
condition difference, but it wasn't.
In any case, make use of the recent change to qemuDomainEventQueue to
check event == NULL and just pass it as a parameter directly in the
error path. That avoids the error.
Signed-off-by: John Ferlan <jferlan@redhat.com>
So while working on my previous patches, I've noticed that
virDomainRestore implementation in qemu and test drivers has the
same problem as I am fixing.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So far we have the following pattern occurring over and over
again:
if (!vm->persistent)
qemuDomainRemoveInactive(driver, vm);
It's safe to put the check into the function and save some LoC.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=871452
So, you want to create a domain from XML. The domain already
exists in libvirt's database of domains. It's okay, because name
and UUID matches. However, on domain startup, internal
representation of the domain is overwritten with your XML even
though we claim that the XML you've provided is a transient one.
The bug is to be found across nearly all the drivers.
Le sigh.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=871452
Okay, so we allow users to 'virsh create' an already existing
domain, providing completely different XML than the one stored in
Libvirt. Well, as long as name and UUID matches. However, in some
drivers the code that handles errors unconditionally removes the
domain that failed to start even though the domain might have
been persistent. Fortunately, the domain is removed just from the
internal list of domains and the config file is kept around.
Steps to reproduce:
1) virsh dumpxml $dom > /tmp/dom.xml
2) change XML so that it is still parse-able but won't boot, e.g.
change guest agent path to /foo/bar
3) virsh create /tmp/dom.xml
4) virsh dumpxml $dom
5) Observe "No such domain" error
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add a new parser flag that will mark code paths that parse XML files
wich will not be used with existing VM state so that post parse
callbacks can possibly do ABI incompatible changes if needed.
I always felt like this function is qemu specific rather than
libvirt-wide. Other drivers may act differently on virDomainDef
change and in fact may require talking to underlying hypervisor
even if something else's than disk->src has changed. I know that
the function is still incomplete, but lets break that into two
commits that are easier to review. This one is pure code
movement.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Firstly, our coding guidelines suggest using 'cleanup' label
instead of 'end'. Then, @ret should be set to value representing
success as the last statement before the 'cleanup' label.
And while I am at this function, lets enumerate all the possible
enum items (virDomainDiskDevice) and avoid using 'default' in
switch(). Pooh. Also, nothing bad happens if we look up the disk
to change in the domain upfront. In fact, it's going to be
helpful later when we want to keep some old values for performing
a rollback.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
While we currently only allow changing a media in a disk, this is
going to change in a while, so the function name would be
invalid. Moreover, the old name does not match the pattern laid
out by other update functions.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1159219
So, in 11e058ca58 I've tried to make UpdateDevice update
startupPolicy too. And it worked well until somebody came around
and pushed d0dc6c0369 which accidentally removed my
contribution. Redo my commit.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Every single call to qemuDomainEventQueue() uses the following pattern:
if (event)
qemuDomainEventQueue(driver, event);
Let's move the check for valid event to qemuDomainEventQueue and
simplify all callers.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
We may want to do some decisions in drivers based on fact if we
are running as privileged user or not. Propagate this info there.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This reverts commit 1ce7c1d20c,
which introduced a significant semantic change to the
virDomainGetInfo() API. Additionally, the change was only
made to 2 of the 15 virt drivers.
Conflicts:
src/qemu/qemu_driver.c
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1253107
Make a call virCgroupGetBlkioWeight to re-read blkio.weight right
after it is set in order to keep internal data up-to-date.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
We will try to set the node to cpuset.mems without check if
it is available, since we already have helper to check this.
Call virNumaNodesetIsAvailable to check if node is available,
then try to change it in the cgroup.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
The problem here is that there are some values that kernel accepts, but
does not set them, for example 18446744073709551615 which acts the same
way as zero. Let's do the same thing we do with other tuning options
and re-read them right after they are set in order to keep our internal
structures up-to-date.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1165580
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1251886
Since iothread_id == 0 is an invalid value for QEMU let's point
that out specifically. For the IOThreadDel code, the failure would
have ended up being a failure to find the IOThread ID; however, for
the IOThreadAdd code - an IOThread 0 was added and that isn't good.
It seems during many reviews/edits to the code the check for
iothread_id = 0 being invalid was lost - it could have originally
been in the API code, but requested to be moved - I cannot remember.
Just like in commit 704cf06, if virCgroup*() fails, the error is
already reported. There's no need to overwrite the error with a
generic one and possibly hiding the true root cause of the error.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
It may happen that user (mistakenly) wants to rename a domain to
itself. Which is no renaming at all. We should reject that with
some meaningful error message.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Coverity complained that 'vm' wasn't initialized before jumping to
cleanup: and calling virDomainObjEndAPI if the VIR_STRDUP fails.
So I initialized vm = NULL and also moved the VIR_STRDUP closer to
usage and used endjob for goto. Lots of other reasons for failures.
Currently supports only renaming inactive domains without snapshots.
Signed-off-by: Tomas Meszaros <exo@tty.sk>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Pinning information returned for emulatorpin and vcpupin calls is being
returned from our data without querying cgroups for some time. However,
not all the data were utilized. When automatic placement is used the
information is not returned for the calls mentioned above. Since the
numad hint in private data is properly saved/restored, we can safely use
it to return true information.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1162947
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Well, there are just two places that needs adjustment:
qemuDomainGetInterfaceParameters - to report the @floor
qemuDomainSetInterfaceParameters - now that the function has been
fixed, we can allow updating @floor too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
As sketched in previous commits, imagine the following scenario:
virsh # domiftune gentoo vnet0
inbound.average: 100
inbound.peak : 0
inbound.burst : 0
outbound.average: 100
outbound.peak : 0
outbound.burst : 0
virsh # domiftune gentoo vnet0 --inbound 0
virsh # shutdown gentoo
Domain gentoo is being shutdown
virsh # list --all
error: Failed to list domains
error: Cannot recv data: Connection reset by peer
Program received signal SIGSEGV, Segmentation fault.
0x00007fffe80ea221 in networkUnplugBandwidth (net=0x7fff9400c1a0, iface=0x7fff940ea3e0) at network/bridge_driver.c:4881
4881 net->floor_sum -= ifaceBand->in->floor;
This is rather unfortunate. We should not SIGSEGV here. The
problem is, that while in the second step the inbound QoS was
cleared out, the network part of it was not updated (moreover, we
don't report that vnet0 had inbound.floor set). Internal
structure therefore still had some fragments left (e.g.
class_id). So when qemuProcessStop() started to clean up the
environment it got to networkUnplugBandwidth(). Here, class_id is
set therefore function assumes that there is an inbound QoS. This
actually is a fair assumption to make, there's no need for a
special QoS box in network's QoS when there's no QoS to set.
Anyway, the problem is not the networkUnplugBandwidth() rather
than qemuDomainSetInterfaceParameters() which completely forgot
about QoS being disperse (some parts are set directly on
interface itself, some on bridge the interface is plugged into).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Qemu reports physical size 0 for block devices. As 15fa84acbb
changed the behavior of qemuDomainGetBlockInfo to just query the monitor
this created a regression since we didn't report the size correctly any
more.
This patch adds code to refresh the physical size of a block device by
opening it and seeking to the end and uses it both in
qemuDomainGetBlockInfo and also in qemuDomainGetStatsOneBlock that was
broken since it was introduced in this respect.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1250982
If cpuset is disabled or not available, it libvirt must not use it.
Mainly for actions that do not need it and can use sched_setaffinity()
or numa_membind() instead, because they will fail without good reason.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1244664
Signed-off-by: Luyao Huang <lhuang@redhat.com>
When stopping a domain on the destination host after a failed migration,
we need to avoid reseting security labels since the domain is still
running on the source host. While we were correctly doing so in some
cases, there were still some paths which did this wrong.
https://bugzilla.redhat.com/show_bug.cgi?id=1242904
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Our atomic increment (virAtomicIntInc) uses (if available) gcc
__sync_add_and_fetch builtin. In qemu driver though, we'd profit more
from __sync_fetch_and_add builtin. To keep it simplistic, this patch
adjusts qemu driver initialization rather than adding a new atomic
increment macro.
Commit d506a51aeb meant to check for
QEMU_CAPS_DRIVE_IOTUNE_MAX, but checked for QEMU_CAPS_DRIVE_IOTUNE
instead. That's clearly visible from the diff, but it got in. Because
of that, we were supplying information unknown for QEMU if it wasn't new
enough and we couldn't even properly handle the error, leading to
"Unexpected error". Also iops_size came at the same time with all the
other "_max" options, so check whether we're not setting that either if
QEMU_CAPS_DRIVE_IOTUNE_MAX is not supported.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1224053
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1245476
We won't return the errno after commit 0d7f45ae, and
the more clearly error will be set in the code in vircgroup*.
Also We will always report error "Operation not permitted",
because the return is -1.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Few parts of the code looked at the current progress of and assumed that
a two phase blockjob is in the _READY state as soon as the progress
reached 100% (info.cur == info.end). In current versions of qemu this
assumption is invalid and qemu exposes a new flag 'ready' in the
query-block-jobs output that is set to true if the job is actually
finished.
This patch adds internal data handling for reading the 'ready' flag and
acting appropriately as long as the flag is present.
While this still doesn't fix the virsh client problem with two phase
block jobs and the --pivot option, it at least improves the error
message:
$ virsh blockcommit --wait --verbose vm vda --base vda[1] --active --pivot
Block commit: [100 %]error: failed to pivot job for disk vda
error: internal error: unable to execute QEMU command 'block-job-complete': The active block job for device 'drive-virtio-disk0' cannot be completed
to
$ virsh blockcommit --wait --verbose VM vda --base vda[1] --active --pivot
Block commit: [100 %]error: failed to pivot job for disk vda
error: block copy still active: disk 'vda' not ready for pivot yet
If one calls update-device with information that is not updatable,
libvirt reports success even though no data were updated. The example
used in the bug linked below uses updating device with <boot order='2'/>
which, in my opinion, is a valid thing to request from user's
perspective. Mainly since we properly error out if user wants to update
such data on a network device for example.
And since there are many things that might happen (update-device on disk
basically knows just how to change removable media), check for what's
changing and moreover, since the function might be usable in other
drivers (updating only disk path is a valid possibility) let's abstract
it for any two disks.
We can't possibly check for everything since for many fields our code
does not properly differentiate between default and unspecified values.
Even though this could be changed, I don't feel like it's worth the
complexity so it's not the aim of this patch.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1007228
https://bugzilla.redhat.com/show_bug.cgi?id=1232663
In one of my previous ptaches (bcd9a564) I've tried to fix the problem
that we blindly assumed strict NUMA mode for guests. This led to
several problems like us pinning a domain onto a nodeset via libnuma
among with CGroups. Once the nodeset was changed by user, well, it did
not result in desired effect. See the original commit for more info.
But, the commit I wrote had a bug: when NUMA parameters are changed on
a running domain we require domain to be strictly pinned onto a
nodeset. Due to a typo a condition was mis-evaluated.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>