Previously libvirt's disk device XML only had a single attribute,
error_policy, to control both read and write error policy, but qemu
has separate options for controlling read and write. In one case
(enospc) a policy is allowed for write errors but not read errors.
This patch adds a separate attribute that sets only the read error
policy. If just error_policy is set, it will apply to both read and
write error policy (previous behavior), but if the new rerror_policy
attribute is set, it will override error_policy for read errors only.
Possible values for rerror_policy are "stop", "report", and "ignore"
("report" is the qemu-controlled default for rerror_policy when
error_policy isn't specified).
For consistency, the value "report" has been added to the possible
values for error_policy as well.
commit 12062ab set rerror=ignore when error_policy="enospace" was
selected (since the rerror option in qemu doesn't accept "enospc", as
the werror option does).
After that patch was already pushed, Paolo Bonzini noticed it and
commented that leaving rerror at the default ("report") would be a
better choice. This patch corrects the problem - if error_policy =
"enospace" is given, rerror is left off the qemu commandline,
effectively setting it to "report". For other values, rerror is still
set to match werror.
Additionally, the parsing of error_policy was changed to no longer
erroneously allow "default" as a choice - as with most other
attributes, if you want the default setting, just don't specify an
error_policy.
Finally, two ommissions in the first patch were corrected - a
long-dormant qemuxml2argv test for enospace was enabled, and fixed to
pass, and the argv2xml parser in qemu_command.c was updated to
recognize the different spelling on the qemu commandline.
Now that RHEL 6.2 Beta is out, it would be nice to test multifunction
devices on that platform. This changes things so that the multifunction
cap bit can be set in two different ways: by version comparison (needed
for qemu 0.13 which lacked a -device query), and by -device query
(provided by qemu.git and backported to the RHEL beta build of
qemu-kvm which still claims to be a modified 0.12, and therefore needed
for RHEL).
* src/qemu/qemu_capabilities.c (qemuCapsParseDeviceStr): Allow
second method of setting multifunction cap bit.
* tests/qemuhelptest.c (mymain): Test it.
* tests/qemuhelpdata/qemu-kvm-0.12.1.2-rhel62-beta: New file.
* tests/qemuhelpdata/qemu-kvm-0.12.1.2-rhel62-beta-device: Likewise.
If using one of the new non-NAT/routed virtual network
configurations, the LXC driver would not know how to
setup the VETH devices. Adding in calls to setup the
"actual" network configuration at VM startup and cleanup
when shutting down fixes this.
* src/lxc/lxc_driver.c: Setup/cleanup actual net devs
Implements the documentation for snapshot revert vs. force.
Part of the patch tightens existing behavior (previously, reverting
to an old snapshot without <domain> was blindly attempted, now it
requires force), while part of it relaxes behavior (previously, it
was not possible to revert an active domain to an ABI-incompatible
active snapshot, now force allows this transition).
* src/qemu/qemu_driver.c (qemuDomainRevertToSnapshot): Check for
risky situations, and allow force to get past them.
Once we know which set of disks belong to a snapshot, reverting or
deleting that snapshot should visit just those disks, rather than
also visiting disks that were hot-plugged in the meantime or
skipping disks that were hot-unplugged in the meantime.
* src/qemu/qemu_domain.c (qemuDomainSnapshotForEachQcow2): Use
snapshot domain details when available. Avoid NULL deref.
Although reverting to a snapshot is a form of data loss, this is
normally expected. However, there are two cases where additional
surprises (failure to run the reverted state, or a break in
connectivity to the domain) can come into play. Requiring extra
acknowledgment in these cases will make it less likely that
someone can get into an unrecoverable state due to a default revert.
Also create a new error code, so users can distinguish when forcing
would make a difference, rather than having to blindly request force.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_SNAPSHOT_REVERT_FORCE):
New flag.
* src/libvirt.c (virDomainRevertToSnapshot): Document it.
* include/libvirt/virterror.h (VIR_ERR_SNAPSHOT_REVERT_RISKY): New
error value.
* src/util/virterror.c (virErrorMsg): Implement it.
* tools/virsh.c (cmdDomainSnapshotRevert): Add --force to virsh.
* tools/virsh.pod (snapshot-revert): Document it.
Commit 9f5e53e introduced the ability to filter snapshots to
just roots, but it was never implemented for VBox until now.
The VBox implementation prohibits deletion of a snapshot with
multiple children. Hence, there can only be at most one root,
which is found by searching for the snapshot with a NULL uuid.
Prior to 4.0, snapshotGet looked up by UUID, and snapshotFind
looked up by name; after that point, snapshotGet disappeared
and snapshotFind handles uuid or name.
* src/vbox/vbox_tmpl.c (vboxDomainSnapshotNum)
(vboxDomainSnapshotListNames): Implement limiting list to root.
Qemu driver tries to update balloon data in virDomainGetInfo and if it
can't do so because there is another monitor job running, it just
reports what's known in domain def. However, if there was no job running
but getting the data from qemu fails, we would fail the whole API. This
doesn't make sense. Let's make the failure nonfatal.
No need to request the parent of a snapshot if we aren't going to use it.
* src/esx/esx_vi.c (esxVI_GetSnapshotTreeByName): Make parent
optional.
* src/esx/esx_driver.c (esxDomainSnapshotCreateXML)
(esxDomainSnapshotLookupByName, esxDomainRevertToSnapshot)
(esxDomainSnapshotDelete): Simplify accordingly.
Commit 9f5e53e introduced the ability to filter snapshots to
just roots, but it was never implemented for ESX until now.
* src/esx/esx_vi.h (esxVI_GetNumberOfSnapshotTrees)
(esxVI_GetSnapshotTreeNames): Add parameter.
* src/esx/esx_vi.c (esxVI_GetNumberOfSnapshotTrees)
(esxVI_GetSnapshotTreeNames): Allow choice of recursion or not.
* src/esx/esx_driver.c (esxDomainSnapshotNum)
(esxDomainSnapshotListNames): Use it to limit to roots.
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=730909
When support for setting the qemu disk error policy to "enospc" was
added, it was inadvertently spelled "enospace". This patch corrects
that on the qemu commandline (while retaining the "enospace" spelling
for libvirt's XML).
Also, while examining the qemu source, I found that "enospc" is not
allowed for the read error policy, only for write error policy (makes
sense). Since libvirt currently only has a single error policy
setting, when "enospace" is selected, the read error policy is set to
"ignore".
Destination libvirtd remembers the original name in the prepare phase
and clears it in the finish phase. The original name is used when
comparing domain name in migration cookie.
When booting a virtual machine with a kernel/initrd it is possible
to pass command line arguments using the <cmdline>...args...</cmdline>
element in the guest XML. These appear to the kernel / init process
in /proc/cmdline.
When booting a container we do not have a custom /proc/cmdline,
but we can easily set an environment variable for it. Ideally
we could pass individual arguments to the init process as a
regular set of 'char *argv[]' parameters, but that would involve
libvirt parsing the <cmdline> XML text. This can easily be added
later, even if we add the env variable now
* docs/drvlxc.html.in: Document env variables passed to LXC
* src/conf/domain_conf.c: Add <cmdline> to be parsed for
guests of type='exe'
* src/lxc/lxc_container.c: Set LIBVIRT_LXC_CMDLINE env var
This patch is a fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=743176
which was discovered by Dan Berrange while making bandwidth
configuration work for LXC guests.
Background: Although virtportprofile data from a network portgroup is
only applicable for direct mode interfaces, the code that copies
bandwidth data from the portgroup was also only being executed in the
case of direct mode interfaces. The result was that interfaces using
traditional virtual networks (forward mode='nat|route|none'), and
those using a host bridge for forwarding, would not pick up bandwidth
data from a portgroup defined in the network.
This patch moves that code outside the conditional, so that bandwidth
information is *alway* copied from the appropriate portgroup (unless
the <interface> definition itself already has bandwidth information,
which would take precedence over what's in the portgroup anyway).
Code altered so that it is consistent with the associated comment. The
'autoconf' variable is forced to zero.
Signed-off-by: Neil Wilson <neil@brightbox.co.uk>
Do not crash if virStreamFinish is called after error.
==11000== Invalid read of size 4
==11000== at 0x373A8099A0: pthread_mutex_lock (pthread_mutex_lock.c:51)
==11000== by 0x4C7CADE: virMutexLock (threads-pthread.c:85)
==11000== by 0x4D57C31: virNetClientStreamRaiseError (virnetclientstream.c:203)
==11000== by 0x4D385E4: remoteStreamFinish (remote_driver.c:3541)
==11000== by 0x4D182F9: virStreamFinish (libvirt.c:14157)
==11000== by 0x40FDC4: cmdScreenshot (virsh.c:3075)
==11000== by 0x42BA40: vshCommandRun (virsh.c:14922)
==11000== by 0x42ECCA: main (virsh.c:16381)
==11000== Address 0x59b86c0 is 16 bytes inside a block of size 216 free'd
==11000== at 0x4A06928: free (vg_replace_malloc.c:427)
==11000== by 0x4C69E2B: virFree (memory.c:310)
==11000== by 0x4D57B56: virNetClientStreamFree (virnetclientstream.c:184)
==11000== by 0x4D3DB7A: remoteDomainScreenshot (remote_client_bodies.h:1812)
==11000== by 0x4CFD245: virDomainScreenshot (libvirt.c:2903)
==11000== by 0x40FB73: cmdScreenshot (virsh.c:3029)
==11000== by 0x42BA40: vshCommandRun (virsh.c:14922)
==11000== by 0x42ECCA: main (virsh.c:16381)
When support for was added for PCI multifunction cards (in commit
9f8baf, first included in libvirt 0.9.3), it was done by always
turning on the multifunction bit for all PCI devices. Since that time
it has been realized that this is not an ideal solution, and that the
multifunction bit must be selectively turned on. For example, see
https://bugzilla.redhat.com/show_bug.cgi?id=728174
and the discussion before and after
https://www.redhat.com/archives/libvir-list/2011-September/msg01036.html
This patch modifies multifunction support so that the multifunction=on
option is only added to the qemu commandline for a device if its PCI
<address> definition has the attribute "multifunction='on'", e.g.:
<address type='pci' domain='0x0000' bus='0x00'
slot='0x04' function='0x0' multifunction='on'/>
In practice, the multifunction bit should only be turned on if
function='0' AND other functions will be used in the same slot - it
usually isn't needed for functions 1-7 (although there are apparently
some exceptions, e.g. the Intel X53 according to the QEMU source
code), and should never be set if only function 0 will be used in the
slot. The test cases have been changed accordingly to illustrate.
With this patch in place, if a user attempts to assign multiple
functions in a slot without setting the multifunction bit for function
0, libvirt will issue an error when the domain is defined, and the
define operation will fail. In the future, we may decide to detect
this situation and automatically add multifunction=on to avoid the
error; even then it will still be useful to have a manual method of
turning on multifunction since, as stated above, there are some
devices that excpect it to be turned on for all functions in a slot.
A side effect of this patch is that attempts to use the same PCI
address for two different devices will now log an error (previously
this would cause the domain define operation to fail, but there would
be no log message generated). Because the function doing this log was
almost completely rewritten, I didn't think it worthwhile to make a
separate patch for that fix (the entire patch would immediately be
obsoleted).
If the regexes supported (?:pvs)?, then we could handle this by
optionally matching but not returning the initial command name. But it
doesn't. So add a new char* argument to
virStorageBackendRunProgRegex(). If that argument is NULL then we act
as usual. Otherwise, if the string at that argument is found at the
start of a returned line, we drop that before running the regex.
With this patch, virt-manager shows me lvs with command_names 1 or 0.
The definitions of PVS_BASE etc may want to be moved into the configure
scripts (though given how PVS is found, IIUC that could only happen if
pvs was a link to pvs_real), but in any case no sense dealing with that
until we're sure this is an ok way to handle it.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Currently, qemuDomainGetXMLDesc and qemudDomainGetInfo check for
outstanding synchronous job before (eventual) monitor entering.
However, there can be already async job set, e.g. migration.
Before, URIs such as hyperv+ssh:// have been declined by the Hyper-V
driver resulting in the remote driver trying to connect to an
non-existing libvirtd.
Now such URIs trigger an error in the yper-V driver suggesting to
try again without the transport part in the scheme.
Before, URIs such as esx+ssh:// have been declined by the ESX driver
resulting in the remote driver trying to connect to an non-existing
libvirtd.
Now such URIs trigger an error in the ESX driver suggesting to try
again without the transport part in the scheme.
If the daemon is restarted so we reconnect to monitor, cdrom media
can be ejected. In that case we don't want to show it in domain xml,
or require it on migration destination.
To check for disk status use 'info block' monitor command.
* src/qemu/qemu_migration.c: if 'vmdef' is NULL, the function
virDomainSaveConfig still dereferences it, it doesn't make
sense, so should add return value check to make sure 'vmdef'
is non-NULL before calling virDomainSaveConfig, in addition,
in order to debug later, also should record error information
into log.
Signed-off-by: Alex Jia <ajia@redhat.com>
First hypervisor implementation of the new API.
Allows 'virsh snapshot-list --tree' to be more efficient.
* src/qemu/qemu_driver.c (qemuDomainSnapshotGetParent): New
function.
Mostly straight-forward, although this is the first API that
returns a new snapshot based on a snapshot rather than a domain.
* src/remote/remote_protocol.x
(REMOTE_PROC_DOMAIN_SNAPSHOT_GET_PARENT): New rpc.
(remote_domain_snapshot_get_parent_args)
(remote_domain_snapshot_get_parent_ret): New structs.
* src/rpc/gendispatch.pl: Adjust generator.
* src/remote/remote_driver.c (remote_driver): Use it.
* src/remote_protocol-structs: Update.
Although a client can already obtain a snapshot's parent by
dumping and parsing the xml, then doing a snapshot lookup by
name, it is more efficient to get the parent in one step, which
in turn will make operations that must traverse a snapshot
hierarchy easier to perform.
* include/libvirt/libvirt.h.in (virDomainSnapshotGetParent):
Declare.
* src/libvirt.c (virDomainSnapshotGetParent): New function.
* src/libvirt_public.syms: Export it.
* src/driver.h (virDrvDomainSnapshotGetParent): New callback.
This patch fixes the regression with using named pipes for qemu serial
devices noted in:
https://bugzilla.redhat.com/show_bug.cgi?id=740478
The problem was that, while new code in libvirt looks for a single
bidirectional fifo of the name given in the config, then relabels that
and continues without looking for / relabelling the two unidirectional
fifos named ${name}.in and ${name}.out, qemu looks in the opposite
order. So if the user had naively created all three fifos, libvirt
would relabel the bidirectional fifo to allow qemu access, but qemu
would attempt to use the two unidirectional fifos and fail (because it
didn't have proper permissions/rights).
This patch changes the order that libvirt looks for the fifos to match
what qemu does - first it looks for the dual fifos, then it looks for
the single bidirectional fifo. If it finds the dual unidirectional
fifos first, it labels/chowns them and ignores any possible
bidirectional fifo.
(Note commit d37c6a3a (which first appeared in libvirt-0.9.2) added
the code that checked for a bidirectional fifo. Prior to that commit,
bidirectional fifos for serial devices didn't work because libvirt
always required the ${name}.(in|out) fifos to exist, and qemu would
always prefer those.
If a domain started with -no-shutdown shuts down while libvirtd is not
running, it will be seen as paused when libvirtd reconnects to it. Use
the paused reason to detect if a domain was stopped because of shutdown
and finish the process just as if a SHUTDOWN event is delivered from
qemu.
The AppArmor security driver adds only the path specified in the domain
XML for character devices of type 'pipe'. It should be using <path>.in
and <path>.out. We do this by creating a new vah_add_file_chardev() and
use it for char devices instead of vah_add_file(). Also adjust
valid_path() to accept S_FIFO (since qemu chardevs of type 'pipe' use
fifos). This is https://launchpad.net/bugs/832507
This patch was made in response to:
https://bugzilla.redhat.com/show_bug.cgi?id=738095
In short, qemu's default for the rombar setting (which makes the
firmware ROM of a PCI device visible/not on the guest) was previously
0 (not visible), but they recently changed the default to 1
(visible). Unfortunately, there are some PCI devices that fail in the
guest when rombar is 1, so the setting must be exposed in libvirt to
prevent a regression in behavior (it will still require explicitly
setting <rom bar='off'/> in the guest XML).
rombar is forced on/off by adding:
<rom bar='on|off'/>
inside a <hostdev> element that defines a PCI device. It is currently
ignored for all other types of devices.
At the moment there is no clean method to determine whether or not the
rombar option is supported by QEMU - this patch uses the advice of a
QEMU developer to assume support for qemu-0.12+. There is currently a
patch in the works to put this information in the output of "qemu-kvm
-device pci-assign,?", but of course if we switch to keying off that,
we would lose support for setting rombar on all the versions of qemu
between 0.12 and whatever version gets that patch.
SIGTERM handling for -no-shutdown is already fixed in qemu git and
libvirt can safely use it. The downside is that 0.15.50 version of qemu
can be any qemu compiled from git, even that without the fix for
SIGTERM. However, I think this patch is worth it since excluding 0.15.50
from the check makes testing current qemu with libvirt much easier and
someone running qemu from git should be able to rebuild fixed qemu from
git if they hit the problem with a hang on shutdown.
* src/storage/storage_driver.c: As virStorageVolLookupByPath lookups
all the pool objs of the drivers, breaking when failing on getting
the stable path of the pool will just breaks the whole lookup process,
it can cause the API fails even if the vol exists indeed. It won't get
any benefit. This patch is to fix it.
QEMU 0.13 introduced cache=unsafe for -drive, this patch exposes
it in the libvirt layer.
* Introduced a new QEMU capability flag ($prefix_CACHE_UNSAFE),
as even if $prefix_CACHE_V2 is set, we can't know if unsafe
is supported.
* Improved the reliability of qemu cache type detection.
commit 984840a2c2 removed the
notification of waiting calls when VIR_NET_CONTINUE messages
arrive. This was to fix the case of a virStreamAbort() call
being prematurely notified of completion.
The problem is that sometimes there are dummy calls from a
virStreamRecv() call waiting that *do* need to be notified.
These dummy calls should have a status VIR_NET_CONTINUE. So
re-add the notification upon VIR_NET_CONTINUE, but only if
the waiter also has a status of VIR_NET_CONTINUE.
* src/rpc/virnetclient.c: Notify waiting call if stream data
arrives
* src/rpc/virnetclientstream.c: Mark dummy stream read packet
with status VIR_NET_CONTINUE