Previously libvirt's disk device XML only had a single attribute,
error_policy, to control both read and write error policy, but qemu
has separate options for controlling read and write. In one case
(enospc) a policy is allowed for write errors but not read errors.
This patch adds a separate attribute that sets only the read error
policy. If just error_policy is set, it will apply to both read and
write error policy (previous behavior), but if the new rerror_policy
attribute is set, it will override error_policy for read errors only.
Possible values for rerror_policy are "stop", "report", and "ignore"
("report" is the qemu-controlled default for rerror_policy when
error_policy isn't specified).
For consistency, the value "report" has been added to the possible
values for error_policy as well.
commit 12062ab set rerror=ignore when error_policy="enospace" was
selected (since the rerror option in qemu doesn't accept "enospc", as
the werror option does).
After that patch was already pushed, Paolo Bonzini noticed it and
commented that leaving rerror at the default ("report") would be a
better choice. This patch corrects the problem - if error_policy =
"enospace" is given, rerror is left off the qemu commandline,
effectively setting it to "report". For other values, rerror is still
set to match werror.
Additionally, the parsing of error_policy was changed to no longer
erroneously allow "default" as a choice - as with most other
attributes, if you want the default setting, just don't specify an
error_policy.
Finally, two ommissions in the first patch were corrected - a
long-dormant qemuxml2argv test for enospace was enabled, and fixed to
pass, and the argv2xml parser in qemu_command.c was updated to
recognize the different spelling on the qemu commandline.
Now that RHEL 6.2 Beta is out, it would be nice to test multifunction
devices on that platform. This changes things so that the multifunction
cap bit can be set in two different ways: by version comparison (needed
for qemu 0.13 which lacked a -device query), and by -device query
(provided by qemu.git and backported to the RHEL beta build of
qemu-kvm which still claims to be a modified 0.12, and therefore needed
for RHEL).
* src/qemu/qemu_capabilities.c (qemuCapsParseDeviceStr): Allow
second method of setting multifunction cap bit.
* tests/qemuhelptest.c (mymain): Test it.
* tests/qemuhelpdata/qemu-kvm-0.12.1.2-rhel62-beta: New file.
* tests/qemuhelpdata/qemu-kvm-0.12.1.2-rhel62-beta-device: Likewise.
If using one of the new non-NAT/routed virtual network
configurations, the LXC driver would not know how to
setup the VETH devices. Adding in calls to setup the
"actual" network configuration at VM startup and cleanup
when shutting down fixes this.
* src/lxc/lxc_driver.c: Setup/cleanup actual net devs
https://bugzilla.redhat.com/show_bug.cgi?id=740899 documents that
if qemu uses aio=native for its disks, then it consumes 128 aio
requests per disk. On a host with multiple guests, this can quickly
run out of kernel aio requests with the default aio-max-nr of
65536. Kernel developers have confirmed that there is no up-front
cost to raising this limit (a larger limit merely implies that more
aio requests can be issued in parallel, which in turn will result
in more kernel memory allocation, only if the system really does use
that many requests). Since the system default limit prevents 256
disks, which is well within libvirt's current scalability, this
patch installs a file to raise the limit and document it in case a
system administrator has further cause to tune the limit. The
install only works on platforms new enough to source /etc/sysctl.d/*
alongside /etc/sysctl.conf (F14 and RHEL 6).
* daemon/libvirtd.sysctl: New file.
* daemon/Makefile.am (EXTRA_DIST): Ship it.
(install-init, uninstall-init): Install it.
* libvirt.spec.in (%files): Include it in rpm.
Implements the documentation for snapshot revert vs. force.
Part of the patch tightens existing behavior (previously, reverting
to an old snapshot without <domain> was blindly attempted, now it
requires force), while part of it relaxes behavior (previously, it
was not possible to revert an active domain to an ABI-incompatible
active snapshot, now force allows this transition).
* src/qemu/qemu_driver.c (qemuDomainRevertToSnapshot): Check for
risky situations, and allow force to get past them.
Once we know which set of disks belong to a snapshot, reverting or
deleting that snapshot should visit just those disks, rather than
also visiting disks that were hot-plugged in the meantime or
skipping disks that were hot-unplugged in the meantime.
* src/qemu/qemu_domain.c (qemuDomainSnapshotForEachQcow2): Use
snapshot domain details when available. Avoid NULL deref.
Although reverting to a snapshot is a form of data loss, this is
normally expected. However, there are two cases where additional
surprises (failure to run the reverted state, or a break in
connectivity to the domain) can come into play. Requiring extra
acknowledgment in these cases will make it less likely that
someone can get into an unrecoverable state due to a default revert.
Also create a new error code, so users can distinguish when forcing
would make a difference, rather than having to blindly request force.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_SNAPSHOT_REVERT_FORCE):
New flag.
* src/libvirt.c (virDomainRevertToSnapshot): Document it.
* include/libvirt/virterror.h (VIR_ERR_SNAPSHOT_REVERT_RISKY): New
error value.
* src/util/virterror.c (virErrorMsg): Implement it.
* tools/virsh.c (cmdDomainSnapshotRevert): Add --force to virsh.
* tools/virsh.pod (snapshot-revert): Document it.
Commit 9f5e53e introduced the ability to filter snapshots to
just roots, but it was never implemented for VBox until now.
The VBox implementation prohibits deletion of a snapshot with
multiple children. Hence, there can only be at most one root,
which is found by searching for the snapshot with a NULL uuid.
Prior to 4.0, snapshotGet looked up by UUID, and snapshotFind
looked up by name; after that point, snapshotGet disappeared
and snapshotFind handles uuid or name.
* src/vbox/vbox_tmpl.c (vboxDomainSnapshotNum)
(vboxDomainSnapshotListNames): Implement limiting list to root.
Qemu driver tries to update balloon data in virDomainGetInfo and if it
can't do so because there is another monitor job running, it just
reports what's known in domain def. However, if there was no job running
but getting the data from qemu fails, we would fail the whole API. This
doesn't make sense. Let's make the failure nonfatal.
No need to request the parent of a snapshot if we aren't going to use it.
* src/esx/esx_vi.c (esxVI_GetSnapshotTreeByName): Make parent
optional.
* src/esx/esx_driver.c (esxDomainSnapshotCreateXML)
(esxDomainSnapshotLookupByName, esxDomainRevertToSnapshot)
(esxDomainSnapshotDelete): Simplify accordingly.
Commit 9f5e53e introduced the ability to filter snapshots to
just roots, but it was never implemented for ESX until now.
* src/esx/esx_vi.h (esxVI_GetNumberOfSnapshotTrees)
(esxVI_GetSnapshotTreeNames): Add parameter.
* src/esx/esx_vi.c (esxVI_GetNumberOfSnapshotTrees)
(esxVI_GetSnapshotTreeNames): Allow choice of recursion or not.
* src/esx/esx_driver.c (esxDomainSnapshotNum)
(esxDomainSnapshotListNames): Use it to limit to roots.
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=730909
When support for setting the qemu disk error policy to "enospc" was
added, it was inadvertently spelled "enospace". This patch corrects
that on the qemu commandline (while retaining the "enospace" spelling
for libvirt's XML).
Also, while examining the qemu source, I found that "enospc" is not
allowed for the read error policy, only for write error policy (makes
sense). Since libvirt currently only has a single error policy
setting, when "enospace" is selected, the read error policy is set to
"ignore".
Previously, virsh 'snapshot-parent' and 'snapshot-current' were
completely silent in the case where the code conclusively proved
there was no parent or current snapshot, but differed in exit
status; this silence caused some confusion on whether the commands
worked. Furthermore, commit d1be48f introduced a regression where
snapshot-parent would leak output about an unknown function, but
only on the first attempt, when talking to an older server that
lacks virDomainSnapshotGetParent. This changes things to consistenly
report an error message and exit with status 1 when no snapshot
exists, and to avoid leaking unknown function warnings when using
fallbacks.
* tools/virsh.c (vshGetSnapshotParent): Alter signature, to
distinguish between real error and missing parent. Don't pollute
last_error on success.
(cmdSnapshotParent): Adjust caller. Always output message on
failure.
(cmdSnapshotList): Adjust caller.
(cmdSnapshotCurrent): Always output message on failure.
Destination libvirtd remembers the original name in the prepare phase
and clears it in the finish phase. The original name is used when
comparing domain name in migration cookie.
When booting a virtual machine with a kernel/initrd it is possible
to pass command line arguments using the <cmdline>...args...</cmdline>
element in the guest XML. These appear to the kernel / init process
in /proc/cmdline.
When booting a container we do not have a custom /proc/cmdline,
but we can easily set an environment variable for it. Ideally
we could pass individual arguments to the init process as a
regular set of 'char *argv[]' parameters, but that would involve
libvirt parsing the <cmdline> XML text. This can easily be added
later, even if we add the env variable now
* docs/drvlxc.html.in: Document env variables passed to LXC
* src/conf/domain_conf.c: Add <cmdline> to be parsed for
guests of type='exe'
* src/lxc/lxc_container.c: Set LIBVIRT_LXC_CMDLINE env var
This patch is a fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=743176
which was discovered by Dan Berrange while making bandwidth
configuration work for LXC guests.
Background: Although virtportprofile data from a network portgroup is
only applicable for direct mode interfaces, the code that copies
bandwidth data from the portgroup was also only being executed in the
case of direct mode interfaces. The result was that interfaces using
traditional virtual networks (forward mode='nat|route|none'), and
those using a host bridge for forwarding, would not pick up bandwidth
data from a portgroup defined in the network.
This patch moves that code outside the conditional, so that bandwidth
information is *alway* copied from the appropriate portgroup (unless
the <interface> definition itself already has bandwidth information,
which would take precedence over what's in the portgroup anyway).
Code altered so that it is consistent with the associated comment. The
'autoconf' variable is forced to zero.
Signed-off-by: Neil Wilson <neil@brightbox.co.uk>
Do not crash if virStreamFinish is called after error.
==11000== Invalid read of size 4
==11000== at 0x373A8099A0: pthread_mutex_lock (pthread_mutex_lock.c:51)
==11000== by 0x4C7CADE: virMutexLock (threads-pthread.c:85)
==11000== by 0x4D57C31: virNetClientStreamRaiseError (virnetclientstream.c:203)
==11000== by 0x4D385E4: remoteStreamFinish (remote_driver.c:3541)
==11000== by 0x4D182F9: virStreamFinish (libvirt.c:14157)
==11000== by 0x40FDC4: cmdScreenshot (virsh.c:3075)
==11000== by 0x42BA40: vshCommandRun (virsh.c:14922)
==11000== by 0x42ECCA: main (virsh.c:16381)
==11000== Address 0x59b86c0 is 16 bytes inside a block of size 216 free'd
==11000== at 0x4A06928: free (vg_replace_malloc.c:427)
==11000== by 0x4C69E2B: virFree (memory.c:310)
==11000== by 0x4D57B56: virNetClientStreamFree (virnetclientstream.c:184)
==11000== by 0x4D3DB7A: remoteDomainScreenshot (remote_client_bodies.h:1812)
==11000== by 0x4CFD245: virDomainScreenshot (libvirt.c:2903)
==11000== by 0x40FB73: cmdScreenshot (virsh.c:3029)
==11000== by 0x42BA40: vshCommandRun (virsh.c:14922)
==11000== by 0x42ECCA: main (virsh.c:16381)
When support for was added for PCI multifunction cards (in commit
9f8baf, first included in libvirt 0.9.3), it was done by always
turning on the multifunction bit for all PCI devices. Since that time
it has been realized that this is not an ideal solution, and that the
multifunction bit must be selectively turned on. For example, see
https://bugzilla.redhat.com/show_bug.cgi?id=728174
and the discussion before and after
https://www.redhat.com/archives/libvir-list/2011-September/msg01036.html
This patch modifies multifunction support so that the multifunction=on
option is only added to the qemu commandline for a device if its PCI
<address> definition has the attribute "multifunction='on'", e.g.:
<address type='pci' domain='0x0000' bus='0x00'
slot='0x04' function='0x0' multifunction='on'/>
In practice, the multifunction bit should only be turned on if
function='0' AND other functions will be used in the same slot - it
usually isn't needed for functions 1-7 (although there are apparently
some exceptions, e.g. the Intel X53 according to the QEMU source
code), and should never be set if only function 0 will be used in the
slot. The test cases have been changed accordingly to illustrate.
With this patch in place, if a user attempts to assign multiple
functions in a slot without setting the multifunction bit for function
0, libvirt will issue an error when the domain is defined, and the
define operation will fail. In the future, we may decide to detect
this situation and automatically add multifunction=on to avoid the
error; even then it will still be useful to have a manual method of
turning on multifunction since, as stated above, there are some
devices that excpect it to be turned on for all functions in a slot.
A side effect of this patch is that attempts to use the same PCI
address for two different devices will now log an error (previously
this would cause the domain define operation to fail, but there would
be no log message generated). Because the function doing this log was
almost completely rewritten, I didn't think it worthwhile to make a
separate patch for that fix (the entire patch would immediately be
obsoleted).
error:could not take a screenshot of xp
==6216== Syscall param unlink(pathname) points to unaddressable byte(s)
==6216== at 0x373A0D4937: unlink (syscall-template.S:82)
==6216== by 0x40FD73: cmdScreenshot (virsh.c:3070)
==6216== by 0x42BA0D: vshCommandRun (virsh.c:14920)
==6216== by 0x42EC97: main (virsh.c:16379)
==6216== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==6216==
error:Requested operation is not valid: domain is not running
If the regexes supported (?:pvs)?, then we could handle this by
optionally matching but not returning the initial command name. But it
doesn't. So add a new char* argument to
virStorageBackendRunProgRegex(). If that argument is NULL then we act
as usual. Otherwise, if the string at that argument is found at the
start of a returned line, we drop that before running the regex.
With this patch, virt-manager shows me lvs with command_names 1 or 0.
The definitions of PVS_BASE etc may want to be moved into the configure
scripts (though given how PVS is found, IIUC that could only happen if
pvs was a link to pvs_real), but in any case no sense dealing with that
until we're sure this is an ok way to handle it.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Currently, qemuDomainGetXMLDesc and qemudDomainGetInfo check for
outstanding synchronous job before (eventual) monitor entering.
However, there can be already async job set, e.g. migration.
Before, URIs such as hyperv+ssh:// have been declined by the Hyper-V
driver resulting in the remote driver trying to connect to an
non-existing libvirtd.
Now such URIs trigger an error in the yper-V driver suggesting to
try again without the transport part in the scheme.
Before, URIs such as esx+ssh:// have been declined by the ESX driver
resulting in the remote driver trying to connect to an non-existing
libvirtd.
Now such URIs trigger an error in the ESX driver suggesting to try
again without the transport part in the scheme.
there is no option "none":
>From libvirt/src/conf/domain_conf.c
<snip>
VIR_ENUM_IMPL(virDomainTimerTickpolicy,
VIR_DOMAIN_TIMER_TICKPOLICY_LAST,
"delay",
"catchup",
"merge",
"discard");
</snip>
Replacing with delay.
Signed-off-by: Douglas Schilling Landgraf <dougsland@redhat.com>
This patch is based on a improvement suggested by Kazuhiro Kikuchi
of Fujitsu, it gives a description of the target parameter for that
command
* tools/virsh.pod: add description for target parameter of
attach-interface
The man page suggest that the cpu_shares parameter of schedinfo
allows values 0-262144, but the kernel remaps values 0 and 1 to
the minimum 2, just document that behaviour:
[root@test ~]# cat /cgroup/cpu/libvirt/qemu/cpu.shares
1024
[root@test ~]# echo 0 > /cgroup/cpu/libvirt/qemu/cpu.shares
[root@test ~]# cat /cgroup/cpu/libvirt/qemu/cpu.shares
2
[root@test ~]# echo 1 > /cgroup/cpu/libvirt/qemu/cpu.shares
[root@test ~]# cat /cgroup/cpu/libvirt/qemu/cpu.shares
2
[root@test ~]#
* tools/virsh.pod: update description of the cpu_shares parameter
to indicate the values 0 and 1 are automatically changed by the
kernel to minimal value 2
If the daemon is restarted so we reconnect to monitor, cdrom media
can be ejected. In that case we don't want to show it in domain xml,
or require it on migration destination.
To check for disk status use 'info block' monitor command.
* src/qemu/qemu_migration.c: if 'vmdef' is NULL, the function
virDomainSaveConfig still dereferences it, it doesn't make
sense, so should add return value check to make sure 'vmdef'
is non-NULL before calling virDomainSaveConfig, in addition,
in order to debug later, also should record error information
into log.
Signed-off-by: Alex Jia <ajia@redhat.com>
First hypervisor implementation of the new API.
Allows 'virsh snapshot-list --tree' to be more efficient.
* src/qemu/qemu_driver.c (qemuDomainSnapshotGetParent): New
function.
Reuse the tree listing of nodedev-list, coupled with the new helper
function to efficiently grab snapshot parent names, to produce
tree output for a snapshot hierarchy. For example:
$ virsh snapshot-list dom --tree
root1
|
+- sibling1
+- sibling2
| |
| +- grandchild
|
+- sibling3
root2
|
+- child
* tools/virsh.c (cmdSnapshotList): Add --tree.
* tools/virsh.pod (snapshot-list): Document it.
Make parent computation reusable, using virDomainSnapshotGetParent
when possible.
* tools/virsh.c (vshGetSnapshotParent): New helper.
(cmdSnapshotParent): Use it.
Mostly straight-forward, although this is the first API that
returns a new snapshot based on a snapshot rather than a domain.
* src/remote/remote_protocol.x
(REMOTE_PROC_DOMAIN_SNAPSHOT_GET_PARENT): New rpc.
(remote_domain_snapshot_get_parent_args)
(remote_domain_snapshot_get_parent_ret): New structs.
* src/rpc/gendispatch.pl: Adjust generator.
* src/remote/remote_driver.c (remote_driver): Use it.
* src/remote_protocol-structs: Update.
Although a client can already obtain a snapshot's parent by
dumping and parsing the xml, then doing a snapshot lookup by
name, it is more efficient to get the parent in one step, which
in turn will make operations that must traverse a snapshot
hierarchy easier to perform.
* include/libvirt/libvirt.h.in (virDomainSnapshotGetParent):
Declare.
* src/libvirt.c (virDomainSnapshotGetParent): New function.
* src/libvirt_public.syms: Export it.
* src/driver.h (virDrvDomainSnapshotGetParent): New callback.
Coupled with the recent virsh nodedev-* doc patch, this should now
give a better picture of libvirt node device handling.
* docs/formatnode.html.in: Fill in page.