Commit Graph

5665 Commits

Author SHA1 Message Date
Eric Blake
a31d65695d snapshot: speed up snapshot location
Each snapshot lookup was iterating over the entire hash table, O(n),
instead of honing in directly on the hash key, amortized O(1).

Besides, fixing this means that virDomainSnapshotFindByName can now
be used inside another virHashForeach iteration (without this patch,
attempts to lookup a snapshot by name during a hash iteration will
fail due to nested iteration).

* src/conf/domain_conf.c (virDomainSnapshotFindByName): Simplify.
(virDomainSnapshotObjListSearchName): Delete unused function.
2011-09-02 16:03:50 -06:00
Eric Blake
7dc44eb059 snapshot: fine-tune qemu snapshot revert states
For a system checkpoint of a running or paused domain, it's fairly
easy to honor new flags for altering which state to use after the
revert.  For an inactive snapshot, the revert has to be done while
there is no qemu process, so do back-to-back transitions; this also
lets us revert to inactive snapshots even for transient domains.

* src/qemu/qemu_driver.c (qemuDomainRevertToSnapshot): Support new
flags.
2011-09-02 12:33:37 -06:00
Eric Blake
25fb3ef1e1 snapshot: properly revert qemu to offline snapshots
Commit 5e47785 broke reverts to offline system checkpoint snapshots
with older qemu, since there is no longer any code path to use
qemu -loadvm on next boot.  Meanwhile, reverts to offline system
checkpoints have been broken for newer qemu, both before and
after that commit, since -loadvm no longer works to revert to
disk state without accompanying vm state.  Fix both of these by
using qemu-img to revert disk state.

Meanwhile, consolidate the (now 3) clients of a qemu-img iteration
over all disks of a VM into one function, so that any future
algorithmic fixes to the FIXMEs in that function after partial
loop iterations are dealt with at once.  That does mean that this
patch doesn't handle partial reverts very well, but we're not
making the situation any worse in this patch.

* src/qemu/qemu_driver.c (qemuDomainRevertToSnapshot): Use
qemu-img rather than 'qemu -loadvm' to revert to offline snapshot.
(qemuDomainSnapshotRevertInactive): New helper.
(qemuDomainSnapshotCreateInactive): Factor guts...
(qemuDomainSnapshotForEachQcow2): ...into new helper.
(qemuDomainSnapshotDiscard): Use it.
2011-09-02 12:30:11 -06:00
Eric Blake
88fe7a4ba5 snapshot: improve reverting to qemu paused snapshots
If you take a checkpoint snapshot of a running domain, then pause
qemu, then restore the snapshot, the result should be a running
domain, but the code was leaving things paused.  Furthermore, if
you take a checkpoint of a paused domain, then run, then restore,
there was a brief but non-deterministic window of time where the
domain was running rather than paused.  Fix both of these
discrepancies by always pausing before restoring.

Also, check that the VM is active every time lock is dropped
between two monitor calls.

Finally, straighten out the events that get emitted on each
transition.

* src/qemu/qemu_driver.c (qemuDomainRevertToSnapshot): Always
pause before reversion, and improve events.
2011-09-02 12:05:08 -06:00
Eric Blake
7381aaff33 snapshot: fine-tune qemu saved images starting paused
Implement the new running/paused overrides for saved state management.

Unfortunately, for virDomainSaveImageDefineXML, the saved state
updates are write-only - I don't know of any way to expose a way
to query the current run/pause setting of an existing save image
file to the user without adding a new API or modifying the domain
xml of virDomainSaveImageGetXMLDesc to include a new element to
reflect the state bit encoded into the save image.  However, I
don't think this is a show-stopper, since the API is designed to
leave the state bit alone unless an explicit flag is used to
change it.

* src/qemu/qemu_driver.c (qemuDomainSaveInternal)
(qemuDomainSaveImageOpen): Adjust signature.
(qemuDomainSaveFlags, qemuDomainManagedSave)
(qemuDomainRestoreFlags, qemuDomainSaveImageGetXMLDesc)
(qemuDomainSaveImageDefineXML, qemuDomainObjRestore): Adjust
callers.
2011-09-02 10:00:06 -06:00
Eric Blake
3cff66f487 snapshot: fine-tune ability to start paused
While it is nice that snapshots and saved images remember whether
the domain was running or paused, sometimes the restoration phase
wants to guarantee a particular state (paused to allow hot-plugging,
or running without needing to call resume).  This introduces new
flags to allow the control, and a later patch will implement the
flags for qemu.

* include/libvirt/libvirt.h.in (VIR_DOMAIN_SAVE_RUNNING)
(VIR_DOMAIN_SAVE_PAUSED, VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING)
(VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED): New flags.
* src/libvirt.c (virDomainSaveFlags, virDomainRestoreFlags)
(virDomainManagedSave, virDomainSaveImageDefineXML)
(virDomainRevertToSnapshot): Document their use, and enforce
mutual exclusion.
2011-09-02 10:00:06 -06:00
Eric Blake
c1ff5dc63d snapshot: better events when starting paused
There are two classes of management apps that track events - one
that only cares about on/off (and only needs to track EVENT_STARTED
and EVENT_STOPPED), and one that cares about paused/running (also
tracks EVENT_SUSPENDED/EVENT_RESUMED).  To keep both classes happy,
any transition that can go from inactive to paused must emit two
back-to-back events - one for started and one for suspended (since
later resuming of the domain will only send RESUMED, but the first
class isn't tracking that).

This also fixes a bug where virDomainCreateWithFlags with the
VIR_DOMAIN_START_PAUSED flag failed to start paused when restoring
from a managed save image.

* include/libvirt/libvirt.h.in (VIR_DOMAIN_EVENT_SUSPENDED_RESTORED)
(VIR_DOMAIN_EVENT_SUSPENDED_FROM_SNAPSHOT)
(VIR_DOMAIN_EVENT_RESUMED_FROM_SNAPSHOT): New sub-events.
* src/qemu/qemu_driver.c (qemuDomainRevertToSnapshot): Use them.
(qemuDomainSaveImageStartVM): Likewise, and add parameter.
(qemudDomainCreate, qemuDomainObjStart): Send suspended event when
starting paused.
(qemuDomainObjRestore): Add parameter.
(qemuDomainObjStart, qemuDomainRestoreFlags): Update callers.
* examples/domain-events/events-c/event-test.c
(eventDetailToString): Map new detail strings.
2011-09-02 10:00:06 -06:00
Marc-André Lureau
4813b3f094 Learn to use spicevmc as a redirection type for usb-redir 2011-09-02 23:39:03 +08:00
Marc-André Lureau
162efa1a7c Add "redirdev" redirection device
- create a new "redirdev" element for this purpose
2011-09-02 23:39:03 +08:00
Marc-André Lureau
fdd14a9d05 qemu: Don't append 0 at usb id, so that it is compatible with legacy -usb
QEMU uses USB bus name "usb.0" when using the legacy -usb argument.
If we want to allow USB devices to specify their addresses with legacy
-usb, we should either in case of legacy bus name drop the 0 from the
address bus, or just drop the 0 from device id. This patch does the
later.

Another solution would be to permit addressing on non-legacy USB
controllers only.
2011-09-02 23:39:03 +08:00
Marc-André Lureau
f35bbf7be7 qemu: don't reserve slot 1 if a PIIX3 USB controller is defined there
Applies only to piix3 and check if piix3 controller is on correct
address, or report error
2011-09-02 23:39:03 +08:00
Marc-André Lureau
31710a5389 Modify USB port to be defined as a port path
So that devices can be attached to hubs. Example, to attach to first
port of a usb-hub on port 1.

      <hub type='usb'>
         <address type='usb' bus='0' port='1'/>
      </hub>

      <input type='mouse' type='usb'>
         <address type='usb' bus='0' port='1.1'/>
      </hub>

also add a test entry
2011-09-02 23:39:03 +08:00
Marc-André Lureau
fdabeb3c5f Add USB hub device
domain parsing and serialization code, qemu driver backend and
a couple of test
2011-09-02 23:38:52 +08:00
Marc-André Lureau
f3ce59621f Add USB companion controllers support
Companion controllers take an extra 'master' attribute to associate
them.

Also add tests for this
2011-09-02 23:22:56 +08:00
Marc-André Lureau
22c0d433ab USB devices gain a new USB address child element
Expand the domain and the QEmu driver code
Adds a couple of tests
2011-09-02 23:22:56 +08:00
Marc-André Lureau
d6d54cd19e Add a new controller type 'usb' with optionnal 'model'
The model by default is piix3-uchi.

Example:
<controller type='usb' index='0' model='ich9-ehci'/>
2011-09-02 23:22:56 +08:00
Marc-André Lureau
2e4b5243b2 Add USB controller models
List is: piix3-uhci piix4-uhci ehci ich9-ehci1 ich9-uhci1 ich9-uhci2
ich9-uhci3 vt82c686b-uhci pci-ohci
2011-09-02 23:22:56 +08:00
Marc-André Lureau
8631bdc0c8 Rename virDomainControllerModel to virDomainControllerModelSCSI
Since we are about to add USB controller support let's remove the
ambiguity
2011-09-02 23:22:56 +08:00
Marc-André Lureau
329f907b99 Add various USB devices QEMU_CAPS 2011-09-02 23:22:56 +08:00
Eric Blake
c554f6e18b snapshot: fix corner case on OOM during creation
Commit 6766ff10 introduced a corner case bug with snapshot creation:
if a snapshot is created, but then we hit OOM while trying to
create the return value of the function, then we have polluted the
internal directory with the snapshot metadata with no way to clean
it up from the running libvirtd.

* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Don't
write metadata file on OOM condition.
2011-09-02 08:50:01 -06:00
Osier Yang
6ee52c1b76 Add directsync cache mode support for disk driver
Newer QEMU introduced cache=directsync for -drive, this patchset
is to expose it in libvirt layer.

  * Introduced a new QEMU capability flag ($prefix_CACHE_DIRECTSYNC),
    As even $prefix_CACHE_V2 is set, we can't known if directsync
    is supported.
2011-09-02 21:36:58 +08:00
Osier Yang
27758859c7 storage: Add fs pool formatting
This patch adds the ability to make the filesystem for a filesystem
pool during a pool build.

The patch adds two new flags, no overwrite and overwrite, to control
when mkfs gets executed.  By default, the patch preserves the
current behavior, i.e., if no flags are specified, pool build on a
filesystem pool only makes the directory on which the filesystem
will be mounted.

If the no overwrite flag is specified, the target device is checked
to determine if a filesystem of the type specified in the pool is
present.  If a filesystem of that type is already present, mkfs is
not executed and the build call returns an error.  Otherwise, mkfs
is executed and any data present on the device is overwritten.

If the overwrite flag is specified, mkfs is always executed, and any
existing data on the target device is overwritten unconditionally.
2011-09-02 21:16:58 +08:00
Osier Yang
50c82157e1 API: Init conn in case of it might be used uninitialized
There is a goto before "conn" is initialized.
2011-09-02 15:41:29 +08:00
Eric Blake
55d88def95 qemu: detect incomplete save files
Several users have reported problems with 'virsh start' failing because
it was encountering a managed save situation where the managed save file
was incomplete.  Be more robust to this by using two different magic
numbers, so that newer libvirt can gracefully handle an incomplete file
differently than a complete one, while older libvirt will at least fail
up front rather than trying to load only to have qemu fail at the end.

Managed save is a convenience - it exists to preserve as much state
as possible; if the state was not preserved, it is reasonable to just
log that fact, then proceed with a fresh boot.  On the other hand,
user saves are under user control, so we must fail, but by making
the failure message distinct, the user can better decide how to handle
the situation of an incomplete save file.

* src/qemu/qemu_driver.c (QEMUD_SAVE_PARTIAL): New define.
(qemuDomainSaveInternal): Use it to mark incomplete images.
(qemuDomainSaveImageOpen, qemuDomainObjRestore): Add parameter
that controls what to do with partial images.
(qemuDomainRestoreFlags, qemuDomainSaveImageGetXMLDesc)
(qemuDomainSaveImageDefineXML, qemuDomainObjStart): Update callers.
Based on an initial idea by Osier Yang.
2011-09-01 22:08:13 -06:00
Eric Blake
449ae9c2f1 qemu: refactor file opening
In a SELinux or root-squashing NFS environment, libvirt has to go
through some hoops to create a new file that qemu can then open()
by name.  Snapshots are a case where we want to guarantee an empty
file that qemu can open; also, reopening a save file to convert it
from being marked partial to complete requires a reopen to avoid
O_DIRECT headaches.  Refactor some existing code to make it easier
to reuse in later patches.

* src/qemu/qemu_migration.h (qemuMigrationToFile): Drop parameter.
* src/qemu/qemu_migration.c (qemuMigrationToFile): Let cgroup do
the stat, rather than asking caller to do it and pass info down.
* src/qemu/qemu_driver.c (qemuOpenFile): New function, pulled from...
(qemuDomainSaveInternal): ...here.
(doCoreDump, qemuDomainSaveImageOpen): Use it here as well.
2011-09-01 22:08:13 -06:00
Wen Congyang
deff02a365 reserve slot 1 on pci bus0
After supporting multi function pci device, we only reserve function 1 on slot 1.
The user can use the other function on slot 1 in the xml config file. We should
detect this wrong usage.
2011-09-02 11:33:04 +08:00
Scott Moser
f0fe28cb8d lxc: do not require 'ifconfig' or 'ipconfig' in container
Currently, the lxc implementation invokes 'ip' and 'ifconfig' commands
inside a container using 'virRun'.  That has the side effect of requiring
those commands to be present and to function in a manner consistent with
the usage.  Some small roots (such as ttylinux) may not have 'ip' or
'ifconfig'.

This patch replaces the use of these commands with usage of
netdevice.  The result is that lxc containers do not have to implement
those commands, and lxc in libvirt is only dependent on the netdevice
interface.

I've tested this patch locally against the ubuntu libvirt version enough
to verify its generally sane.  I attempted to build upstream today, but
failed with:
  /usr/bin/ld:
    ../src/.libs/libvirt_driver_qemu.a(libvirt_driver_qemu_la-qemu_domain.o):
   undefined reference to symbol 'xmlXPathRegisterNs@@LIBXML2_2.4.30

Thats probably a local issue only, but I wanted to get this patch up and
see what others thought of it.  This is ubuntu bug
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/828211 .
2011-09-01 20:11:50 -06:00
Serge Hallyn
c1665ba872 Create ptmx as a device
Hi,

I'm seeing an issue with udev and libvirt-lxc.  Libvirt-lxc creates
/dev/ptmx as a symlink to /dev/pts/ptmx.  When udev starts up, it
checks the device type, sees ptmx is 'not right', and replaces it
with a 'proper' ptmx.

In lxc, /dev/ptmx is bind-mounted from /dev/pts/ptmx instead of being
symlinked, so udev sees the right device type and leaves it alone.

A patch like the following seems to work for me.  Would there be
any objections to this?

>From 4c5035de52de7e06a0de9c5d0bab8c87a806cba7 Mon Sep 17 00:00:00 2001
From: Ubuntu <ubuntu@domU-12-31-39-14-F0-B3.compute-1.internal>
Date: Wed, 31 Aug 2011 18:15:54 +0000
Subject: [PATCH 1/1] make ptmx a bind mount rather than symlink

udev on some systems checks the device type of /dev/ptmx, and replaces it if
not as expected.  The symlink created by libvirt-lxc therefore gets replaced.
By creating it as a bind mount, the device type is correct and udev leaves it
alone.

Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
2011-09-01 20:11:50 -06:00
Adam Litke
d4b9e06256 BlockPull: Set initial bandwidth limit if specified
The libvirt BlockPull API supports the use of an initial bandwidth limit but the
qemu block_stream API does not.  To get the desired behavior we use the two APIs
strung together: first BlockPull, then BlockJobSetSpeed.  We can do this at the
driver level to avoid duplicated code in each monitor path.

Signed-off-by: Adam Litke <agl@us.ibm.com>
2011-09-01 20:11:50 -06:00
Adam Litke
78d9325d1e BlockJob: Bandwidth parameter is in MB when using text monitor
Due to an unfortunate precedent in qemu, the units for the bandwidth parameter
to block_job_set_speed are different between the text monitor and the qmp
monitor.  While the qmp monitor uses bytes/s, the text monitor expects MB/s.

Correct the units for the text interface.

Signed-off-by: Adam Litke <agl@us.ibm.com>
2011-09-01 20:11:50 -06:00
Jim Fehlig
57c95175e2 Increase size of buffer for xend response
On systems with many pcpus, the sexpr returned by xend can be quite
large for dom0 when it is configured to have #vcpus = #pcpus (default).
E.g. on a 80 pcpu system, where dom0 had 80 vcpus, the sexpr details
for dom0 was 73817 bytes!  Increase maximum buffer size to 256k.
2011-09-01 19:57:30 -06:00
Jim Fehlig
32620dabb1 Don't overwrite errors from xend_{get,req}
xenDaemonDomainFetch() was overwriting errors reported by
xend_get() and xend_req().  E.g. without patch

error: failed Xen syscall xenDaemonDomainFetch failed to find this domain

with patch

error: internal error Xend returned HTTP Content-Length of 73817, which exceeds
maximum of 65536
2011-09-01 18:19:33 -06:00
Eric Blake
7bc1c5cefe build: fix 'make check' with pdwtags
Problem introduced by commit b12354b.

* src/remote_protocol-structs: Remove spurious blank line.
2011-09-01 12:33:46 -06:00
Jim Fehlig
b12354befe Add public API for getting migration speed
Includes impl of python binding since the generator was not
able to cope.

Note: Requires gendispatch.pl patch from Matthias Bolte

https://www.redhat.com/archives/libvir-list/2011-August/msg01367.html
2011-09-01 11:26:21 -06:00
Daniel P. Berrange
b3fb288e52 Fix tracking of RPC messages wrt streams
Commit 2c85644b0b attempted to
fix a problem with tracking RPC messages from streams by doing

-            if (msg->header.type == VIR_NET_REPLY) {
+            if (msg->header.type == VIR_NET_REPLY ||
+                (msg->header.type == VIR_NET_STREAM &&
+                 msg->header.status != VIR_NET_CONTINUE)) {
                 client->nrequests--;

In other words any stream packet, with status NET_OK or NET_ERROR
would cause nrequests to be decremented. This is great if the
packet from from a synchronous virStreamFinish or virStreamAbort
API call, but wildly wrong if from a server initiated abort.
The latter resulted in 'nrequests' being decremented below zero.
This then causes all I/O for that client to be stopped.

Instead of trying to infer whether we need to decrement the
nrequests field, from the message type/status, introduce an
explicit 'bool tracked' field to mark whether the virNetMessagePtr
object is subject to tracking.

Also add a virNetMessageClear function to allow a message
contents to be cleared out, without adversely impacting the
'tracked' field as a naive memset() would do

* src/rpc/virnetmessage.c, src/rpc/virnetmessage.h: Add
  a 'bool tracked' field and virNetMessageClear() API
* daemon/remote.c, daemon/stream.c, src/rpc/virnetclientprogram.c,
  src/rpc/virnetclientstream.c, src/rpc/virnetserverclient.c,
  src/rpc/virnetserverprogram.c: Switch over to use
  virNetMessageClear() and pass in the 'bool tracked' value
  when creating messages.
2011-09-01 10:52:35 +01:00
Daniel P. Berrange
b6263c1801 Fix parted sector size assumption
Parted does not report disk size in 512 byte units, but
rather the disks' logical sector size, which with modern
drives might be 4k.

* src/storage/parthelper.c: Remove hardcoded 512 byte sector
  size
2011-09-01 10:46:31 +01:00
Osier Yang
6f2581edd7 qemu: Fix a regression of domain save
* src/qemu/qemu_driver.c - qemuDomainSaveInternal: Return directly
will keep the domain object locked, introduced by 173015bec6.
2011-09-01 17:38:20 +08:00
Osier Yang
9f3e724339 Revert "test: Cleanup improper VIR_ERR_NO_SUPPORT use"
This reverts commit 172214bd30.
2011-09-01 17:37:11 +08:00
Osier Yang
ffafede112 storage: Fix incorrect error codes
Commit 0376f4a69b intended to fix incorrect use of VIR_ERR_NO_SUPPORT,
but replacing it with VIR_ERR_OPERATION_INVALID is not proper either.
2011-09-01 17:36:38 +08:00
Osier Yang
fd038a337b remote: Fix incorrect error codes
Introduced by d4b53ef6c. For "no internalFlags support", the
error code is changed into INTERNAL_ERROR.
2011-09-01 17:35:56 +08:00
Osier Yang
03388b6424 nodeinfo: Fix incorrect error codes
Introduced by 5e495c8b, except the ones for checking if numa
is supported by host, all the NO_SUPPORT are changed back. For
the ones about numa checking, change them into INTERNAL_ERROR.
2011-09-01 17:35:23 +08:00
Osier Yang
6af0c3e82b lxc: Fix incorrect changes on error codes.
Fix incorrect changes introduced by commit 6ac47762bb.
2011-09-01 17:34:31 +08:00
Osier Yang
c2c713dd00 conf: Substitute OPERATION_INVALID with INTERNAL_ERROR 2011-09-01 17:31:24 +08:00
Daniel P. Berrange
6ff9fc26d3 Stop libxl driver polluting logs on non-Xen hosts
If the libxl driver is compiled in, then everytime libvirtd
starts up on a non-Xen Dom0 host, it logs a error message.
Since this is an expected condition, we should not log at
'error' level, only 'info'.

* src/libxl/libxl_driver.c: Lower log level for certain
  expected errors during driver init
2011-08-31 17:53:01 +01:00
Daniel P. Berrange
d07aa6a96f Fix memory leak parsing 'relabel' attribute in domain security XML
* src/conf/domain_conf.c: Free the 'relabel' attribute
2011-08-31 17:51:09 +01:00
Daniel P. Berrange
c32536e7da Don't leak memory if a cgroup is mounted multiple times
It is possible (expected/likely in Fedora 15) for a cgroup controller
to be mounted in multiple locations at the same time, due to bind
mounts. Currently we leak memory if this happens, because we overwrite
the previous 'mountPoint' string. Instead just accept the first match
we find.

* src/util/cgroup.c: Only accept first match for a cgroup
  controller mount
2011-08-31 17:51:09 +01:00
Eric Blake
cab55fa0a8 security: fix build
Regression introduced in commit 183383889.

* src/libvirt_private.syms (security_manager.h): Drop deleted
symbol. Detected by build-bot.
2011-08-31 08:33:17 -06:00
Daniel P. Berrange
183383889a Remove bogus virSecurityManagerSetProcessFDLabel method
The virSecurityManagerSetProcessFDLabel method was introduced
after a mis-understanding from a conversation about SELinux
socket labelling. The virSecurityManagerSetSocketLabel method
should have been used for all such scenarios.

* src/security/security_apparmor.c, src/security/security_apparmor.c,
  src/security/security_driver.h, src/security/security_manager.c,
  src/security/security_manager.h, src/security/security_selinux.c,
  src/security/security_stack.c: Remove SetProcessFDLabel driver
2011-08-31 11:07:31 +01:00
Daniel P. Berrange
64bdec3841 Fix sanlock socket security labelling
It is not possible to change the label of a TCP socket once it
has been opened. When creating a TCP socket care must be taken
to ensure the socket creation label is set & then cleared.
Remove the bogus call to virSecurityManagerSetProcessFDLabel
from the lock driver guest setup code and instead make use of
virSecurityManagerSetSocketLabel
2011-08-31 11:07:31 +01:00
Daniel P. Berrange
2223b1f71f Fix incorrect path length check in sanlock lockspace setup
The code for creating a sanlock lockspace accidentally used
SANLK_NAME_LEN instead of SANLK_PATH_LEN for a size check.
This meant disk paths were limited to 48 bytes !

* src/locking/lock_driver_sanlock.c: Fix disk path length
  check
2011-08-31 11:07:31 +01:00