Commit Graph

1788 Commits

Author SHA1 Message Date
Osier Yang
02e8d0cfdf qemu: Split scsi-disk into into scsi-hd and scsi-cd
A "scsi-disk" device can be either a hard disk or a CD-ROM,
if there is ",media=cdrom" specified for the backend, it's
a CD-ROM, otherwise it's a hard disk.

But upstream qemu splitted "scsi-disk" into "scsi-hd" and
"scsi-cd" since commit b443ae, and ",media=cdrom" is not
required for scsi-cd anymore. "scsi-disk" is still supported
for backwards compatibility, but no doubt we should go
foward.
2012-04-17 17:21:24 +08:00
Jan Kiszka
dde91ab917 Do not enforce source type of console[0]
If console[0] is an alias for serial[0], do not enforce the former to
have a PTY source type. This breaks serial consoles on stdio and makes
no sense.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
2012-04-16 22:24:20 -06:00
Michal Privoznik
63ddc65d63 qemuProcessStart: Switch to flags instead of bunch booleans
Currently, we have 3 boolean arguments we have to pass
to qemuProcessStart(). As libvirt grows it is harder and harder
to remember them and their position. Therefore we should
switch to flags instead.
2012-04-16 17:20:04 +02:00
Osier Yang
6fbd5737e9 qemu: Avoid the memory allocation and freeing 2012-04-16 18:09:10 +08:00
Osier Yang
ccf80e3630 numad: Convert node list to cpumap before setting affinity
Instead of returning a CPUs list, numad returns NUMA node
list instead, this patch is to convert the node list to
cpumap before affinity setting. Otherwise, the domain
processes will be pinned only to CPU[$numa_cell_num],
which will cause significiant performance losses.

Also because numad will balance the affinity dynamically,
reflecting the cpuset from numad back doesn't make much
sense then, and it may just could produce confusion for
the users. Thus the better way is not to reflect it back
to XML. And in this case, it's better to ignore the cpuset
when parsing XML.

The codes to update the cpuset is removed in this patch
incidentally, and there will be a follow up patch to ignore
the manually specified "cpuset" if "placement" is "auto",
and document will be updated too.
2012-04-16 18:09:05 +08:00
Michal Privoznik
354e6d4ed0 qemu: Fix mem leak in qemuProcessInitCpuAffinity
If placement mode is AUTO, on some return paths char *cpumap or
char *nodeset are leaked.
2012-04-13 12:01:53 +02:00
D. Herrendoerfer
997366ca7d qemu,util: fix netlink callback registration for migration
This patch adds a netlink callback when migrating a VEPA enabled
virtual machine.  It fixes a Bug where a VM would not request a port
association when it was cleared by lldpad.

This patch requires the latest git version of lldpad to work.

Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
2012-04-12 14:32:10 -04:00
Michal Privoznik
b1256816ff qemuOpenFile: Don't force chown on NFS
If dynamic_ownership is off and we are creating a file on NFS
we force chown. This will fail as chown/chmod are not supported
on NFS. However, with no dynamic_ownership we are not required
to do any chown.
2012-04-12 13:53:38 +02:00
Eric Blake
a9d3495e67 blockjob: allow for fast-finishing job
In my testing, I was able to provoke an odd block pull failure:

$ virsh blockpull dom vda --bandwidth 10000
error: Requested operation is not valid: No active operation on device: drive-virtio-disk0

merely by using gdb to artifically wait to do the block job set speed
until after the pull had already finished.  But in reality, that should
be a success, since the pull finished before we had a chance to set
speed.  Furthermore, using a double job lock is not only annoying, but
a bug in itself - if you do parallel virDomainBlockRebase, and hit
the race window just right, the first call grabs the VM job to start
a fast block job, then the second call grabs the VM job to start
a long-running job with unspecified speed, then the first call finally
regrabs the VM job and sets the speed, which ends up running the
second job under the speed from the first call.  By consolidating
things into a single job, we avoid opening that race, as well as reduce
the time between starting the job and changing the speed, for less
likelihood of the speed change happening after block job completion
in the first place.

* src/qemu/qemu_monitor.h (BLOCK_JOB_CMD): Add new mode.
* src/qemu/qemu_driver.c (qemuDomainBlockRebase): Move secondary
job call...
(qemuDomainBlockJobImpl): ...here, for fewer locks.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJob): Change
return value on new internal mode.
2012-04-11 21:45:43 -06:00
Eric Blake
a91ce852b5 blockjob: wire up qemu async virDomainBlockJobAbort
Without the VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag, libvirt will internally
poll using qemu's "query-block-jobs" API and will not return until the
operation has been completed.  API users are advised that this operation
is unbounded and further interaction with the domain during this period
may block.  Future patches may refactor things to allow other queries in
parallel with this polling.  For older qemu, we synthesize the cancellation
event, since qemu won't generate it.

The choice of polling duration copies from the code in qemu_migration.c.

Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2012-04-11 21:22:06 -06:00
Eric Blake
ecb39e9d4b blockjob: optimize JSON event handler lookup
Probably in the noise, but this will let us scale more efficiently
as we learn to recognize even more qemu events.

* src/qemu/qemu_monitor_json.c (eventHandlers): Sort.
(qemuMonitorEventCompare): New helper function.
(qemuMonitorJSONIOProcessEvent): Optimize event lookup.
2012-04-11 20:56:03 -06:00
Eric Blake
2b085f5bc5 blockjob: add qemu capabilities related to block pull jobs
RHEL 6.2 was released with an early version of block jobs, which only
worked on the qed file format, where the commands were spelled with
underscore (contrary to QMP style), and where 'block_job_cancel' was
synchronous and did not trigger an event.

The upcoming qemu 1.1 release has fixed these short-comings [1][2]:
the commands now work on multiple file types, are spelled with dash,
and 'block-job-cancel' is asynchronous and emits an event upon conclusion.

[1]qemu commit 370521a1d6f5537ea7271c119f3fbb7b0fa57063
[2]https://lists.gnu.org/archive/html/qemu-devel/2012-04/msg01248.html

This patch recognizes the new spellings, and fixes virDomainBlockRebase
to give a graceful error when talking to a too-old qemu on a partial
rebase attempt.  Fixes for the new semantics will come later.  This
patch also removes a bogus ATTRIBUTE_NONNULL mistakenly added in
commit 10ec36e2.

* src/qemu/qemu_capabilities.h (QEMU_CAPS_BLOCKJOB_SYNC)
(QEMU_CAPS_BLOCKJOB_ASYNC): New bits.
* src/qemu/qemu_capabilities.c (qemuCaps): Name them.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONCheckCommands): Set
them.
(qemuMonitorJSONBlockJob): Manage both command names.
(qemuMonitorJSONDiskSnapshot): Minor formatting fix.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJob): Alter signature.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJob): Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Pass through
capability bit.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Update callers.
2012-04-11 20:43:53 -06:00
Peter Krempa
3d3de46a67 qemu: Fix deadlock when qemuDomainOpenConsole cleans up a connection
The new safe console handling introduced a possibility to deadlock the
qemu driver when a new console connection forcibly disconnects a
previous console stream that belongs to an already closed connection.

The virStreamFree function calls subsequently a the virReleaseConnect
function that tries to lock the driver while discarding the connection,
but the driver was already locked in qemuDomainOpenConsole.

Backtrace of the deadlocked thread:
0  0x00007f66e5aa7f14 in __lll_lock_wait () from /lib64/libpthread.so.0
1  0x00007f66e5aa3411 in _L_lock_500 () from /lib64/libpthread.so.0
2  0x00007f66e5aa322a in pthread_mutex_lock () from/lib64/libpthread.so.0
3  0x0000000000462bbd in qemudClose ()
4  0x00007f66e6e178eb in virReleaseConnect () from/usr/lib64/libvirt.so.0
5  0x00007f66e6e19c8c in virUnrefStream () from /usr/lib64/libvirt.so.0
6  0x00007f66e6e3d1de in virStreamFree () from /usr/lib64/libvirt.so.0
7  0x00007f66e6e09a5d in virConsoleHashEntryFree () from/usr/lib64/libvirt.so.0
8  0x00007f66e6db7282 in virHashRemoveEntry () from/usr/lib64/libvirt.so.0
9  0x00007f66e6e09c4e in virConsoleOpen () from /usr/lib64/libvirt.so.0
10 0x00000000004526e9 in qemuDomainOpenConsole ()
11 0x00007f66e6e421f1 in virDomainOpenConsole () from/usr/lib64/libvirt.so.0
12 0x00000000004361e4 in remoteDispatchDomainOpenConsoleHelper ()
13 0x00007f66e6e80375 in virNetServerProgramDispatch () from/usr/lib64/libvirt.so.0
14 0x00007f66e6e7ae11 in virNetServerHandleJob () from/usr/lib64/libvirt.so.0
15 0x00007f66e6da897d in virThreadPoolWorker () from/usr/lib64/libvirt.so.0
16 0x00007f66e6da7ff6 in virThreadHelper () from/usr/lib64/libvirt.so.0
17 0x00007f66e5aa0c5c in start_thread () from /lib64/libpthread.so.0
18 0x00007f66e57e7fcd in clone () from /lib64/libc.so.6

* src/qemu/qemu_driver.c: qemuDomainOpenConsole()
        -- unlock the qemu driver right after acquiring the domain
        object
2012-04-11 10:45:53 +02:00
Jiri Denemark
6eede368bc qemu: Warn on possibly incorrect usage of EnterMonitor*
qemuDomainObjEnterMonitor{,WithDriver} should not be called from async
jobs, only EnterMonitorAsync variant is allowed.
2012-04-11 09:57:39 +02:00
Jiri Denemark
08ec1d787f qemu: Track job owner for better debugging
In case an API fails with "cannot acquire state change lock", searching
for the API that possibly forgot to end its job is not always easy.
Let's keep track of the job owner and print it out for easier
identification.
2012-04-11 09:57:39 +02:00
Jiri Denemark
31796e2c1c qemu: Avoid excessive calls to qemuDomainObjSaveJob()
As reported by Daniel Berrangé, we have a huge performance regression
for virDomainGetInfo() due to the change which makes virDomainEndJob()
save the XML status file every time it is called. Previous to that
change, 2000 calls to virDomainGetInfo() took ~2.5 seconds. After that
change, 2000 calls to virDomainGetInfo() take 2 *minutes* 45 secs.

We made the change to be able to recover from libvirtd restart in the
middle of a job. However, only destroy and async jobs are taken care of.
Thus it makes more sense to only save domain state XML when these jobs
are started/stopped.
2012-04-11 09:57:21 +02:00
Daniel P. Berrange
ddf2dfa1f7 Wire up <loader> to set the QEMU BIOS path
* src/qemu/qemu_command.c: Wire up -bios with <loader>
* tests/qemuxml2argvdata/qemuxml2argv-bios.args,
  tests/qemuxml2argvdata/qemuxml2argv-bios.xml: Expand
  existing BIOS test case to cover <loader>
2012-04-10 16:34:39 +01:00
Eric Blake
1413560966 snapshot: fix memory leak on error
Leak introduced in commit 0436d32.  If we allocate an actions array,
but fail early enough to never consume it with the qemu monitor
transaction call, we leaked memory.

But our semantics of making the transaction command free the caller's
memory is awkward; avoiding the memory leak requires making every
intermediate function in the call chain check for error.  It is much
easier to fix things so that the function that allocates also frees,
while the call chain leaves the caller's data intact.  To do that,
I had to hack our JSON data structure to make it easy to protect a
portion of an arbitrary JSON tree from being freed.

* src/util/json.h (virJSONType): Name the enum.
(_virJSONValue): New field.
* src/util/json.c (virJSONValueFree): Use it to protect a portion
of an array.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONTransaction): Avoid
freeing caller's data.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive):
Free actions array on failure.
2012-04-06 08:39:34 -06:00
Michal Privoznik
650da0e99c qemu_ga: Don't overwrite errors on FSThaw
We can tell qemuDomainSnapshotFSThaw if we want it to report errors or
not. However, if we don't want to and an error has been already set by
previous qemuReportError() we must keep copy of that error not just a
pointer to it. Otherwise, it get overwritten if FSThaw reports an error.
2012-04-06 13:42:04 +02:00
Michal Privoznik
ea3bc548ac qemu: Build activeUsbHostdevs list on process reconnect
If the daemon is restarted it will lose list of active
USB devices assigned to active domains. Therefore we need
to rebuild this list on qemuProcessReconnect().
2012-04-04 15:09:41 +02:00
Michal Privoznik
e2f5dd6134 qemu: Delete USB devices used by domain on stop
To prevent assigning one USB device to two domains,
we keep a list of assigned USB devices. On domain
startup - qemuProcessStart() - we insert devices
used by domain into the list but remove them only
on detach-device. Devices are, however, released
on qemuProcessStop() as well.
2012-04-04 15:09:41 +02:00
Michal Privoznik
b2c7b9ee0e qemu: Don't leak temporary list of USB devices
and add debug message when adding USB device
to the list of active devices.
2012-04-04 15:09:41 +02:00
Jiri Denemark
66cab01ae1 qemu: Start nested job in qemuDomainCheckEjectableMedia
Originally, qemuDomainCheckEjectableMedia was entering monitor with qemu
driver lock. Commit 2067e31bf9, which I
made to fix that, revealed another issue we had (but didn't notice it
since the driver was locked): we didn't set nested job when
qemuDomainCheckEjectableMedia is called during migration. Thus the
original fix I made was wrong.
2012-04-02 21:44:27 +02:00
Philipp Hahn
b8bf79aad7 Support clock=variable relative to localtime
Since Xen 3.1 the clock=variable semantic is supported. In addition to
qemu/kvm Xen also knows about a variant where the offset is relative to
'localtime' instead of 'utc'.

Extends the libvirt structure with a flag 'basis' to specify, if the
offset is relative to 'localtime' or 'utc'.

Extends the libvirt structure with a flag 'reset' to force the reset
behaviour of 'localtime' and 'utc'; this is needed for backward
compatibility with previous versions of libvirt, since they report
incorrect XML.

Adapt the only user 'qemu' to the new name.
Extend the RelaxNG schema accordingly.
Document the new 'basis' attribute in the HTML documentation.
Adapt test for the new attribute.

Signed-off-by: Philipp Hahn <hahn@univention.de>
2012-04-02 09:08:31 -06:00
Eric Blake
095b0bc46a qemu: reflect any memory rounding back to xml
If we round up a user's memory request, we should update the XML
to reflect the actual value in use by the VM, rather than giving
an artificially small value back to the user.

* src/qemu/qemu_command.c (qemuBuildNumaArgStr)
(qemuBuildCommandLine): Reflect rounding back to XML.
2012-03-31 09:17:35 -06:00
Hendrik Schwartke
2711ac8716 qemu: support live change of the bridge used by a guest network device
This patch was created to resolve this upstream bug:

  https://bugzilla.redhat.com/show_bug.cgi?id=784767

and is at least a partial solution to this RHEL RFE:

  https://bugzilla.redhat.com/show_bug.cgi?id=805071

Previously the only attribute of a network device that could be
modified by virUpdateDeviceFlags() ("virsh update-device") was the
link state; attempts to change any other attribute would log an error
and fail.

This patch adds recognition of a change in bridge device name, and
supports reconnecting the guest's interface to the new device.
Standard audit logs for detaching and attaching a network device are
also generated. Although the current auditing function doesn't log the
bridge being attached to, this will later be changed in a separate
patch.
2012-03-30 20:14:36 -04:00
Laine Stump
ecde15910a qemu: eliminate nested switch, simplify code
qemuBuildHostNetStr had a switch-within-a-switch where both were
looking at the same variable. This was apparently to take advantage of
code common to three different cases (while also taking care of some
code that was different). However, there were only 2 lines common to
all, one of those can be eliminated by merging it into the
virAsprintfs that are in each case. On top of that, all the extra
empty cases cause Coverity complaints (because they are unreachable),
but absence of the empty cases causes a compile error due to
"enumeration value not handled in switch".

The solution is to just make each toplevel case independent, folding
in the common code to each.
2012-03-30 12:41:18 -04:00
Laine Stump
3269ee657c qemu: set default name for SPICE agent channel when generating command
commit b0e2bb33 set a default value for the SPICE agent channel by
inserting it during parsing of the channel XML. That method of setting
a default is problematic because it makes a format/parse roundtrip
unclean, and experience with setting other values as a side effect of
parsing has led to headaches (e.g. automatically setting a MAC address
in the parser when one isn't specified in the input XML).

This patch does not revert commit b0e2bb33 (it will be reverted in a
separate patch) but adds the alternate implementation of simply
inserting the default value in the appropriate place on the qemu
commandline when no value is provided.
2012-03-30 12:37:52 -04:00
Michal Privoznik
075c8518c6 qemu_agent: Issue guest-sync prior to every command
If we issue guest command and GA is not running, the issuing thread
will block endlessly. We can check for GA presence by issuing
guest-sync with unique ID (timestamp). We don't want to issue real
command as even if GA is not running, once it is started, it process
all commands written to GA socket.
2012-03-30 18:16:17 +02:00
Daniel P. Berrange
ec8cae93db Consistent style for usage of sizeof operator
The code is splattered with a mix of

  sizeof foo
  sizeof (foo)
  sizeof(foo)

Standardize on sizeof(foo) and add a syntax check rule to
enforce it

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-03-30 11:47:24 +01:00
Wen Congyang
ff68d6eeb5 fix a deadlock when qemu cannot start
When qemu cannot start, we may call qemuProcessStop() twice.
We have check whether the vm is running at the beginning of
qemuProcessStop() to avoid libvirt deadlock. We call
qemuProcessStop() with driver and vm locked. It seems that
we can avoid libvirt deadlock. But unfortunately we may
unlock driver and vm in the function qemuProcessKill() while
vm->def->id is not -1. So qemuProcessStop() will be run twice,
and monitor will be freed unexpectedly. So we should set
vm->def->id to -1 at the beginning of qemuProcessStop().
2012-03-30 14:21:49 +08:00
Christian Benvenuti
a02500d010 qemu: Make migration fail when port profile association fails on the dst host
In the current V3 migration protocol, Libvirt does not
check the result of the function

  qemuMigrationVPAssociatePortProfiles

This means that it is possible for a migration to complete
successfully even when the VM loses network connectivity on
the destination host.

With this change libvirt aborts the migration
(during the "finish" step) when the above function fails, that
is to say when at least one of the port profile associations fails.

Signed-off by: Christian Benvenuti <benve@cisco.com>
2012-03-28 10:45:22 -06:00
Eric Blake
a14eda311e snapshot: don't pass NULL to QMP command creation
Commit d42a2ff caused a regression in creating a disk-only snapshot
of a qcow2 disk; by passing the wrong variable to the monitor call,
libvirt ended up creating JSON that looked like "format":null instead
of the intended "format":"qcow2".

To make it easier to diagnose this in the future, make JSON creation
error out if "s:arg" is paired with NULL (it is still possible to
use "n:arg" in the rare cases where qemu will accept a null).

* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive): Pass correct value.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONMakeCommandRaw):
Improve error message.
2012-03-27 09:34:07 -06:00
D. Herrendoerfer
bd6b0a052e qemu,util: on restart of libvirt restart vepa callbacks
When libvirtd is restarted, also restart the netlink event
message callbacks for existing VEPA connections and send
a message to lldpad for these existing links, so it learns
the new libvirtd pid.

Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
2012-03-27 10:48:39 -04:00
Jiri Denemark
2067e31bf9 qemu: Avoid entering monitor with locked driver
This avoids possible deadlock of the qemu driver in case a domain is
begin migrated (in Begin phase) and unrelated connection to qemu driver
is closed at the right time.

I checked all callers of qemuDomainCheckEjectableMedia() and they are
calling this function with qemu driver locked.
2012-03-27 14:18:12 +02:00
Laine Stump
ecb4d92d57 build: fix "missing initializer" error in qemu_process.c
Found when attempting to build on Fedora 17 alpha with:

   ./autogen.sh --system --enable-compile-warnings=error

(this same build command works without problem on Fedora 16). Since
the consumer of the qemuProcessReconnectData doesn't assume that the
other fields of the struct are initialized (although it uses them
internally), the simpler solution is to just switch to C99-style
struct initialization (which doesn't require specification of all
fields).
2012-03-26 17:08:30 -04:00
Laine Stump
cf57d345b5 build: avoid frame size error when building without -O2
libvirt always adds -Werror-frame-larger-than=4096 to the flags when
it builds. When building on Fedora 17, two functions with multiple
1024 buffers declared inside if {} blocks would generate frame size
errors; apparently the version of gcc on Fedora 16 will merge these
multiple buffers into a single buffer even when optimization is off,
but Fedora 17 won't.

The fix is to declare a single 1024 buffer at the top of the two
offending functions, and reuse the single buffer throughout the
functions.
2012-03-26 17:08:30 -04:00
Martin Kletzander
9943276fd2 Cleanup for a return statement in source files
Return statements with parameter enclosed in parentheses were modified
and parentheses were removed. The whole change was scripted, here is how:

List of files was obtained using this command:
git grep -l -e '\<return\s*([^()]*\(([^()]*)[^()]*\)*)\s*;' |             \
grep -e '\.[ch]$' -e '\.py$'

Found files were modified with this command:
sed -i -e                                                                 \
's_^\(.*\<return\)\s*(\(\([^()]*([^()]*)[^()]*\)*\))\s*\(;.*$\)_\1 \2\4_' \
-e 's_^\(.*\<return\)\s*(\([^()]*\))\s*\(;.*$\)_\1 \2\3_'

Then checked for nonsense.

The whole command looks like this:
git grep -l -e '\<return\s*([^()]*\(([^()]*)[^()]*\)*)\s*;' |             \
grep -e '\.[ch]$' -e '\.py$' | xargs sed -i -e                            \
's_^\(.*\<return\)\s*(\(\([^()]*([^()]*)[^()]*\)*\))\s*\(;.*$\)_\1 \2\4_' \
-e 's_^\(.*\<return\)\s*(\([^()]*\))\s*\(;.*$\)_\1 \2\3_'
2012-03-26 14:45:22 -06:00
Osier Yang
beb76e3742 spec: Add missed dependancy for numad
numad is available since Fedora 17 and RHEL6.X. And it's not supported
on s390[x] and ARM.
2012-03-24 09:35:20 +08:00
Eric Blake
d42a2ffc07 snapshot: improve qemu handling of reused snapshot targets
The oVirt developers have stated that the real reasons they want
to have qemu reuse existing volumes when creating a snapshot are:
1. the management framework is set up so that creation has to be
done from a central node for proper resource tracking, and having
libvirt and/or qemu create things violates the framework, and
2. qemu defaults to creating snapshots with an absolute path to
the backing file, but oVirt wants to manage a backing chain that
uses just relative names, to allow for easier migration of a chain
across storage locations.

When 0.9.10 added VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT (commit
4e9953a4), it only addressed point 1, but libvirt was still using
O_TRUNC which violates point 2.  Meanwhile, the new qemu
'transaction' monitor command includes a new optional mode argument
that will force qemu to reuse the metadata of the file it just
opened (with the burden on the caller to have valid metadata there
in the first place).  So, this tweaks the meaning of the flag to
cover both points as intended for use by oVirt.  It is not strictly
backward-compatible to 0.9.10 behavior, but it can be argued that
the O_TRUNC of 0.9.10 was a bug.

Note that this flag is all-or-nothing, and only selects between
'existing' and the default 'absolute-paths'.  A more flexible
approach that would allow per-disk selections, as well as adding
support for the 'no-backing-file' mode, would be possible by
extending the <domainsnapshot> xml to have a per-disk mode, but
until we have a management application expressing a need for that
additional complexity, it is not worth doing.

* src/libvirt.c (virDomainSnapshotCreateXML): Tweak documentation.
* src/qemu/qemu_monitor.h (qemuMonitorDiskSnapshot): Add
parameters.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDiskSnapshot):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorDiskSnapshot): Pass them
through.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDiskSnapshot): Use
new monitor command arguments.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive)
(qemuDomainSnapshotCreateSingleDiskActive): Adjust callers.
(qemuDomainSnapshotDiskPrepare): Allow qed, modify rules on reuse.
2012-03-23 16:38:20 -06:00
Eric Blake
0436d328f5 snapshot: wire up qemu transaction command
The hardest part about adding transactions is not using the new
monitor command, but undoing the partial changes we made prior
to a failed transaction.

* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive): Use
transaction when available.
(qemuDomainSnapshotUndoSingleDiskActive): New function.
(qemuDomainSnapshotCreateSingleDiskActive): Pass through actions.
(qemuDomainSnapshotCreateXML): Adjust caller.
2012-03-23 16:38:20 -06:00
Eric Blake
64d5e815b7 snapshot: add support for qemu transaction command
QEmu 1.1 is adding a 'transaction' command to the JSON monitor.
Each element of a transaction corresponds to a top-level command,
with the additional guarantee that the transaction flushes all
pending I/O, then guarantees that all actions will be successful
as a group or that failure will roll back the state to what it
was before the monitor command.  The difference between a
top-level command:

{ "execute": "blockdev-snapshot-sync", "arguments":
  { "device": "virtio0", ... } }

and a transaction:

{ "execute": "transaction", "arguments":
  { "actions": [
    { "type": "blockdev-snapshot-sync", "data":
      { "device": "virtio0", ... } } ] } }

is just a couple of changed key names and nesting the shorter
command inside a JSON array to the longer command.  This patch
just adds the framework; the next patch will actually use a
transaction.

* src/qemu/qemu_monitor_json.c (qemuMonitorJSONMakeCommand): Move
guts...
(qemuMonitorJSONMakeCommandRaw): ...into new helper.  Add support
for array element.
(qemuMonitorJSONTransaction): New command.
(qemuMonitorJSONDiskSnapshot): Support use in a transaction.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDiskSnapshot): Add
argument.
(qemuMonitorJSONTransaction): New declaration.
* src/qemu/qemu_monitor.h (qemuMonitorTransaction): Likewise.
(qemuMonitorDiskSnapshot): Add argument.
* src/qemu/qemu_monitor.c (qemuMonitorTransaction): New wrapper.
(qemuMonitorDiskSnapshot): Pass argument on.
* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive): Update caller.
2012-03-23 16:38:20 -06:00
Eric Blake
4c4cc1b96d snapshot: rudimentary qemu support for atomic disk snapshot
Taking an external snapshot of just one disk is atomic, without having
to pause and resume the VM.  This also paves the way for later patches
to interact with the new qemu 'transaction' monitor command.

The various scenarios when requesting atomic are:
online, 1 disk, old qemu - safe, allowed by this patch
online, more than 1 disk, old qemu - failure, this patch
offline snapshot - safe, once a future patch implements offline disk snapshot
online, 1 or more disks, new qemu - safe, once future patch uses transaction

Taking an online system checkpoint snapshot is atomic, since it is
done via a single 'savevm' monitor command.  Taking an offline system
checkpoint snapshot is atomic, thanks to the previous patch.

* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Support
new flag for single-disk setups.
(qemuDomainSnapshotDiskPrepare): Check for atomic here.
(qemuDomainSnapshotCreateDiskActive): Skip pausing the VM when
atomic supported.
(qemuDomainSnapshotIsAllowed): Use bool instead of int.
2012-03-23 16:38:20 -06:00
Eric Blake
922d498e1c snapshot: make offline qemu snapshots atomic
Offline internal snapshots can be rolled back with just a little
bit of refactoring, meaning that we are now automatically atomic.

* src/qemu/qemu_domain.c (qemuDomainSnapshotForEachQcow2): Move
guts...
(qemuDomainSnapshotForEachQcow2Raw): ...to new helper, to allow
rollbacks.
2012-03-23 16:38:20 -06:00
Eric Blake
311357d9e3 snapshot: add qemu capability for 'transaction' command
We need a capability bit to gracefully error out if some of the
additions in future patches can't be implemented by the running qemu.

* src/qemu/qemu_capabilities.h (QEMU_CAPS_TRANSACTION): New cap.
* src/qemu/qemu_capabilities.c (qemuCaps): Name it.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONCheckCommands): Set
it.
2012-03-23 16:38:19 -06:00
Osier Yang
7c5a0c94e4 qemu: Update domain status to running while wakeup event is emitted
This introduces a new running reason VIR_DOMAIN_RUNNING_WAKEUP,
and new suspend event type VIR_DOMAIN_EVENT_STARTED_WAKEUP.

While a wakeup event is emitted, the domain which entered into
VIR_DOMAIN_PMSUSPENDED will be transferred to "running"
with reason VIR_DOMAIN_RUNNING_WAKEUP, and a new domain lifecycle
event emitted with type VIR_DOMAIN_EVENT_STARTED_WAKEUP.
2012-03-23 23:12:29 +08:00
Osier Yang
321fa64bf5 qemu: Update domain state to pmsuspended while suspend event occurs 2012-03-23 23:12:26 +08:00
Osier Yang
487c063381 Add support for the suspend event
This patch introduces a new event type for the QMP event
SUSPEND:

    VIR_DOMAIN_EVENT_ID_PMSUSPEND

The event doesn't take any data, but considering there might
be reason for wakeup in future, the callback definition is:

typedef void
(*virConnectDomainEventSuspendCallback)(virConnectPtr conn,
                                        virDomainPtr dom,
                                        int reason,
                                        void *opaque);

"reason" is unused currently, always passes "0".
2012-03-23 23:12:18 +08:00
Osier Yang
57ddcc235a Add support for the wakeup event
This patch introduces a new event type for the QMP event
WAKEUP:

    VIR_DOMAIN_EVENT_ID_PMWAKEUP

The event doesn't take any data, but considering there might
be reason for wakeup in future, the callback definition is:

typedef void
(*virConnectDomainEventWakeupCallback)(virConnectPtr conn,
                                       virDomainPtr dom,
                                       int reason,
                                       void *opaque);

"reason" is unused currently, always passes "0".
2012-03-23 23:12:14 +08:00
Osier Yang
2d19e33f97 qemu: Update tray status while tray moved event is emitted
With this patch, libvirt won't start the guest with the medium
source which already ejected by guest when doing migration, or
saving/restoring.
2012-03-23 23:12:09 +08:00
Osier Yang
7fcf943bcd qemu: Prohibit setting tray status as open for block type disk 2012-03-23 23:12:02 +08:00
Osier Yang
ad7db43913 qemu: Do not start with source for removable disks if tray is open
This is similiar with physical world, one will be surprised if the
box starts with medium exists while the tray is open.

New tests are added, tests disk-{cdrom,floppy}-tray are for the qemu
supports "-device" flag, and disk-{cdrom,floppy}-no-device-cap are
for old qemu, i.e. which doesn't support "-device" flag.
2012-03-23 23:11:54 +08:00
Osier Yang
a26a1969c3 Add support for event tray moved of removable disks
This patch introduces a new event type for the QMP event
DEVICE_TRAY_MOVED, which occurs when the tray of a removable
disk is moved (i.e opened or closed):

    VIR_DOMAIN_EVENT_ID_TRAY_CHANGE

The event's data includes the device alias and the reason
for tray status' changing, which indicates why the tray
status was changed. Thus the callback definition for the event
is:

enum {
    VIR_DOMAIN_EVENT_TRAY_CHANGE_OPEN = 0,
    VIR_DOMAIN_EVENT_TRAY_CHANGE_CLOSE,

\#ifdef VIR_ENUM_SENTINELS
    VIR_DOMAIN_EVENT_TRAY_CHANGE_LAST
\#endif
} virDomainEventTrayChangeReason;

typedef void
(*virConnectDomainEventTrayChangeCallback)(virConnectPtr conn,
                                           virDomainPtr dom,
                                           const char *devAlias,
                                           int reason,
                                           void *opaque);
2012-03-23 23:10:26 +08:00
Daniel P. Berrange
1f66c18f79 Centralize error reporting for URI parsing/formatting problems
Move error reporting out of the callers, into virURIParse
and virURIFormat, to get consistency.

* include/libvirt/virterror.h, src/util/virterror.c: Add VIR_FROM_URI
* src/util/viruri.c, src/util/viruri.h: Add error reporting
* src/esx/esx_driver.c, src/libvirt.c, src/libxl/libxl_driver.c,
  src/lxc/lxc_driver.c, src/openvz/openvz_driver.c,
  src/qemu/qemu_driver.c, src/qemu/qemu_migration.c,
  src/remote/remote_driver.c, src/uml/uml_driver.c,
  src/vbox/vbox_tmpl.c, src/vmx/vmx.c, src/xen/xen_driver.c,
  src/xen/xend_internal.c, tests/viruritest.c: Remove error
  reporting

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-03-23 12:59:21 +00:00
Daniel P. Berrange
c33dae3175 Use virURIFree instead of xmlFreeURI
Since we defined a custom virURIPtr type, we should use a
virURIFree method instead of assuming it will always be
a typedef for xmlURIPtr

* src/util/viruri.c, src/util/viruri.h, src/libvirt_private.syms:
  Add a virURIFree method
* src/datatypes.c, src/esx/esx_driver.c, src/libvirt.c,
  src/qemu/qemu_migration.c, src/vmx/vmx.c, src/xen/xend_internal.c,
  tests/viruritest.c: s/xmlFreeURI/virURIFree/

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-03-23 12:59:20 +00:00
Jiri Denemark
1fdc53c385 qemu: Avoid dangling migration-out job when client dies
When a client which started non-p2p migration dies in a bad time, the
source libvirtd never clears the migration job and almost nothing can be
done with the domain without restarting the daemon. This patch makes use
of connection close callbacks and ensures that migration job is properly
discarded when the client disconnects.
2012-03-21 17:31:09 +01:00
Jiri Denemark
527d867a94 qemu: Make autodestroy utilize connection close callbacks 2012-03-21 17:31:09 +01:00
Jiri Denemark
791273603e qemu: Add connection close callbacks
Add support for registering arbitrary callback to be called for a domain
when a connection gets closed.
2012-03-21 17:31:09 +01:00
Jiri Denemark
4f061ea641 qemu: Avoid dangling migration-in job on shutoff domains
Destination daemon should not rely on the client or source daemon
(depending on the type of migration) to call Finish when migration
fails, because the client may crash before it can do so. The domain
prepared for incoming migration is set to be destroyed (and migration
job cleaned up) when connection with the client closes but this is not
enough. If the associated qemu process crashes after Prepare step and
the domain is cleaned up before the connection gets closed, autodestroy
is not called for the domain and migration jobs remains set. In case the
domain is defined on destination host (i.e., it is not completely
removed once destroyed) we keep the job set for ever. To fix this, we
register a cleanup callback which is responsible to clean migration-in
job when a domain dies anywhere between Prepare and Finish steps. Note
that we can't blindly clean any job when spotting EOF on monitor since
normally an API is running at that time.
2012-03-21 17:31:09 +01:00
Jiri Denemark
bf9f0a9726 qemu: Add support for domain cleanup callbacks
Add support for registering cleanup callbacks to be run when a domain
transitions to shutoff state.
2012-03-21 17:31:08 +01:00
Jiri Denemark
9f71368d06 qemu: Use unlimited speed when migrating to file
This reverts commit 61f2b6ba5f and most of
commit d8916dc8e2, which effectively
brings back commit ef1065cf5a written by
Jim Fehlig:

The qemu migration speed default is 32MiB/s as defined in migration.c

/* Migration speed throttling */
static int64_t max_throttle = (32 << 20);

There's no need to throttle migration when targeting a file, so set
migration speed to unlimited prior to migration, and restore to libvirt
default value after migration.

Default units is MB for migrate_set_speed monitor command, so
(INT64_MAX / (1024 * 1024)) is used for unlimited migration speed.

This was reverted because migration to file could not be canceled and
even monitored since qemu was not processing any monitor commands until
the migration finished. This is now different as we make sure the
file descriptor we pass to qemu is able to properly report EAGAIN.
Recent qemu changes might have helped as well.

I tested managedsave with this patch in and indeed, it is 10x faster
while I can still monitor its progress.
2012-03-21 17:26:20 +01:00
Eric Blake
7c736bab06 snapshot: make quiesce a bit safer
If a guest is paused, we were silently ignoring the quiesce flag,
which results in unclean snapshots, contrary to the intent of the
flag.  Since we can't quiesce without guest agent support, we should
instead fail if the guest is not running.

Meanwhile, if we attempt a quiesce command, but the guest agent
doesn't respond, and we time out, we may have left the command
pending on the guest's queue, and when the guest resumes parsing
commands, it will freeze even though our command is no longer
around to issue a thaw.  To be safe, we must _always_ pair every
quiesce call with a counterpart thaw, even if the quiesce call
failed due to a timeout, so that if a guest wakes up and starts
processing a command backlog, it will not get stuck in a frozen
state.

* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive):
Always issue thaw after a quiesce, even if quiesce failed.
(qemuDomainSnapshotFSThaw): Add a parameter.
2012-03-19 10:58:18 -06:00
Daniel P. Berrange
f987d17511 Fix handling of blkio deviceWeight empty string
A common coding pattern for changing blkio parameters is

  1. virDomainGetBlkioParameters

  2. change one or more params

  3. virDomainSetBlkioParameters

For this to work, it must be possible to roundtrip through
the methods without error. Unfortunately virDomainGetBlkioParameters
will return "" for the deviceWeight parameter for guests by default,
which virDomainSetBlkioParameters will then reject as invalid.

This fixes the handling of "" to be a no-op, and also improves the
error message to tell you what was invalid
2012-03-16 15:05:05 +00:00
Michal Privoznik
362c3b33e6 qemuDomainDetachPciDiskDevice: Free allocated cgroup
This function potentially allocates new virCgroup but never
frees it.
2012-03-15 17:10:22 +01:00
Laine Stump
89ae6a5a30 Emit graphics events when a SPICE client connects/disconnects
Wire up the domain graphics event notifications for SPICE. Adapted
from a RHEL-only patch written by Dan Berrange that used custom
__com.redhat_SPICE events - equivalent events are now available in
upstream QEMU (including a SPICE_CONNECTED event, which was missing in
the __COM.redhat_SPICE version).

* src/qemu/qemu_monitor_json.c: Wire up SPICE graphics events
2012-03-15 11:27:37 -04:00
Osier Yang
d86120fc52 numad: Fix typo and warning
src/libvirt_private.syms:
  s/virDomainCpuPlacement/virDomainCpuPlacementMode/
src/qemu/qemu_process.c
  def->mem.cur_balloon expects "llu"
--
pushed under build-breaker rule
2012-03-15 19:43:42 +08:00
Osier Yang
0f8e7ae33a qemu: Support numad
numad is an user-level daemon that monitors NUMA topology and
processes resource consumption to facilitate good NUMA resource
alignment of applications/virtual machines to improve performance
and minimize cost of remote memory latencies. It provides a
pre-placement advisory interface, so significant processes can
be pre-bound to nodes with sufficient available resources.

More details: http://fedoraproject.org/wiki/Features/numad

"numad -w ncpus:memory_amount" is the advisory interface numad
provides currently.

This patch add the support by introducing a new XML attribute
for <vcpu>. e.g.

  <vcpu placement="auto">4</vcpu>
  <vcpu placement="static" cpuset="1-10^6">4</vcpu>

The returned advisory nodeset from numad will be printed
in domain's dumped XML. e.g.
  <vcpu placement="auto" cpuset="1-10^6">4</vcpu>

If placement is "auto", the number of vcpus and the current
memory amount specified in domain XML will be used for numad
command line (numad uses MB for memory amount):
  numad -w $num_of_vcpus:$current_memory_amount / 1024

The advisory nodeset returned from numad will be used to set
domain process CPU affinity then. (e.g. qemuProcessInitCpuAffinity).

If the user specifies both CPU affinity policy (e.g.
(<vcpu cpuset="1-10,^7,^8">4</vcpu>) and placement == "auto"
the specified CPU affinity will be overridden.

Only QEMU/KVM drivers support it now.

See docs update in patch for more details.
2012-03-15 12:24:56 +08:00
Osier Yang
3165602a55 qemu: Use scsi-block for lun passthrough instead of scsi-disk
And don't allow to hotplug a usb disk with "device == lun". This
is the missed pieces in previous virtio-scsi patchset:

http://www.redhat.com/archives/libvir-list/2012-February/msg01052.html
2012-03-14 23:32:53 +08:00
Michal Privoznik
823a27c628 qemu: Reverse condition in qemuDomainCheckDiskPresence
With current code, we pass true iff domain is cold booting. However,
if disk is inaccessible and startupPolicy for that disk is set to
'requisite' we have to fail iff cold booting.
2012-03-14 12:52:46 +01:00
Michal Privoznik
2e4defdca7 graphics: Cleanup port policy
Even though we say in documentation setting (tls-)port to -1 is legacy
compat style for enabling autoport, we're roughly doing this for VNC.
However, in case of SPICE auto enable autoport iff both port & tlsPort
are equal -1 as documentation says autoport plays with both.
2012-03-13 09:48:25 +01:00
Guannan Ren
19c7980ee6 qemu: fix segfault when detaching non-existent network device
In qemuDomainDetachNetDevice, detach was being used before it had been
validated. If no matching device was found, this resulted in a
dereference of a NULL pointer.

This behavior was a regression introduced in commit
cf90342be0, so it has not been a part of
any official libvirt release.
2012-03-13 03:06:35 -04:00
Jiri Denemark
041109afef qemu: Fix (managed)save and snapshots with host mode CPU
When host-model and host-passthrouh CPU modes were introduced, qemu
driver was properly modify to update guest CPU definition during
migration so that we use the right CPU at the destination. However,
similar treatment is needed for (managed)save and snapshots since they
need to save the exact CPU so that a domain can be properly restored.
To avoid repetition of such situation, all places that need live XML
share the code which generates it.

As a side effect, this patch fixes error reporting from
qemuDomainSnapshotWriteMetadata().
2012-03-13 07:59:36 +01:00
Eric Blake
759095f636 cpustats: report user and sys times
Thanks to cgroups, providing user vs. system time of the overall
guest is easy to add to our existing API.

* include/libvirt/libvirt.h.in (VIR_DOMAIN_CPU_STATS_USERTIME)
(VIR_DOMAIN_CPU_STATS_SYSTEMTIME): New constants.
* src/util/virtypedparam.h (virTypedParameterArrayValidate)
(virTypedParameterAssign): Enforce checking the result.
* src/qemu/qemu_driver.c (qemuDomainGetPercpuStats): Fix offender.
(qemuDomainGetTotalcpuStats): Implement new parameters.
* tools/virsh.c (cmdCPUStats): Tweak output accordingly.
2012-03-12 08:46:56 -06:00
Eric Blake
6e0ff1d402 qemu: support disk filenames with comma
If there is a disk file with a comma in the name, QEmu expects a double
comma instead of a single one (e.g., the file "virtual,disk.img" needs
to be specified as "virtual,,disk.img" in QEmu's command line). This
patch fixes libvirt to work with that feature. Fix RHBZ #801036.

Based on an initial patch by Crístian Viana.

* src/util/buf.h (virBufferEscape): Alter signature.
* src/util/buf.c (virBufferEscape): Add parameter.
(virBufferEscapeSexpr): Fix caller.
* src/qemu/qemu_command.c (qemuBuildRBDString): Likewise.  Also
escape commas in file names.
(qemuBuildDriveStr): Escape commas in file names.
* docs/schemas/basictypes.rng (absFilePath): Relax RNG to allow
commas in input file names.
* tests/qemuxml2argvdata/*-disk-drive-network-sheepdog.*: Update
test.

Signed-off-by: Eric Blake <eblake@redhat.com>
2012-03-12 08:09:37 -06:00
Daniel Veillard
dd39f13af0 Fix a few typo in translated strings
this was raised by our hindi localization team
chandan kumar <chandankumar.093047@gmail.com>
2012-03-12 17:41:26 +08:00
Michal Privoznik
ee4907320f qemuBuildCommandLine: Don't add tlsPort if none set
If user hasn't supplied any tlsPort we default to setting it
to zero in our internal structure. However, when building command
line we test it against -1 which is obviously wrong.
2012-03-09 08:49:10 +01:00
Peng Zhou
896e6ac4f8 qemu: spice agent-mouse support
spice agent-mouse support

Usage:
  <graphics type='spice'>
    <mouse mode='client'|'server'/>
  <graphics/>

Signed-off-by: Osier Yang <jyang@redhat.com>
2012-03-09 15:26:24 +08:00
Laine Stump
7a23ba090d qemu: eliminate memory leak in qemuDomainUpdateDeviceConfig
This function was freeing a virDomainNetDef with
VIR_FREE(). virDomainNetDef is a complex structure with many pointers
to other dynamically allocated data; to properly free it
virDomainNetDefFree() must be called instead, otherwise several
strings (and potentially other things) will be leaked.
2012-03-08 16:58:53 -05:00
Laine Stump
edb6fc3a7f qemu: support persistent hotplug of <hostdev> devices
For some reason, although live hotplug of <hostdev> devices is
supported, persistent hotplug is not. This patch adds the proper
VIR_DOMAIN_DEVICE_HOSTDEV cases to the switches in
qemuDomainAttachDeviceConfig and qemuDomainDetachDeviceConfig.
2012-03-08 16:58:40 -05:00
Laine Stump
f985773d06 util: eliminate device object leaks related to virDomain*Remove*()
There are several functions in domain_conf.c that remove a device
object from the domain's list of that object type, but don't free the
object or return it to the caller to free. In many cases this isn't a
problem because the caller already had a pointer to the object and
frees it afterward, but in several cases the removed object was just
left floating around with no references to it.

In particular, the function qemuDomainDetachDeviceConfig() calls
functions to locate and remove net (virDomainNetRemoveByMac), disk
(virDomainDiskRemoveByName()), and lease (virDomainLeaseRemove())
devices, but neither it nor its caller qemuDomainModifyDeviceConfig()
ever obtain a pointer to the device being removed, much less free it.

This patch modifies the following "remove" functions to return a
pointer to the device object being removed from the domain device
arrays, to give the caller the option of freeing the device object
using that pointer if needed. In places where the object was
previously leaked, it is now freed:

  virDomainDiskRemove
  virDomainDiskRemoveByName
  virDomainNetRemove
  virDomainNetRemoveByMac
  virDomainHostdevRemove
  virDomainLeaseRemove
  virDomainLeaseRemoveAt

The functions that had been leaking:

  libxlDomainDetachConfig - leaked a virDomainDiskDef
  qemuDomainDetachDeviceConfig - could leak a virDomainDiskDef,
                            a virDomainNetDef, or a
                            virDomainLeaseDef
  qemuDomainDetachLease   - leaked a virDomainLeaseDef
2012-03-08 16:58:27 -05:00
Laine Stump
b59e59845f qemu: don't 'remove' hostdev objects from domain if operation fails
There were certain paths through the hostdev detach code that could
lead to the lower level function failing (and not removing the object
from the domain's hostdevs list), but the higher level function
free'ing the hostdev object anyway. This would leave a stale
hostdevdef pointer in the list, which would surely cause a problem
eventually.

This patch relocates virDomainHostdevRemove from the lower level
functions qemuDomainDetachThisHostDevice and
qemuDomainDetachHostPciDevice, to their caller
qemuDomainDetachThisHostDevice, placing it just before the call to
virDomainHostdevDefFree. This makes it easy to verify that either both
operations are done, or neither.

NB: The "dangling pointer" part of this problem was introduced in
commit 13d5a6, so it is not present in libvirt versions prior to
0.9.9. Earlier versions would return failure in certain cases even
though the the device object was removed/deleted, but the removal and
deletion operations would always both happen or neither.
2012-03-08 16:58:22 -05:00
Ansis Atteka
ac8bbdbdfa Attach vm-id to Open vSwitch interfaces.
This patch will allow OpenFlow controllers to identify which interface
belongs to a particular VM by using the Domain UUID.

ovs-vsctl get Interface vnet0 external_ids
{attached-mac="52:54:00:8C:55:2C", iface-id="83ce45d6-3639-096e-ab3c-21f66a05f7fa", iface-status=active, vm-id="142a90a7-0acc-ab92-511c-586f12da8851"}

V2 changes:
Replaced vm-uuid with vm-id. There was a discussion in Open vSwitch
mailinglist that we should stick with the same DB key postfixes for the
sake of consistency (e.g iface-id, vm-id ...).
2012-03-08 14:44:15 -05:00
Michal Privoznik
1e0534a770 qemu: Don't parse device twice in attach/detach
Some members are generated during XML parse (e.g. MAC address of
an interface); However, with current implementation, if we
are plugging a device both to persistent and live config,
we parse given XML twice: first time for live, second for config.
This is wrong then as the second time we are not guaranteed
to generate same values as we did for the first time.
To prevent that we need to create a copy of DeviceDefPtr;
This is done through format/parse process instead of writing
functions for deep copy as it is easier to maintain:
adding new field to any virDomain*DefPtr doesn't require change
of copying function.
2012-03-08 10:20:21 +01:00
Michal Privoznik
b819b3b7cf qemu: Fix startupPolicy for snapshot-revert
Currently, startupPolicy='requisite' was determining cold boot
by migrateFrom != NULL. That means, if domain was started up
with migrateFrom set we didn't require disk source path and allowed
it to be dropped. However, on snapshot-revert domain wasn't migrated
but according to documentation, requisite should drop disk source
as well.
2012-03-08 10:03:08 +01:00
Eric Blake
4888f0fb56 xml: use better types for memory values
Using 'unsigned long' for memory values is risky on 32-bit platforms,
as a PAE guest can have more than 4GiB memory.  Our API is
(unfortunately) locked at 'unsigned long' and a scale of 1024, but
the rest of our system should consistently use 64-bit values,
especially since the previous patch centralized overflow checking.

* src/conf/domain_conf.h (_virDomainDef): Always use 64-bit values
for memory.  Change hugepage_backed to a bool.
* src/conf/domain_conf.c (virDomainDefParseXML)
(virDomainDefCheckABIStability, virDomainDefFormatInternal): Fix
clients.
* src/vmx/vmx.c (virVMXFormatConfig): Likewise.
* src/xenxs/xen_sxpr.c (xenParseSxpr, xenFormatSxpr): Likewise.
* src/xenxs/xen_xm.c (xenXMConfigGetULongLong): New function.
(xenXMConfigGetULong, xenXMConfigSetInt): Avoid truncation.
(xenParseXM, xenFormatXM): Fix clients.
* src/phyp/phyp_driver.c (phypBuildLpar): Likewise.
* src/openvz/openvz_driver.c (openvzDomainSetMemoryInternal):
Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainDefineXML): Likewise.
* src/qemu/qemu_command.c (qemuBuildCommandLine): Likewise.
* src/qemu/qemu_process.c (qemuProcessStart): Likewise.
* src/qemu/qemu_monitor.h (qemuMonitorGetBalloonInfo): Likewise.
* src/qemu/qemu_monitor_text.h (qemuMonitorTextGetBalloonInfo):
Likewise.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextGetBalloonInfo):
Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONGetBalloonInfo):
Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONGetBalloonInfo):
Likewise.
* src/qemu/qemu_driver.c (qemudDomainGetInfo)
(qemuDomainGetXMLDesc): Likewise.
* src/uml/uml_conf.c (umlBuildCommandLine): Likewise.
2012-03-07 18:24:44 -07:00
Eric Blake
73b9977140 xml: use long long internally, to centralize overflow checks
On 64-bit platforms, unsigned long and unsigned long long are
identical, so we don't have to worry about overflow checks.
On 32-bit platforms, anywhere we narrow unsigned long long back
to unsigned long, we have to worry about overflow; it's easier
to do this in one place by having most of the code use the same
or wider types, and only doing the narrowing at the last minute.
Therefore, the memory set commands remain unsigned long, and
the memory get command now centralizes the overflow check into
libvirt.c, so that drivers don't have to repeat the work.

This also fixes a bug where xen returned the wrong value on
failure (most APIs return -1 on failure, but getMaxMemory
must return 0 on failure).

* src/driver.h (virDrvDomainGetMaxMemory): Use long long.
* src/libvirt.c (virDomainGetMaxMemory): Raise overflow.
* src/test/test_driver.c (testGetMaxMemory): Fix driver.
* src/rpc/gendispatch.pl (name_to_ProcName): Likewise.
* src/xen/xen_hypervisor.c (xenHypervisorGetMaxMemory): Likewise.
* src/xen/xen_driver.c (xenUnifiedDomainGetMaxMemory): Likewise.
* src/xen/xend_internal.c (xenDaemonDomainGetMaxMemory):
Likewise.
* src/xen/xend_internal.h (xenDaemonDomainGetMaxMemory):
Likewise.
* src/xen/xm_internal.c (xenXMDomainGetMaxMemory): Likewise.
* src/xen/xm_internal.h (xenXMDomainGetMaxMemory): Likewise.
* src/xen/xs_internal.c (xenStoreDomainGetMaxMemory): Likewise.
* src/xen/xs_internal.h (xenStoreDomainGetMaxMemory): Likewise.
* src/xenapi/xenapi_driver.c (xenapiDomainGetMaxMemory):
Likewise.
* src/esx/esx_driver.c (esxDomainGetMaxMemory): Likewise.
* src/libxl/libxl_driver.c (libxlDomainGetMaxMemory): Likewise.
* src/qemu/qemu_driver.c (qemudDomainGetMaxMemory): Likewise.
* src/lxc/lxc_driver.c (lxcDomainGetMaxMemory): Likewise.
* src/uml/uml_driver.c (umlDomainGetMaxMemory): Likewise.
2012-03-07 18:24:43 -07:00
Eric Blake
239fb8c46b api: add overflow error
Overflow can be user-induced, so it deserves more than being called
an internal error.  Note that in general, 32-bit platforms have
far more places to trigger this error (anywhere the public API
used 'unsigned long' but the other side of the connection is a
64-bit server); but some are possible on 64-bit platforms (where
the public API computes the product of two numbers).

* include/libvirt/virterror.h (VIR_ERR_OVERFLOW): New error.
* src/util/virterror.c (virErrorMsg): Translate it.
* src/libvirt.c (virDomainSetVcpusFlags, virDomainGetVcpuPinInfo)
(virDomainGetVcpus, virDomainGetCPUStats): Use it.
* daemon/remote.c (HYPER_TO_TYPE): Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockResize): Likewise.
2012-03-07 18:24:43 -07:00
Eric Blake
462dc569de rpc: allow truncated return for virDomainGetCPUStats
The RPC code assumed that the array returned by the driver would be
fully populated; that is, ncpus on entry resulted in ncpus * return
value on exit.  However, while we don't support holes in the middle
of ncpus, we do want to permit the case of ncpus on entry being
longer than the array returned by the driver (that is, it should be
safe for the caller to pass ncpus=128 on entry, and the driver will
stop populating the array when it hits max_id).

Additionally, a successful return implies that the caller will then
use virTypedParamArrayClear on the entire array; for this to not
free uninitialized memory, the driver must ensure that all skipped
entries are explicitly zeroed (the RPC driver did this, but not
the qemu driver).

There are now three cases:
server 0.9.10 and client 0.9.10 or newer: No impact - there were no
hypervisor drivers that supported cpu stats

server 0.9.11 or newer and client 0.9.10: if the client calls with
ncpus beyond the max, then the rpc call will fail on the client side
and disconnect the client, but the server is no worse for the wear

server 0.9.11 or newer and client 0.9.11: the server can return a
truncated array and the client will do just fine

I reproduced the problem by using a host with 2 CPUs, and doing:
virsh cpu-stats $dom --start 1 --count 2

* daemon/remote.c (remoteDispatchDomainGetCPUStats): Allow driver
to omit tail of array.
* src/remote/remote_driver.c (remoteDomainGetCPUStats):
Accommodate driver that omits tail of array.
* src/libvirt.c (virDomainGetCPUStats): Document this.
* src/qemu/qemu_driver.c (qemuDomainGetPercpuStats): Clear all
unpopulated entries.
2012-03-07 07:14:11 -07:00
KAMEZAWA Hiroyuki
44b0a53a7c qemu driver for virDomainGetCPUstats using cpuacct cgroup.
* For now, only "cpu_time" is supported.
* cpuacct cgroup is used for providing percpu cputime information.

* src/qemu/qemu.conf     - take care of cpuacct cgroup.
* src/qemu/qemu_conf.c   - take care of cpuacct cgroup.
* src/qemu/qemu_driver.c - added an interface
* src/util/cgroup.c/h    - added interface for getting percpu cputime

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
2012-03-06 21:54:48 -07:00
Roopa Prabhu
ce43483caf qemu: install port profile and mac address on netdev hostdevs
These changes are applied only if the hostdev has a parent net device
(i.e. if it was defined as "<interface type='hostdev'>" rather than
just "<hostdev>").  If the parent netdevice has virtual port
information, the original virtualport associate functions are called
(these set and restore both mac and port profile on an
interface). Otherwise, only mac address is set on the device.

Note that This is only supported for SR-IOV Virtual Functions (not for
standard PCI or USB netdevs), and virtualport association is only
supported for 802.1Qbh. For all other types of cards and types of
virtualport, a "Config Unsupported" error is returned and the
operation fails.

Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
2012-03-06 06:04:04 -05:00
Roopa Prabhu
15bbfd8390 util: Changes to support portprofiles for hostdevs
This patch includes the following changes to virnetdevmacvlan.c and
virnetdevvportprofile.c:

 - removes some netlink functions which are now available in
   virnetdev.c

 - Adds a vf argument to all port profile functions.

For 802.1Qbh devices, the port profile calls can use a vf argument if
passed by the caller. If the vf argument is -1 it will try to derive the vf
if the device passed is a virtual function.

For 802.1Qbg devices, This patch introduces a null check for the device
argument because during port profile assignment on a hostdev, this argument
can be null.

Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
2012-03-06 06:03:57 -05:00
Laine Stump
cf90342be0 qemu: support type=hostdev network device live hotplug attach/detach
qemuDomainAttachNetDevice

  - re-ordered some things at start of function because
    networkAllocateActualDevice should always be run and a slot
    in def->nets always allocated, but host_net_add isn't needed
    if the actual type is hostdev.

  - if actual type is hostdev, defer to
    qemuDomainAttachHostDevice (which will reach up to the NetDef
    for things like MAC address when necessary). After return
    from qemuDomainAttachHostDevice, slip directly to cleanup,
    since the rest of the function is specific to emulated net
    devices.

  - put assignment of new NetDef into expanded def->nets down
    below cleanup: (but only on success) since it is also needed
    for emulated and hostdev net devices.

qemuDomainDetachHostDevice

  - after locating the exact device to detach, check if it's a
    network device and, if so, use toplevel
    qemuDomainDetachNetDevice instead so that the def->nets list
    is properly updated, and 'actual device' properly returned to
    network pool if appropriate. Otherwise, for normal hostdevs,
    call the lower level qemuDomainDetachThisDevice.

qemuDomainDetachNetDevice

  - This is where it gets a bit tricky. After locating the device
    on the def->nets list, if the network device type == hostdev,
    call the *lower level* qemuDomainDetachThisDevice (which will
    reach back up to the parent net device for MAC address /
    virtualport when appropriate, then clear the device out of
    def->hostdevs) before skipping past all the emulated
    net-device-specific code to cleanup:, where the network
    device is removed from def->nets, and the network device
    object is freed.

In short, any time a hostdev-type network device is detached, we must
go through the toplevel virDomaineDetachNetDevice function first and
last, to make sure 1) the def->nnets list is properly managed, and 2)
any device allocated with networkAllocateActualDevice is properly
freed. At the same time, in the middle we need to go through the
lower-level vidDomainDetach*This*HostDevice to be sure that 1) the
def->hostdevs list is properly managed, 2) the PCI device is properly
detached from the guest and reattached to the host (if appropriate),
and 3) any higher level teardown is called at the appropriate time, by
reaching back up to the NetDef config (part (3) will be covered in a
separate patch).
2012-03-05 23:24:50 -05:00
Laine Stump
16520d6555 qemu: use virDomainNetRemove instead of inline code
The code being replaced is exactly identical to the newly global
function, right down to the comment.
2012-03-05 23:24:44 -05:00
Laine Stump
8639a42059 qemu: support type='hostdev' network devices at domain start
This patch makes sure that each network device ("interface") of
type='hostdev' appears on both the hostdevs list and the nets list of
the virDomainDef, and it modifies the qemu driver startup code so that
these devices will be presented to qemu on the commandline as hostdevs
rather than as network devices.

It does not add support for hotplug of these type of devices, or code
to honor the <mac address> or <virtualport> given in the config (both
of those will be done in separate patches).

Once each device is placed on both lists, much of what this patch does
is modify places in the code that traverse all the device lists so
that these hybrid devices are only acted on once - either along with
the other hostdevs, or along with the other network interfaces. (In
many cases, only one of the lists is traversed / a specific operation
is performed on only one type of device. In those instances, the code
can remain unchanged.)

There is one special case - when building the commandline, interfaces
are allowed to proceed all the way through
networkAllocateActualDevice() before deciding to skip the rest of
netdev-specific processing - this is so that (once we have support for
networks with pools of hostdev devices) we can get the actual device
allocated, then rely on the loop processing all hostdevs to generate
the correct commandline.

(NB: <interface type='hostdev'> is only supported for PCI network
devices that are SR-IOV Virtual Functions (VF). Standard PCI[e] and
USB devices, and even the Physical Functions (PF) of SR-IOV devices
can only be assigned to a guest using the more basic <hostdev> device
entry. This limitation is mostly due to the fact that non-SR-IOV
ethernet devices tend to lose mac address configuration whenever the
card is reset, which happens when a card is assigned to a guest;
SR-IOV VFs fortunately don't suffer the same problem.)
2012-03-05 23:24:34 -05:00
Laine Stump
3b1c191fe7 conf: parse/format type='hostdev' network interfaces
This is the new interface type that sets up an SR-IOV PCI network
device to be assigned to the guest with PCI passthrough after
initializing some network device-specific things from the config
(e.g. MAC address, virtualport profile parameters). Here is an example
of the syntax:

  <interface type='hostdev' managed='yes'>
    <source>
      <address type='pci' domain='0' bus='0' slot='4' function='3'/>
    </source>
    <mac address='00:11:22:33:44:55'/>
    <address type='pci' domain='0' bus='0' slot='7' function='0'/>
  </interface>

This would assign the PCI card from bus 0 slot 4 function 3 on the
host, to bus 0 slot 7 function 0 on the guest, but would first set the
MAC address of the card to 00:11:22:33:44:55.

NB: The parser and formatter don't care if the PCI card being
specified is a standard single function network adapter, or a virtual
function (VF) of an SR-IOV capable network adapter, but the upcoming
code that implements the back end of this config will work *only* with
SR-IOV VFs. This is because modifying the mac address of a standard
network adapter prior to assigning it to a guest is pointless - part
of the device reset that occurs during that process will reset the MAC
address to the value programmed into the card's firmware.

Although it's not supported by any of libvirt's hypervisor drivers,
usb network hostdevs are also supported in the parser and formatter
for completeness and consistency. <source> syntax is identical to that
for plain <hostdev> devices, except that the <address> element should
have "type='usb'" added if bus/device are specified:

  <interface type='hostdev'>
    <source>
      <address type='usb' bus='0' device='4'/>
    </source>
    <mac address='00:11:22:33:44:55'/>
  </interface>

If the vendor/product form of usb specification is used, type='usb'
is implied:

  <interface type='hostdev'>
    <source>
      <vendor id='0x0012'/>
      <product id='0x24dd'/>
    </source>
    <mac address='00:11:22:33:44:55'/>
  </interface>

Again, the upcoming patch to fill in the backend of this functionality
will log an error and fail with "Unsupported Config" if you actually
try to assign a USB network adapter to a guest using <interface
type='hostdev'> - just use a standard <hostdev> entry in that case
(and also for single-port PCI adapters).
2012-03-05 23:24:28 -05:00
Laine Stump
93870c4ef7 qemu: refactor hotplug detach of hostdevs
This refactoring is necessary to support hotplug detach of
type=hostdev network devices, but needs to be in a separate patch to
make potential debugging of regressions more practical.

Rather than the lowest level functions searching for a matching
device, the search is now done in the toplevel function, and an
intermediate-level function (qemuDomainDetachThisHostDevice()), which
expects that the device's entry is already found, is called (this
intermediate function will be called by qemuDomainDetachNetDevice() in
order to support detach of type=hostdev net devices)

This patch should result in 0 differences in functionality.
2012-03-05 23:24:22 -05:00
Laine Stump
6fbb957d91 qemu: re-order functions in qemu_hotplug.c
Code movement only, no functional change. This is necessary to prevent
a forward reference in an upcoming patch.
2012-03-05 23:24:17 -05:00
Laine Stump
29293930a9 conf: make hostdev info a separate object
In order to allow for a virDomainHostdevDef that uses the
virDomainDeviceInfo of a "higher level" device (such as a
virDomainNetDef), this patch changes the virDomainDeviceInfo in the
HostdevDef into a virDomainDeviceInfoPtr. Rather than adding checks
all over the code to check for a null info, we just guarantee that it
is always valid. The new function virDomainHostdevDefAlloc() allocates
a virDomainDeviceInfo and plugs it in, and virDomainHostdevDefFree()
makes sure it is freed.

There were 4 places allocating virDomainHostdevDefs, all of them
parsers of one sort or another, and those have all had their
VIR_ALLOC(hostdev) changed to virDomainHostdevDefAlloc(). Other than
that, and the new functions, all the rest of the changes are just
mechanical removals of "&" or changing "." to "->".
2012-03-05 23:23:44 -05:00
Laine Stump
2f925c650c conf: add device pointer to args of virDomainDeviceInfoIterate callback
There will be cases where the iterator callback will need to know the
type of the device whose info is being operated on, and possibly even
need to use some of the device's config. This patch adds a
virDomainDeviceDefPtr to the args of every callback, and fills it in
appropriately as the devices are iterated through.
2012-03-05 23:23:38 -05:00
Laine Stump
37038d5c0b qemu: rename virDomainDeviceInfoPtr variables to avoid confusion
The virDomainDeviceInfoPtrs in qemuCollectPCIAddress and
qemuComparePCIDevice are named "dev" and "dev1", but those functions
will be changed (in order to match a change in the args sent to
virDomainDeviceInfoIterate() callback args) to contain a
virDomainDeviceDefPtr device.

This patch renames "dev" to "info" (and "dev[n]" to "info[n]") to
avoid later confusion.
2012-03-05 23:23:31 -05:00
Eric Blake
877fd769b9 blockResize: add flag for bytes
Qemu supports sizing by bytes; we shouldn't force the user to
round up if they really wanted an unaligned total size.

* include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_RESIZE_BYTES):
New flag.
* src/libvirt.c (virDomainBlockResize): Document it.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockResize): Take
size in bytes.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextBlockResize):
Likewise.  Pass bytes, not megabytes, to monitor.
* src/qemu/qemu_driver.c (qemuDomainBlockResize): Implement new
flag.
2012-03-05 10:06:52 -07:00
Jiri Denemark
07dd6fb610 qemu: Shared or readonly disks are always safe wrt migration
No matter what cache mode is used, readonly disks are always safe wrt
migration. Shared disks are required to be readonly or to disable
host-side cache, which makes them safe as well.
2012-03-05 15:24:00 +01:00
Osier Yang
1f77472d5b qemu: Fix indention 2012-03-05 18:32:53 +08:00
Laine Stump
d1c310231d util: combine bools in virNetDevTapCreateInBridgePort into flags
With an additional new bool added to determine whether or not to
discourage the use of the supplied MAC address by the bridge itself,
virNetDevTapCreateInBridgePort had three booleans (well, 2 bools and
an int used as a bool) in the arg list, which made it increasingly
difficult to follow what was going on. This patch combines those three
into a single flags arg, which not only shortens the arg list, but
makes it more self-documenting.
2012-03-02 16:04:06 -05:00
Ansis Atteka
c1b164d70c util: centralize tap device MAC address 1st byte "0xFE" modification
When a tap device for a domain is created and attached to a bridge,
the first byte of the tap device MAC address is set to 0xFE, while the
rest is set to match the MAC address that will be presented to the
guest as its network device MAC address. Setting this high value in
the tap's MAC address discourages the bridge from using the tap
device's MAC address as the bridge's own MAC address (Linux bridges
always take on the lowest numbered MAC address of all attached devices
as their own).

In one case within libvirt, a tap device is created and attached to
the bridge with the intent that its MAC address be taken on by the
bridge as its own (this is used to assure that the bridge has a fixed
MAC address to prevent network outages created by the bridge MAC
address "flapping" as guests are started and stopped). In this case,
the first byte of the mac address is *not* altered to 0xFE.

In the current code, callers to virNetDevTapCreateInBridgePort each
make the MAC address modification themselves before calling, which
leads to code duplication, and also prevents lower level functions
from knowing the real MAC address being used by the guest. The problem
here is that openvswitch bridges must be informed about this MAC
address, or they will be unable to pass traffic to/from the guest.

This patch centralizes the location of the MAC address "0xFE fixup"
into virNetDevTapCreateInBridgePort(), meaning 1) callers of this
function no longer need the extra strange bit of code, and 2)
bitNetDevTapCreateBridgeInPort itself now is called with the guest's
unaltered MAC address, and can pass it on, unmodified, to
virNetDevOpenvswitchAddPort.

There is no other behavioral change created by this patch.
2012-03-02 16:04:00 -05:00
Eric Blake
3e2c3d8f6d build: use correct type for pid and similar types
No thanks to 64-bit windows, with 64-bit pid_t, we have to avoid
constructs like 'int pid'.  Our API in libvirt-qemu cannot be
changed without breaking ABI; but then again, libvirt-qemu can
only be used on systems that support UNIX sockets, which rules
out Windows (even if qemu could be compiled there) - so for all
points on the call chain that interact with this API decision,
we require a different variable name to make it clear that we
audited the use for safety.

Adding a syntax-check rule only solves half the battle; anywhere
that uses printf on a pid_t still needs to be converted, but that
will be a separate patch.

* cfg.mk (sc_correct_id_types): New syntax check.
* src/libvirt-qemu.c (virDomainQemuAttach): Document why we didn't
use pid_t for pid, and validate for overflow.
* include/libvirt/libvirt-qemu.h (virDomainQemuAttach): Tweak name
for syntax check.
* src/vmware/vmware_conf.c (vmwareExtractPid): Likewise.
* src/driver.h (virDrvDomainQemuAttach): Likewise.
* tools/virsh.c (cmdQemuAttach): Likewise.
* src/remote/qemu_protocol.x (qemu_domain_attach_args): Likewise.
* src/qemu_protocol-structs (qemu_domain_attach_args): Likewise.
* src/util/cgroup.c (virCgroupPidCode, virCgroupKillInternal):
Likewise.
* src/qemu/qemu_command.c(qemuParseProcFileStrings): Likewise.
(qemuParseCommandLinePid): Use pid_t for pid.
* daemon/libvirtd.c (daemonForkIntoBackground): Likewise.
* src/conf/domain_conf.h (_virDomainObj): Likewise.
* src/probes.d (rpc_socket_new): Likewise.
* src/qemu/qemu_command.h (qemuParseCommandLinePid): Likewise.
* src/qemu/qemu_driver.c (qemudGetProcessInfo, qemuDomainAttach):
Likewise.
* src/qemu/qemu_process.c (qemuProcessAttach): Likewise.
* src/qemu/qemu_process.h (qemuProcessAttach): Likewise.
* src/uml/uml_driver.c (umlGetProcessInfo): Likewise.
* src/util/virnetdev.h (virNetDevSetNamespace): Likewise.
* src/util/virnetdev.c (virNetDevSetNamespace): Likewise.
* tests/testutils.c (virtTestCaptureProgramOutput): Likewise.
* src/conf/storage_conf.h (_virStoragePerms): Use mode_t, uid_t,
and gid_t rather than int.
* src/security/security_dac.c (virSecurityDACSetOwnership): Likewise.
* src/conf/storage_conf.c (virStorageDefParsePerms): Avoid
compiler warning.
2012-03-02 06:57:43 -07:00
Eric Blake
10ec36e2e7 qemu: pass block pull backing file to monitor
This actually wires up the new optional parameter to block_stream:
http://wiki.qemu.org/Features/LiveBlockMigration/ImageStreamingAPI

The error checking is still sparse, since libvirt must not use
qemu-img or header probing on a qcow2 file in use by qemu to
check if the backing file name is valid; so for now, libvirt is
relying on qemu to diagnose an incorrect backing name.  Fixing this
will require libvirt to track the entire backing file chain at the
time qemu is started and keeps it updated with snapshot and pull
operations.

* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJob): Add
parameter, and update callers.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJob): Update
signature.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJob): Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Update caller.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Likewise.
2012-02-29 13:44:20 -07:00
Eric Blake
68a1300556 qemu: require json for block jobs
Block job commands are not part of upstream qemu until 1.1; and
proper support of job completion and cancellation depends on being
able to receive QMP events, which implies the JSON monitor.
Additionally, some early versions of block job commands were
backported to RHEL qemu, but these versions lacked asynchronous
job cancellation and partial block pull, so there are several
patches that will still be needed in this area of libvirt code
to support both flavors of block job commands.

Due to earlier patches in libvirt, we are guaranteed that all versions
of qemu that support block job commands already require libvirt to
use the JSON monitor.  That means that the text version of block jobs
will not be used, and having to refactor two copies of the block job
handlers makes no sense.  So instead, we delete the text handlers.

* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Drop text monitor
support.
* src/qemu/qemu_monitor_text.h (qemuMonitorTextBlockJob): Delete.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextParseBlockJobOne)
(qemuMonitorTextParseBlockJob, qemuMonitorTextBlockJob):
Likewise.
2012-02-29 13:44:20 -07:00
D. Herrendoerfer
723d5c50c0 Add de-association handling to macvlan code
Add de-association handling for 802.1qbg (vepa) via lldpad
netlink messages. Also adds the possibility to perform an
association request without waiting for a confirmation.

Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
2012-02-29 10:37:32 -05:00
Jiri Denemark
04dec5826d qemu: Add pre-migration hook
This hook is called during the Prepare phase on destination host and may
be used for changing domain XML.
2012-02-29 12:27:12 +01:00
Jiri Denemark
8ab785783f hooks: Add support for capturing hook output
Hooks may now be used as filters.
2012-02-29 12:27:12 +01:00
Jiri Denemark
238a5a4c3d qemu: Don't emit tls-port spice option if port is -1
Bug introduced by commit eda0fc7a.
2012-02-29 11:12:54 +01:00
Osier Yang
c56fe7f1d6 qemu: Build command line for the new address format
For any disk controller model which is not "lsilogic", the command
line will be like:

  -drive file=/dev/sda,if=none,id=drive-scsi0-0-3-0,format=raw \
  -device scsi-disk,bus=scsi0.0,channel=0,scsi-id=3,lun=0,i\
  drive=drive-scsi0-0-3-0,id=scsi0-0-3-0

The relationship between the libvirt address attrs and the qdev
properties are (controller model is not "lsilogic"; strings
inside <> represent libvirt adress attrs):
  bus=scsi<controller>.0
  channel=<bus>
  scsi-id=<target>
  lun=<unit>

* src/qemu/qemu_command.h: (New param "virDomainDefPtr def"
  for function qemuBuildDriveDevStr; new param "virDomainDefPtr
  vmdef" for function qemuAssignDeviceDiskAlias. Both for
  virDomainDiskFindControllerModel's use).

* src/qemu/qemu_command.c:
  - New param "virDomainDefPtr def" for qemuAssignDeviceDiskAliasCustom.
    For virDomainDiskFindControllerModel's use, if the disk bus is "scsi"
    and the controller model is not "lsilogic", "target" is one part of
    the alias name.
  - According change on qemuAssignDeviceDiskAlias and qemuBuildDriveDevStr

* src/qemu/qemu_hotplug.c:
  - Changes to be consistent with declarations of qemuAssignDeviceDiskAlias
    qemuBuildDriveDevStr, and qemuBuildControllerDevStr.

* tests/qemuxml2argvdata/qemuxml2argv-pseries-vio-user-assigned.args,
  tests/qemuxml2argvdata/qemuxml2argv-pseries-vio.args: Update the
  generated command line.
2012-02-28 14:27:17 +08:00
Osier Yang
05fbe728ee qemu: New cap flag to indicate if channel is supported by scsi-disk 2012-02-28 14:27:13 +08:00
Paolo Bonzini
8dcac770f1 qemu: add virtio-scsi controller model
Adding a new model for virtio-scsi roughly follows the same scheme
as the previous patch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-02-28 14:27:03 +08:00
Paolo Bonzini
3482191d12 qemu: add ibmvscsi controller model
KVM will be able to use a PCI SCSI controller even on POWER.  Let
the user specify the vSCSI controller by other means than a default.

After this patch, the QEMU driver will actually look at the model
and reject anything but auto, lsilogic and ibmvscsi.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Osier Yang <jyang@redhat.com>
2012-02-28 14:27:00 +08:00
Laine Stump
4cc4b62e30 qemu: fix cleanup of bridge during failure of qemuDomainAttachNetDevice
In qemuDomainAttachNetDevice, the guest's tap interface has only been
attached to the bridge if iface_connected is true. It's possible for
an error to occur prior to that happening, and previously we would
attempt to remove the tap interface from the bridge even if it hadn't
been attached.
2012-02-27 22:44:22 -05:00
Josh Durgin
f27f616ff8 qemu: unescape HMP commands before converting them to json
QMP commands don't need to be escaped since converting them to json
also escapes special characters. When a QMP command fails, however,
libvirt falls back to HMP commands. These fallback functions
(qemuMonitorText*) do their own escaping, and pass the result directly
to qemuMonitorHMPCommandWithFd. If the monitor is in json mode, these
pre-escaped commands will be escaped again when converted to json,
which can result in the wrong arguments being sent.

For example, a filename test\file would be sent in json as
test\\file.

This prevented attaching an image file with a " or \ in its name in
qemu 1.0.50, and also broke rbd attachment (which uses backslashes to
escape some internal arguments.)

Reported-by: Masuko Tomoya <tomoya.masuko@gmail.com>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2012-02-27 16:06:02 -07:00
Peter Krempa
4716138229 qemu: Add ability to abort existing console while creating new one
This patch fixes console corruption, that happens if two concurrent
sessions are opened for a single console on a domain. Result of this
corruption was that each of the console streams recieved just a part
of the data written to the pipe so every console rendered unusable.

New helper function for safe console handling is used to establish the
console stream connection. This function ensures that no other libvirt
client is using the console (with the ability to disconnect consoles of
libvirt clients) and that no UUCP style lockfile is placed on the PTY
device.

* src/qemu/qemu_domain.h
        - add data structure to domain's private data dealing with
          console connections
* src/qemu/qemu_domain.c:
        - allocate/free domain's console data structure
* src/qemu/qemu_driver.c
        - use the new helper function for console handling
2012-02-27 15:05:17 +01:00
Michal Privoznik
9bf1bcc59d qemu: Implement virDomainPMWakeup API
using 'system-wakeup' monitor command. It is supported only in JSON,
as we are enabling it if possible. Moreover, this command is available
in qemu-1.1+ which definitely has JSON.
2012-02-27 11:47:02 +01:00
Martin Kletzander
9f748277bb Fixed URI parsing
Function xmlParseURI does not remove square brackets around IPv6
address when parsing. One of the solutions is making wrappers around
functions working with xmlURI*. This assures that uri->server will be
always properly assigned and it doesn't have to be changed when used
on some new place in the code.
For this purpose, functions virParseURI and virSaveURI were
added. These function are wrappers around xmlParseURI and xmlSaveUri
respectively.
Also there is one new syntax check function to prohibit these functions
anywhere else.

File changes:
 - src/util/viruri.h        -- declaration
 - src/util/viruri.c        -- definition
 - src/libvirt_private.syms -- symbol export
 - src/Makefile.am          -- added source and header files
 - cfg.mk                   -- added sc_prohibit_xmlURI
 - all others               -- ID name and include fixes
2012-02-24 16:49:21 -07:00
Christophe Fergeau
eda0fc7a82 Error out when using SPICE TLS with spice_tls=0
It's possible to disable SPICE TLS in qemu.conf. When this happens,
libvirt ignores any SPICE TLS port or x509 directory that may have
been set when it builds the qemu command line to use. However, it's
not ignoring the secure channels that may have been set and adds
tls-channel arguments to qemu command line.
Current qemu versions don't report an error when this happens, and try to use
TLS for the specified channels.

Before this patch

<domain type='kvm'>
  <name>auto-tls-port</name>
  <memory>65536</memory>
  <os>
    <type arch='x86_64' machine='pc'>hvm</type>
  </os>
  <devices>
    <graphics type='spice' port='5900' tlsPort='-1' autoport='yes' listen='0' ke
      <listen type='address' address='0'/>
      <channel name='main' mode='secure'/>
      <channel name='inputs' mode='secure'/>
    </graphics>
  </devices>
</domain>

generates

-spice port=5900,addr=0,disable-ticketing,tls-channel=main,tls-channel=inputs

and starts QEMU.

After this patch, an error is reported if a TLS port is set in the XML
or if secure channels are specified but TLS is disabled in qemu.conf.
This is the behaviour the oVirt people (where I spotted this issue) said
they would expect.

This fixes bug #790436
2012-02-24 09:25:44 -07:00
Eric Blake
d2dc5057fd qemu: nicer error message on failed graceful destroy
https://bugzilla.redhat.com/show_bug.cgi?id=795656 mentions
that a graceful destroy request can time out, meaning that the
error message is user-visible and should be more appropriate
than just internal error.

* src/qemu/qemu_driver.c (qemuDomainDestroyFlags): Swap error type.
2012-02-23 08:47:06 -07:00
Jiri Denemark
d57485f73a qemu: Forbid migration with cache != none
Migrating domains with disks using cache != none is unsafe unless the
disk images are stored on coherent clustered filesystem. Thus we forbid
migrating such domains unless VIR_MIGRATE_UNSAFE flags is used.
2012-02-23 14:34:56 +01:00
Alex Jia
18942b9bea qemu: Prevent crash of libvirtd without guest agent
* src/qemu/qemu_process.c (qemuFindAgentConfig): avoid crash libvirtd due to
deref a NULL pointer.

* How to reproduce?
1. virsh edit the following xml into guest configuration:
    <channel type='pty'>
      <target type='virtio'/>
    </channel>
2. virsh start <domain>

or
% virt-install -n foo -r 1024 --disk path=/var/lib/libvirt/images/foo.img,size=1 \
--channel pty,target_type=virtio -l <installation tree>

Signed-off-by: Alex Jia <ajia@redhat.com>
2012-02-16 23:26:41 +08:00
Jiri Denemark
e0d4b0db9e qemu: Unlock monitor when connecting to dest qemu fails
When migrating a qemu domain, we enter the monitor, send some commands,
try to connect to destination qemu, send other commands, end exit the
monitor. However, if we couldn't connect to destination qemu we forgot
to exit the monitor.

Bug introduced by commit d9d518b1c8.
2012-02-16 10:58:35 +01:00
Jiri Denemark
2ccc4a607f qemu: Fix segfault when host CPU is empty
In case libvirtd cannot detect host CPU model (which may happen if it
runs inside a virtual machine), the daemon is likely to segfault when
starting a new qemu domain. It segfaults when domain XML asks for host
(either model or passthrough) CPU or does not ask for any specific CPU
model at all.
2012-02-16 10:41:13 +01:00
Ansis Atteka
df81004632 network: support Open vSwitch
This patch allows libvirt to add interfaces to already
existing Open vSwitch bridges. The following syntax in
domain XML file can be used:

    <interface type='bridge'>
      <mac address='52:54:00:d0:3f:f2'/>
      <source bridge='ovsbr'/>
      <virtualport type='openvswitch'>
        <parameters interfaceid='921a80cd-e6de-5a2e-db9c-ab27f15a6e1d'/>
      </virtualport>
      <address type='pci' domain='0x0000' bus='0x00'
                          slot='0x03' function='0x0'/>
    </interface>

or if libvirt should auto-generate the interfaceid use
following syntax:

    <interface type='bridge'>
      <mac address='52:54:00:d0:3f:f2'/>
      <source bridge='ovsbr'/>
      <virtualport type='openvswitch'>
      </virtualport>
      <address type='pci' domain='0x0000' bus='0x00'
                          slot='0x03' function='0x0'/>
    </interface>

It is also possible to pass an optional profileid. To do that
use following syntax:

   <interface type='bridge'>
     <source bridge='ovsbr'/>
     <mac address='00:55:1a:65:a2:8d'/>
     <virtualport type='openvswitch'>
       <parameters interfaceid='921a80cd-e6de-5a2e-db9c-ab27f15a6e1d'
                   profileid='test-profile'/>
     </virtualport>
   </interface>

To create Open vSwitch bridge install Open vSwitch and
run the following command:

    ovs-vsctl add-br ovsbr
2012-02-15 16:04:54 -05:00
Laine Stump
9368465f75 conf: rename virDomainNetGetActualDirectVirtPortProfile
An upcoming patch will add a <virtualport> element to interfaces of
type='bridge', so it makes sense to give this function a more generic
name.
2012-02-15 16:04:53 -05:00
Laine Stump
f367cd1388 qemu: increase the timeout before sending SIGKILL to qemu process
The current default method of terminating the qemu process is to send
a SIGTERM, wait for up to 1.6 seconds for it to cleanly shutdown, then
send a SIGKILL and wait for up to 1.4 seconds more for the process to
terminate. This is problematic because occasionally 1.6 seconds is not
long enough for the qemu process to flush its disk buffers, so the
guest's disk ends up in an inconsistent state.

Since this only occasionally happens when the timeout prior to SIGKILL
is 1.6 seconds, this patch increases that timeout to 10 seconds. At
the very least, this should reduce the occurrence from "occasionally"
to "extremely rarely". (Once SIGKILL is sent, it waits another 5
seconds for the process to die before returning).

Note that in the cases where it takes less than this for qemu to
shutdown cleanly, libvirt will *not* wait for any longer than it would
without this patch - qemuProcessKill polls the process and returns as
soon as it is gone.
2012-02-15 13:57:15 -05:00
Laine Stump
595e26c086 qemu: drop driver lock while trying to terminate qemu process
This patch is based on an earlier patch by Eric Blake which was never
committed:

https://www.redhat.com/archives/libvir-list/2011-November/msg00243.html

Aside from rebasing, this patch only drops the driver lock once (prior
to the first time the function sleeps), then leaves it dropped until
it returns (Eric's patch would drop and re-acquire the lock around
each call to sleep).

At the time Eric sent his patch, the response (from Dan Berrange) was
that, while it wasn't a good thing to be holding the driver lock while
sleeping, we really need to rethink locking wrt the driver object,
switching to a finer-grained approach that locks individual items
within the driver object separately to allow for greater concurrency.

This is a good plan, and at the time it made sense to not apply the
patch because there was no known bug related to the driver lock being
held in this function.

However, we now know that the length of the wait in qemuProcessKill is
sometimes too short to allow the qemu process to fully flush its disk
cache before SIGKILL is sent, so we need to lengthen the timeout (in
order to improve the situation with management applications until they
can be updated to use the new VIR_DOMAIN_DESTROY_GRACEFUL flag added
in commit 72f8a7f197). But, if we
lengthen the timeout, we also lengthen the amount of time that all
other threads in libvirtd are essentially blocked from doing anything
(since just about everything needs to acquire the driver lock, if only
for long enough to get a pointer to a domain).

The solution is to modify qemuProcessKill to drop the driver lock
while sleeping, as proposed in Eric's patch. Then we can increase the
timeout with a clear conscience, and thus at least lower the chances
that someone running with existing management software will suffer the
consequence's of qemu's disk cache not being flushed.

In the meantime, we still should work on Dan's proposal to make
locking within the driver object more fine grained.

(NB: although I couldn't find any instance where qemuProcessKill() was
called with no jobs active for the domain (or some other guarantee
that the current thread had at least one refcount on the domain
object), this patch still follows Eric's method of temporarily adding
a ref prior to unlocking the domain object, because I couldn't
convince myself 100% that this was the case.)
2012-02-15 13:57:10 -05:00
Michal Privoznik
82f47fde6c qemu: Implement DomainPMSuspendForDuration
via user agent. Allow targets mem & hybrid iff system_wakeup
monitor command is available.
2012-02-15 11:45:45 +01:00
Michal Privoznik
2f1e003939 qemu: Set capabilities based on supported monitor commands
In the future (my next patch in fact) we may want to make
decisions depending on qemu having a monitor command or not.
Therefore, we want to set qemuCaps flag instead of querying
on the monitor each time we are about to make that decision.
2012-02-15 11:37:39 +01:00
Eric Blake
172d34298f qemu: make block io tuning smarter
When blkdeviotune was first committed in 0.9.8, we had the limitation
that setting one value reset all others.  But bytes and iops should
be relatively independent.  Furthermore, setting tuning values on
a live domain followed by dumpxml did not output the new settings.

* src/qemu/qemu_driver.c (qemuDiskPathToAlias): Add parameter, and
update callers.
(qemuDomainSetBlockIoTune): Don't lose previous unrelated
settings.  Make live changes reflect to dumpxml output.
* tools/virsh.pod (blkdeviotune): Update documentation.
2012-02-13 10:34:25 -07:00
Daniel Veillard
ded8e894dd Revert "qemu: add ibmvscsi controller model"
This reverts commit 7b345b69f2.

Conflicts:

	tests/qemuxml2argvdata/qemuxml2argv-disk-scsi-vscsi.xml
2012-02-13 21:37:03 +08:00
Daniel Veillard
3d224ae669 Revert "qemu: add virtio-scsi controller model"
This reverts commit c9abfadf37.

Conflicts:

	tests/qemuxml2argvdata/qemuxml2argv-disk-scsi-virtio-scsi.xml
2012-02-13 21:36:02 +08:00
Osier Yang
7c90026db9 npiv: Auto-generate WWN if it's not specified
The auto-generated WWN comply with the new addressing schema of WWN:

<quote>
the first nibble is either hex 5 or 6 followed by a 3-byte vendor
identifier and 36 bits for a vendor-specified serial number.
</quote>

We choose hex 5 for the first nibble. And for the 3-bytes vendor ID,
we uses the OUI according to underlying hypervisor type, (invoking
virConnectGetType to get the virt type). e.g. If virConnectGetType
returns "QEMU", we use Qumranet's OUI (00:1A:4A), if returns
ESX|VMWARE, we use VMWARE's OUI (00:05:69). Currently it only
supports qemu|xen|libxl|xenapi|hyperv|esx|vmware drivers. The last
36 bits are auto-generated.
2012-02-10 12:53:25 +08:00
Marc-André Lureau
42043afcdc domain: add implicit USB controller
Some tools, such as virt-manager, prefers having the default USB
controller explicit in the XML document. This patch makes sure there
is one. With this patch, it is now possible to switch from USB1 to
USB2 from the release 0.9.1 of virt-manager.

Fix tests to pass with this change.
2012-02-09 16:44:57 -07:00
Eric Blake
c8c239a439 qemu: fix persistent setting of blkiodevice weights
virsh blkiotune dom --device-weights /dev/sda,400 --config

wasn't working correctly.

* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters): Use
correct definition.
2012-02-08 16:53:39 -07:00
Eric Blake
b0bfbd82d1 qemu: make blkiodevice weights easier to read
The merge code had too many indirections to easily analyze.

* src/qemu/qemu_driver.c (qemuDomainMergeDeviceWeights): Pick
better variable names.
2012-02-08 15:41:11 -07:00
Jiri Denemark
91ca45f9dc qemu: Fix memory leak when building -cpu argument
Reported by Alex Jia:

==21503== 112 (32 direct, 80 indirect) bytes in 1 blocks are
definitely lost in loss record 37 of 40
==21503==    at 0x4A04A28: calloc (vg_replace_malloc.c:467)
==21503==    by 0x4A8991: virAlloc (memory.c:101)
==21503==    by 0x505A6C: x86DataCopy (cpu_x86.c:247)
==21503==    by 0x507B34: x86Compute (cpu_x86.c:1225)
==21503==    by 0x43103C: qemuBuildCommandLine (qemu_command.c:3561)
==21503==    by 0x41C9F7: testCompareXMLToArgvHelper
(qemuxml2argvtest.c:183)
==21503==    by 0x41E10D: virtTestRun (testutils.c:141)
==21503==    by 0x41B942: mymain (qemuxml2argvtest.c:705)
==21503==    by 0x41D7E7: virtTestMain (testutils.c:696)
2012-02-08 14:35:12 +01:00
Jiri Denemark
c4caab538e qemu: Always use iohelper for domain save
This is probably not strictly needed as save operation is not live but
we may have other reasons to avoid blocking qemu's main loop.
2012-02-08 14:08:54 +01:00
Jiri Denemark
c8683f231d qemu: Always use iohelper for dumping domain core
Qemu uses non-blocking I/O which doesn't play nice with regular file
descriptors. We need to pass a pipe to qemu instead, which can easily be
done using iohelper.
2012-02-08 11:26:20 +01:00
Jiri Denemark
afe6e58aed util: Generalize virFileDirectFd
virFileDirectFd was used for accessing files opened with O_DIRECT using
libvirt_iohelper. We will want to use the helper for accessing files
regardless on O_DIRECT and thus virFileDirectFd was generalized and
renamed to virFileWrapperFd.
2012-02-08 11:26:20 +01:00
Jiri Denemark
d9d518b1c8 qemu: Fix seamless spice migration
Calling qemuDomainMigrateGraphicsRelocate notifies spice clients to
connect to destination qemu so that they can seamlessly switch streams
once migration is done. Unfortunately, current qemu is not able to
accept any connections while incoming migration connection is open.
Thus, we need to delay opening the migration connection to the point
spice client is already connected to the destination qemu.
2012-02-06 09:41:52 +01:00
Laine Stump
c18a88ac48 qemu: eliminate "Ignoring open failure" when using root-squash NFS
This eliminates the warning message reported in:

 https://bugzilla.redhat.com/show_bug.cgi?id=624447

It was caused by a failure to open an image file that is not
accessible by root (the uid libvirtd is running as) because it's on a
root-squash NFS share, owned by a different user, with permissions of
660 (or maybe 600).

The solution is to use virFileOpenAs() rather than open(). The
codepath that generates the error is during qemuSetupDiskCGroup(), but
the actual open() is in a lower-level generic function called from
many places (virDomainDiskDefForeachPath), so some other pieces of the
code were touched just to add dummy (or possibly useful) uid and gid
arguments.

Eliminating this warning message has the nice side effect that the
requested operation may even succeed (which in this case isn't
necessary, but shouldn't hurt anything either).
2012-02-03 16:47:43 -05:00
Laine Stump
90e4d681bc util: refactor virFileOpenAs
virFileOpenAs previously would only try opening a file as the current
user, or as a different user, but wouldn't try both methods in a
single call. This made it cumbersome to use as a replacement for
open(2). Additionally, it had a lot of historical baggage that led to
it being difficult to understand.

This patch refactors virFileOpenAs in the following ways:

* reorganize the code so that everything dealing with both the parent
  and child sides of the "fork+setuid+setgid+open" method are in a
  separate function. This makes the public function easier to understand.

* Allow a single call to virFileOpenAs() to first attempt the open as
  the current user, and if that fails to automatically re-try after
  doing fork+setuid (if deemed appropriate, i.e. errno indicates it
  would now be successful, and the file is on a networkFS). This makes
  it possible (in many, but possibly not all, cases) to drop-in
  virFileOpenAs() as a replacement for open(2).

  (NB: currently qemuOpenFile() calls virFileOpenAs() twice, once
  without forking, then again with forking. That unfortunately can't
  be changed without at least some discussion of the ramifications,
  because the requested file permissions are different in each case,
  which is something that a single call to virFileOpenAs() can't deal
  with.)

* Add a flag so that any fchown() of the file to a different uid:gid
  is explicitly requested when the function is called, rather than it
  being implied by the presence of the O_CREAT flag. This just makes
  for less subtle surprises to consumers. (Commit
  b1643dc15c added the check for O_CREAT
  before forcing ownership. This patch just makes that restriction
  more explicit.)

* If either the uid or gid is specified as "-1", virFileOpenAs will
  interpret this to mean "the current [gu]id".

All current consumers of virFileOpenAs should retain their present
behavior (after a few minor changes to their setup code and
arguments).
2012-02-03 16:47:39 -05:00
Laine Stump
72f8a7f197 qemu: new GRACEFUL flag for virDomainDestroy w/ QEMU support
When libvirt's virDomainDestroy API is shutting down the qemu process,
it first sends SIGTERM, then waits for 1.6 seconds and, if it sees the
process still there, sends a SIGKILL.

There have been reports that this behavior can lead to data loss
because the guest running in qemu doesn't have time to flush its disk
cache buffers before it's unceremoniously whacked.

This patch maintains that default behavior, but provides a new flag
VIR_DOMAIN_DESTROY_GRACEFUL to alter the behavior. If this flag is set
in the call to virDomainDestroyFlags, SIGKILL will never be sent to
the qemu process; instead, if the timeout is reached and the qemu
process still exists, virDomainDestroy will return an error.

Once this patch is in, the recommended method for applications to call
virDomainDestroyFlags will be with VIR_DOMAIN_DESTROY_GRACEFUL
included. If that fails, then the application can decide if and when
to call virDomainDestroyFlags again without
VIR_DOMAIN_DESTROY_GRACEFUL (to force the issue with SIGKILL).

(Note that this does not address the issue of existing applications
that have not yet been modified to use VIR_DOMAIN_DESTROY_GRACEFUL.
That is a separate patch.)
2012-02-03 14:21:17 -05:00
Philipp Hahn
99d24ab2e0 virterror.c: Fix several spelling mistakes
compat{a->i}bility
erron{->e}ous
nec{c->}essary.
Either "the" or "a".

Signed-off-by: Philipp Hahn <hahn@univention.de>
2012-02-03 11:32:51 -07:00
Martin Kletzander
3d93706d0d Added RSS reporting
Added RSS information gathering into qemuMemoryStats into qemu driver
and the reporting into virsh dommemstat.
2012-02-03 20:54:58 +08:00