Commit Graph

1742 Commits

Author SHA1 Message Date
Laine Stump
6fbb957d91 qemu: re-order functions in qemu_hotplug.c
Code movement only, no functional change. This is necessary to prevent
a forward reference in an upcoming patch.
2012-03-05 23:24:17 -05:00
Laine Stump
29293930a9 conf: make hostdev info a separate object
In order to allow for a virDomainHostdevDef that uses the
virDomainDeviceInfo of a "higher level" device (such as a
virDomainNetDef), this patch changes the virDomainDeviceInfo in the
HostdevDef into a virDomainDeviceInfoPtr. Rather than adding checks
all over the code to check for a null info, we just guarantee that it
is always valid. The new function virDomainHostdevDefAlloc() allocates
a virDomainDeviceInfo and plugs it in, and virDomainHostdevDefFree()
makes sure it is freed.

There were 4 places allocating virDomainHostdevDefs, all of them
parsers of one sort or another, and those have all had their
VIR_ALLOC(hostdev) changed to virDomainHostdevDefAlloc(). Other than
that, and the new functions, all the rest of the changes are just
mechanical removals of "&" or changing "." to "->".
2012-03-05 23:23:44 -05:00
Laine Stump
2f925c650c conf: add device pointer to args of virDomainDeviceInfoIterate callback
There will be cases where the iterator callback will need to know the
type of the device whose info is being operated on, and possibly even
need to use some of the device's config. This patch adds a
virDomainDeviceDefPtr to the args of every callback, and fills it in
appropriately as the devices are iterated through.
2012-03-05 23:23:38 -05:00
Laine Stump
37038d5c0b qemu: rename virDomainDeviceInfoPtr variables to avoid confusion
The virDomainDeviceInfoPtrs in qemuCollectPCIAddress and
qemuComparePCIDevice are named "dev" and "dev1", but those functions
will be changed (in order to match a change in the args sent to
virDomainDeviceInfoIterate() callback args) to contain a
virDomainDeviceDefPtr device.

This patch renames "dev" to "info" (and "dev[n]" to "info[n]") to
avoid later confusion.
2012-03-05 23:23:31 -05:00
Eric Blake
877fd769b9 blockResize: add flag for bytes
Qemu supports sizing by bytes; we shouldn't force the user to
round up if they really wanted an unaligned total size.

* include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_RESIZE_BYTES):
New flag.
* src/libvirt.c (virDomainBlockResize): Document it.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockResize): Take
size in bytes.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextBlockResize):
Likewise.  Pass bytes, not megabytes, to monitor.
* src/qemu/qemu_driver.c (qemuDomainBlockResize): Implement new
flag.
2012-03-05 10:06:52 -07:00
Jiri Denemark
07dd6fb610 qemu: Shared or readonly disks are always safe wrt migration
No matter what cache mode is used, readonly disks are always safe wrt
migration. Shared disks are required to be readonly or to disable
host-side cache, which makes them safe as well.
2012-03-05 15:24:00 +01:00
Osier Yang
1f77472d5b qemu: Fix indention 2012-03-05 18:32:53 +08:00
Laine Stump
d1c310231d util: combine bools in virNetDevTapCreateInBridgePort into flags
With an additional new bool added to determine whether or not to
discourage the use of the supplied MAC address by the bridge itself,
virNetDevTapCreateInBridgePort had three booleans (well, 2 bools and
an int used as a bool) in the arg list, which made it increasingly
difficult to follow what was going on. This patch combines those three
into a single flags arg, which not only shortens the arg list, but
makes it more self-documenting.
2012-03-02 16:04:06 -05:00
Ansis Atteka
c1b164d70c util: centralize tap device MAC address 1st byte "0xFE" modification
When a tap device for a domain is created and attached to a bridge,
the first byte of the tap device MAC address is set to 0xFE, while the
rest is set to match the MAC address that will be presented to the
guest as its network device MAC address. Setting this high value in
the tap's MAC address discourages the bridge from using the tap
device's MAC address as the bridge's own MAC address (Linux bridges
always take on the lowest numbered MAC address of all attached devices
as their own).

In one case within libvirt, a tap device is created and attached to
the bridge with the intent that its MAC address be taken on by the
bridge as its own (this is used to assure that the bridge has a fixed
MAC address to prevent network outages created by the bridge MAC
address "flapping" as guests are started and stopped). In this case,
the first byte of the mac address is *not* altered to 0xFE.

In the current code, callers to virNetDevTapCreateInBridgePort each
make the MAC address modification themselves before calling, which
leads to code duplication, and also prevents lower level functions
from knowing the real MAC address being used by the guest. The problem
here is that openvswitch bridges must be informed about this MAC
address, or they will be unable to pass traffic to/from the guest.

This patch centralizes the location of the MAC address "0xFE fixup"
into virNetDevTapCreateInBridgePort(), meaning 1) callers of this
function no longer need the extra strange bit of code, and 2)
bitNetDevTapCreateBridgeInPort itself now is called with the guest's
unaltered MAC address, and can pass it on, unmodified, to
virNetDevOpenvswitchAddPort.

There is no other behavioral change created by this patch.
2012-03-02 16:04:00 -05:00
Eric Blake
3e2c3d8f6d build: use correct type for pid and similar types
No thanks to 64-bit windows, with 64-bit pid_t, we have to avoid
constructs like 'int pid'.  Our API in libvirt-qemu cannot be
changed without breaking ABI; but then again, libvirt-qemu can
only be used on systems that support UNIX sockets, which rules
out Windows (even if qemu could be compiled there) - so for all
points on the call chain that interact with this API decision,
we require a different variable name to make it clear that we
audited the use for safety.

Adding a syntax-check rule only solves half the battle; anywhere
that uses printf on a pid_t still needs to be converted, but that
will be a separate patch.

* cfg.mk (sc_correct_id_types): New syntax check.
* src/libvirt-qemu.c (virDomainQemuAttach): Document why we didn't
use pid_t for pid, and validate for overflow.
* include/libvirt/libvirt-qemu.h (virDomainQemuAttach): Tweak name
for syntax check.
* src/vmware/vmware_conf.c (vmwareExtractPid): Likewise.
* src/driver.h (virDrvDomainQemuAttach): Likewise.
* tools/virsh.c (cmdQemuAttach): Likewise.
* src/remote/qemu_protocol.x (qemu_domain_attach_args): Likewise.
* src/qemu_protocol-structs (qemu_domain_attach_args): Likewise.
* src/util/cgroup.c (virCgroupPidCode, virCgroupKillInternal):
Likewise.
* src/qemu/qemu_command.c(qemuParseProcFileStrings): Likewise.
(qemuParseCommandLinePid): Use pid_t for pid.
* daemon/libvirtd.c (daemonForkIntoBackground): Likewise.
* src/conf/domain_conf.h (_virDomainObj): Likewise.
* src/probes.d (rpc_socket_new): Likewise.
* src/qemu/qemu_command.h (qemuParseCommandLinePid): Likewise.
* src/qemu/qemu_driver.c (qemudGetProcessInfo, qemuDomainAttach):
Likewise.
* src/qemu/qemu_process.c (qemuProcessAttach): Likewise.
* src/qemu/qemu_process.h (qemuProcessAttach): Likewise.
* src/uml/uml_driver.c (umlGetProcessInfo): Likewise.
* src/util/virnetdev.h (virNetDevSetNamespace): Likewise.
* src/util/virnetdev.c (virNetDevSetNamespace): Likewise.
* tests/testutils.c (virtTestCaptureProgramOutput): Likewise.
* src/conf/storage_conf.h (_virStoragePerms): Use mode_t, uid_t,
and gid_t rather than int.
* src/security/security_dac.c (virSecurityDACSetOwnership): Likewise.
* src/conf/storage_conf.c (virStorageDefParsePerms): Avoid
compiler warning.
2012-03-02 06:57:43 -07:00
Eric Blake
10ec36e2e7 qemu: pass block pull backing file to monitor
This actually wires up the new optional parameter to block_stream:
http://wiki.qemu.org/Features/LiveBlockMigration/ImageStreamingAPI

The error checking is still sparse, since libvirt must not use
qemu-img or header probing on a qcow2 file in use by qemu to
check if the backing file name is valid; so for now, libvirt is
relying on qemu to diagnose an incorrect backing name.  Fixing this
will require libvirt to track the entire backing file chain at the
time qemu is started and keeps it updated with snapshot and pull
operations.

* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJob): Add
parameter, and update callers.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJob): Update
signature.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJob): Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Update caller.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Likewise.
2012-02-29 13:44:20 -07:00
Eric Blake
68a1300556 qemu: require json for block jobs
Block job commands are not part of upstream qemu until 1.1; and
proper support of job completion and cancellation depends on being
able to receive QMP events, which implies the JSON monitor.
Additionally, some early versions of block job commands were
backported to RHEL qemu, but these versions lacked asynchronous
job cancellation and partial block pull, so there are several
patches that will still be needed in this area of libvirt code
to support both flavors of block job commands.

Due to earlier patches in libvirt, we are guaranteed that all versions
of qemu that support block job commands already require libvirt to
use the JSON monitor.  That means that the text version of block jobs
will not be used, and having to refactor two copies of the block job
handlers makes no sense.  So instead, we delete the text handlers.

* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Drop text monitor
support.
* src/qemu/qemu_monitor_text.h (qemuMonitorTextBlockJob): Delete.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextParseBlockJobOne)
(qemuMonitorTextParseBlockJob, qemuMonitorTextBlockJob):
Likewise.
2012-02-29 13:44:20 -07:00
D. Herrendoerfer
723d5c50c0 Add de-association handling to macvlan code
Add de-association handling for 802.1qbg (vepa) via lldpad
netlink messages. Also adds the possibility to perform an
association request without waiting for a confirmation.

Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
2012-02-29 10:37:32 -05:00
Jiri Denemark
04dec5826d qemu: Add pre-migration hook
This hook is called during the Prepare phase on destination host and may
be used for changing domain XML.
2012-02-29 12:27:12 +01:00
Jiri Denemark
8ab785783f hooks: Add support for capturing hook output
Hooks may now be used as filters.
2012-02-29 12:27:12 +01:00
Jiri Denemark
238a5a4c3d qemu: Don't emit tls-port spice option if port is -1
Bug introduced by commit eda0fc7a.
2012-02-29 11:12:54 +01:00
Osier Yang
c56fe7f1d6 qemu: Build command line for the new address format
For any disk controller model which is not "lsilogic", the command
line will be like:

  -drive file=/dev/sda,if=none,id=drive-scsi0-0-3-0,format=raw \
  -device scsi-disk,bus=scsi0.0,channel=0,scsi-id=3,lun=0,i\
  drive=drive-scsi0-0-3-0,id=scsi0-0-3-0

The relationship between the libvirt address attrs and the qdev
properties are (controller model is not "lsilogic"; strings
inside <> represent libvirt adress attrs):
  bus=scsi<controller>.0
  channel=<bus>
  scsi-id=<target>
  lun=<unit>

* src/qemu/qemu_command.h: (New param "virDomainDefPtr def"
  for function qemuBuildDriveDevStr; new param "virDomainDefPtr
  vmdef" for function qemuAssignDeviceDiskAlias. Both for
  virDomainDiskFindControllerModel's use).

* src/qemu/qemu_command.c:
  - New param "virDomainDefPtr def" for qemuAssignDeviceDiskAliasCustom.
    For virDomainDiskFindControllerModel's use, if the disk bus is "scsi"
    and the controller model is not "lsilogic", "target" is one part of
    the alias name.
  - According change on qemuAssignDeviceDiskAlias and qemuBuildDriveDevStr

* src/qemu/qemu_hotplug.c:
  - Changes to be consistent with declarations of qemuAssignDeviceDiskAlias
    qemuBuildDriveDevStr, and qemuBuildControllerDevStr.

* tests/qemuxml2argvdata/qemuxml2argv-pseries-vio-user-assigned.args,
  tests/qemuxml2argvdata/qemuxml2argv-pseries-vio.args: Update the
  generated command line.
2012-02-28 14:27:17 +08:00
Osier Yang
05fbe728ee qemu: New cap flag to indicate if channel is supported by scsi-disk 2012-02-28 14:27:13 +08:00
Paolo Bonzini
8dcac770f1 qemu: add virtio-scsi controller model
Adding a new model for virtio-scsi roughly follows the same scheme
as the previous patch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-02-28 14:27:03 +08:00
Paolo Bonzini
3482191d12 qemu: add ibmvscsi controller model
KVM will be able to use a PCI SCSI controller even on POWER.  Let
the user specify the vSCSI controller by other means than a default.

After this patch, the QEMU driver will actually look at the model
and reject anything but auto, lsilogic and ibmvscsi.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Osier Yang <jyang@redhat.com>
2012-02-28 14:27:00 +08:00
Laine Stump
4cc4b62e30 qemu: fix cleanup of bridge during failure of qemuDomainAttachNetDevice
In qemuDomainAttachNetDevice, the guest's tap interface has only been
attached to the bridge if iface_connected is true. It's possible for
an error to occur prior to that happening, and previously we would
attempt to remove the tap interface from the bridge even if it hadn't
been attached.
2012-02-27 22:44:22 -05:00
Josh Durgin
f27f616ff8 qemu: unescape HMP commands before converting them to json
QMP commands don't need to be escaped since converting them to json
also escapes special characters. When a QMP command fails, however,
libvirt falls back to HMP commands. These fallback functions
(qemuMonitorText*) do their own escaping, and pass the result directly
to qemuMonitorHMPCommandWithFd. If the monitor is in json mode, these
pre-escaped commands will be escaped again when converted to json,
which can result in the wrong arguments being sent.

For example, a filename test\file would be sent in json as
test\\file.

This prevented attaching an image file with a " or \ in its name in
qemu 1.0.50, and also broke rbd attachment (which uses backslashes to
escape some internal arguments.)

Reported-by: Masuko Tomoya <tomoya.masuko@gmail.com>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2012-02-27 16:06:02 -07:00
Peter Krempa
4716138229 qemu: Add ability to abort existing console while creating new one
This patch fixes console corruption, that happens if two concurrent
sessions are opened for a single console on a domain. Result of this
corruption was that each of the console streams recieved just a part
of the data written to the pipe so every console rendered unusable.

New helper function for safe console handling is used to establish the
console stream connection. This function ensures that no other libvirt
client is using the console (with the ability to disconnect consoles of
libvirt clients) and that no UUCP style lockfile is placed on the PTY
device.

* src/qemu/qemu_domain.h
        - add data structure to domain's private data dealing with
          console connections
* src/qemu/qemu_domain.c:
        - allocate/free domain's console data structure
* src/qemu/qemu_driver.c
        - use the new helper function for console handling
2012-02-27 15:05:17 +01:00
Michal Privoznik
9bf1bcc59d qemu: Implement virDomainPMWakeup API
using 'system-wakeup' monitor command. It is supported only in JSON,
as we are enabling it if possible. Moreover, this command is available
in qemu-1.1+ which definitely has JSON.
2012-02-27 11:47:02 +01:00
Martin Kletzander
9f748277bb Fixed URI parsing
Function xmlParseURI does not remove square brackets around IPv6
address when parsing. One of the solutions is making wrappers around
functions working with xmlURI*. This assures that uri->server will be
always properly assigned and it doesn't have to be changed when used
on some new place in the code.
For this purpose, functions virParseURI and virSaveURI were
added. These function are wrappers around xmlParseURI and xmlSaveUri
respectively.
Also there is one new syntax check function to prohibit these functions
anywhere else.

File changes:
 - src/util/viruri.h        -- declaration
 - src/util/viruri.c        -- definition
 - src/libvirt_private.syms -- symbol export
 - src/Makefile.am          -- added source and header files
 - cfg.mk                   -- added sc_prohibit_xmlURI
 - all others               -- ID name and include fixes
2012-02-24 16:49:21 -07:00
Christophe Fergeau
eda0fc7a82 Error out when using SPICE TLS with spice_tls=0
It's possible to disable SPICE TLS in qemu.conf. When this happens,
libvirt ignores any SPICE TLS port or x509 directory that may have
been set when it builds the qemu command line to use. However, it's
not ignoring the secure channels that may have been set and adds
tls-channel arguments to qemu command line.
Current qemu versions don't report an error when this happens, and try to use
TLS for the specified channels.

Before this patch

<domain type='kvm'>
  <name>auto-tls-port</name>
  <memory>65536</memory>
  <os>
    <type arch='x86_64' machine='pc'>hvm</type>
  </os>
  <devices>
    <graphics type='spice' port='5900' tlsPort='-1' autoport='yes' listen='0' ke
      <listen type='address' address='0'/>
      <channel name='main' mode='secure'/>
      <channel name='inputs' mode='secure'/>
    </graphics>
  </devices>
</domain>

generates

-spice port=5900,addr=0,disable-ticketing,tls-channel=main,tls-channel=inputs

and starts QEMU.

After this patch, an error is reported if a TLS port is set in the XML
or if secure channels are specified but TLS is disabled in qemu.conf.
This is the behaviour the oVirt people (where I spotted this issue) said
they would expect.

This fixes bug #790436
2012-02-24 09:25:44 -07:00
Eric Blake
d2dc5057fd qemu: nicer error message on failed graceful destroy
https://bugzilla.redhat.com/show_bug.cgi?id=795656 mentions
that a graceful destroy request can time out, meaning that the
error message is user-visible and should be more appropriate
than just internal error.

* src/qemu/qemu_driver.c (qemuDomainDestroyFlags): Swap error type.
2012-02-23 08:47:06 -07:00
Jiri Denemark
d57485f73a qemu: Forbid migration with cache != none
Migrating domains with disks using cache != none is unsafe unless the
disk images are stored on coherent clustered filesystem. Thus we forbid
migrating such domains unless VIR_MIGRATE_UNSAFE flags is used.
2012-02-23 14:34:56 +01:00
Alex Jia
18942b9bea qemu: Prevent crash of libvirtd without guest agent
* src/qemu/qemu_process.c (qemuFindAgentConfig): avoid crash libvirtd due to
deref a NULL pointer.

* How to reproduce?
1. virsh edit the following xml into guest configuration:
    <channel type='pty'>
      <target type='virtio'/>
    </channel>
2. virsh start <domain>

or
% virt-install -n foo -r 1024 --disk path=/var/lib/libvirt/images/foo.img,size=1 \
--channel pty,target_type=virtio -l <installation tree>

Signed-off-by: Alex Jia <ajia@redhat.com>
2012-02-16 23:26:41 +08:00
Jiri Denemark
e0d4b0db9e qemu: Unlock monitor when connecting to dest qemu fails
When migrating a qemu domain, we enter the monitor, send some commands,
try to connect to destination qemu, send other commands, end exit the
monitor. However, if we couldn't connect to destination qemu we forgot
to exit the monitor.

Bug introduced by commit d9d518b1c8.
2012-02-16 10:58:35 +01:00
Jiri Denemark
2ccc4a607f qemu: Fix segfault when host CPU is empty
In case libvirtd cannot detect host CPU model (which may happen if it
runs inside a virtual machine), the daemon is likely to segfault when
starting a new qemu domain. It segfaults when domain XML asks for host
(either model or passthrough) CPU or does not ask for any specific CPU
model at all.
2012-02-16 10:41:13 +01:00
Ansis Atteka
df81004632 network: support Open vSwitch
This patch allows libvirt to add interfaces to already
existing Open vSwitch bridges. The following syntax in
domain XML file can be used:

    <interface type='bridge'>
      <mac address='52:54:00:d0:3f:f2'/>
      <source bridge='ovsbr'/>
      <virtualport type='openvswitch'>
        <parameters interfaceid='921a80cd-e6de-5a2e-db9c-ab27f15a6e1d'/>
      </virtualport>
      <address type='pci' domain='0x0000' bus='0x00'
                          slot='0x03' function='0x0'/>
    </interface>

or if libvirt should auto-generate the interfaceid use
following syntax:

    <interface type='bridge'>
      <mac address='52:54:00:d0:3f:f2'/>
      <source bridge='ovsbr'/>
      <virtualport type='openvswitch'>
      </virtualport>
      <address type='pci' domain='0x0000' bus='0x00'
                          slot='0x03' function='0x0'/>
    </interface>

It is also possible to pass an optional profileid. To do that
use following syntax:

   <interface type='bridge'>
     <source bridge='ovsbr'/>
     <mac address='00:55:1a:65:a2:8d'/>
     <virtualport type='openvswitch'>
       <parameters interfaceid='921a80cd-e6de-5a2e-db9c-ab27f15a6e1d'
                   profileid='test-profile'/>
     </virtualport>
   </interface>

To create Open vSwitch bridge install Open vSwitch and
run the following command:

    ovs-vsctl add-br ovsbr
2012-02-15 16:04:54 -05:00
Laine Stump
9368465f75 conf: rename virDomainNetGetActualDirectVirtPortProfile
An upcoming patch will add a <virtualport> element to interfaces of
type='bridge', so it makes sense to give this function a more generic
name.
2012-02-15 16:04:53 -05:00
Laine Stump
f367cd1388 qemu: increase the timeout before sending SIGKILL to qemu process
The current default method of terminating the qemu process is to send
a SIGTERM, wait for up to 1.6 seconds for it to cleanly shutdown, then
send a SIGKILL and wait for up to 1.4 seconds more for the process to
terminate. This is problematic because occasionally 1.6 seconds is not
long enough for the qemu process to flush its disk buffers, so the
guest's disk ends up in an inconsistent state.

Since this only occasionally happens when the timeout prior to SIGKILL
is 1.6 seconds, this patch increases that timeout to 10 seconds. At
the very least, this should reduce the occurrence from "occasionally"
to "extremely rarely". (Once SIGKILL is sent, it waits another 5
seconds for the process to die before returning).

Note that in the cases where it takes less than this for qemu to
shutdown cleanly, libvirt will *not* wait for any longer than it would
without this patch - qemuProcessKill polls the process and returns as
soon as it is gone.
2012-02-15 13:57:15 -05:00
Laine Stump
595e26c086 qemu: drop driver lock while trying to terminate qemu process
This patch is based on an earlier patch by Eric Blake which was never
committed:

https://www.redhat.com/archives/libvir-list/2011-November/msg00243.html

Aside from rebasing, this patch only drops the driver lock once (prior
to the first time the function sleeps), then leaves it dropped until
it returns (Eric's patch would drop and re-acquire the lock around
each call to sleep).

At the time Eric sent his patch, the response (from Dan Berrange) was
that, while it wasn't a good thing to be holding the driver lock while
sleeping, we really need to rethink locking wrt the driver object,
switching to a finer-grained approach that locks individual items
within the driver object separately to allow for greater concurrency.

This is a good plan, and at the time it made sense to not apply the
patch because there was no known bug related to the driver lock being
held in this function.

However, we now know that the length of the wait in qemuProcessKill is
sometimes too short to allow the qemu process to fully flush its disk
cache before SIGKILL is sent, so we need to lengthen the timeout (in
order to improve the situation with management applications until they
can be updated to use the new VIR_DOMAIN_DESTROY_GRACEFUL flag added
in commit 72f8a7f197). But, if we
lengthen the timeout, we also lengthen the amount of time that all
other threads in libvirtd are essentially blocked from doing anything
(since just about everything needs to acquire the driver lock, if only
for long enough to get a pointer to a domain).

The solution is to modify qemuProcessKill to drop the driver lock
while sleeping, as proposed in Eric's patch. Then we can increase the
timeout with a clear conscience, and thus at least lower the chances
that someone running with existing management software will suffer the
consequence's of qemu's disk cache not being flushed.

In the meantime, we still should work on Dan's proposal to make
locking within the driver object more fine grained.

(NB: although I couldn't find any instance where qemuProcessKill() was
called with no jobs active for the domain (or some other guarantee
that the current thread had at least one refcount on the domain
object), this patch still follows Eric's method of temporarily adding
a ref prior to unlocking the domain object, because I couldn't
convince myself 100% that this was the case.)
2012-02-15 13:57:10 -05:00
Michal Privoznik
82f47fde6c qemu: Implement DomainPMSuspendForDuration
via user agent. Allow targets mem & hybrid iff system_wakeup
monitor command is available.
2012-02-15 11:45:45 +01:00
Michal Privoznik
2f1e003939 qemu: Set capabilities based on supported monitor commands
In the future (my next patch in fact) we may want to make
decisions depending on qemu having a monitor command or not.
Therefore, we want to set qemuCaps flag instead of querying
on the monitor each time we are about to make that decision.
2012-02-15 11:37:39 +01:00
Eric Blake
172d34298f qemu: make block io tuning smarter
When blkdeviotune was first committed in 0.9.8, we had the limitation
that setting one value reset all others.  But bytes and iops should
be relatively independent.  Furthermore, setting tuning values on
a live domain followed by dumpxml did not output the new settings.

* src/qemu/qemu_driver.c (qemuDiskPathToAlias): Add parameter, and
update callers.
(qemuDomainSetBlockIoTune): Don't lose previous unrelated
settings.  Make live changes reflect to dumpxml output.
* tools/virsh.pod (blkdeviotune): Update documentation.
2012-02-13 10:34:25 -07:00
Daniel Veillard
ded8e894dd Revert "qemu: add ibmvscsi controller model"
This reverts commit 7b345b69f2.

Conflicts:

	tests/qemuxml2argvdata/qemuxml2argv-disk-scsi-vscsi.xml
2012-02-13 21:37:03 +08:00
Daniel Veillard
3d224ae669 Revert "qemu: add virtio-scsi controller model"
This reverts commit c9abfadf37.

Conflicts:

	tests/qemuxml2argvdata/qemuxml2argv-disk-scsi-virtio-scsi.xml
2012-02-13 21:36:02 +08:00
Osier Yang
7c90026db9 npiv: Auto-generate WWN if it's not specified
The auto-generated WWN comply with the new addressing schema of WWN:

<quote>
the first nibble is either hex 5 or 6 followed by a 3-byte vendor
identifier and 36 bits for a vendor-specified serial number.
</quote>

We choose hex 5 for the first nibble. And for the 3-bytes vendor ID,
we uses the OUI according to underlying hypervisor type, (invoking
virConnectGetType to get the virt type). e.g. If virConnectGetType
returns "QEMU", we use Qumranet's OUI (00:1A:4A), if returns
ESX|VMWARE, we use VMWARE's OUI (00:05:69). Currently it only
supports qemu|xen|libxl|xenapi|hyperv|esx|vmware drivers. The last
36 bits are auto-generated.
2012-02-10 12:53:25 +08:00
Marc-André Lureau
42043afcdc domain: add implicit USB controller
Some tools, such as virt-manager, prefers having the default USB
controller explicit in the XML document. This patch makes sure there
is one. With this patch, it is now possible to switch from USB1 to
USB2 from the release 0.9.1 of virt-manager.

Fix tests to pass with this change.
2012-02-09 16:44:57 -07:00
Eric Blake
c8c239a439 qemu: fix persistent setting of blkiodevice weights
virsh blkiotune dom --device-weights /dev/sda,400 --config

wasn't working correctly.

* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters): Use
correct definition.
2012-02-08 16:53:39 -07:00
Eric Blake
b0bfbd82d1 qemu: make blkiodevice weights easier to read
The merge code had too many indirections to easily analyze.

* src/qemu/qemu_driver.c (qemuDomainMergeDeviceWeights): Pick
better variable names.
2012-02-08 15:41:11 -07:00
Jiri Denemark
91ca45f9dc qemu: Fix memory leak when building -cpu argument
Reported by Alex Jia:

==21503== 112 (32 direct, 80 indirect) bytes in 1 blocks are
definitely lost in loss record 37 of 40
==21503==    at 0x4A04A28: calloc (vg_replace_malloc.c:467)
==21503==    by 0x4A8991: virAlloc (memory.c:101)
==21503==    by 0x505A6C: x86DataCopy (cpu_x86.c:247)
==21503==    by 0x507B34: x86Compute (cpu_x86.c:1225)
==21503==    by 0x43103C: qemuBuildCommandLine (qemu_command.c:3561)
==21503==    by 0x41C9F7: testCompareXMLToArgvHelper
(qemuxml2argvtest.c:183)
==21503==    by 0x41E10D: virtTestRun (testutils.c:141)
==21503==    by 0x41B942: mymain (qemuxml2argvtest.c:705)
==21503==    by 0x41D7E7: virtTestMain (testutils.c:696)
2012-02-08 14:35:12 +01:00
Jiri Denemark
c4caab538e qemu: Always use iohelper for domain save
This is probably not strictly needed as save operation is not live but
we may have other reasons to avoid blocking qemu's main loop.
2012-02-08 14:08:54 +01:00
Jiri Denemark
c8683f231d qemu: Always use iohelper for dumping domain core
Qemu uses non-blocking I/O which doesn't play nice with regular file
descriptors. We need to pass a pipe to qemu instead, which can easily be
done using iohelper.
2012-02-08 11:26:20 +01:00
Jiri Denemark
afe6e58aed util: Generalize virFileDirectFd
virFileDirectFd was used for accessing files opened with O_DIRECT using
libvirt_iohelper. We will want to use the helper for accessing files
regardless on O_DIRECT and thus virFileDirectFd was generalized and
renamed to virFileWrapperFd.
2012-02-08 11:26:20 +01:00
Jiri Denemark
d9d518b1c8 qemu: Fix seamless spice migration
Calling qemuDomainMigrateGraphicsRelocate notifies spice clients to
connect to destination qemu so that they can seamlessly switch streams
once migration is done. Unfortunately, current qemu is not able to
accept any connections while incoming migration connection is open.
Thus, we need to delay opening the migration connection to the point
spice client is already connected to the destination qemu.
2012-02-06 09:41:52 +01:00
Laine Stump
c18a88ac48 qemu: eliminate "Ignoring open failure" when using root-squash NFS
This eliminates the warning message reported in:

 https://bugzilla.redhat.com/show_bug.cgi?id=624447

It was caused by a failure to open an image file that is not
accessible by root (the uid libvirtd is running as) because it's on a
root-squash NFS share, owned by a different user, with permissions of
660 (or maybe 600).

The solution is to use virFileOpenAs() rather than open(). The
codepath that generates the error is during qemuSetupDiskCGroup(), but
the actual open() is in a lower-level generic function called from
many places (virDomainDiskDefForeachPath), so some other pieces of the
code were touched just to add dummy (or possibly useful) uid and gid
arguments.

Eliminating this warning message has the nice side effect that the
requested operation may even succeed (which in this case isn't
necessary, but shouldn't hurt anything either).
2012-02-03 16:47:43 -05:00
Laine Stump
90e4d681bc util: refactor virFileOpenAs
virFileOpenAs previously would only try opening a file as the current
user, or as a different user, but wouldn't try both methods in a
single call. This made it cumbersome to use as a replacement for
open(2). Additionally, it had a lot of historical baggage that led to
it being difficult to understand.

This patch refactors virFileOpenAs in the following ways:

* reorganize the code so that everything dealing with both the parent
  and child sides of the "fork+setuid+setgid+open" method are in a
  separate function. This makes the public function easier to understand.

* Allow a single call to virFileOpenAs() to first attempt the open as
  the current user, and if that fails to automatically re-try after
  doing fork+setuid (if deemed appropriate, i.e. errno indicates it
  would now be successful, and the file is on a networkFS). This makes
  it possible (in many, but possibly not all, cases) to drop-in
  virFileOpenAs() as a replacement for open(2).

  (NB: currently qemuOpenFile() calls virFileOpenAs() twice, once
  without forking, then again with forking. That unfortunately can't
  be changed without at least some discussion of the ramifications,
  because the requested file permissions are different in each case,
  which is something that a single call to virFileOpenAs() can't deal
  with.)

* Add a flag so that any fchown() of the file to a different uid:gid
  is explicitly requested when the function is called, rather than it
  being implied by the presence of the O_CREAT flag. This just makes
  for less subtle surprises to consumers. (Commit
  b1643dc15c added the check for O_CREAT
  before forcing ownership. This patch just makes that restriction
  more explicit.)

* If either the uid or gid is specified as "-1", virFileOpenAs will
  interpret this to mean "the current [gu]id".

All current consumers of virFileOpenAs should retain their present
behavior (after a few minor changes to their setup code and
arguments).
2012-02-03 16:47:39 -05:00
Laine Stump
72f8a7f197 qemu: new GRACEFUL flag for virDomainDestroy w/ QEMU support
When libvirt's virDomainDestroy API is shutting down the qemu process,
it first sends SIGTERM, then waits for 1.6 seconds and, if it sees the
process still there, sends a SIGKILL.

There have been reports that this behavior can lead to data loss
because the guest running in qemu doesn't have time to flush its disk
cache buffers before it's unceremoniously whacked.

This patch maintains that default behavior, but provides a new flag
VIR_DOMAIN_DESTROY_GRACEFUL to alter the behavior. If this flag is set
in the call to virDomainDestroyFlags, SIGKILL will never be sent to
the qemu process; instead, if the timeout is reached and the qemu
process still exists, virDomainDestroy will return an error.

Once this patch is in, the recommended method for applications to call
virDomainDestroyFlags will be with VIR_DOMAIN_DESTROY_GRACEFUL
included. If that fails, then the application can decide if and when
to call virDomainDestroyFlags again without
VIR_DOMAIN_DESTROY_GRACEFUL (to force the issue with SIGKILL).

(Note that this does not address the issue of existing applications
that have not yet been modified to use VIR_DOMAIN_DESTROY_GRACEFUL.
That is a separate patch.)
2012-02-03 14:21:17 -05:00
Philipp Hahn
99d24ab2e0 virterror.c: Fix several spelling mistakes
compat{a->i}bility
erron{->e}ous
nec{c->}essary.
Either "the" or "a".

Signed-off-by: Philipp Hahn <hahn@univention.de>
2012-02-03 11:32:51 -07:00
Martin Kletzander
3d93706d0d Added RSS reporting
Added RSS information gathering into qemuMemoryStats into qemu driver
and the reporting into virsh dommemstat.
2012-02-03 20:54:58 +08:00
Martin Kletzander
350d6ccb91 Added RSS information gathering into qemudGetProcessInfo
One more parameter added into the function parsing /proc/<pid>/stat
and the call of the function is fixed as well.
2012-02-03 20:33:57 +08:00
Daniel P. Berrange
b170eb99f5 Add two new security label types
Curently security labels can be of type 'dynamic' or 'static'.
If no security label is given, then 'dynamic' is assumed. The
current code takes advantage of this default, and avoids even
saving <seclabel> elements with type='dynamic' to disk. This
means if you temporarily change security driver, the guests
can all still start.

With the introduction of sVirt to LXC though, there needs to be
a new default of 'none' to allow unconfined LXC containers.

This patch introduces two new security label types

 - default:  the host configuration decides whether to run the
             guest with type 'none' or 'dynamic' at guest start
 - none:     the guest will run unconfined by security policy

The 'none' label type will obviously be undesirable for some
deployments, so a new qemu.conf option allows a host admin to
mandate confined guests. It is also possible to turn off default
confinement

  security_default_confined = 1|0  (default == 1)
  security_require_confined = 1|0  (default == 0)

* src/conf/domain_conf.c, src/conf/domain_conf.h: Add new
  seclabel types
* src/security/security_manager.c, src/security/security_manager.h:
  Set default sec label types
* src/security/security_selinux.c: Handle 'none' seclabel type
* src/qemu/qemu.conf, src/qemu/qemu_conf.c, src/qemu/qemu_conf.h,
  src/qemu/libvirtd_qemu.aug: New security config options
* src/qemu/qemu_driver.c: Tell security driver about default
  config
2012-02-02 17:44:37 -07:00
Eric Blake
9f902a2ed5 block rebase: initial qemu implementation
This is a trivial implementation, which works with the current
released qemu 1.0 with backports of preliminary block pull but
no partial rebase.  Future patches will update the monitor handling
to support an optional parameter for partial rebase; but as qemu
1.1 is unreleased, it can be in later patches, designed to be
backported on top of the supported API.

* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Add parameter,
and adjust callers.  Drop redundant check.
(qemuDomainBlockPull): Move guts...
(qemuDomainBlockRebase): ...to new function.
2012-02-01 15:31:44 -07:00
Peter Krempa
21d13ddc5d qemu: Add support for virDomainGetMetadata and virDomainSetMetadata
This patch adds support for the new api into the qemu driver to support
modification and retrieval of domain description and title. This patch
does not add support for modifying the <metadata> element.
2012-02-01 15:19:28 -07:00
Jiri Denemark
e17e3ed6aa qemu: Implement virDomainGetDiskErrors 2012-02-01 10:54:15 +01:00
Michal Privoznik
50e9b38930 qemu: Clenup qemuDomainSetInterfaceParameters
which contained some useless lines, copied code, NULL
dereference.
2012-02-01 08:56:54 +01:00
Michal Privoznik
bb311b3458 qemu: Don't jump to endjob if no job was even started
In qemuDomainShutdownFlags if we try to use guest agent,
which has error or is not configured, we jump go endjob
label even if we haven't started any job yet. This may
lead to the daemon crash:
1) virsh shutdown --mode agent on a domain without agent configured
2) wait until domain quits
3) virsh edit
2012-02-01 08:42:47 +01:00
Taku Izumi
53e23e99a9 qemu: fix my typo at commit 74e034964c
Fix my typo at
  commit 74e034964c

"disk->rawio == -1" indicates that this value is not
specified. So in case of this, domain must not
be tainted.

Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
2012-01-31 20:21:06 -07:00
Taku Izumi
74e034964c qemu: make qemu processes to retain rawio capability
This patch revises qemuProcessStart() function for qemu
processes to retain CAP_SYS_RAWIO if needed.
And in case of that, add taint flag to domain.

Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Shota Hirae <m11g1401@hibikino.ne.jp>
2012-01-31 13:36:38 -05:00
Laine Stump
3801831cdf qemu: add "romfile" support to specify device boot ROM
This patch addresses: https://bugzilla.redhat.com/show_bug.cgi?id=781562

Along with the "rombar" option that controls whether or not a boot rom
is made visible to the guest, qemu also has a "romfile" option that
allows specifying a binary file to present as the ROM BIOS of any
emulated or passthrough PCI device. This patch adds support for
specifying romfile to both passthrough PCI devices, and emulated
network devices that attach to the guest's PCI bus (just about
everything other than ne2k_isa).

One example of the usefulness of this option is described in the
bugzilla report: 82576 sriov network adapters don't provide a ROM BIOS
for the cards virtual functions (VF), but an image of such a ROM is
available, and with this ROM visible to the guest, it can PXE boot.

In libvirt's xml, the new option is configured like this:

   <hostdev>
     ...
     <rom file='/etc/fake/boot.bin'/>
     ...
   </hostdev

(similarly for <interface>).
2012-01-30 12:30:35 -05:00
Laine Stump
3284ac046f qemu: (and conf) support rombar for network devices
When support for the rombar option was added, it was only added for
PCI passthrough devices, configured with <hostdev>. The same option is
available for any network device that is attached to the guest's PCI
bus. This patch allows setting rombar for any PCI network device type.

After adding cases to test this to qemuxml2argv-hostdev-pci-rombar.*,
I decided to rename those files (to qemuxml2argv-pci-rom.*) to more
accurately reflect the additional tests, and also noticed that up to
now we've only been performing a domainschematest for that case, so I
added the "pci-rom" test to both qemuxml2argv and qemuxml2xml (and in
the process found some bugs whose fixes I squashed into previous
commits of this series).
2012-01-30 12:25:32 -05:00
Laine Stump
159f4d0b30 conf: put all guest-related HostdevDef data in one object
To help consolidate the commonality between virDomainHostdevDef and
virDomainNetDef into as few members as possible (and because I
think it makes sense), this patch moves the rombar and bootIndex
members into the "info" member that is common to both (and to all the
other structs that use them).

It's a bit problematic that this gives rombar and bootIndex to many
device types that don't use them, but this is already the case for the
master and mastertype members of virDomainDeviceInfo, and is properly
commented as such in the definition.

Note that this opens the door to supporting rombar for other devices
that are attached to the guest PCI bus - virtio-blk-pci,
virtio-net-pci, various other network adapters - which which have that
capability in qemu, but previously had no support in libvirt.
2012-01-30 12:25:20 -05:00
Hendrik Schwartke
484a0bab39 qemu: Fix segfault in qemuMonitorTextGetBlockInfo
If some error occurs then the cleanup code calls VIR_FREE(info)
without ensuring that info is initialized.
2012-01-30 13:48:34 +01:00
Eric Blake
ab6f1c9814 qemu: avoid double free of qemu help output
If yajl was not compiled in, we end up freeing an incoming
parameter, which leads to a bogus free later on.  Regression
introduced in commit 6e769eb.

* src/qemu/qemu_capabilities.c (qemuCapsParseHelpStr): Avoid alloc
on failure path, which in turn fixes bogus free.
Reported by Cole Robinson.
2012-01-27 13:53:11 -07:00
Daniel P. Berrange
4ce98dadcc Rename virXXXXMacAddr to virMacAddrXXX
Rename virFormatMacAddr, virGenerateMacAddr and virParseMacAddr
to virMacAddrFormat, virMacAddrGenerate and virMacAddrParse
respectively
2012-01-27 17:53:44 +00:00
Paolo Bonzini
b66d1bef14 qemu: parse and create -cpu ...,-kvmclock
QEMU supports a bunch of CPUID features that are tied to the kvm CPUID
nodes rather than the processor's.  They are "kvmclock",
"kvm_nopiodelay", "kvm_mmu", "kvm_asyncpf".  These are not known to
libvirt and their CPUID leaf might move if (for example) the Hyper-V
extensions are enabled. Hence their handling would anyway require some
special-casing.

However, among these the most useful is kvmclock; an additional
"property" of this feature is that a <timer> element is a better model
than a CPUID feature.  Although, creating part of the -cpu command-line
from something other than the <cpu> XML element introduces some
ugliness.

Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-01-27 16:51:50 +01:00
Paolo Bonzini
df8e6918b3 qemu: do not create useless <cpu> element
Avoid creating an empty <cpu> element when the QEMU command-line simply
specifies the default "-cpu qemu32" or "-cpu qemu64".

This requires the previous patch, which lets us represent "-cpu qemu32"
as <os arch='i686'> in the generated XML.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-01-27 16:51:50 +01:00
Paolo Bonzini
d5e88b2c33 qemu: get arch name from <cpu> element
The qemu32 CPU model is chosen based on the <os arch=...> name when
creating the QEMU command line for a 64-bit host.  For the opposite
transformation we can test the guest CPU model for the "lm" feature.
If it is absent, def->os.arch needs to be corrected.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-01-27 16:51:50 +01:00
Paolo Bonzini
4be541a6d9 qemu: detect arch correctly for KVM
When running under KVM, the arch is usually set to i686 because
the name of the emulator is not qemu-system-x86_64.  Use the host
arch instead.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-01-27 16:51:49 +01:00
Paolo Bonzini
4a00c099ab qemu: parse -enable-kvm
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-01-27 16:51:49 +01:00
Eric Blake
6e769ebadb qemu: require qmp on new enough qemu
The qemu developers have made it clear that modern qemu will no
longer guarantee human monitor command stability; furthermore,
some features, such as async events, are only supported via qmp.
If we are compiled without support for handling JSON, we cannot
expect to sanely interact with modern qemu.

However, things must continue to build on RHEL 5, where qemu
is stuck at 0.10, and where yajl is not available.

Another benefit of this patch: future additions of new monitor
commands need only focus on qemu_monitor_json.c, instead of
also wasting time with qemu_monitor_text.c.

* src/qemu/qemu_capabilities.c (qemuCapsComputeCmdFlags): Report
error if yajl is missing but qemu requires qmp.
(qemuCapsParseHelpStr): Propagate error.
(qemuCapsExtractVersionInfo): Update caller.
* tests/qemuhelptest.c (testHelpStrParsing): Likewise.
2012-01-27 08:45:50 -07:00
Eric Blake
ff88cd5905 qemu: support qmp on RHEL/CentOS qemu
I'm getting tired of remembering to backport RHEL-specific
patches when building upstream libvirt on RHEL 6.x or CentOS.
All the affected versions of RHEL qemu-kvm have backported
enough patches to a) make JSON useful, and b) modify the
-help text to mention libvirt as the preferred interface;
which means this string in the help output is a reliable
indicator that we can outsmart a strict version check,
even when upstream qemu 0.12 lacked the needed features.

* src/qemu/qemu_capabilities.c (qemuCapsComputeCmdFlags):
Recognize particular help string present when enough features were
backported to be worth using JSON.
* tests/qemuhelptest.c (mymain): Update tests accordingly.
2012-01-27 08:11:19 -07:00
Jiri Denemark
65c27e2935 qemu: Refactor qemuMonitorGetBlockInfo
QEMU always sends details about all available block devices as an answer
for "info block"/"query-block" command. On the other hand, our
qemuMonitorGetBlockInfo was made for a single block devices queries
only. Thus, when asking for multiple devices, we asked qemu multiple
times to always get the same answer from which different parts were
filtered. This patch makes qemuMonitorGetBlockInfo return a hash table
of all block devices, which may later be used for getting details about
specific devices.
2012-01-27 13:07:56 +01:00
Daniel P. Berrange
1d5c7a9fdf Rename hash.h and hash.c to virhash.h and virhash.c
In preparation for the patch to include Murmurhash3, which
introduces a virhashcode.h and virhashcode.c files, rename
the existing hash.h and hash.c to virhash.h and virhash.c
respectively.
2012-01-26 14:11:13 +00:00
Michal Privoznik
109593ecb0 snapshots: Introduce VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE flag
With this flag, virDomainSnapshotCreate will use fs-freeze and
fs-thaw guest agent commands to quiesce guest's disks.
2012-01-25 10:59:41 +01:00
Michal Privoznik
29bce12ada qemu_agent: Create file system freeze and thaw functions
These functions simply issue command to guest agent which
should freeze or unfreeze all file systems within guest.
2012-01-25 10:59:41 +01:00
Jiri Denemark
24a001493a qemu: Emit bootindex even for direct boot
Direct boot (using kernel, initrd, and command line) is used by
virt-install/virt-manager for network install. While any bootindex has
no direct effect since -kernel is always first, we need it as a hint for
SeaBIOS to present disks in the same order as they will be presented
during normal boot.
2012-01-25 10:38:01 +01:00
Daniel P. Berrange
fb52a39928 Wire up QEMU agent to reboot/shutdown APIs
This makes use of the QEMU guest agent to implement the
virDomainShutdownFlags and virDomainReboot APIs. With
no flags specified, it will prefer to use the agent, but
fallback to ACPI. Explicit choice can be made by using
a suitable flag

* src/qemu/qemu_driver.c: Wire up use of agent
2012-01-24 12:19:51 +01:00
Daniel P. Berrange
c160ce3316 QEMU guest agent support
There is now a standard QEMU guest agent that can be installed
and given a virtio serial channel

    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/f16x86_64.agent'/>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
    </channel>

The protocol that runs over the guest agent is JSON based and
very similar to the JSON monitor. We can't use exactly the same
code because there are some odd differences in the way messages
and errors are structured. The qemu_agent.c file is based on
a combination and simplification of qemu_monitor.c and
qemu_monitor_json.c

* src/qemu/qemu_agent.c, src/qemu/qemu_agent.h: Support for
  talking to the agent for shutdown
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h: Add thread
  helpers for talking to the agent
* src/qemu/qemu_process.c: Connect to agent whenever starting
  a guest
* src/qemu/qemu_monitor_json.c: Make variable static
2012-01-24 12:19:51 +01:00
Eric Blake
32b57a72de maint: cleanup qemu capabilities
Fix inconsistent whitespace and long lines.

* src/qemu/qemu_capabilities.h (qemuCapsFlags): Improve formatting.
2012-01-20 16:34:29 -07:00
Eric Blake
bb69630b6c maint: enforce use of _LAST marker
When converting a linear enum to a string, we have checks in
place in the VIR_ENUM_IMPL macro to ensure that there is one
string for every value, which lets us quickly flag if a user
added a value but forgot to add a counterpart string.  However,
this only works if we use the _LAST marker.

* cfg.mk (sc_require_enum_last_marker): New syntax check.
* src/conf/domain_conf.h (virDomainSnapshotState): Add new marker.
* src/conf/domain_conf.c (virDomainSnapshotState): Fix offender.
* src/qemu/qemu_monitor_json.c (qemuMonitorWatchdogAction)
(qemuMonitorIOErrorAction, qemuMonitorGraphicsAddressFamily):
Likewise.
* src/util/virtypedparam.c (virTypedParameter): Likewise.
2012-01-20 16:16:04 -07:00
Eric Blake
9e48c22534 util: use new virTypedParameter helpers
Reusing common code makes things smaller; it also buys us some
additional safety, such as now rejecting duplicate parameters
during a set operation.

* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters)
(qemuDomainSetMemoryParameters, qemuDomainSetNumaParameters)
(qemuSetSchedulerParametersFlags)
(qemuDomainSetInterfaceParameters, qemuDomainSetBlockIoTune)
(qemuDomainGetBlkioParameters, qemuDomainGetMemoryParameters)
(qemuDomainGetNumaParameters, qemuGetSchedulerParametersFlags)
(qemuDomainBlockStatsFlags, qemuDomainGetInterfaceParameters)
(qemuDomainGetBlockIoTune): Use new helpers.
* src/esx/esx_driver.c (esxDomainSetSchedulerParametersFlags)
(esxDomainSetMemoryParameters)
(esxDomainGetSchedulerParametersFlags)
(esxDomainGetMemoryParameters): Likewise.
* src/libxl/libxl_driver.c
(libxlDomainSetSchedulerParametersFlags)
(libxlDomainGetSchedulerParametersFlags): Likewise.
* src/lxc/lxc_driver.c (lxcDomainSetMemoryParameters)
(lxcSetSchedulerParametersFlags, lxcDomainSetBlkioParameters)
(lxcDomainGetMemoryParameters, lxcGetSchedulerParametersFlags)
(lxcDomainGetBlkioParameters): Likewise.
* src/test/test_driver.c (testDomainSetSchedulerParamsFlags)
(testDomainGetSchedulerParamsFlags): Likewise.
* src/xen/xen_hypervisor.c (xenHypervisorSetSchedulerParameters)
(xenHypervisorGetSchedulerParameters): Likewise.
2012-01-19 13:20:30 -07:00
Martin Kletzander
4c82f09ef0 Added capability checking for block <iotune> setting.
There was missing capability for blkiotune and thus specifying these
settings caused libvirt to run qemu with invalid parameters and then
reporting qemu error instead of the standard libvirt one. The support
for blkiotune setting was added in upstream qemu repo under commit
0563e191516289c9d2f282a8c50f2eecef2fa773.
2012-01-18 09:56:00 -07:00
Osier Yang
7aeb9794d2 qemu: Prohibit reattaching node device if it is in use
It doesn't make sense to reattach a device to host while it's
still in use, e.g, by a domain.
2012-01-17 17:15:22 -07:00
Osier Yang
6be610bfaa qemu: Introduce inactive PCI device list
pciTrySecondaryBusReset checks if there is active device on the
same bus, however, qemu driver doesn't maintain an effective
list for the inactive devices, and it passes meaningless argument
for parameter "inactiveDevs". e.g. (qemuPrepareHostdevPCIDevices)

if (!(pcidevs = qemuGetPciHostDeviceList(hostdevs, nhostdevs)))
    return -1;

..skipped...

if (pciResetDevice(dev, driver->activePciHostdevs, pcidevs) < 0)
    goto reattachdevs;

NB, the "pcidevs" used above are extracted from domain def, and
thus one won't be able to attach a device of which bus has other
device even detached from host (nodedev-detach). To see more
details of the problem:

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=773667

This patch is to resolve the problem by introducing an inactive
PCI device list (just like qemu_driver->activePciHostdevs), and
the whole logic is:

  * Add the device to inactive list during nodedev-dettach
  * Remove the device from inactive list during nodedev-reattach
  * Remove the device from inactive list during attach-device
    (for non-managed device)
  * Add the device to inactive list after detach-device, only
    if the device is not managed

With the above, we have a sufficient inactive PCI device list, and thus
we can use it for pciResetDevice. e.g.(qemuPrepareHostdevPCIDevices)

if (pciResetDevice(dev, driver->activePciHostdevs,
                   driver->inactivePciHostdevs) < 0)
    goto reattachdevs;
2012-01-17 17:05:32 -07:00
Deepak C Shetty
d9e0d8204b Add new attribute wrpolicy to <driver> element
This introduces new attribute wrpolicy with only supported
value as immediate. This will be an optional
attribute with no defaults. This helps specify whether
to skip the host page cache.

When wrpolicy is specified, meaning when wrpolicy=immediate
a writeback is explicitly initiated for the dirty pages in
the host page cache as part of the guest file write operation.

Usage:
<filesystem type='mount' accessmode='passthrough'>
  <driver type='path' wrpolicy='immediate'/>
  <source dir='/export/to/guest'/>
  <target dir='mount_tag'/>
</filesystem>

Currently this only works with type='mount' for the QEMU/KVM driver.

Signed-off-by: Deepak C Shetty <deepakcs@linux.vnet.ibm.com>
2012-01-17 15:37:42 -07:00
Jiri Denemark
9619d8a62e qemu: Don't break domain with 0:0:2.0 assigned to anything but VGA
In the past we didn't reserve 0:0:2.0 PCI address if there was no video
device assigned to a domain, which made it impossible to add a video
device later on. So we fixed it (commit v0.9.0-37-g7b2cac1) by always
reserving that address. However, that breaks existing domains without
video devices that already have another device assigned to the
problematic address.

This patch reserves address 0:0:2.0 only in case it was not explicitly
assigned to another device, which means libvirt will try to keep this
address free and will not automatically assign it new devices. But
existing domains for which older libvirt already assigned the address to
a non-video device will keep working as they used to work before 0.9.1.
Moreover, users who want to create a domain without a video device and
use its address for another device may do so by explicitly configuring
the PCI address in domain XML.
2012-01-17 21:01:23 +01:00
Jiri Denemark
e7201afdf7 qemu: Add support for host CPU modes
This adds support for host-model and host-passthrough CPU modes to qemu
driver. The host-passthrough mode is mapped to -cpu host.
2012-01-17 12:22:19 +01:00
Jiri Denemark
c8506d6662 Taint domains configured with cpu mode=host-passthrough
There are several reasons for doing this:

- the CPU specification is out of libvirt's control so we cannot
  guarantee stable guest ABI
- not every feature of a CPU may actually work as expected when
  advertised directly to a guest
- migration between two machines with exactly the same CPU may work but
  no guarantees can be made
- this mode is not supported and its use is at one's own risk
2012-01-17 11:49:42 +01:00
Jiri Denemark
277bc0dcb8 cpu: Update guest CPU in host-* mode
VIR_DOMAIN_XML_UPDATE_CPU flag for virDomainGetXMLDesc may be used to
get updated custom mode guest CPU definition in case it depends on host
CPU. This patch implements the same behavior for host-model and
host-passthrough CPU modes.
2012-01-17 11:42:56 +01:00
Jiri Denemark
a6f88cbd2d cpu: Optionally forbid fallback CPU models
In case a hypervisor doesn't support the exact CPU model requested by a
domain XML, we automatically fallback to a closest CPU model the
hypervisor supports (and make sure we add/remove any additional features
if needed). This patch adds 'fallback' attribute to model element, which
can be used to disable this automatic fallback.
2012-01-17 11:39:19 +01:00
Jiri Denemark
5e31e71365 Clarify semantics of virDomainMigrate{,ToURI}2
Commit 5d784bd6d7 was a nice attempt to
clarify the semantics by requiring domain name from dxml to either match
original name or dname. However, setting dxml domain name to dname
doesn't really work since destination host needs to know the original
domain name to be able to use it in migration cookies. This patch
requires domain name in dxml to match the original domain name. The
change should be safe and backward compatible since migration would fail
just a bit later in the process.
2012-01-17 10:31:24 +01:00
Michael Ellerman
69dde2e653 tests: Teach qemuxml2argvtest about spapr-vio addresses
We can't call qemuCapsExtractVersionInfo() from test code, because it
expects to be able to call the emulator, and for testing we have fake
emulators that can't be executed. For that reason qemuxml2argvtest.c
doesn't call qemuDomainAssignPCIAddresses(), instead it open codes its
own version.

That means we can't call qemuDomainAssignAddresses() from the test code,
instead we need to manually call qemuDomainAssignSpaprVioAddresses().

Also add logic to cope with qemuDomainAssignSpaprVioAddresses() failing,
so that we can write a test that checks for a known failure in there.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2012-01-13 16:08:22 -07:00
Paolo Bonzini
c9abfadf37 qemu: add virtio-scsi controller model
Adding a new model for virtio-scsi roughly follows the same scheme
as the previous patch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-01-13 14:54:48 -07:00
Paolo Bonzini
7b345b69f2 qemu: add ibmvscsi controller model
KVM will be able to use a PCI SCSI controller even on POWER.  Let
the user specify the vSCSI controller by other means than a default.

After this patch, the QEMU driver will actually look at the model
and reject anything but auto, lsilogic and ibmvscsi.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-01-13 14:13:30 -07:00
Osier Yang
5edfcaae6f qemu: Support copy on read for disk
The new introduced optional attribute "copy_on_read</code> controls
whether to copy read backing file into the image file. The value can
be either "on" or "off". Copy-on-read avoids accessing the same backing
file sectors repeatedly and is useful when the backing file is over a
slow network. By default copy-on-read is off.
2012-01-13 10:08:15 +08:00
Deepak C Shetty
99fbb3866c Do not generate security_model when fs driver is anything but 'path'
QEMU does not support security_model for anything but 'path' fs driver type.
Currently in libvirt, when security_model ( accessmode attribute) is not
specified it auto-generates it irrespective of the fs driver type, which
can result in a qemu error for drivers other than path. This patch ensures
that the qemu cmdline is correctly generated by taking into account the
fs driver type.

Signed-off-by: Deepak C Shetty <deepakcs@linux.vnet.ibm.com>
2012-01-11 13:48:52 -07:00
Daniel P. Berrange
99be754ada Change security driver APIs to use virDomainDefPtr instead of virDomainObjPtr
When sVirt is integrated with the LXC driver, it will be neccessary
to invoke the security driver APIs using only a virDomainDefPtr
since the lxc_container.c code has no virDomainObjPtr available.
Aside from two functions which want obj->pid, every bit of the
security driver code only touches obj->def. So we don't need to
pass a virDomainObjPtr into the security drivers, a virDomainDefPtr
is sufficient. Two functions also gain a 'pid_t pid' argument.

* src/qemu/qemu_driver.c, src/qemu/qemu_hotplug.c,
  src/qemu/qemu_migration.c, src/qemu/qemu_process.c,
  src/security/security_apparmor.c,
  src/security/security_dac.c,
  src/security/security_driver.h,
  src/security/security_manager.c,
  src/security/security_manager.h,
  src/security/security_nop.c,
  src/security/security_selinux.c,
  src/security/security_stack.c: Change all security APIs to use a
  virDomainDefPtr instead of virDomainObjPtr
2012-01-11 09:52:18 +00:00
Eric Blake
4e9953a426 snapshot: allow reuse of existing files in disk snapshot
When disk snapshots were first implemented, libvirt blindly refused
to allow an external snapshot destination that already exists, since
qemu will blindly overwrite the contents of that file during the
snapshot_blkdev monitor command, and we don't like a default of
data loss by default.  But VDSM has a scenario where NFS permissions
are intentionally set so that the destination file can only be
created by the management machine, and not the machine where the
guest is running, so that libvirt will necessarily see the destination
file already existing; adding a flag will allow VDSM to force the file
reuse without libvirt complaining of possible data loss.

https://bugzilla.redhat.com/show_bug.cgi?id=767104

* include/libvirt/libvirt.h.in (virDomainSnapshotCreateFlags): Add
VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT.
* src/libvirt.c (virDomainSnapshotCreateXML): Document it.  Add
note about partial failure.
* tools/virsh.c (cmdSnapshotCreate, cmdSnapshotCreateAs): Add new
flag.
* tools/virsh.pod (snapshot-create, snapshot-create-as): Document
it.
* src/qemu/qemu_driver.c (qemuDomainSnapshotDiskPrepare)
(qemuDomainSnapshotCreateXML): Implement the new flag.
2012-01-10 11:53:23 -07:00
Laine Stump
32f63e912d qemu: check for kvm availability before starting kvm guests
This *kind of* addresses:

  https://bugzilla.redhat.com/show_bug.cgi?id=772395

(it doesn't eliminate the failure to start, but causes libvirt to give
a better idea about the cause of the failure).

If a guest uses a kvm emulator (e.g. /usr/bin/qemu-kvm) and the guest
is started when kvm isn't available (either because virtualization is
unavailable / has been disabled in the BIOS, or the kvm modules
haven't been loaded for some reason), a semi-cryptic error message is
logged:

  libvirtError: internal error Child process (LC_ALL=C
  PATH=/sbin:/usr/sbin:/bin:/usr/bin /usr/bin/qemu-kvm -device ? -device
  pci-assign,? -device virtio-blk-pci,? -device virtio-net-pci,?) status
  unexpected: exit status 1

This patch notices at process start that a guest needs kvm, and checks
for the presence of /dev/kvm (a reasonable indicator that kvm is
available) before trying to execute the qemu binary. If kvm isn't
available, a more useful (too verbose??) error is logged.
2012-01-10 13:42:59 -05:00
Alex Jia
d8d9b0e058 qemu: fix a typo on qemuDomainSetBlkioParameters
It should be a copy-paste error, the result is programming will result in an
infinite loop again due to without iterating 'j' variable.

* src/qemu/qemu_driver.c: fix a typo on qemuDomainSetBlkioParameters.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=770520

Signed-off-by: Alex Jia <ajia@redhat.com>
2012-01-10 11:41:27 +01:00
Laine Stump
177db08775 qemu: add new disk device='lun' for bus='virtio' & type='block'
In the past, generic SCSI commands issued from a guest to a virtio
disk were always passed through to the underlying disk by qemu, and
the kernel would also pass them on.

As a result of CVE-2011-4127 (see:
http://seclists.org/oss-sec/2011/q4/536), qemu now honors its
scsi=on|off device option for virtio-blk-pci (which enables/disables
passthrough of generic SCSI commands), and the kernel will only allow
the commands for physical devices (not for partitions or logical
volumes). The default behavior of qemu is still to allow sending
generic SCSI commands to physical disks that are presented to a guest
as virtio-blk-pci devices, but libvirt prefers to disable those
commands in the standard virtio block devices, enabling it only when
specifically requested (hopefully indicating that the requester
understands what they're asking for). For this purpose, a new libvirt
disk device type (device='lun') has been created.

device='lun' is identical to the default device='disk', except that:

1) It is only allowed if bus='virtio', type='block', and the qemu
   version is "new enough" to support it ("new enough" == qemu 0.11 or
   better), otherwise the domain will fail to start and a
   CONFIG_UNSUPPORTED error will be logged).

2) The option "scsi=on" will be added to the -device arg to allow
   SG_IO commands (if device !='lun', "scsi=off" will be added to the
   -device arg so that SG_IO commands are specifically forbidden).

Guests which continue to use disk device='disk' (the default) will no
longer be able to use SG_IO commands on the disk; those that have
their disk device changed to device='lun' will still be able to use SG_IO
commands.

*docs/formatdomain.html.in - document the new device attribute value.
*docs/schemas/domaincommon.rng - allow it in the RNG
*tests/* - update the args of several existing tests to add scsi=off, and
 add one new test that will test scsi=on.
*src/conf/domain_conf.c - update domain XML parser and formatter

*src/qemu/qemu_(command|driver|hotplug).c - treat
 VIR_DOMAIN_DISK_DEVICE_LUN *almost* identically to
 VIR_DOMAIN_DISK_DEVICE_DISK, except as indicated above.

Note that no support for this new device value was added to any
hypervisor drivers other than qemu, because it's unclear what it might
mean (if anything) to those drivers.
2012-01-09 10:55:53 -05:00
Laine Stump
e8daeeb136 qemu: add capabilities flags related to SG_IO
This patch adds two capabilities flags to deal with various aspects
of supporting SG_IO commands on virtio-blk-pci devices:

  QEMU_CAPS_VIRTIO_BLK_SCSI
    set if -device virtio-blk-pci accepts the scsi="on|off" option
    When present, this is on by default, but can be set to off to disable
    SG_IO functions.

  QEMU_CAPS_VIRTIO_BLK_SG_IO
    set if SG_IO commands are supported in the virtio-blk-pci driver
    (present since qemu 0.11 according to a qemu developer, if I
     understood correctly)
2012-01-09 10:55:44 -05:00
Laine Stump
1734cdb995 config: report error when script given for inappropriate interface type
This fixes https://bugzilla.redhat.com/show_bug.cgi?id=638633

Although scripts are not used by interfaces of type other than
"ethernet" in qemu, due to the fact that the parser stores the script
name in a union that is only valid when type is ethernet or bridge,
there is no way for anyone except the parser itself to catch the
problem of specifying an interface script for an inappropriate
interface type (by the time the parsed data gets back to the code that
called the parser, all evidence that a script was specified is
forgotten).

Since the parser itself should be agnostic to which type of interface
allows scripts (an example of why: a script specified for an interface
of type bridge is valid for xen domains, but not for qemu domains),
the solution here is to move the script out of the union(s) in the
DomainNetDef, always populate it when specified (regardless of
interface type), and let the driver decide whether or not it is
appropriate.

Currently the qemu, xen, libxml, and uml drivers recognize the script
parameter and do something with it (the uml driver only to report that
it isn't supported). Those drivers have been updated to log a
CONFIG_UNSUPPORTED error when a script is specified for an interface
type that's inappropriate for that particular hypervisor.

(NB: There was earlier discussion of solving this problem by adding a
VALIDATE flag to all libvirt APIs that accept XML, which would cause
the XML to be validated against the RNG files. One statement during
that discussion was that the RNG shouldn't contain hypervisor-specific
things, though, and a proper solution to this problem would require
that (again, because a script for an interface of type "bridge" is
accepted by xen, but not by qemu).
2012-01-08 10:52:24 -05:00
Eric Blake
13a776ca0d qemu: one more client to live/config helper
Commit ae523427 missed one pair of functions that could use
the helper routine.

* src/qemu/qemu_driver.c (qemuSetSchedulerParametersFlags)
(qemuGetSchedulerParametersFlags): Simplify.
2012-01-07 05:08:01 -07:00
Alex Jia
b41d440e61 qemu: Avoid memory leaks on qemuParseRBDString
Detected by valgrind. Leak introduced in commit 5745dc1.

* src/qemu/qemu_command.c: fix memory leak on failure and successful path.

* How to reproduce?
% valgrind -v --leak-check=full ./qemuargv2xmltest

* Actual result:

==2196== 80 bytes in 1 blocks are definitely lost in loss record 3 of 4
==2196==    at 0x4A05FDE: malloc (vg_replace_malloc.c:236)
==2196==    by 0x39CF07F6E1: strdup (in /lib64/libc-2.12.so)
==2196==    by 0x419823: qemuParseRBDString (qemu_command.c:1657)
==2196==    by 0x4221ED: qemuParseCommandLine (qemu_command.c:5934)
==2196==    by 0x422AFB: qemuParseCommandLineString (qemu_command.c:7561)
==2196==    by 0x416864: testCompareXMLToArgvHelper (qemuargv2xmltest.c:48)
==2196==    by 0x417DB1: virtTestRun (testutils.c:141)
==2196==    by 0x415CAF: mymain (qemuargv2xmltest.c:175)
==2196==    by 0x4174A7: virtTestMain (testutils.c:696)
==2196==    by 0x39CF01ECDC: (below main) (in /lib64/libc-2.12.so)
==2196==
==2196== LEAK SUMMARY:
==2196==    definitely lost: 80 bytes in 1 blocks

Signed-off-by: Alex Jia <ajia@redhat.com>
2012-01-06 14:51:26 +08:00
Hu Tao
6b780f744b qemu: fix a bug in numatune
When setting numa nodeset for a domain which has no nodeset set
before, libvirtd crashes by dereferencing the pointer to the old
nodemask which is null in that case.
2012-01-05 13:04:02 -07:00
Eric Blake
820a2159e9 qemu: fix use-after-free regression
Commit baade4d fixed a memory leak on failure, but in the process,
introduced a use-after-free on success, which can be triggered with:

1. set bandwidth with --live
2. query bandwidth
3. set bandwidth with --live

* src/qemu/qemu_driver.c (qemuDomainSetInterfaceParameters): Don't
free newBandwidth on success.
Reported by Hu Tao.
2012-01-05 10:21:34 -07:00
Yuri Chornoivan
524ba58bb9 Fix typos in messages.
https://bugzilla.redhat.com/show_bug.cgi?id=770954
2012-01-03 20:30:33 -07:00
Eric Blake
851fc8139f qemu: fix block stat naming
Typo has existed since API introduction in commit ee0d8c3.

* src/qemu/qemu_driver.c (qemuDomainBlockStatsFlags): Use correct
name.
2012-01-02 20:43:07 -07:00
Eric Blake
269ce467fc domiftune: clean up previous patches
Most severe here is a latent (but currently untriggered) memory leak
if any hypervisor ever adds a string interface property; the
remainder are mainly cosmetic.

* include/libvirt/libvirt.h.in (VIR_DOMAIN_BANDWIDTH_*): Move
macros closer to interface that uses them, and document type.
* src/libvirt.c (virDomainSetInterfaceParameters)
(virDomainGetInterfaceParameters): Formatting tweaks.
* daemon/remote.c (remoteDispatchDomainGetInterfaceParameters):
Avoid memory leak.
* src/libvirt_public.syms (LIBVIRT_0.9.9): Sort lines.
* src/libvirt_private.syms (domain_conf.h): Likewise.
* src/qemu/qemu_driver.c (qemuDomainSetInterfaceParameters): Fix
comments, break long lines.
2012-01-02 14:35:12 -07:00
Alex Jia
baade4cd2b qemu: Fix bandwidth memory leak on failure
Detected by Coverity. Leaks introduced in commit e8d6b29.

Signed-off-by: Alex Jia <ajia@redhat.com>
2011-12-31 16:42:23 -07:00
Eric Blake
8267aea5a6 qemu: fix blkio memory leak on failure
Leak detected by Coverity, and introduced in commit 93ab585.
Reported by Alex Jia.

* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters): Free
devices array on error.
2011-12-31 16:32:35 -07:00
Hu Tao
e8d6b293d8 domiftune: Add virDomain{S,G}etInterfaceParameters support to qemu driver
* src/qemu/qemu_driver.c: implement the qemu driver support
2011-12-29 18:28:47 +08:00
Eric Blake
1a3f6608aa qemu: fix inf-loop in blkio parameters
https://bugzilla.redhat.com/show_bug.cgi?id=770520

We had two nested loops both trying to use 'i' as the iteration
variable, which can result in an infinite loop when the inner
loop interferes with the outer loop.  Introduced in commit 93ab585.

* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters): Don't
reuse iteration variable across two loops.
2011-12-28 06:57:42 -07:00
Michal Privoznik
8a34f822e6 qemu: Keep list of USB devices attached to domains
In order to avoid situation where a USB device is
in use by two domains, we must keep a list of already
attached devices like we do for PCI.
2011-12-24 18:12:04 +01:00
Michal Privoznik
d8db0f9690 qemu: Support for overriding NOFILE limit
This patch adds max_files option to qemu.conf which can be used to
override system default limit on number of opened files that are
allowed for qemu user.
2011-12-22 17:49:04 +01:00
Osier Yang
a1a83c5874 qemu: Support readonly filesystem passthrough
Upstream QEMU starts to support it from commit 2c74c2cb.
2011-12-22 12:29:58 +08:00
Osier Yang
33eca17f6a qemu: Release the lock on domobj if fails on finding the disk path 2011-12-21 10:22:08 +08:00
Michael Ellerman
d64955a91a qemu: Add spapr-vio address assignment
Add logic to assign addresses for devices with spapr-vio addresses.

We also do validation of addresses specified by the user, ie. ensuring
that there are not duplicate addresses on the bus.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2011-12-20 16:09:21 -07:00
Michael Ellerman
7e4d896b5e Add address type for SPAPR VIO devices
For QEMU PPC64 we have a machine type ("pseries") which has a virtual
bus called "spapr-vio". We need to be able to create devices on this
bus, and as such need a way to specify the address for those devices.

This patch adds a new address type "spapr-vio", which achieves this.

The addressing is specified with a "reg" property in the address
definition. The reg is optional, if it is not specified QEMU will
auto-assign an address for the device.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2011-12-20 15:39:16 -07:00
Michael Ellerman
5abbe04d68 qemu: Add a capability flag for -no-acpi
Currently non-x86 guests must have <acpi/> defined in <features> to
prevent libvirt from running qemu with -no-acpi. Although it works, it
is a hack.

Instead add a capability flag which indicates whether qemu understands
the -no-acpi option. Use it to control whether libvirt emits -no-acpi.

Current versions of qemu always display -no-acpi in their help output,
so this patch has no effect. However the development version of qemu
has been modified such that -no-acpi is only displayed when it is
actually supported.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2011-12-20 12:33:55 -07:00
Hu Tao
6758a01b18 Implement virDomain{G, S}etNumaParameters for the qemu driver 2011-12-20 11:01:27 -07:00
Hu Tao
9d3a721ad5 use cpuset to manage numa
This patch also sets cgroup cpuset parameters for numatune.
2011-12-20 09:32:23 -07:00
Daniel P. Berrange
707781fe12 Only add the timer when a callback is registered
The lifetime of the virDomainEventState object is tied to
the lifetime of the driver, which in stateless drivers is
tied to the lifetime of the virConnectPtr.

If we add & remove a timer when allocating/freeing the
virDomainEventState object, we can get a situation where
the timer still triggers once after virDomainEventState
has been freed. The timeout callback can't keep a ref
on the event state though, since that would be a circular
reference.

The trick is to only register the timer when a callback
is registered with the event state & remove the timer
when the callback is unregistered.

The demo for the bug is to run

  while true ; do date ; ../tools/virsh -q -c test:///default 'shutdown test; undefine test; dominfo test' ; done

prior to this fix, it will frequently hang and / or
crash, or corrupt memory
2011-12-19 11:08:25 +00:00
Daniel P. Berrange
34ad13536e Hide use of timers for domain event dispatch
Currently all drivers using domain events need to provide a callback
for handling a timer to dispatch events in a clean stack. There is
no technical reason for dispatch to go via driver specific code. It
could trivially be dispatched directly from the domain event code,
thus removing tedious boilerplate code from all drivers

Also fix the libxl & xen drivers to pass 'true' when creating the
virDomainEventState, since they run inside the daemon & thus always
expect events to be present.

* src/conf/domain_event.c, src/conf/domain_event.h: Internalize
  dispatch of events from timer callback
* src/libxl/libxl_driver.c, src/lxc/lxc_driver.c,
  src/qemu/qemu_domain.c, src/qemu/qemu_driver.c,
  src/remote/remote_driver.c, src/test/test_driver.c,
  src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
  src/xen/xen_driver.c: Remove all timer dispatch functions
2011-12-19 11:08:24 +00:00
Daniel P. Berrange
7b87a30f15 Convert drivers to thread safe APIs for adding callbacks
* src/libxl/libxl_driver.c, src/lxc/lxc_driver.c,
  src/qemu/qemu_driver.c, src/remote/remote_driver.c,
  src/test/test_driver.c, src/uml/uml_driver.c,
  src/vbox/vbox_tmpl.c, src/xen/xen_driver.c: Convert
  to threadsafe APIs
2011-12-19 11:08:10 +00:00
Daniel P. Berrange
d09f6ba5fe Return count of callbacks when registering callbacks
When registering a callback for a particular event some callers
need to know how many callbacks already exist for that event.
While it is possible to ask for a count, this is not free from
race conditions when threaded. Thus the API for registering
callbacks should return the count of callbacks. Also rename
virDomainEventStateDeregisterAny to virDomainEventStateDeregisterID

* src/conf/domain_event.c, src/conf/domain_event.h,
  src/libvirt_private.syms: Return count of callbacks when
  registering callbacks
* src/libxl/libxl_driver.c, src/libxl/libxl_driver.c,
  src/qemu/qemu_driver.c, src/remote/remote_driver.c,
  src/remote/remote_driver.c, src/uml/uml_driver.c,
  src/vbox/vbox_tmpl.c, src/xen/xen_driver.c: Update
  for change in APIs
2011-12-19 11:08:10 +00:00
Peter Krempa
8fb2aeb662 migration: Add more specific error code/message on migration abort
A generic error code was returned, if the user aborted a migration job.
This made it hard to distinguish between a user requested abort and an
error that might have occured. This patch introduces a new error code,
which is returned in the specific case of a user abort, while leaving
all other failures with their existing code. This makes it easier to
distinguish between failure while mirgrating and an user requested
abort.

 * include/libvirt/virterror.h: - add new error code
 * src/util/virterror.c: - add message for the new error code
 * src/qemu/qemu_migration.h: - Emit operation aborted error instead of
                                operation failed, on migration abort
2011-12-16 16:38:26 +01:00
Eric Blake
d99fe011a2 qemu: detect truncated file as invalid save image
If managed save fails at the right point in time, then the save
image can end up with 0 bytes in length (no valid header), and
our attempts in commit 55d88def to detect and skip invalid save
files missed this case.

* src/qemu/qemu_driver.c (qemuDomainSaveImageOpen): Also unlink
empty file as corrupt.  Reported by Dennis Householder.
2011-12-16 08:29:31 -07:00
Michal Privoznik
13d5a6b83d qemu: Don't drop hostdev config until security label restore
Currently, on device detach, we parse given XML, find the device
in domain object, free it and try to restore security labels.
However, in some cases (e.g. usb hostdev) parsed XML contains
less information than freed device. In usb case it is bus & device
IDs. These are needed during label restoring as a symlink into
/dev/bus is generated from them. Therefore don't drop device
configuration until security labels are restored.
2011-12-16 11:53:03 +01:00
Jim Fehlig
d8916dc8e2 Fix default migration speed in qemu driver
In commit 6f84e110 I mistakenly set default migration speed to
33554432 Mb!  The units of migMaxBandwidth is Mb, with conversion
handled in qemuMonitor{JSON,Text}SetMigrationSpeed().

Also, remove definition of QEMU_DOMAIN_FILE_MIG_BANDWIDTH_MAX since
it is no longer used after reverting commit ef1065cf.
2011-12-15 11:25:07 -07:00
Jiri Denemark
6948b725e7 qemu: Fix race between async and query jobs
If an async job run on a domain will stop the domain at the end of the
job, a concurrently run query job can hang in qemu monitor and nothing
can be done with that domain from this point on. An attempt to start
such domain results in "Timed out during operation: cannot acquire state
change lock" error.

However, quite a few things have to happen at the right time... There
must be an async job running which stops a domain at the end. This race
was reported with dump --crash but other similar jobs, such as
(managed)save and migration, should be able to trigger this bug as well.
While this async job is processing its last monitor command, that is a
query-migrate to which qemu replies with status "completed", a new
libvirt API that results in a query job must arrive and stay waiting
until the query-migrate command finishes. Once query-migrate is done but
before the async job closes qemu monitor while stopping the domain, the
other thread needs to wake up and call qemuMonitorSend to send its
command to qemu. Before qemu gets a chance to respond to this command,
the async job needs to close the monitor. At this point, the query job
thread is waiting for a condition that no-one will ever signal so it
never finishes the job.
2011-12-15 11:53:20 +01:00
Osier Yang
3f29d6c91f qemu: Do not free the device from activePciHostdevs if it's in use
* src/qemu/qemu_hostdev.c (qemuDomainReAttachHostdevDevices):
pciDeviceListFree(pcidevs) in the end free()s the device even if
it's in use by other domain, which can cause a race.

How to reproduce:

<script>

virsh nodedev-dettach pci_0000_00_19_0
virsh start test
virsh attach-device test hostdev.xml
virsh start test2

for i in {1..5}; do
        echo "[ -- ${i}th time --]"
        virsh nodedev-reattach pci_0000_00_19_0
done

echo "clean up"
virsh destroy test
virsh nodedev-reattach pci_0000_00_19_0
</script>

Device pci_0000_00_19_0 dettached

Domain test started

Device attached successfully

error: Failed to start domain test2
error: Requested operation is not valid: PCI device 0000:00:19.0 is in use by domain test

[ -- 1th time --]
Device pci_0000_00_19_0 re-attached

[ -- 2th time --]
Device pci_0000_00_19_0 re-attached

[ -- 3th time --]
Device pci_0000_00_19_0 re-attached

[ -- 4th time --]
Device pci_0000_00_19_0 re-attached

[ -- 5th time --]
Device pci_0000_00_19_0 re-attached

clean up
Domain test destroyed

Device pci_0000_00_19_0 re-attached

The patch also fixes another problem, there won't be error like
"qemuDomainReAttachHostdevDevices: Not reattaching active
device 0000:00:19.0" in daemon log if some device is in active.
As pciResetDevice and pciReattachDevice won't be called for
the device anymore. This is sensible as we already reported
error when preparing the device if it's active. Blindly trying
to pciResetDevice & pciReattachDevice on the device and getting
an error is just redundant.
2011-12-15 10:18:20 +08:00
Osier Yang
a0aec362e8 qemu: Honor the original properties of PCI device when detaching
This patch fixes two problems:
    1) The device will be reattached to host even if it's not
       managed, as there is a "pciDeviceSetManaged".
    2) The device won't be reattached to host with original
       driver properly. As it doesn't honor the device original
       properties which are maintained by driver->activePciHostdevs.
2011-12-15 10:14:11 +08:00
Lei Li
ae52342754 Provide a helper method virDomainLiveConfigHelperMethod
This chunk of code below repeated in several functions, factor it into
a helper method virDomainLiveConfigHelperMethod to eliminate duplicated code
based on Eric and Adam's suggestion. I have tested it for all the
relevant APIs changed.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
2011-12-13 15:10:42 -07:00
Jiri Denemark
5547d2b81c qemu: Disable EOF processing during qemuDomainDestroy
When destroying a domain qemuDomainDestroy kills its qemu process and
starts a new job, which means it unlocks the domain object and locks it
again after some time. Although the object is usually unlocked for a
pretty short time, chances are another thread processing an EOF event on
qemu monitor is able to lock the object first and does all the cleanup
by itself. This leads to wrong shutoff reason and lifecycle event detail
and virDomainDestroy API incorrectly reporting failure to destroy an
inactive domain.

Reported by Charlie Smurthwaite.
2011-12-12 16:31:19 +01:00
Michael Ellerman
9f406c5838 qemu: Prepare to cater for more general address assignment
Currently qemuDomainAssignPCIAddresses() is called to assign addresses
to PCI devices.

We need to do something similar for devices with spapr-vio addresses.
So create one place where address assignment will be done, that is
qemuDomainAssignAddresses().

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2011-12-09 15:01:52 -07:00
Michael Ellerman
2a994a3b1e qemu: Add address in qemuBuildChrDeviceStr() on pseries
For the PPC64 pseries machine type we need to add address information
for the spapr-vty device.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2011-12-09 13:27:57 -07:00
Michael Ellerman
e1636f47ae qemu: Use spapr-vscsi on pseries machine type
On the PPC64 pseries machine type we need to use the spapr-vscsi device
rather than an lsi.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2011-12-09 13:03:33 -07:00
Stefan Berger
84f5633312 fix error when parsing ppc64 models on x86 host
When parsing ppc64 models on an x86 host an out-of-memory error message is displayed due
to it checking for retcpus being NULL. Fix this by removing the check whether retcpus is NULL
since we will realloc into this variable.
Also in the X86 model parser display the OOM error at the location where it happens.
2011-12-09 12:18:58 -05:00
Stefan Berger
33eb3567dd Pass the VM's UUID into the nwfilter subsystem
A preparatory patch for DHCP snooping where we want to be able to
differentiate between a VM's interface using the tuple of
<VM UUID, Interface MAC address>. We assume that MAC addresses could
possibly be re-used between different networks (VLANs) thus do not only
want to rely on the MAC address to identify an interface.

At the current 'final destination' in virNWFilterInstantiate I am leaving
the vmuuid parameter as ATTRIBUTE_UNUSED until the DHCP snooping patches arrive.
(we may not post the DHCP snooping patches for 0.9.9, though)

Mostly this is a pretty trivial patch. On the lowest layers, in lxc_driver
and uml_conf, I am passing the virDomainDefPtr around until I am passing
only the VM's uuid into the NWFilter calls.
2011-12-08 21:35:20 -05:00
Stefan Berger
95ff5899b9 nwfilter: cleanup return codes in nwfilter subsystem
This patch cleans up return codes in the nwfilter subsystem.

Some functions in nwfilter_conf.c (validators and formatters) are
keeping their bool return for now and I am converting their return
code to true/false.

All other functions now have failure return codes of -1 and success
of 0.

[I searched for all occurences of ' 1;' and checked all 'if ' and
adapted where needed. After that I did a grep for 'NWFilter' in the source
tree.]
2011-12-08 21:26:34 -05:00
Prerna Saxena
5e6ce1c936 Clean up qemuBuildCommandLine to remove x86-specific
assumptions from generic code.

This implements the minimal set of changes needed in libvirt to launch a
PowerPC-KVM based guest.
It removes x86-specific assumptions about choice of serial driver backend
from generic qemu guest commandline generation code.
It also restricts the ACPI capability to be available for an x86 or
x86_64 domain.
This is not a complete solution -- it still does not guarantee libvirt
the capability to flag non-supported options in guest XML. (Eg, an ACPI
specification in a PowerPC guest XML will still get processed, even
though qemu-system-ppc64 does not support it while qemu-system-x86_64 does.)
This drawback exists because libvirt falls back on qemu to query supported
features, and qemu '-h' blindly lists all capabilities -- irrespective
of whether they are available while emulating a given architecture or not.
The long-term solution would be for qemu to list out capabilities based
on architecture and platform -- so that libvirt can cleanly make out what
devices are supported on an arch (say 'ppc64') and platform (say, 'mac99').

Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
2011-12-08 08:39:26 -05:00
Prerna Saxena
9bb8064dff Add support for ppc64 qemu
This enables libvirt to select the correct qemu binary (qemu-system-ppc64)
for a guest vm based on arch 'ppc64'.
Also, libvirt is enabled to correctly parse the list of supported PowerPC
CPUs, generated by running 'qemu-system-ppc64 -cpu ?'

Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
Acked-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
2011-12-08 08:39:26 -05:00
Jim Fehlig
284230199a Prevent crash of libvirtd when attaching to existing qemu process
With security_driver set to "none" in /etc/libvirt/qemu.conf,
libvirtd would crash when attempted to attach to an existing
qemu process.  Only copy the security model if it actually exists.
2011-12-07 11:23:03 -07:00
Jiri Denemark
97652044af qemu: Ignore shutdown event from destroyed domain
During virDomainDestroy, QEMU may emit SHUTDOWN event as a response to
SIGTERM and since domain object is still locked, the event is processed
after the domain is destroyed. We need to ignore this event in such case
to avoid changing domain state from shutoff to shutdown.
2011-12-07 14:45:22 +01:00
Jiri Denemark
38527c9ae0 qemu: Rework handling of shutdown event
When QEMU guest finishes its shutdown sequence, qemu stops virtual CPUs
and when started with -no-shutdown waits for us to kill it using
SGITERM. Since QEMU is flushing its internal buffers, some time may pass
before QEMU actually dies. We mistakenly used "paused" state (and
events) for this which is quite confusing since users may see a domain
going to pause while they expect it to shutdown. Since we already have
"shutdown" state with "the domain is being shut down" semantics, we
should use it for this state.

However, the state didn't have a corresponding event so I created one
and called its detail as VIR_DOMAIN_EVENT_SHUTDOWN_FINISHED (guest OS
finished its shutdown sequence) with the intent to add
VIR_DOMAIN_EVENT_SHUTDOWN_STARTED in the future if we have a
sufficiently capable guest agent that can notify us when guest OS starts
to shutdown.
2011-12-05 14:14:31 +01:00
Jiri Denemark
dd8e895606 Add support for QEMU 1.0 2011-12-05 13:02:54 +01:00
Lei Li
ac6b368d8a Fix a logic error for setting block I/O
Fix a logic error, the initial value of ret = -1, if just set --config,
it will goto endjob directly without doing its really job here.

Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
2011-12-01 08:01:16 -07:00
Alex Jia
7b811a74c6 qemu: Plug memory leak onqemuProcessWaitForMonitor() error path
Detected by Coverity. Leak introduced in commit 109efd7.

Signed-off-by: Alex Jia <ajia@redhat.com>
2011-11-30 14:39:36 -07:00
Hu Tao
25a5f07c69 qemu: filter blkio 0-device-weight at two other places
filter 0-device-weight when:

  - getting blkio parameters with --config
  - starting up a domain

When testing with blkio, I found these issues:

  (dom is down)
  virsh blkiotune dom --device-weights /dev/sda,300,/dev/sdb,500
  virsh blkiotune dom --device-weights /dev/sda,300,/dev/sdb,0
  virsh blkiotune dom
  weight         : 800
  device_weight  : /dev/sda,200,/dev/sdb,0

  # issue 1: shows 0 device weight of /dev/sdb that may confuse user

  (continued)
  virsh start dom

  # issue 2: If /dev/sdb doesn't exist, libvirt refuses to bring the
  # dom up because it wants to set the device weight to 0 of a
  # non-existing device. Since 0 means no weight-limit, we really don't
  # have to set it.
2011-11-30 12:34:30 -07:00
Eric Blake
22cf6d46f4 qemu: amend existing table of device weights
Prior to this patch, for a running dom, the commands:

$ virsh blkiotune dom --device-weights /dev/sda,502,/dev/sdb,498
$ virsh blkiotune dom --device-weights /dev/sda,503
$ virsh blkiotune dom
weight         : 500
device_weight  : /dev/sda,503

claim that /dev/sdb no longer has a non-default weight, but
directly querying cgroups says otherwise:

$ cat /cgroup/blkio/libvirt/qemu/dom/blkio.weight_device
8:0     503
8:16    498

After this patch, an explicit 0 is required to remove a device path
from the XML, and omitting a device path that was previously
specified leaves that device path untouched in the XML, to match
cgroups behavior.

* src/qemu/qemu_driver.c (parseBlkioWeightDeviceStr): Rename...
(qemuDomainParseDeviceWeightStr): ...and use correct type.
(qemuDomainSetBlkioParameters): After parsing string, modify
rather than replacing existing table.
* tools/virsh.pod (blkiotune): Tweak wording.
2011-11-30 12:18:18 -07:00
Lei Li
eca96694a7 Implement virDomain{Set, Get}BlockIoTune for the qemu driver
Implement the block I/O throttle setting and getting support to qemu
driver.

Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2011-11-30 11:36:10 -07:00
Daniel P. Berrange
a8bb75a3e6 Remove time APIs from src/util/util.h
The virTimestamp and virTimeMs functions in src/util/util.h
duplicate functionality from virtime.h, in a non-async signal
safe manner. Remove them, and convert all code over to the new
APIs.

* src/util/util.c, src/util/util.h: Delete virTimeMs and virTimestamp
* src/lxc/lxc_driver.c, src/qemu/qemu_domain.c,
  src/qemu/qemu_driver.c, src/qemu/qemu_migration.c,
  src/qemu/qemu_process.c, src/util/event_poll.c: Convert to use
  virtime APIs
2011-11-30 11:43:50 +00:00
Daniel P. Berrange
f1f28611f1 Remove powerMgmt_valid field from capabilities struct
If we ensure that virNodeSuspendGetTargetMask always resets
*bitmask to zero upon failure, there is no need for the
powerMgmt_valid field.

* src/util/virnodesuspend.c: Ensure *bitmask is zero upon
  failure
* src/conf/capabilities.c, src/conf/capabilities.h: Remove
  powerMgmt_valid field
* src/qemu/qemu_capabilities.c: Remove powerMgmt_valid
2011-11-30 10:12:30 +00:00
Daniel P. Berrange
c92653f4dd Move suspend capabilities APIs out of util.h into virnodesuspend.c
The node suspend capabilities APIs should not have been put into
util.[ch]. Instead move them into virnodesuspend.[ch]

* src/util/util.c, src/util/util.h: Remove suspend capabilities APIs
* src/util/virnodesuspend.c, src/util/virnodesuspend.h: Add
  suspend capabilities APIs
* src/qemu/qemu_capabilities.c: Include virnodesuspend.h
2011-11-30 10:12:29 +00:00
Daniel P. Berrange
53c2aad88b Rename suspend capabilities APIs
Rename virGetPMCapabilities to virNodeSuspendGetTargetMask and
virDiscoverHostPMFeature to virNodeSuspendSupportsTarget.

* src/util/util.c, src/util/util.h: Rename APIs
* src/qemu/qemu_capabilities.c, src/util/virnodesuspend.c: Adjust
  for new names
2011-11-30 10:12:29 +00:00
Hu Tao
93ab58595d blkiotune: add qemu support for blkiotune.device_weight
Implement setting/getting per-device blkio weights in qemu,
using the cgroups blkio.weight_device tunable.
2011-11-29 12:26:21 -07:00
Eric Blake
659ded58ed qemu: fix blkiotune --live --config
Without this,  'virsh blkiotune --live --config --weight=n'
only affected live.

* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters): Allow
setting both configurations at once.
2011-11-29 10:54:29 -07:00
Eric Blake
51727c1dc0 qemu, lxc: drop redundant checks
After the previous patch, there are now some redundant checks.

* src/qemu/qemu_driver.c (qemudDomainGetVcpuPinInfo)
(qemuGetSchedulerParametersFlags): Drop checks now guaranteed by
libvirt.c.
* src/lxc/lxc_driver.c (lxcGetSchedulerParametersFlags):
Likewise.
2011-11-29 10:54:29 -07:00
Osier Yang
d1a6c77aca block_resize: Implement qemu driver method
It requires the domain is running, otherwise fails. Resize to a lower
size is supported, but should be used with extreme caution.

In order to prohibit the "size" overflowing after multiplied by
1024. We do checking in the codes. For QMP mode, the default units
is Bytes, the passed size needs to be multiplied by 1024, however,
for HMP mode, the default units is "Megabytes", the passed "size"
needs to be divided by 1024 then.
2011-11-29 21:45:18 +08:00
Osier Yang
4fa36f1392 block_resize: Implement qemu monitor functions
Implements functions for both HMP and QMP mode.

For HMP mode, qemu uses "M" as the units by default, so the passed "sized"
is divided by 1024.

For QMP mode, qemu uses "Bytes" as the units by default, the passed "sized"
is multiplied by 1024.

All of the monitor functions return -1 on failure, 0 on success, or -2 if
not supported.
2011-11-29 21:45:11 +08:00
Srivatsa S. Bhat
4ddb37c395 Implement the core API to suspend/resume the host
Add the core functions that implement the functionality of the API.
Suspend is done by using an asynchronous mechanism so that we can return
the status to the caller before the host gets suspended. This asynchronous
operation is achieved by suspending the host in a separate thread of
execution. However, returning the status to the caller is only best-effort,
but not guaranteed.

To resume the host, an RTC alarm is set up (based on how long we want to
suspend) before suspending the host. When this alarm fires, the host
gets woken up.

Suspend-to-RAM operation on a host running Linux can take upto more than 20
seconds, depending on the load of the system. (Freezing of tasks, an operation
preceding any suspend operation, is given up after a 20 second timeout).
And Suspend-to-Disk can take even more time, considering the time required
for compaction, creating the memory image and writing it to disk etc.
So, we do not allow the user to specify a suspend duration of less than 60
seconds, to be on the safer side, since we don't want to prematurely declare
failure when we only had to wait for some more time.
2011-11-29 17:29:17 +08:00
Jiri Denemark
2c4cdb736c Fix version numbers for isAlive and setKeepAlive driver APIs 2011-11-24 14:44:59 +01:00
Jiri Denemark
3a6a262428 qemu: Cancel p2p migration when connection breaks
If a connection to destination host is lost during peer-to-peer
migration (because keepalive protocol timed out), we won't be able to
finish the migration and it doesn't make sense to wait for qemu to
transmit all data. This patch automatically cancels such migration
without waiting for virDomainAbortJob to be called.
2011-11-24 12:00:10 +01:00
Jiri Denemark
1e62643719 qemu: Add support for keepalive messages during p2p migration 2011-11-24 12:00:10 +01:00
Jiri Denemark
e401b0cd02 Implement virConnectIsAlive in all drivers 2011-11-24 12:00:10 +01:00
Peter Krempa
c4b32641f1 qemu: Avoid dereference of NULL pointer
If something fails while initializing qemu job object in
qemuDomainObjPrivateAlloc(), memory to the private pointer is freed, but
after that, the pointer is still dereferenced, which may result in a
segfault.

* qemuDomainObjPrivateAlloc() - Don't dereference NULL pointer.
2011-11-23 16:19:48 +01:00
Eric Blake
db2f680775 qemu: fix a const-correctness issue
Generally, functions which return malloc'd strings should be typed
as 'char *', not 'const char *', to make it obvious that the caller
is responsible to free things.  free(const char *) fails to compile,
and although we have a cast embedded in VIR_FREE to work around poor
code that frees const char *, it's better to not rely on that hack.

* src/qemu/qemu_driver.c (qemuDiskPathToAlias): Change return type.
(qemuDomainBlockJobImpl): Update caller.
2011-11-23 07:29:45 -07:00
Eric Blake
c725e2dc5a blockstats: support lookup by path in blockstats
Commit 89b6284f made it possible to pass either a source name or
the target device to most API demanding a disk designation, but
forgot to update the documentation.  It also failed to update
virDomainBlockStats to take both forms. This patch fixes both the
documentation and the remaining function.

Xen continues to use just device shorthand (that is, I did not
implement path lookup there, since xen does not track a domain_conf
to quickly tie a path back to the device shorthand).

* src/libvirt.c (virDomainBlockStats, virDomainBlockStatsFlags)
(virDomainGetBlockInfo, virDomainBlockPeek)
(virDomainBlockJobAbort, virDomainGetBlockJobInfo)
(virDomainBlockJobSetSpeed, virDomainBlockPull): Document
acceptable disk naming conventions.
* src/qemu/qemu_driver.c (qemuDomainBlockStats)
(qemuDomainBlockStatsFlags): Allow lookup by source name.
* src/test/test_driver.c (testDomainBlockStats): Likewise.
2011-11-23 06:10:30 -07:00
Srivatsa S. Bhat
e352b16400 Export KVM Host Power Management capabilities
This patch exports KVM Host Power Management capabilities as XML so that
higher-level systems management software can make use of these features
available in the host.

The script "pm-is-supported" (from pm-utils package) is run to discover if
Suspend-to-RAM (S3) or Suspend-to-Disk (S4) is supported by the host.
If either of them are supported, then a new tag "<power_management>" is
introduced in the XML under the <host> tag.

However in case the query to check for power management features succeeded,
but the host does not support any such feature, then the XML will contain
an empty <power_management/> tag. In the event that the PM query itself
failed, the XML will not contain any "power_management" tag.

To use this, new APIs could be implemented in libvirt to exploit power
management features such as S3/S4.
2011-11-22 11:31:22 +08:00
Roopa Prabhu
334c539ba0 qemu: don't release network actual device twice
For direct attach devices, in qemuBuildCommandLine, we seem to be freeing
actual device on error path (with networkReleaseActualDevice). But the actual
device is not deleted.

qemuProcessStop eventually deletes the direct attach device and releases
actual device. But by the time qemuProcessStop is called qemuBuildCommandLine
has already freed actual device, leaving stray macvtap devices behind on error.
So the simplest fix is to remove the networkReleaseActualDevice in
qemuBuildCommandLine. This patch does just that.

Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
2011-11-21 14:42:33 -07:00
Michal Privoznik
2e37bf42d2 qemu: Copy console definition from serial
Now, when we support multiple consoles per domain,
the vm->def->console[0] can still remain an alias
for vm->def->serial[0]; However, we need to copy
it's source definition as well otherwise we'll regress
on virDomainOpenConsole.
2011-11-21 14:53:13 +01:00
Hu Tao
41a2636aa3 enable cgroup cpuset by default
This prepares for subsequent patches which introduce dependence
on cgroup cpuset. Enable cgroup cpuset by default so users don't
have to modify configuration file before encountering a cpuset
error.
2011-11-18 11:38:19 -07:00
Daniel P. Berrange
6ec8288a96 Allow creation of plain macvlan devices
Update virNetDevMacVLanCreateWithVPortProfile to allow creation
of plain macvlan devices, as well as macvtap devices. The former
is useful for LXC containers

* src/qemu/qemu_command.c: Explicitly request a macvtap device
* src/util/virnetdevmacvlan.c, src/util/virnetdevmacvlan.h: Add
  new flag to allow switching between macvlan and macvtap
  creation
2011-11-18 16:10:37 +00:00
Daniel P. Berrange
191090ae27 Rename high level macvlan creation APIs
Rename virNetDevMacVLanCreate to virNetDevMacVLanCreateWithVPortProfile
and virNetDevMacVLanDelete to virNetDevMacVLanDeleteWithVPortProfile

To make way for renaming the other macvlan creation APIs in
interface.c

* util/virnetdevmacvlan.c, util/virnetdevmacvlan.h,
  qemu/qemu_command.c, qemu/qemu_hotplug.c, qemu/qemu_process.c:
  Rename APIs
2011-11-18 16:10:02 +00:00
Daniel P. Berrange
896104c9f0 Rename and split the macvtap.c file
Rename the macvtap.c file to virnetdevmacvlan.c to reflect its
functionality. Move the port profile association code out into
virnetdevvportprofile.c. Make the APIs available unconditionally
to callers

* src/util/macvtap.h: rename to src/util/virnetdevmacvlan.h,
* src/util/macvtap.c: rename to src/util/virnetdevmacvlan.c
* src/util/virnetdevvportprofile.c, src/util/virnetdevvportprofile.h:
  Pull in vport association code
* src/Makefile.am, src/conf/domain_conf.h, src/qemu/qemu_conf.c,
  src/qemu/qemu_conf.h, src/qemu/qemu_driver.c: Update include
  paths & remove conditional compilation
2011-11-18 16:10:01 +00:00
Daniel P. Berrange
43925db7ca Rename Macvtap management APIs
In preparation for code re-organization, rename the Macvtap
management APIs to have the following patterns

  virNetDevMacVLanXXXXX     - macvlan/macvtap interface management
  virNetDevVPortProfileXXXX - virtual port profile management

* src/util/macvtap.c, src/util/macvtap.h: Rename APIs
* src/conf/domain_conf.c, src/network/bridge_driver.c,
  src/qemu/qemu_command.c, src/qemu/qemu_command.h,
  src/qemu/qemu_driver.c, src/qemu/qemu_hotplug.c,
  src/qemu/qemu_migration.c, src/qemu/qemu_process.c,
  src/qemu/qemu_process.h: Update for renamed APIs
2011-11-18 16:10:01 +00:00
Daniel P. Berrange
a7c6ce0d52 Fix use of uninitialized variable in QEMU driver 2011-11-18 16:09:35 +00:00
Bharata B Rao
9b6bb0fef6 qemu: Generate -numa option
Add routines to generate -numa QEMU command line option based on
<numa> ... </numa> XML specifications.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
2011-11-17 13:47:11 -07:00
Sage Weil
5745dc123a qemu/rbd: improve rbd device specification
This improves the support for qemu rbd devices by adding support for a few
key features (e.g., authentication) and cleaning up the way in which
rbd configuration options are passed to qemu.

An <auth> member of the disk source xml specifies how librbd should
authenticate. The username attribute is the Ceph/RBD user to authenticate as.
The usage or uuid attributes specify which secret to use. Usage is an
arbitrary identifier local to libvirt.

The old RBD support relied on setting an environment variable to
communicate information to qemu/librbd.  Instead, pass those options
explicitly to qemu.  Update the qemu argument parsing and tests
accordingly.

Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-11-15 17:06:42 -07:00
Daniel P. Berrange
d3406045fd Split src/util/network.{c,h} into 5 pieces
The src/util/network.c file is a dumping ground for many different
APIs. Split it up into 5 pieces, along functional lines

 - src/util/virnetdevbandwidth.c: virNetDevBandwidth type & helper APIs
 - src/util/virnetdevvportprofile.c: virNetDevVPortProfile type & helper APIs
 - src/util/virsocketaddr.c: virSocketAddr and APIs
 - src/conf/netdev_bandwidth_conf.c: XML parsing / formatting
   for virNetDevBandwidth
 - src/conf/netdev_vport_profile_conf.c: XML parsing / formatting
   for virNetDevVPortProfile

* src/util/network.c, src/util/network.h: Split into 5 pieces
* src/conf/netdev_bandwidth_conf.c, src/conf/netdev_bandwidth_conf.h,
  src/conf/netdev_vport_profile_conf.c, src/conf/netdev_vport_profile_conf.h,
  src/util/virnetdevbandwidth.c, src/util/virnetdevbandwidth.h,
  src/util/virnetdevvportprofile.c, src/util/virnetdevvportprofile.h,
  src/util/virsocketaddr.c, src/util/virsocketaddr.h: New pieces
* daemon/libvirtd.h, daemon/remote.c, src/conf/domain_conf.c,
  src/conf/domain_conf.h, src/conf/network_conf.c,
  src/conf/network_conf.h, src/conf/nwfilter_conf.h,
  src/esx/esx_util.h, src/network/bridge_driver.c,
  src/qemu/qemu_conf.c, src/rpc/virnetsocket.c,
  src/rpc/virnetsocket.h, src/util/dnsmasq.h, src/util/interface.h,
  src/util/iptables.h, src/util/macvtap.c, src/util/macvtap.h,
  src/util/virnetdev.h, src/util/virnetdevtap.c,
  tools/virsh.c: Update include files
2011-11-15 10:27:54 +00:00
Daniel P. Berrange
767e01ceb1 Rename virVirtualPortProfileParams & APIs
Rename the virVirtualPortProfileParams struct to be
virNetDevVPortProfile, and rename the APIs to match
this prefix.

* src/util/network.c, src/util/network.h: Rename port profile
  APIs
* src/conf/domain_conf.c, src/conf/domain_conf.h,
  src/conf/network_conf.c, src/conf/network_conf.h,
  src/network/bridge_driver.c, src/qemu/qemu_hotplug.c,
  src/util/macvtap.c, src/util/macvtap.h: Update for
  renamed APIs/structs
2011-11-15 10:10:05 +00:00
Michael Wood
be622a63cd PATCH: Fix build without MACVTAP
Hi

Commit c31d23a787 removed the "conn"
parameter from qemuPhysIfaceConnect(), but it's still used if
WITH_MACVTAP is false.  Also, it's still mentioned in the comment
above the function:

/**
 * qemuPhysIfaceConnect:
 * @def: the definition of the VM (needed by 802.1Qbh and audit)
 * @conn: pointer to virConnect object
 * @driver: pointer to the qemud_driver
 * @net: pointer to he VM's interface description with direct device type
 * @qemuCaps: flags for qemu
 *
 * Returns a filedescriptor on success or -1 in case of error.
 */
int
qemuPhysIfaceConnect(virDomainDefPtr def,
                     struct qemud_driver *driver,
                     virDomainNetDefPtr net,
                     virBitmapPtr qemuCaps,
                     enum virVMOperationType vmop)
{
    int rc;
#if WITH_MACVTAP
[...]
#else
    (void)def;
    (void)conn;
    (void)net;
    (void)qemuCaps;
    (void)driver;
    (void)vmop;
    qemuReportError(VIR_ERR_INTERNAL_ERROR,
                    "%s", _("No support for macvtap device"));
    rc = -1;
#endif
    return rc;
}

--
Michael Wood <esiotrot@gmail.com>

From f4fc43b4111a4c099395c55902e497b8965e2b53 Mon Sep 17 00:00:00 2001
From: Michael Wood <esiotrot@gmail.com>
Date: Sat, 12 Nov 2011 13:37:53 +0200
Subject: [PATCH] Fix build without MACVTAP.
2011-11-14 15:25:33 -05:00
Eric Blake
342c09578a API: add trivial qemu support for VIR_TYPED_PARAM_STRING
Qemu will be the first driver to make use of a typed string in the
next round of additions.  Separate out the trivial addition.

* src/qemu/qemu_driver.c (qemudSupportsFeature): Advertise feature.
(qemuDomainGetBlkioParameters, qemuDomainGetMemoryParameters)
(qemuGetSchedulerParametersFlags, qemudDomainBlockStatsFlags):
Allow typed strings flag where trivially supported.
2011-11-11 17:27:04 -07:00
Eric Blake
61f2b6ba5f qemu: fix domjobabort regression
This reverts commit ef1065cf5ac; see also this bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=751900

In qemu 0.15.1 and earlier, during migration to file, the
qemu_savevm_state_begin and qemu_savevm_state_iterate methods
will both process as much migration data as possible until either

  1. The file descriptor returns EAGAIN
  2. The bandwidth rate limit is reached

If we set the rate limit to ULONG_MAX, test 2 never becomes true. We're
passing a plain file descriptor to QEMU and POSIX does not support EAGAIN on
regular files / block devices, so test 1 never becomes true either.

In the 'virsh save --bypass-cache' case, we pass a pipe instead of a
regular fd, but using a pipe adds I/O overhead, so always passing a
pipe just so qemu can see EAGAIN doesn't seem nice.

The ultimate fix needs to come from qemu - background migration must
respect asynchronous abort requests, or else periodically return
control to the main handling loop without an EAGAIN and without
waiting to hit an insanely large amount of data.  But until a
version of qemu is fixed to support "unlimited" data rates while
still allowing cancellation, the best we can do is avoid the
automatic use of unlimited rates from within libvirt (users can
still explicitly change the migration rates, if they are aware that
they are giving up the ability to cancel a job).

Reverting the lone use of QEMU_DOMAIN_FILE_MIG_BANDWIDTH_MAX is
the simplest patch; this slows migration back down to a default
32M/sec cap, but also ensures that the main qemu processing loop
will still be responsive to cancellation requests.  Hopefully
upstream qemu will provide us a means of safely using unlimited
speed, including a runtime probe of that capability.

* src/qemu/qemu_migration.c (qemuMigrationToFile): Revert attempt
to use unlimited migration bandwidth when migrating to file.

Signed-off-by: Daniel Veillard <veillard@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2011-11-11 16:43:24 -07:00
Eric Blake
e55ec69de6 build: drop useless dirent.h includes
* .gnulib: Update to latest, for improved syntax-check.
* src/lxc/lxc_container.c (includes): Drop unused include.
* src/network/bridge_driver.c: Likewise.
* src/node_device/node_device_linux_sysfs.c: Likewise.
* src/openvz/openvz_driver.c: Likewise.
* src/qemu/qemu_conf.c: Likewise.
* src/storage/storage_backend_iscsi.c: Likewise.
* src/storage/storage_backend_mpath.c: Likewise.
* src/uml/uml_conf.c: Likewise.
* src/uml/uml_driver.c: Likewise.
2011-11-11 14:12:37 -07:00
Stefan Berger
c31d23a787 Remove code instantiating filters on direct interfaces
Remove the code that instantiates network filters on direct type
of interfaces. The parser already does not accept it.
2011-11-10 11:16:22 -05:00
Daniel P. Berrange
0eee075dc7 Adjust naming of network device bandwidth management APIs
Rename virBandwidth to virNetDevBandwidth, and virRate to
virNetDevBandwidthRate.

* src/util/network.c, src/util/network.h: Rename bandwidth
  structs and APIs
* src/conf/domain_conf.c, src/conf/domain_conf.h,
  src/conf/network_conf.c, src/conf/network_conf.h,
  src/lxc/lxc_driver.c, src/network/bridge_driver.c,
  src/qemu/qemu_command.c, src/util/macvtap.c,
  src/util/macvtap.h, tools/virsh.c: Update for API changes.
2011-11-09 17:10:28 +00:00
Daniel P. Berrange
4c544e6c61 Santize naming of socket address APIs
The socket address APIs in src/util/network.h either take the
form  virSocketAddrXXX, virSocketXXX or virSocketXXXAddr.

Sanitize this so everything is virSocketAddrXXXX, and ensure
that the virSocketAddr parameter is always the first one.

* src/util/network.c, src/util/network.h: Santize socket
  address API naming
* src/conf/domain_conf.c, src/conf/network_conf.c,
  src/conf/nwfilter_conf.c, src/network/bridge_driver.c,
  src/nwfilter/nwfilter_ebiptables_driver.c,
  src/nwfilter/nwfilter_learnipaddr.c,
  src/qemu/qemu_command.c, src/rpc/virnetsocket.c,
  src/util/dnsmasq.c, src/util/iptables.c,
  src/util/virnetdev.c, src/vbox/vbox_tmpl.c: Update for
  API renaming
2011-11-09 17:10:23 +00:00
Daniel P. Berrange
e49c9bf25c Split bridge.h into three separate files
Following the renaming of the bridge management APIs, we can now
split the source file into 3 corresponding pieces

 * src/util/virnetdev.c: APIs for any type of network interface
 * src/util/virnetdevbridge.c: APIs for bridge interfaces
 * src/util/virnetdevtap.c: APIs for TAP interfaces

* src/util/virnetdev.c, src/util/virnetdev.h,
  src/util/virnetdevbridge.c, src/util/virnetdevbridge.h,
  src/util/virnetdevtap.c, src/util/virnetdevtap.h: Copied
  from bridge.{c,h}
* src/util/bridge.c, src/util/bridge.h: Split into 3 pieces
* src/lxc/lxc_driver.c, src/network/bridge_driver.c,
  src/openvz/openvz_driver.c, src/qemu/qemu_command.c,
  src/qemu/qemu_conf.h, src/uml/uml_conf.c, src/uml/uml_conf.h,
  src/uml/uml_driver.c: Update #include directives
2011-11-09 16:34:25 +00:00
Daniel P. Berrange
dced27c89e Rename all brXXXX APIs to follow new convention
The existing brXXX APIs in src/util/bridge.h are renamed to
follow one of three different conventions

 - virNetDevXXX       - operations for any type of interface
 - virNetDevBridgeXXX - operations for bridge interfaces
 - virNetDevTapXXX    - operations for tap interfaces

* src/util/bridge.h, src/util/bridge.c: Rename all APIs
* src/lxc/lxc_driver.c, src/network/bridge_driver.c,
  src/qemu/qemu_command.c, src/uml/uml_conf.c,
  src/uml/uml_driver.c: Update for API renaming
2011-11-09 16:33:28 +00:00
Daniel P. Berrange
4f4fd8f7ad Make all brXXX APIs raise errors, instead of returning errnos
Currently every caller of the brXXX APIs has to store the returned
errno value and then raise an error message. This results in
inconsistent error messages across drivers, additional burden on
the callers and makes the error reporting inaccurate since it is
hard to distinguish different scenarios from 1 errno value.

* src/util/bridge.c: Raise errors instead of returning errnos
* src/lxc/lxc_driver.c, src/network/bridge_driver.c,
  src/qemu/qemu_command.c, src/uml/uml_conf.c,
  src/uml/uml_driver.c: Remove error reporting code
2011-11-09 16:33:19 +00:00
Daniel P. Berrange
6cfeb9a766 Remove 'brControl' object
The bridge management APIs in src/util/bridge.c require a brControl
object to be passed around. This holds the file descriptor for the
control socket. This extra object complicates use of the API for
only a minor efficiency gain, which is in turn entirely offset by
the need to fork/exec the brctl command for STP configuration.

This patch removes the 'brControl' object entirely, instead opening
the control socket & closing it again within the scope of each method.

The parameter names for the APIs are also made to consistently use
'brname' for bridge device name, and 'ifname' for an interface
device name. Finally annotations are added for non-NULL parameters
and return check validation

* src/util/bridge.c, src/util/bridge.h: Remove brControl object
  and update API parameter names & annotations.
* src/lxc/lxc_driver.c, src/network/bridge_driver.c,
  src/uml/uml_conf.h, src/uml/uml_conf.c, src/uml/uml_driver.c,
  src/qemu/qemu_command.c, src/qemu/qemu_conf.h,
  src/qemu/qemu_driver.c: Remove reference to 'brControl' object
2011-11-09 16:33:14 +00:00
Osier Yang
5ab243b64f qemu: Fix improper error message for disk detaching
s/virDomainDeviceTypeToString/virDomainDiskDeviceTypeToString/

Report by Xu He Jie <xuhj@linux.vnet.ibm.
2011-11-09 13:59:31 +08:00