Commit Graph

368 Commits

Author SHA1 Message Date
Laine Stump
8cd40e7e0d qemu: allocate network connections sooner during domain startup
VFIO device assignment requires a cgroup ACL to be setup for access to
the /dev/vfio/nn "group" device for any devices that will be assigned
to a guest. In the case of a host device that is allocated from a
pool, it was being allocated during qemuBuildCommandLine(), which is
called by qemuProcessStart() *after* the all-encompassing
qemuSetupCgroup() was called, meaning that the standard Cgroup ACL
setup wasn't creating ACLs for these devices allocated from pools.

One possible solution was to manually add a single ACL down inside
qemuBuildCommandLine() when networkAllocateActualDevice() is called,
but that has two problems: 1) the function that adds the cgroup ACL
requires a virDomainObjPtr, which isn't available in
qemuBuildCommandLine(), and 2) we really shouldn't be doing network
device setup inside qemuBuildCommandLine() anyway.

Instead, I've created a new function called
qemuNetworkPrepareDevices() which is called just before
qemuPrepareHostDevices() during qemuProcessStart() (explanation of
ordering in the comments), i.e. well before the call to
qemuSetupCgroup(). To minimize code churn in a patch that will be
backported to 1.0.5-maint, qemuNetworkPrepareDevices only does
networkAllocateActualDevice() and the bare amount of setup required
for type='hostdev network devices, but it eventually should do *all*
device setup for guest network devices.

Note that some of the code that was previously needed in
qemuBuildCommandLine() is no longer required when
networkAllocateActualDevice() is called earlier:

 * qemuAssignDeviceHostdevAlias() is already done further down in
   qemuProcessStart().

 * qemuPrepareHostdevPCIDevices() is called by
   qemuPrepareHostDevices() which is called after
   qemuNetworkPrepareDevices() in qemuProcessStart().

As hinted above, this new function should be moved into a separate
qemu_network.c (or similarly named) file along with
qemuPhysIfaceConnect(), qemuNetworkIfaceConnect(), and
qemuOpenVhostNet(), and expanded to call those functions as well, then
the nnets loop in qemuBuildCommandLine() should be reduced to only
build the commandline string (which itself can be in a separate
qemuInterfaceBuilldCommandLine() function as suggested by
Michal). However, this will require storing away an array of tapfd and
vhostfd that are needed for the commandline, so I would rather do that
in a separate patch and leave this patch at the minimum to fix the
bug.
2013-05-07 11:36:43 -04:00
Michal Privoznik
7c9a2d88cd virutil: Move string related functions to virstring.c
The source code base needs to be adapted as well. Some files
include virutil.h just for the string related functions (here,
the include is substituted to match the new file), some include
virutil.h without any need (here, the include is removed), and
some require both.
2013-05-02 16:56:55 +02:00
Laine Stump
f6966b6277 qemu: fix failure to start with spice graphics and no tls
Commit eca3fdf inadvertantly caused a failure to start for any domain
with the following in its config:

    <graphics type='spice' autoport='yes'/>

The problem is that when tlsPort == 0 and defaultMode == "any" (which
is the default for defaultMode), this would be flagged in the code as
"needTLSPort", and if there was then no spice tls config, the new
error+fail would happen.

This patch checks for the case of defaultMode == "any", and in that
case simply doesn't allocate a TLS port (since that's probably not
what the user wanted, and it would have failed later anyway.). It does
leave the error in place for cases when the user specifically asked to
use tls in one way or another, though.
2013-04-30 18:20:53 -04:00
Peter Krempa
eca3fdf738 qemu: Error out if spice port autoallocation is requested, but disabled
When a user requests auto-allocation of the spice TLS port but spice TLS
is disabled in qemu.conf, we start the machine and let qemu fail instead
of erroring out sooner.

Add an error message so that this doesn't happen.
2013-04-30 09:43:12 +02:00
Laine Stump
7bdf459d2c qemu: use new virCommandSetMax(Processes|Files)
These were previously being set in a custom hook function, but now
that virCommand directly supports setting them, we can eliminate that
part of the hook and call the APIs directly.
2013-04-26 10:23:46 -04:00
Peter Krempa
7b4a630484 qemu: Do sensible auto allocation of SPICE port numbers
With this patch, if the autoport attribute is used, the code will
sensibly auto allocate the ports only if needed.
2013-04-24 14:37:20 +02:00
Daniel P. Berrange
abe038cfc0 Extend previous check to validate driver struct field names
Ensure that the driver struct field names match the public
API names. For an API virXXXX we must have a driver struct
field xXXXX. ie strip the leading 'vir' and lowercase any
leading uppercase letters.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-24 10:59:53 +01:00
Peter Krempa
23090823f1 qemu: Split out SPICE port allocation into a separate function
Later on this function will be used to do more sophisticated checks and
determination if port allocation is needed.
2013-04-23 21:30:56 +02:00
Jiri Denemark
6d1b3edc6e qemu: Ignore libvirt logs when reading QEMU error output
When QEMU fails to start, libvirt read its error output and reports it
back in an error message. However, when libvirtd is configured to log
debug messages, one would get the following unhelpful garbage:

    virsh # start cd
    error: Failed to start domain cd
    error: internal error process exited while connecting to monitor: \
      2013-04-22 14:24:54.214+0000: 2194219: debug : virFileClose:72 : \
      Closed fd 21
    2013-04-22 14:24:54.214+0000: 2194219: debug : virFileClose:72 : \
      Closed fd 27
    2013-04-22 14:24:54.215+0000: 2194219: debug : virFileClose:72 : \
      Closed fd 3
    2013-04-22 14:24:54.215+0000: 2194220: debug : virExec:602 : Run \
      hook 0x7feb8f600bf0 0x7feb86ef9300
    2013-04-22 14:24:54.215+0000: 2194220: debug : qemuProcessHook:2507 \
      : Obtaining domain lock
    2013-04-22 14:24:54.216+0000: 2194220: debug : \
      virDomainLockProcessStart:170 : plugin=0x7feb780261f0 \
      dom=0x7feb7802a360 paused=1 fd=0x7feb86ef8ec4
    2013-04-22 14:24:54.216+0000: 2194220: debug : \
      virDomainLockManagerNew:128 : plugin=0x7feb780261f0 \
      dom=0x7feb7802a360 withResources=1
    2013-04-22 14:24:54.216+0000: 2194220: debug : \
      virLockManagerPluginGetDriver:297 : plugin=0x7feb780261f0
    2013-04-22 14:24:54.216+0000: 2194220: debug : \
      virLockManagerNew:321 : driver=0x7feb8ef08640 type=0 nparams=5 \
      params=0x7feb86ef8d60 flags=0
    2013-04-22 14:24:54.216+000

instead of (the output with this patch applied):

    virsh # start cd
    error: Reconnected to the hypervisor
    error: Failed to start domain cd
    error: internal error process exited while connecting to monitor: \
      char device redirected to /dev/pts/33 (label charserial0)
    qemu-system-x86_64: -drive file=/home/vm/systemrescuecd-x86-1.2.0.\
      iso,if=none,id=drive-ide0-1-0,readonly=on,format=raw,cache=none: \
      could not open disk image /home/vm/systemrescuecd-x86-1.2.0.iso: \
      Permission denied
2013-04-22 20:13:40 +02:00
Jiri Denemark
e4bdba8d7f qemu: Move QEMU log reading into a separate function 2013-04-22 20:13:40 +02:00
Daniel P. Berrange
db44eb1b5f Change default cgroup layout for QEMU/LXC and honour XML config
Historically QEMU/LXC guests have been placed in a cgroup layout
that is

   $LOCATION-OF-LIBVIRTD/libvirt/{qemu,lxc}/$VMNAME

This is bad for a number of reasons

 - The cgroup hierarchy gets very deep which seriously
   impacts kernel performance due to cgroups scalability
   limitations.

 - It is hard to setup cgroup policies which apply across
   services and virtual machines, since all VMs are underneath
   the libvirtd service.

To address this the default cgroup location is changed to
be

    /system/$VMNAME.{lxc,qemu}.libvirt

This puts virtual machines at the same level in the hierarchy
as system services, allowing consistent policy to be setup
across all of them.

This also honours the new resource partition location from the
XML configuration, for example

  <resource>
    <partition>/virtualmachines/production</partitions>
  </resource>

will result in the VM being placed at

    /virtualmachines/production/$VMNAME.{lxc,qemu}.libvirt

NB, with the exception of the default, /system, path which
is intended to always exist, libvirt will not attempt to
auto-create the partitions in the XML. It is the responsibility
of the admin/app to configure the partitions. Later libvirt
APIs will provide a way todo this.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
632f78caaf Store a virCgroupPtr instance in qemuDomainObjPrivatePtr
Instead of calling virCgroupForDomain every time we need
the virCgrouPtr instance, just do it once at Vm startup
and cache a reference to the object in qemuDomainObjPrivatePtr
until shutdown of the VM. Removing the virCgroupPtr from
the QEMU driver state also means we don't have stale mount
info, if someone mounts the cgroups filesystem after libvirtd
has been started

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Osier Yang
a9762b730b qemu: Support sgio setting for volume type disk 2013-04-08 19:10:12 +08:00
Osier Yang
60b78b33e1 qemu: Translate the pool disk source earlier
To support "shareable" for volume type disk, we have to translate
the source before trying to add the shared disk entry. To achieve
the goal, this moves the helper qemuTranslateDiskSourcePool into
src/qemu/qemu_conf.c, and introduce an internal only member (voltype)
for struct _virDomainDiskSourcePoolDef, to record the underlying
volume type for use when building the drive string.

Later patch will support "shareable" volume type disk.
2013-04-08 19:02:34 +08:00
Peter Krempa
e84b19316a maint: Rename xmlconf to xmlopt and virDomainXMLConfig to virDomainXMLOption
This patch is the result of running:

for i in $(git ls-files | grep -v html | grep -v \.po$ ); do
  sed -i -e "s/virDomainXMLConf/virDomainXMLOption/g" -e "s/xmlconf/xmlopt/g" $i
done

and a few manual tweaks.
2013-04-04 22:18:56 +02:00
Peter Krempa
a584eaa5ff qemu: Un-mark volume as mirrored/copied if blockjob copy fails
When the blockjob fails for some reason an event is emitted but the disk
wasn't unmarked as being part of a active block copy operation.
2013-03-21 12:32:03 +01:00
Michal Privoznik
cb86e9d39b qemu: s/VIR_ERR_NO_SUPPORT/VIR_ERR_OPERATION_UNSUPPORTED
The VIR_ERR_NO_SUPPORT error code is reserved for cases where an
API is not implemented in a driver. It definitely should not be
used when an API execution fails due to unsupported operation.
2013-03-21 09:26:15 +01:00
Gao feng
45e9d27ad8 NUMA: cleanup for numa related codes
Intend to reduce the redundant code,use virNumaSetupMemoryPolicy
to replace virLXCControllerSetupNUMAPolicy and
qemuProcessInitNumaMemoryPolicy.

This patch also moves the numa related codes to the
file virnuma.c and virnuma.h

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-03-20 19:37:00 +08:00
Gao feng
763edb5ebe rename qemuGetNumadAdvice to virNumaGetAutoPlacementAdvice
qemuGetNumadAdvice will be used by LXC driver, rename
it to virNumaGetAutoPlacementAdvice and move it to virnuma.c

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-03-19 15:55:40 -06:00
Jiri Denemark
ef3cd6473f qemu: Fix startupPolicy regression
Commit 82d5fe5437

    qemu: check backing chains even when cgroup is omitted

added backing file checks just before the code that removes optional
disks if they are not present. However, the backing chain code fails in
case the disk file does not exist, which makes qemuProcessStart fail
regardless on configured startupPolicy.

Note that startupPolicy implementation is still wrong after this patch
since it only check the first file in a possible chain. It should rather
check the complete backing chain. But this is an existing limitation
that can be solved later. After all, startupPolicy is most useful for
CDROM images and they won't make use of backing files in most cases.
2013-03-18 14:11:58 +01:00
Viktor Mihajlovski
608512b24a S390: QEMU driver support for CCW addresses
This commit adds the QEMU driver support for CCW addresses. The
current QEMU only allows virtio devices to be attached to the
CCW bus. We named the new capability indicating that support
QEMU_CAPS_VIRTIO_CCW accordingly.

The fact that CCW devices can only be assigned to domains with a
machine type of s390-ccw-virtio requires a few extra checks for
machine type in qemu_command.c on top of querying
QEMU_CAPS_VIRTIO_{CCW|S390}.

The majority of the new functions deals with CCW address generation
and management.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2013-03-13 17:14:38 -06:00
Peter Krempa
27cf98e2d1 virCaps: conf: start splitting out irrelevat data
The virCaps structure gathered a ton of irrelevant data over time that.
The original reason is that it was propagated to the XML parser
functions.

This patch aims to create a new data structure virDomainXMLConf that
will contain immutable data that are used by the XML parser. This will
allow two things we need:

1) Get rid of the stuff from virCaps

2) Allow us to add callbacks to check and add driver specific stuff
after domain XML is parsed.

This first attempt removes pointers to private data allocation functions
to this new structure and update all callers and function that require
them.
2013-03-13 09:27:14 +01:00
Daniel P. Berrange
82793a2a55 Convert QEMU driver to use virLogProbablyLogMessage
The current QEMU code for skipping log messages only skips over
'debug' message, switch to virLogProbablyLogMessage to make sure
it skips over all of them
2013-03-07 18:56:52 +00:00
Daniel P. Berrange
9c4ecb3e8e Revert hack for autodestroy in qemuProcessStop
This reverts the hack done in

commit 568a6cda27
Author: Jiri Denemark <jdenemar@redhat.com>
Date:   Fri Feb 15 15:11:47 2013 +0100

    qemu: Avoid deadlock in autodestroy

since we now have a fix which avoids the deadlock scenario
entirely
2013-03-01 10:18:27 +00:00
Daniel P. Berrange
7ccad0b16d Fix crash in QEMU auto-destroy with transient guests
When the auto-destroy callback runs it is supposed to return
NULL if the virDomainObjPtr is no longer valid. It was not
doing this for transient guests, so we tried to virObjectUnlock
a mutex which had been freed. This often led to a crash.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-03-01 10:16:29 +00:00
Daniel P. Berrange
d0b3ee55ec Fix typo in internal VIR_QEMU_PROCESS_START_AUTODESROY constant
s/VIR_QEMU_PROCESS_START_AUTODESROY/VIR_QEMU_PROCESS_START_AUTODESTROY/

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-02-27 22:51:24 +00:00
Paolo Bonzini
45dc3f1703 qemu: do not set unpriv_sgio if neither supported nor requested
Currently we call virSetDeviceUnprivSGIO with val == 0 if a block device
has an sgio attribute.  But for sgio='filtered', we know that a
kernel with no unpriv_sgio support will always behave as the user
wanted.  In this case, there is no need to call the function and
report a (bogus) error.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-02-26 13:46:52 +01:00
Michal Privoznik
86d90b3abd qemu_migration: Introduce qemuMigrationStartNBDServer()
We need to start NBD server and feed it with all non-<shared/>,
RW and source-full disks. Moreover, with new virPortAllocator we
must ensure the borrowed port for NBD server will be returned if
either migration completes or qemu process is torn down.
2013-02-23 08:25:09 +01:00
Eric Blake
82d5fe5437 qemu: check backing chains even when cgroup is omitted
https://bugzilla.redhat.com/show_bug.cgi?id=896685 points out
a regression caused by commit 38c4a9c - libvirt only labels
the backing chain if the backing chain cache is populated, but
the code to populate the cache was only conditionally performed
if cgroup labeling was necessary.

* src/qemu/qemu_cgroup.c (qemuSetupCgroup): Hoist cache setup...
* src/qemu/qemu_process.c (qemuProcessStart): ...earlier into
caller, where it is now unconditional.
2013-02-21 12:32:56 -07:00
Jiri Denemark
568a6cda27 qemu: Avoid deadlock in autodestroy
Since closeCallbacks were turned into virObjectLockable, we can no
longer call virQEMUCloseCallbacks APIs from within a registered close
callback.
2013-02-21 10:38:28 +01:00
Jiri Denemark
3898ba7f2c qemu: Turn closeCallbacks into virObjectLockable
To avoid having to hold the qemu driver lock while iterating through
close callbacks and calling them. This fixes a real deadlock when a
domain which is being migrated from another host gets autodestoyed as a
result of broken connection to the other host.
2013-02-21 10:27:24 +01:00
Osier Yang
d0e4b76204 qemu: Update shared disk table when reconnecting qemu process 2013-02-21 00:31:24 +08:00
Osier Yang
a4504ac184 qemu: Record names of domain which uses the shared disk in hash table
The hash entry is changed from "ref" to {ref, @domains}. With this, the
caller can simply call qemuRemoveSharedDisk, without afraid of removing
the entry belongs to other domains. qemuProcessStart will obviously
benifit from it on error codepath (which calls qemuProcessStop to do
the cleanup).
2013-02-21 00:31:24 +08:00
Osier Yang
371df778eb qemu: Merge qemuCheckSharedDisk into qemuAddSharedDisk
Based on moving various checking into qemuAddSharedDisk, this
avoids the caller using it in wrong ways. Also this adds two
new checking for qemuCheckSharedDisk (disk device not 'lun'
and kernel doesn't support unpriv_sgio simply returns 0).
2013-02-21 00:31:24 +08:00
Osier Yang
dab878a861 qemu: Add checking in helpers for sgio setting
This moves the various checking into the helpers, to avoid the
callers missing the checking.
2013-02-21 00:31:24 +08:00
Jiri Denemark
5d6f636764 qemu: Use atomic ops for driver->nactive 2013-02-19 19:11:23 +01:00
Laine Stump
0345c7281b qemu: let virCommand set child process security labels/uid/gid
The qemu driver had been calling virSecurityManagerSetProcessLabel()
from a "pre-exec hook" function that is run after the child is forked,
but before exec'ing qemu. This is problematic because the uid and gid
of the child are set by the security driver, but capabilities are
dropped by virCommand - such separation doesn't work; the two
operations must be done together or the capabilities do not transfer
properly to the child process.

This patch switches to using virSecurityManagerSetChildProcessLabel(),
which is called prior to virCommandRun() (rather than being called
*during* virCommandrun() by the hook function), and doesn't set the
UID/GID/security label directly, but instead merely informs virCommand
what it should set them all to when the time is appropriate.

This lets virCommand choose to do the uid/gid and caps dropping all at
the same time if it wants (it does *want* to, but isn't doing so yet;
that's for an upcoming patch).
2013-02-13 16:11:16 -05:00
Daniel P. Berrange
a9e97e0c30 Remove qemuDriverLock from almost everywhere
With the majority of fields in the virQEMUDriverPtr struct
now immutable or self-locking, there is no need for practically
any methods to be using the QEMU driver lock. Only a handful
of helper APIs in qemu_conf.c now need it
2013-02-13 11:10:30 +00:00
Daniel P. Berrange
61b52d2e38 Fix potential deadlock across fork() in QEMU driver
The hook scripts used by virCommand must be careful wrt
accessing any mutexes that may have been held by other
threads in the parent process. With the recent refactoring
there are 2 potential flaws lurking, which will become real
deadlock bugs once the global QEMU driver lock is removed.

Remove use of the QEMU driver lock from the hook function
by passing in the 'virQEMUDriverConfigPtr' instance directly.

Add functions to the virSecurityManager to be invoked before
and after fork, to ensure the mutex is held by the current
thread. This allows it to be safely used in the hook script
in the child process.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-02-12 11:05:31 +00:00
Daniel P. Berrange
8cdd5faf46 Pass virQEMUDriverPtr into APIs managed shared disk list
Currently the APIs for managing the shared disk list take
a virHashTablePtr as the primary argument. This is bad
because it requires the caller to deal with locking of
the QEMU driver. Switch the APIs to take the full
virQEMUDriverPtr instance

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-02-11 12:48:22 +00:00
Daniel P. Berrange
020a030786 Stop accessing driver->caps directly in QEMU driver
The 'driver->caps' pointer can be changed on the fly. Accessing
it currently requires the global driver lock. Isolate this
access in a single helper, so a future patch can relax the
locking constraints.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-02-08 11:49:16 +00:00
Daniel P. Berrange
32803ba409 Rename 'qemuCapsXXX' to 'virQEMUCapsXXX'
To avoid confusion between 'virCapsPtr' and 'qemuCapsPtr'
do some renaming of various fucntions/variables. All
instances of 'qemuCapsPtr' are renamed to 'qemuCaps'. To
avoid that clashing with the 'qemuCaps' typedef though,
rename the latter to virQEMUCaps.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-02-08 11:49:14 +00:00
Daniel P. Berrange
6ffcab65c9 Use atomic ops to increment nextvmid
Use atomic ops to increment nextvmid and encapsulate it in a
method to prevent accidental non-atomic access
2013-02-05 19:22:25 +00:00
Daniel P. Berrange
37abd47165 Turn virDomainObjList into an opaque virObject
As a step towards making virDomainObjList thread-safe turn it
into an opaque virObject, preventing any direct access to its
internals.

As part of this a new method virDomainObjListForEach is
introduced to replace all existing usage of virHashForEach
2013-02-05 15:49:25 +00:00
Daniel P. Berrange
4f6ed6c33a Rename all domain list APIs to have virDomainObjList prefix
The APIs names for accessing the domain list object are
very inconsistent. Rename them all to have a standard
virDomainObjList prefix.
2013-02-05 15:49:25 +00:00
Daniel P. Berrange
b090aa7d55 Introduce a virQEMUDriverConfigPtr object
Currently the virQEMUDriverPtr struct contains an wide variety
of data with varying access needs. Move all the static config
data into a dedicated virQEMUDriverConfigPtr object. The only
locking requirement is to hold the driver lock, while obtaining
an instance of virQEMUDriverConfigPtr. Once a reference is held
on the config object, it can be used completely lockless since
it is immutable.

NB, not all APIs correctly hold the driver lock while getting
a reference to the config object in this patch. This is safe
for now since the config is never updated on the fly. Later
patches will address this fully.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-02-05 15:49:25 +00:00
Peter Krempa
87b4c10c6c capabilities: Switch CPU data in NUMA topology to a struct
This will allow storing additional topology data in the NUMA topology
definition.

This patch changes the storage type and fixes fallout of the change
across the drivers using it.

This patch also changes semantics of adding new NUMA cell information.
Until now the data were re-allocated and copied to the topology
definition. This patch changes the addition function to steal the
pointer to a pre-allocated structure to simplify the code.
2013-01-24 10:53:00 +01:00
Michal Privoznik
d960d06fc0 qemu_agent: Ignore expected EOFs
https://bugzilla.redhat.com/show_bug.cgi?id=892079

One of my previous patches (f2a4e5f176) tried to fix crashing
libvirtd on domain detroy. However, we need to copy pattern from
qemuProcessHandleMonitorEOF() instead of decrementing reference
counter. The rationale for this is, if qemu process is dying due
to domain being destroyed, we obtain EOF on both the monitor and
agent sockets. However, if the exit is expected, qemuProcessStop
is called, which cleans both agent and monitor sockets up. We
want qemuAgentClose() to be called iff the EOF is not expected,
so we don't leak an FD and memory. Moreover, there could be race
with qemuProcessHandleMonitorEOF() which could have already
closed the agent socket, in which case we don't want to do
anything.
2013-01-23 15:35:44 +01:00
Daniel P. Berrange
dfb1022c72 Convert QEMU driver over to use virPortAllocator APIs
Replace the current QEMU driver code for managing port
reservations with the new virPortAllocator APIs.
2013-01-16 11:02:58 +00:00
Daniel P. Berrange
325b02b5a3 Convert virDomainObj, qemuAgent, qemuMonitor, lxcMonitor to virObjectLockable
The  virDomainObj, qemuAgent, qemuMonitor, lxcMonitor classes
all require a mutex, so can be switched to use virObjectLockable

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-01-16 11:02:58 +00:00
Daniel P. Berrange
6f736c83e5 Convert HAVE_NUMACTL to WITH_NUMACTL
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-01-14 13:25:06 +00:00
Michal Privoznik
f2a4e5f176 qemu_agent: Remove agent reference only when disposing it
https://bugzilla.redhat.com/show_bug.cgi?id=892079

With current code, if user calls virDomainPMSuspendForDuration()
followed by virDomainDestroy(), the former API checks for qemu agent
presence, which will evaluate as true (if agent is configured). While
talking to qemu agent, the qemu driver is unlocked, so the latter API
starts executing.  However, if machine dies meanwhile, libvirtd gets
EOF on the agent socket and qemuProcessHandleAgentEOF() is called. The
handler clears reference to qemu agent while the destroy API already
holding a reference to it. This leads to NULL dereferencing later in
the code. Therefore, the agent pointer should be set to NULL only if
we are the exclusive owner of it.
2013-01-10 10:32:54 +01:00
Andres Lagar-Cavilla
aedfcce33e Add RESUME event listener to qemu monitor.
Perform all the appropriate plumbing.

When qemu/KVM VMs are paused manually through a monitor not-owned by libvirt,
libvirt will think of them as "paused" event after they are resumed and
effectively running. With this patch the discrepancy goes away.

This is meant to address bug 892791.

Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
2013-01-09 10:17:40 +01:00
Osier Yang
1279e421b2 qemu: Check if the shared disk's cdbfilter conflicts with others
This prevents domain starting and disk attaching if the shared disk's
setting conflicts with other active domain(s), E.g. A domain with
"sgio" set as "filtered", however, another active domain is using
it set as "unfiltered".
2013-01-07 21:39:20 +08:00
Osier Yang
278f87c4b5 qemu: set unpriv_sgio when starting domain and attaching disk
This ignores the default "filtered" if unpriv_sgio is not supported
by kernel, but for explicit request "filtered", it error out for
domain starting.
2013-01-07 21:39:06 +08:00
Osier Yang
d7ead3e19a qemu: Add a hash table for the shared disks
This introduces a hash table for qemu driver, to store the shared
disk's info as (@major:minor, @ref_count). @ref_count is the number
of domains which shares the disk.

Since we only care about if the disk support unprivileged SG_IO
commands, and the SG_IO commands only make sense for block disk,
this patch only manages (add/remove hash entry) the shared disk for
block disk.

* src/qemu/qemu_conf.h: (Add member 'sharedDisks' of type
                         virHashTablePtr; Declare helpers
                         qemuGetSharedDiskKey, qemuAddSharedDisk
                         and qemuRemoveSharedDisk)
* src/qemu/qemu_conf.c (Implement the 3 helpers)
* src/qemu/qemu_process.c (Update 'sharedDisks' when domain
                           starting and shutdown)
* src/qemu/qemu_driver.c (Update 'sharedDisks' when attaching
                          or detaching disk).
2013-01-07 21:35:19 +08:00
Ján Tomko
b7a443fcbb qemu: fix a segfault in qemuProcessWaitForMonitor
Commit b3f2b4ca5c left buf unallocated in
the case of QMP capability probing being used, leading to a segfault in
strlen in the cleanup path.

This patch opens the log and allocates the buffer if QMP probing was
used, so we can display the helpful error message.
2013-01-04 11:00:43 +01:00
Michal Privoznik
b3f2b4ca5c qemu: Don't parse log output when starting up a domain
Despite our great effort we still parsed qemu log output.
We wouldn't notice unless upcoming qemu 1.4 changed the
format of the logs slightly. Anyway, now we should gather
all interesting knobs like pty paths from monitor. Moreover,
since for historical reasons the first console can be just
an alias to the first serial port, we need to check this and
copy the pty path if that's the case to the first console.
2013-01-03 09:56:51 +01:00
Michal Privoznik
fe915278c1 Revert "qemu: Adapt to new log format"
This reverts commit 28224c4d2a
which shouldn't be needed at all because with current qemu
we obtain all paths from 'query-chardev' output. We ought
not parse log output at all anymore.
2013-01-02 11:52:18 +01:00
Michal Privoznik
28224c4d2a qemu: Adapt to new log format
Since 586502189edf9fd0f89a83de96717a2ea826fdb0 qemu commit, the log
lines reporting chardev's path has changed from:

$ ./x86_64-softmmu/qemu-system-x86_64 -serial pty -serial pty -monitor pty
char device redirected to /dev/pts/5
char device redirected to /dev/pts/6
char device redirected to /dev/pts/7

to:

$ ./x86_64-softmmu/qemu-system-x86_64 -serial pty -serial pty -monitor pty
char device compat_monitor0 redirected to /dev/pts/5
char device serial0 redirected to /dev/pts/6
char device serial1 redirected to /dev/pts/7

However, with current code we are not prepared for such change, which
results in us being unable to start any domain.
2012-12-30 12:12:21 +01:00
Daniel P. Berrange
f24404a324 Rename virterror.c virterror_internal.h to virerror.{c,h} 2012-12-21 11:19:50 +00:00
Daniel P. Berrange
e861b31275 Rename uuid.{c,h} to viruuid.{c,h} 2012-12-21 11:19:49 +00:00
Daniel P. Berrange
44f6ae27fe Rename util.{c,h} to virutil.{c,h} 2012-12-21 11:19:49 +00:00
Daniel P. Berrange
f56c773bf8 Merge processinfo.{c,h} into virprocess.{c,h} 2012-12-21 11:19:45 +00:00
Daniel P. Berrange
ab9b7ec2f6 Rename memory.{c,h} to viralloc.{c,h} 2012-12-21 11:17:14 +00:00
Daniel P. Berrange
936d95d347 Rename logging.{c,h} to virlog.{c,h} 2012-12-21 11:17:14 +00:00
Daniel P. Berrange
30f3a005ff Rename hooks.{c,h} to virhook.{c,h} 2012-12-21 11:17:13 +00:00
Daniel P. Berrange
a27e4fbb72 Rename bitmap.{c,h} to virbitmap.{c,h}
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-21 11:17:12 +00:00
Martin Kletzander
b72c97e732 fix typo in the word affinities
This patch fixes just the word Affinites to Affinities (it's really
painful to search in TAGS without being able to find the right
function).
2012-12-19 02:17:38 +01:00
Roman Bogorodskiy
9a2f36ec04 Qemu FreeBSD: fix compilation
* Autotools changes:
  - Don't assume Qemu is Linux-only
  - Check Linux headers only on Linux
  - Disable firewalld on FreeBSD
* Initctl:
  Initctl seem to present only on Linux, so stub it on other platforms
* Raw I/O: Linux-only as well
* Headers cleanup
2012-12-12 11:59:53 -07:00
Serge Hallyn
88bd1a644b add security hook for permitting hugetlbfs access
When a qemu domain is backed by huge pages, apparmor needs to grant the domain
rw access to files under the hugetlbfs mount point.  Add a hook, called in
qemu_process.c, which ends up adding the read-write access through
virt-aa-helper.  Qemu will be creating a randomly named file under the
mountpoint and unlinking it as soon as it has mmap()d it, therefore we
cannot predict the full pathname, but for the same reason it is generally
safe to provide access to $path/**.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2012-12-11 14:27:20 -07:00
Daniel P. Berrange
79b8a56995 Replace polling for active VMs with signalling by drivers
Currently to deal with auto-shutdown libvirtd must periodically
poll all stateful drivers. Thus sucks because it requires
acquiring both the driver lock and locks on every single virtual
machine. Instead pass in a "inhibit" callback to virStateInitialize
which drivers can invoke whenever they want to inhibit shutdown
due to existance of active VMs.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-04 12:14:04 +00:00
Daniel P. Berrange
4738c2a7e7 Replace 'struct qemud_driver *' with virQEMUDriverPtr
Remove the obsolete 'qemud' naming prefix and underscore
based type name. Introduce virQEMUDriverPtr as the replacement,
in common with LXC driver naming style
2012-11-28 18:17:25 +00:00
Daniel P. Berrange
7492276317 s/qemud/qemu/ in QEMU driver sources
Change some legacy function names to use 'qemu' as their
prefix instead of 'qemud' which was a hang over from when
the QEMU driver ran inside a separate daemon

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 19:36:36 +00:00
Eric Blake
7e5aa78d0f build: avoid C99 for loop
Although we require various C99 features, we don't yet require a
complete C99 compiler.  On RHEL 5, compilation complained:

qemu/qemu_command.c: In function 'qemuBuildGraphicsCommandLine':
qemu/qemu_command.c:4688: error: 'for' loop initial declaration used outside C99 mode

* src/qemu/qemu_command.c (qemuBuildGraphicsCommandLine): Declare
variable sooner.
* src/qemu/qemu_process.c (qemuProcessInitPasswords): Likewise.
2012-11-26 15:28:25 -07:00
Alon Levy
23e8b5d8e7 qemu: refactor graphics code to not hardcode a single display
The check for a single display remains so no new functionality is added.
2012-11-20 19:57:39 +01:00
Viktor Mihajlovski
a2b3d7cff8 qemu, lxc: Change host CPU number detection logic.
The drivers for QEMU and LXC use virNodeGetInfo only to determine
the number of host CPUs. On Linux hosts nodeGetCPUCount has less
overhead.
2012-11-15 08:48:19 -07:00
Peter Krempa
02cf57c0d0 qemu: Fix domain ID numbering race condition
When the libvirt daemon is restarted it tries to reconnect to running
qemu domains. Since commit d38897a5d4 the
re-connection code runs in separate threads. In the original
implementation the maximum of domain ID's (that is used as an
initializer for numbering guests created next) while libvirt was
reconnecting to the guest.

With the threaded implementation this opens a possibility for race
conditions with the thread that is autostarting guests. When there's a
guest running with id 1 and the daemon is restarted. The autostart code
is reached first and spawns the first guest that should be autostarted
as id 1. This results into the following unwanted situation:

 # virsh list
   Id    Name                           State
  ----------------------------------------------------
   1     guest1                         running
   1     guest2                         running

This patch extracts the detection code before the re-connection threads
are started so that the maximum id of the guests being reconnected to is
known.

The only semantic change created by this is if the guest with greatest ID
quits before we are able to reconnect it's ID is used anyway as the
greatest one as without this patch the greatest ID of a process we could
successfuly reconnect to would be used.
2012-11-09 00:12:38 +01:00
Peter Krempa
2a59a3d597 snapshot: qemu: Add async job type for snapshots
The new external system checkpoints will require an async job while the
snapshot is taken. This patch adds QEMU_ASYNC_JOB_SNAPSHOT to track this
job type.
2012-11-03 14:57:43 +01:00
Eric Blake
dd0a7040f7 build: typo fix for qemu cpu affinity
Introduced in commit 0039a32f.

* src/qemu/qemu_process.c (qemuPrepareCpumap): s/covert/convert/
2012-10-27 08:09:51 -06:00
Eric Blake
edecd45c78 blockjob: return appropriate event and info
Handle the new type of block copy event and info.  Of course,
this patch does nothing until a later patch actually allows the
creation/abort of a block copy job.

* include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_JOB_READY): New
block job status.
* src/libvirt.c (virDomainBlockRebase): Document the event.
* src/qemu/qemu_monitor_json.c (eventHandlers): New event.
(qemuMonitorJSONHandleBlockJobReady): New function.
(qemuMonitorJSONGetBlockJobInfoOne): Translate new job type.
(qemuMonitorJSONHandleBlockJobImpl): Handle new event and job type.
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Recognize
the event to minimize snooping.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Snoop a successful
info query to save effort on a pivot request.
2012-10-27 07:43:38 -06:00
Osier Yang
bb81021bfe qemu: Keep the affinity when creating cgroup for emulator thread
When the cpu placement model is "auto", it sets the affinity for
domain process with the advisory nodeset from numad, however,
creating cgroup for the domain process (called emulator thread
in some contexts) later overrides that with pinning it to all
available pCPUs.

How to reproduce:

  * Configure the domain with "auto" placement for <vcpu>, e.g.
    <vcpu placement='auto'>4</vcpu>
  * % virsh start dom
  * % cat /proc/$dompid/status

Though the emulator cgroup cause conflicts, but we can't simply
prohibit creating it, as other tunables are still useful, such
as "emulator_period", which is used by API
virDomainSetSchedulerParameter. So this patch doesn't prohibit
creating the emulator cgroup, but inherit the nodeset from numad,
and reset the affinity for domain process.

* src/qemu/qemu_cgroup.h: Modify definition of qemuSetupCgroupForEmulator
                          to accept the passed nodenet
* src/qemu/qemu_cgroup.c: Set the affinity with the passed nodeset
2012-10-24 21:46:24 +08:00
Osier Yang
0039a32fca qemu: Add helper to prepare cpumap for affinity setting
Abstract the codes to prepare cpumap into a helper a function,
which can be used later.

* src/qemu/qemu_process.h: Declare qemuPrepareCpumap
* src/qemu/qemu_process.c: Implement qemuPrepareCpumap, and use it.
2012-10-24 21:24:10 +08:00
Osier Yang
b0f1ba47dd qemu: Fix the unused parameter which causes the build failure 2012-10-22 15:51:13 +08:00
Osier Yang
5828080f71 qemu: Cleanup the unused 'nodeinfo'
"nodeinfo" is not used in these two functions, and it's waste
of goto in qemuProcessSetEmulatorAffinites
2012-10-22 15:12:57 +08:00
Eric Blake
3f38c7e3a9 blockjob: manage qemu block-commit monitor command
qemu 1.3 will be adding a 'block-commit' monitor command, per
qemu.git commit ed61fc1.  It matches nicely to the libvirt API
virDomainBlockCommit.

* src/qemu/qemu_capabilities.h (QEMU_CAPS_BLOCK_COMMIT): New bit.
* src/qemu/qemu_capabilities.c (qemuCapsProbeQMPCommands): Set it.
* src/qemu/qemu_monitor.h (qemuMonitorBlockCommit): New prototype.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockCommit):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorBlockCommit): Implement it.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockCommit):
Likewise.
(qemuMonitorJSONHandleBlockJobImpl)
(qemuMonitorJSONGetBlockJobInfoOne): Handle new event type.
2012-10-19 17:35:11 -06:00
Eric Blake
4d34c92947 storage: cache backing chain while qemu domain is live
Technically, we should not be re-probing any file that qemu might
be currently writing to.  As such, we should cache the backing
file chain prior to starting qemu.  This patch adds the cache,
but does not use it until the next patch.

Ultimately, we want to also store the chain in domain XML, so that
it is remembered across libvirtd restarts, and so that the only
kosher way to modify the backing chain of an offline domain will be
through libvirt API calls, but we aren't there yet.  So for now, we
merely invalidate the cache any time we do a live operation that
alters the chain (block-pull, block-commit, external disk snapshot),
as well as tear down the cache when the domain is not running.

* src/conf/domain_conf.h (_virDomainDiskDef): New field.
* src/conf/domain_conf.c (virDomainDiskDefFree): Clean new field.
* src/qemu/qemu_domain.h (qemuDomainDetermineDiskChain): New
prototype.
* src/qemu/qemu_domain.c (qemuDomainDetermineDiskChain): New
function.
* src/qemu/qemu_driver.c (qemuDomainAttachDeviceDiskLive)
(qemuDomainChangeDiskMediaLive): Pre-populate chain.
(qemuDomainSnapshotCreateSingleDiskActive): Uncache chain before
snapshot.
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Update
chain after block pull.
2012-10-19 17:35:10 -06:00
Guido Günther
a605594f8e qemu: Don't fail without emulatorpin or cpumask
This unbreaks qemu:///session that got broken by
ba63d8f7d8.
2012-10-19 01:25:19 +02:00
Martin Kletzander
ba63d8f7d8 qemu: Pin the emulator when only cpuset is specified
According to our recent changes (clarifications), we should be pinning
qemu's emulator processes using the <vcpu> 'cpuset' attribute in case
there is no <emulatorpin> specified.  This however doesn't work
entirely as expected and this patch should resolve all the remaining
issues.
2012-10-17 17:37:10 +02:00
Martin Kletzander
7ba5defb5a Add support for SUSPEND_DISK event
This patch adds support for SUSPEND_DISK event; both lifecycle and
separated.  The support is added for QEMU, machines are changed to
PMSUSPENDED, but as QEMU sends SHUTDOWN afterwards, the state changes
to shut-off.  This and much more needs to be done in order for libvirt
to work with transient devices, wake-ups etc.  This patch is not
aiming for that functionality.
2012-10-15 12:09:10 +02:00
Osier Yang
3635b41e15 qemu: Ignore def->cpumask if emulatorpin is specified
If the vcpu placement is "static", it's just fine to ignore the
def->cpumask if emulatorpin is specified.
2012-10-15 12:20:37 +08:00
Ján Tomko
149c87b49d Various typos and misspellings 2012-10-12 00:03:43 +02:00
Jiri Denemark
28f8dfdccc Add MIGRATABLE flag for virDomainGetXMLDesc
Using VIR_DOMAIN_XML_MIGRATABLE flag, one can request domain's XML
configuration that is suitable for migration or save/restore. Such XML
may contain extra run-time stuff internal to libvirt and some default
configuration may be removed for better compatibility of the XML with
older libvirt releases.

This flag may serve as an easy way to get the XML that can be passed
(after desired modifications) to APIs that accept custom XMLs, such as
virDomainMigrate{,ToURI}2 or virDomainSaveFlags.
2012-10-11 15:11:42 +02:00
Jiri Denemark
edc9269a2a qemu: Implement startupPolicy for USB passed through devices 2012-10-11 15:11:42 +02:00
Jiri Denemark
d236f3fc38 locking: Pass hypervisor driver name when acquiring locks
This is required in case a lock manager needs to contact libvirtd in
case of an unexpected event.
2012-10-11 14:41:42 +02:00
Daniel P. Berrange
1b21351b93 Move command/event capabilities detection out of QEMU monitor code
The qemuMonitorSetCapabilities() API is used to initialize the QMP
protocol capabilities. It has since been abused to initialize some
libvirt internal capabilities based on command/event existance too.
Move the latter code out into qemuCapsProbeQMP() in the QEMU
capabilities source file instead

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-09-27 11:06:04 +01:00
Daniel P. Berrange
15ee6614f7 Remove probing of flags when launching QEMU guests
Remove all use of the existing APIs for querying QEMU
capability flags. Instead obtain a qemuCapsPtr object
from the global cache. This avoids the execution of
'qemu -help' (and related commands) when launching new
guests.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-09-27 10:24:52 +01:00
Daniel P. Berrange
362d04779c Fix potential deadlock when agent is closed
If the qemuAgentClose method is called from a place which holds
the domain lock, it is theoretically possible to get a deadlock
in the agent destroy callback. This has not been observed, but
the equivalent code in the QEMU monitor destroy callback has seen
a deadlock.

Remove the redundant locking while unrefing the object and the
bogus assignment

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-09-27 10:11:44 +01:00
Daniel P. Berrange
25f582e36a Fix (rare) deadlock in QEMU monitor callbacks
Some users report (very rarely) seeing a deadlock in the QEMU
monitor callbacks

 Thread 10 (Thread 0x7fcd11e20700 (LWP 26753)):
 #0  0x00000030d0e0de4d in __lll_lock_wait () from /lib64/libpthread.so.0
 #1  0x00000030d0e09ca6 in _L_lock_840 () from /lib64/libpthread.so.0
 #2  0x00000030d0e09ba8 in pthread_mutex_lock () from /lib64/libpthread.so.0
 #3  0x00007fcd162f416d in virMutexLock (m=<optimized out>)
     at util/threads-pthread.c:85
 #4  0x00007fcd1632c651 in virDomainObjLock (obj=<optimized out>)
     at conf/domain_conf.c:14256
 #5  0x00007fcd0daf05cc in qemuProcessHandleMonitorDestroy (mon=0x7fcccc0029e0,
     vm=0x7fcccc00a850) at qemu/qemu_process.c:1026
 #6  0x00007fcd0db01710 in qemuMonitorDispose (obj=0x7fcccc0029e0)
     at qemu/qemu_monitor.c:249
 #7  0x00007fcd162fd4e3 in virObjectUnref (anyobj=<optimized out>)
     at util/virobject.c:139
 #8  0x00007fcd0db027a9 in qemuMonitorClose (mon=<optimized out>)
     at qemu/qemu_monitor.c:860
 #9  0x00007fcd0daf61ad in qemuProcessStop (driver=driver@entry=0x7fcd04079d50,
     vm=vm@entry=0x7fcccc00a850,
     reason=reason@entry=VIR_DOMAIN_SHUTOFF_DESTROYED, flags=flags@entry=0)
     at qemu/qemu_process.c:4057
 #10 0x00007fcd0db323cf in qemuDomainDestroyFlags (dom=<optimized out>,
     flags=<optimized out>) at qemu/qemu_driver.c:1977
 #11 0x00007fcd1637ff51 in virDomainDestroyFlags (
     domain=domain@entry=0x7fccf00c1830, flags=1) at libvirt.c:2256

At frame #10 we are holding the domain lock, we call into
qemuProcessStop() to cleanup QEMU, which triggers the monitor
to close, which invokes qemuProcessHandleMonitorDestroy() which
tries to obtain the domain lock again. This is a non-recursive
lock, hence hang.

Since qemuMonitorPtr is a virObject, the unref call in
qemuProcessHandleMonitorDestroy no longer needs mutex
protection. The assignment of priv->mon = NULL, can be
instead done by the caller of qemuMonitorClose(), thus
removing all need for locking.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-09-27 10:11:44 +01:00
Daniel P. Berrange
0b62c0736a Don't skip over socket label cleanup
If QEMU quits immediately after we opened the monitor it was
possible we would skip the clearing of the SELinux process
socket context

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-09-27 10:11:44 +01:00