Commit Graph

7987 Commits

Author SHA1 Message Date
Eric Blake
246143b69f build: fix type-punning bug
With older gcc and 64-bit size_t, the compiler issues a real warning:
rpc/virnetserverservice.c:277: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]

Introduced in commit 0cc79255.  Depending on machine endianness,
this warning represents a real bug that could mis-interpret the
value by a factor of 2^32.  I don't know why I couldn't get newer
gcc to report the same warning message.

* src/rpc/virnetserverservice.c
(virNetServerServiceNewPostExecRestart): Use temporary instead.
2012-10-26 13:00:27 -06:00
Laine Stump
73ebd86d73 parallels: fix build for some older compilers
Found this when building on RHEL5:

parallels/parallels_storage.c: In function 'parallelsStorageOpen':
parallels/parallels_storage.c:180: error: 'for' loop initial declaration used outside C99 mode

(and similar error in parallels_driver.c). This was in spite of
configuring with "-Wno-error".
2012-10-26 13:23:56 -04:00
Cole Robinson
eba36a3878 daemon: Fix LIBVIRT_DEBUG=1 default output
This commit changes the behavior of LIBVIRT_DEBUG=1 libvirtd:

$ git show 7022b09111
commit 7022b09111
Author: Daniel P. Berrange <berrange@redhat.com>
Date:   Thu Sep 27 13:13:09 2012 +0100

    Automatically enable systemd journal logging

    Probe to see if the systemd journal is accessible, and if
    so enable logging to the journal by default, rather than
    stderr (current default under systemd).

Previously  'LIBVIRT_DEBUG=1 /usr/sbin/libvirtd' would show all debug
output to stderr, now it send debug output to the journal.

Only use the journal by default if running in daemon mode, or
if stdin is _not_ a tty. This should make libvirtd launched from
systemd use the journal, but preserve the old behavior in most
situations.
2012-10-25 16:46:23 -04:00
Laine Stump
d8aae15aa1 network: fix networkValidate check for default portgroup and vlan
This was found during testing of the fix for:

   https://bugzilla.redhat.com/show_bug.cgi?id=868483

networkValidate was supposed to check for the existence of multiple
portgroups and report an error if this was encountered. It did, but
there were two problems:

1) even though it logged an error, it still returned success, allowing
the operation to continue.

2) It could exit the portgroup checking loop early (or possibly not
even do it once) if a vlan tag was supplied in the base network config
or one of the portgroups.

This patch fixes networkValidate to return failure in addition to
logging the error, and also changes it to not exit the portgroup
checking loop early. The logic was a bit off in the checking for vlan
anyway, and it's intertwined with fixing the early loop exit, so I
fixed that as well. Now it correctly checks for combinations where a
<virtualport> is specified in the base network def and <vlan> is given
in a portgroup, as well as the opposite (<vlan> in base network def
and <virtualport> in portgroup), and ignores the case of a disallowed
vlan when using *no* portgroup if there is a default portgroup (since
in that case there is no way to not use any portgroup).
2012-10-25 16:32:04 -04:00
Viktor Mihajlovski
e3ba67037b virNodeGetCPUMap: Implement driver support
Driver support added for:
- test: pretending 8 host CPUS, 3 being online
- qemu, lxc, openvz, uml: using nodeGetCPUMap

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2012-10-25 11:20:15 -06:00
Viktor Mihajlovski
d34439c9e4 virNodeGetCPUMap: Implement support function in nodeinfo
Added an implemention of virNodeGetCPUMap to nodeinfo.c,
(nodeGetCPUMap) which can be used by all drivers for a Linux
hypervisor host.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2012-10-25 11:20:08 -06:00
Eric Blake
2f4c5338a6 nodeinfo: improve probing node cpu bitmap
Callers should not need to know what the name of the file to
be read in the Linux-specific version of nodeGetCPUmap;
furthermore, qemu cares about online cpus, not present cpus,
when determining which cpus to skip.

While at it, I fixed the fact that we were computing the maximum
online cpu id by doing a slow iteration, when what we really want
to know is the max available cpu.

* src/nodeinfo.h (nodeGetCPUmap): Rename...
(nodeGetCPUBitmap): ...and simplify signature.
* src/nodeinfo.c (linuxParseCPUmax): New function.
(linuxParseCPUmap): Simplify and alter signature.
(nodeGetCPUBitmap): Change implementation.
* src/libvirt_private.syms (nodeinfo.h): Reflect rename.
* src/qemu/qemu_driver.c (qemuDomainGetPercpuStats): Update
caller.
2012-10-25 11:20:08 -06:00
Eric Blake
0711c4b74d bitmap: add virBitmapCountBits
Sometimes it's handy to know how many bits are set.

* src/util/bitmap.h (virBitmapCountBits): New prototype.
(virBitmapNextSetBit): Use correct type.
* src/util/bitmap.c (virBitmapNextSetBit): Likewise.
(virBitmapSetAll): Maintain invariant of clear tail bits.
(virBitmapCountBits): New function.
* src/libvirt_private.syms (bitmap.h): Export it.
* tests/virbitmaptest.c (test2): Test it.
2012-10-25 11:19:23 -06:00
Jiri Denemark
0111b409a3 Fix build with apparmor
Recent storage patches changed signature of virStorageFileGetMetadata
and replaced chain with backingChain in virDomainDiskDef.
2012-10-25 10:21:57 +02:00
Matthias Bolte
1e7cd39511 esx: Update version checks for vSphere 5.1
Also remove warnings for upcoming versions. There hadn't been any
compatibility problems with new ESX version over the whole lifetime
of the ESX driver, so I don't expect any in the future.

Update documentation to mention vSphere 5.x support.
2012-10-24 19:50:28 +02:00
Peter Krempa
012f9b19ef cpu: Add recently added cpu feature flags.
Qemu has added some new feature flags. This patch adds them to libvirt.

The new features are for the cpuid function 0x7 that takes an argument
in the ecx register. Currently only 0x0 is used as the argument so I was
lazy and I just clear the registers to 0 before calling cpuid. In future
when there maybe will be some other possible arguments, we will need to
improve the cpu detection code to take this into account.
2012-10-24 17:36:03 +02:00
Osier Yang
a6bd7c22ea qemu: Prohibit chaning affinity of domain process if placement is 'auto'
On one hand, numad probably will manage the affinity of domain process
dynamically in future. On the other hand, even numad won't manage it,
it still could confusion. Let's make things simpler enough to avoid
the lair for now.
2012-10-24 22:26:11 +08:00
Osier Yang
bb81021bfe qemu: Keep the affinity when creating cgroup for emulator thread
When the cpu placement model is "auto", it sets the affinity for
domain process with the advisory nodeset from numad, however,
creating cgroup for the domain process (called emulator thread
in some contexts) later overrides that with pinning it to all
available pCPUs.

How to reproduce:

  * Configure the domain with "auto" placement for <vcpu>, e.g.
    <vcpu placement='auto'>4</vcpu>
  * % virsh start dom
  * % cat /proc/$dompid/status

Though the emulator cgroup cause conflicts, but we can't simply
prohibit creating it, as other tunables are still useful, such
as "emulator_period", which is used by API
virDomainSetSchedulerParameter. So this patch doesn't prohibit
creating the emulator cgroup, but inherit the nodeset from numad,
and reset the affinity for domain process.

* src/qemu/qemu_cgroup.h: Modify definition of qemuSetupCgroupForEmulator
                          to accept the passed nodenet
* src/qemu/qemu_cgroup.c: Set the affinity with the passed nodeset
2012-10-24 21:46:24 +08:00
Osier Yang
0039a32fca qemu: Add helper to prepare cpumap for affinity setting
Abstract the codes to prepare cpumap into a helper a function,
which can be used later.

* src/qemu/qemu_process.h: Declare qemuPrepareCpumap
* src/qemu/qemu_process.c: Implement qemuPrepareCpumap, and use it.
2012-10-24 21:24:10 +08:00
Viktor Mihajlovski
d804d35fac virNodeGetCPUMap: Implement wire protocol.
- Defined the wire protocol format for virNodeGetCPUMap and its
  arguments
- Implemented remote method invocation (remoteNodeGetCPUMap)
- Implemented method dispatcher (remoteDispatchNodeGetCPUMap)

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2012-10-23 18:46:48 -06:00
Viktor Mihajlovski
7ecc1d814a virNodeGetCPUMap: Define public API.
Adding a new API to obtain information about the
host node's present, online and offline CPUs.

int virNodeGetCPUMap(virConnectPtr conn,
                     unsigned char **cpumap,
                     unsigned int *online,
                     unsigned int flags);

The function will return the number of CPUs present on the host
or -1 on failure;
If cpumap is non-NULL virNodeGetCPUMap will allocate an array
containing a bit map representation of the online CPUs. It's
the callers responsibility to deallocate cpumap using free().
If online is non-NULL, the variable pointed to will contain
the number of online host node CPUs.
The variable flags has been added to support future extensions
and must be set to 0.

Extend the driver structure by nodeGetCPUMap entry in support of the
new API virNodeGetCPUMap.
Added implementation of virNodeGetCPUMap to libvirt.c

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2012-10-23 18:46:47 -06:00
Kyle Mestery
2f3e2c0c43 qemu_migration: Transport OVS per-port data during live migration
Transport Open vSwitch per-port data during live
migration by using the utility functions
virNetDevOpenvswitchGetMigrateData() and
virNetDevOpenvswitchSetMigrateData().

Signed-off-by: Kyle Mestery <kmestery@cisco.com>
2012-10-23 15:26:04 -04:00
Kyle Mestery
f6a2f97eb9 openvswitch: Add utility functions for getting and setting Open vSwitch per-port data
Add utility functions for Open vSwitch to both save
per-port data before a live migration, and restore the
per-port data after a live migration.

Signed-off-by: Kyle Mestery <kmestery@cisco.com>
2012-10-23 15:26:04 -04:00
Kyle Mestery
694d0c520b qemu_migration: Add hooks to transport network data during migration
Add the ability for the Qemu V3 migration protocol to
include transporting network configuration. A generic
framework is proposed with this patch to allow for the
transfer of opaque data.

Signed-off-by: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Laine Stump <laine@laine.org>
2012-10-23 15:26:04 -04:00
Jim Fehlig
9785f2b6f2 Fix detection of Xen sysctl version 9
In commit 371ddc98, I mistakenly added the check for sysctl
version 9 after setting the hypercall version to 1, which will
fail with

error : xenHypervisorDoV1Op:967 : Unable to issue hypervisor
ioctl 3166208: Function not implemented

This check should be included along with the others that use
hypercall version 2.
2012-10-23 11:18:20 -06:00
Cole Robinson
767be8be72 selinux: Don't fail RestoreAll if file doesn't have a default label
When restoring selinux labels after a VM is stopped, any non-standard
path that doesn't have a default selinux label causes the process
to stop and exit early. This isn't really an error condition IMO.

Of course the selinux API could be erroring for some other reason
but hopefully that's rare enough to not need explicit handling.

Common example here is storing disk images in a non-standard location
like under /mnt.
2012-10-23 11:45:24 -04:00
Eric Blake
add633bdf9 build: print uids as unsigned
Reported by Michal Privoznik.

* src/security/security_dac.c (virSecurityDACGenLabel): Use
correct format.
2012-10-23 08:38:33 -06:00
Ján Tomko
9b704ab823 xml: omit domain name from comment if it contains double hyphen
We put a comment containing "virsh edit <domain_name>" at the start of
the XML. W3C recommendation forbids the use of "--" in comments [1] and
libvirt can't parse it either. This patch omits the domain name if it
contains a double hyphen.

[1] http://www.w3.org/TR/REC-xml/#sec-comments
2012-10-23 14:24:31 +02:00
Ján Tomko
b326765c80 storage: don't shadow global 'wait' declaration
Rename the 'wait' parameter to 'loop'.
This silences the warning:
storage/storage_backend.c:1348:34: error: declaration of 'wait' shadows
a global declaration [-Werror=shadow]
and fixes the build with -Werror.
--
Note: loop is pool backwards.
2012-10-23 13:56:59 +02:00
Eric Blake
33eaebe48e snapshot: sanity check when reusing file for snapshot
The snapshot code when reusing an existing file had hard-to-read
logic, as well as a missing sanity check: REUSE_EXT should require
the destination to already be present.

* src/qemu/qemu_driver.c (qemuDomainSnapshotDiskPrepare): Require
destination on REUSE_EXT, rename variable for legibility.
2012-10-22 15:10:16 -06:00
Eric Blake
23a4df886d build: use correct printf types for uid/gid
Fixes a build failure on cygwin:
cc1: warnings being treated as errors
security/security_dac.c: In function 'virSecurityDACSetProcessLabel':
security/security_dac.c:862:5: error: format '%u' expects type 'unsigned int', but argument 7 has type 'uid_t' [-Wformat]
security/security_dac.c:862:5: error: format '%u' expects type 'unsigned int', but argument 8 has type 'gid_t' [-Wformat]

* src/security/security_dac.c (virSecurityDACSetProcessLabel)
(virSecurityDACGenLabel): Use proper casts.
2012-10-22 14:41:00 -06:00
Cole Robinson
77eff5eeb2 storage: Don't do wait loops from VolLookupByPath
virStorageVolLookupByPath is an API call that virt-manager uses
quite a bit when dealing with storage. This call use BackendStablePath
which has several usleep() heuristics that can be tripped up
and hang virt-manager for a while.

Current example: an empty mpath pool pointing to /dev/mapper makes
_any_ calls to virStorageVolLookupByPath take 5 seconds.

The sleep heuristics are actually only needed in certain cases
when we are waiting for new storage to appear, so let's skip the
timeout steps when calling from LookupByPath.
2012-10-22 16:15:12 -04:00
Cole Robinson
e58dfad4a4 qemu: Don't use -enable-nesting with qemu 1.2.0+
Since the option doesn't exist. Fixes booting with
cpu mode='host-model' and qemu 1.2.0
2012-10-22 16:15:12 -04:00
Doug Goldstein
2da776b1d6 qemu: Don't blindly assume VNC is supported
Currently it's assumed that qemu always supports VNC, however it is
definitely possible to compile qemu without VNC support so we should at
the very least check for it and handle that correctly.
2012-10-22 23:16:17 +08:00
Eric Blake
d9d77bfa80 storage: let format probing work on root-squash NFS
Yet another instance of where using plain open() mishandles files
that live on root-squash NFS, and where improving the API can
improve the chance of a successful probe.

* src/util/storage_file.h (virStorageFileProbeFormat): Alter
signature.
* src/util/storage_file.c (virStorageFileProbeFormat): Use better
method for opening file.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Update caller.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Likewise.
2012-10-22 09:04:57 -06:00
Ján Tomko
b6ab7a067f migrate: v2: use VIR_DOMAIN_XML_MIGRATABLE when available
In v2 migration protocol, XML is obtained by calling domainGetXMLDesc.
This includes the default USB controller in XML, which breaks migration
to older libvirt (before 0.9.2).

Commit 409b5f5495
    qemu: Emit compatible XML when migrating a domain
only fixed this for v3 migration.

This patch uses the new VIR_DOMAIN_XML_MIGRATABLE flag (detected by
VIR_DRV_FEATURE_XML_MIGRATABLE) to obtain XML without the default controller,
enabling backward v2 migration.
2012-10-22 10:48:50 +02:00
Michal Privoznik
508451e4ad qemu: set seamless migration capability
As we switched to setting capabilities based on QMP communication,
qemu seamless-migration capability was not set. In the -help output
this knob is called seamless-migration=[on|off]. The equivalent in
QMP world is SPICE_MIGRATE_COMPLETED event (qemu upstream commit
2fdd16e2).
2012-10-22 10:09:47 +02:00
Osier Yang
b0f1ba47dd qemu: Fix the unused parameter which causes the build failure 2012-10-22 15:51:13 +08:00
Osier Yang
5828080f71 qemu: Cleanup the unused 'nodeinfo'
"nodeinfo" is not used in these two functions, and it's waste
of goto in qemuProcessSetEmulatorAffinites
2012-10-22 15:12:57 +08:00
Cole Robinson
b62f9b99dd Log parameters passed to virFileMakePath 2012-10-21 13:21:50 -04:00
Cole Robinson
7fcf8d9d69 Log file name passed to virConfReadFile 2012-10-21 13:21:50 -04:00
Laine Stump
6f8a8b30c9 network: don't allow multiple default portgroups
This resolves: https://bugzilla.redhat.com/show_bug.cgi?id=868483

virNetworkUpdate, virNetworkDefine, and virNetworkCreate all three
allow network definitions to contain multiple <portgroup> elements
with default='yes'. Only a single default portgroup should be allowed
for each network.

This patch updates networkValidate() (called by both
virNetworkCreate() and virNetworkDefine()) and
virNetworkDefUpdatePortGroup (called by virNetworkUpdate() to not
allow multiple default portgroups.
2012-10-20 21:29:19 -04:00
Laine Stump
1cb1f9dabf network: always create dnsmasq hosts and addnhosts files, even if empty
This fixes the problem reported in:

  https://bugzilla.redhat.com/show_bug.cgi?id=868389

Previously, the dnsmasq hosts file (used for static dhcp entries, and
addnhosts file (used for additional dns host entries) were only
created/referenced on the dnsmasq commandline if there was something
to put in them at the time the network was started. Once we can update
a network definition while it's active (which is now possible with
virNetworkUpdate), this is no longer a valid strategy - if there were
0 dhcp static hosts (resulting in no reference to the hosts file on the
commandline), then one was later added, the commandline wouldn't have
linked dnsmasq up to the file, so even though we create it, dnsmasq
doesn't pay any attention.

The solution is to just always create these files and reference them
on the dnsmasq commandline (almost always, anyway). That way dnsmasq
can notice when a new entry is added at runtime (a SIGHUP is sent to
dnsmasq by virNetworkUdpate whenever a host entry is added or removed)

The exception to this is that the dhcp static hosts file isn't created
if there are no lease ranges *and* no static hosts. This is because in
this case dnsmasq won't be setup to listen for dhcp requests anyway -
in that case, if the count of dhcp hosts goes from 0 to 1, dnsmasq
will need to be restarted anyway (to get it listening on the dhcp
port). Likewise, if the dhcp hosts count goes from 1 to 0 (and there
are no dhcp ranges) we need to restart dnsmasq so that it will stop
listening on port 67. These special situations are handled in the
bridge driver's networkUpdate() by checking for ((bool)
nranges||nhosts) both before and after the update, and triggering a
dnsmasq restart if the before and after don't match.
2012-10-20 21:29:19 -04:00
Laine Stump
78fab2770b network: free/null newDef if network fails to start
https://bugzilla.redhat.com/show_bug.cgi?id=866364

pointed out a crash due to virNetworkObjAssignDef free'ing
network->newDef without NULLing it afterward. A fix for this is in
upstream commit b7e9202401. While the
NULLing of newDef was a legitimate fix, newDef should have already
been empty (NULL) anyway (as indicated in the comment that was deleted
by that commit).

The reason that newDef had a non-NULL value (i.e. the root cause) was
that networkStartNetwork() had failed after populating
network->newDef, but then neglected to free/NULL newDef in the
cleanup.

(A bit of background here: network->newDef should contain the
persistent config of a network when a network is active (and of course
only when it is persisten), and NULL at all other times. There is also
a network->def which should contain the persistent definition of the
network when it is inactive, and the current live state at all other
times. The idea is that you can make changes to network->newDef which
will take effect the next time the network is restarted, but won't
mess with the current state of the network (virDomainObj has a similar
pair of virDomainDefs that behave in the same fashion). Personally I
think there should be a network->live and network->config, and the
location of the persistent config should *always* be in
network->config, but that's for a later cleanup).

Since I love things to be symmetric, I created a new function called
virNetworkObjUnsetDefTransient(), which reverses the effects of
virNetworkObjSetDefTransient(). I don't really like the name of the
new function, but then I also didn't really like the name of the old
one either (it's just named that way to match a similar function in
the domain conf code).
2012-10-20 02:43:16 -04:00
Eric Blake
a172dfbe2e blockjob: avoid segv on early error
Gcc with optimization warns:
../../src/qemu/qemu_driver.c: In function 'qemuDomainBlockCommit':
../../src/qemu/qemu_driver.c:12813:46: error: 'disk' may be used uninitialized in this function [-Werror=maybe-uninitialized]
../../src/qemu/qemu_driver.c:12698:25: note: 'disk' was declared here
cc1: all warnings being treated as errors

so obviously I had only been testing with optimization off.

* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Guard cleanup.
2012-10-19 21:17:00 -06:00
Eric Blake
2e43cb8e90 blockjob: properly label disks for qemu block-commit
I finally have all the pieces in place to perform a block-commit with
SELinux enforcing.  There's still missing cleanup work when the commit
completes, but doing that requires tracking both the backing chain and
the base and top files within that chain in domain XML across libvirtd
restarts.  Furthermore, from a security standpoint, once you have
granted access, you must assume any damage that can be done will be
done; later revoking access is nice to minimize the window of damage,
but less important as it does not affect the fact that damage can be
done in the first place.  Therefore, deferring the revoke efforts until
we have better XML tracking of what chain operations are in effect,
including across a libvirtd restart, is reasonable.

* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Label disks as
needed.
(qemuDomainPrepareDiskChainElement): Cast away const.
2012-10-19 17:56:39 -06:00
Eric Blake
35a2f5bc52 blockjob: refactor qemu disk chain permission grants
Previously, snapshot code did its own permission granting (lock
manager, cgroup device controller, and security manager labeling)
inline.  But now that we are adding block-commit and block-copy
which also have to change permissions, it's better to reuse
common code for the task.  While snapshot should fall back to
no access if read-write access failed, block-commit will want to
fall back to read-only access.  The common code doesn't know
whether failure to grant read-write access should revert to no
access (snapshot, block-copy) or read-only access (block-commit).
This code can also be used to revoke access to unused files after
block-pull.

It might be nice to clean things up in a future patch by adding
new functions to the lock manager, cgroup manager, and security
manager that takes a single file name and applies context of a
disk to that file, rather than the current semantics of applying
context to the entire chain already associated to a disk.  That
way, we could avoid the games this patch plays of temporarily
swapping out the disk->src and related fields of the disk.  But
that would involve more code changes, so this patch really is
the smallest hack for doing the necessary work; besides, this
patch is more or less code motion (the hack was already employed
by the snapshot creation code, we are just making it reusable).

* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateSingleDiskActive)
(qemuDomainSnapshotUndoSingleDiskActive): Refactor labeling hacks...
(qemuDomainPrepareDiskChainElement): ...into new function.
2012-10-19 17:49:06 -06:00
Eric Blake
0a220e2225 blockjob: implement shallow commit flag in qemu
Now that we can crawl the chain of backing files, we can do
argument validation and implement the 'shallow' flag.  In
testing this, I discovered that it can be handy to pass the
shallow flag and an explicit base, as a means of validating
that the base is indeed the file we expected.

* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Crawl through
chain to implement shallow flag.
* src/libvirt.c (virDomainBlockCommit): Relax API.
2012-10-19 17:35:11 -06:00
Eric Blake
2cbc1fd892 blockjob: wire up online qemu block-commit
This is the bare minimum to kick off a block commit.  In particular,
flags support is missing (shallow requires us to crawl the backing
chain to determine the file name to pass to the qemu monitor command;
delete requires us to track what needs to be deleted at the time
the completion event fires).  Also, we are relying on qemu to do
error checking (such as validating 'top' and 'base' as being members
of the backing chain), including the fact that the current qemu code
does not support committing the active layer (although it is still
planned to add that before qemu 1.3).  Since the active layer won't
change, we have it easy and do not have to alter the domain XML.
Additionally, this will fail if SELinux is enforcing, because we fail
to grant qemu proper read/write access to the files it will modify.

* src/qemu/qemu_driver.c (qemuDomainBlockCommit): New function.
(qemuDriver): Register it.
2012-10-19 17:35:11 -06:00
Eric Blake
3f38c7e3a9 blockjob: manage qemu block-commit monitor command
qemu 1.3 will be adding a 'block-commit' monitor command, per
qemu.git commit ed61fc1.  It matches nicely to the libvirt API
virDomainBlockCommit.

* src/qemu/qemu_capabilities.h (QEMU_CAPS_BLOCK_COMMIT): New bit.
* src/qemu/qemu_capabilities.c (qemuCapsProbeQMPCommands): Set it.
* src/qemu/qemu_monitor.h (qemuMonitorBlockCommit): New prototype.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockCommit):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorBlockCommit): Implement it.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockCommit):
Likewise.
(qemuMonitorJSONHandleBlockJobImpl)
(qemuMonitorJSONGetBlockJobInfoOne): Handle new event type.
2012-10-19 17:35:11 -06:00
Eric Blake
67aea3fb78 blockjob: remove unused parameters after previous patch
Minor cleanup made possible by previous simplifications.

* src/qemu/qemu_cgroup.h (qemuSetupDiskCgroup)
(qemuTeardownDiskCgroup): Alter signature.
* src/qemu/qemu_cgroup.c (qemuSetupDiskCgroup)
(qemuTeardownDiskCgroup, qemuSetupCgroup): Update all uses.
* src/qemu/qemu_hotplug.c (qemuDomainDetachPciDiskDevice)
(qemuDomainDetachDiskDevice): Likewise.
* src/qemu/qemu_driver.c (qemuDomainAttachDeviceDiskLive)
(qemuDomainChangeDiskMediaLive)
(qemuDomainSnapshotCreateSingleDiskActive)
(qemuDomainSnapshotUndoSingleDiskActive): Likewise.
2012-10-19 17:35:11 -06:00
Eric Blake
38c4a9cc40 storage: use cache to walk backing chain
We used to walk the backing file chain at least twice per disk,
once to set up cgroup device whitelisting, and once to set up
security labeling.  Rather than walk the chain every iteration,
which possibly includes calls to fork() in order to open root-squashed
NFS files, we can exploit the cache of the previous patch.

* src/conf/domain_conf.h (virDomainDiskDefForeachPath): Alter
signature.
* src/conf/domain_conf.c (virDomainDiskDefForeachPath): Require caller
to supply backing chain via disk, if recursion is desired.
* src/security/security_dac.c
(virSecurityDACSetSecurityImageLabel): Adjust caller.
* src/security/security_selinux.c
(virSecuritySELinuxSetSecurityImageLabel): Likewise.
* src/security/virt-aa-helper.c (get_files): Likewise.
* src/qemu/qemu_cgroup.c (qemuSetupDiskCgroup)
(qemuTeardownDiskCgroup): Likewise.
(qemuSetupCgroup): Pre-populate chain.
2012-10-19 17:35:11 -06:00
Eric Blake
4d34c92947 storage: cache backing chain while qemu domain is live
Technically, we should not be re-probing any file that qemu might
be currently writing to.  As such, we should cache the backing
file chain prior to starting qemu.  This patch adds the cache,
but does not use it until the next patch.

Ultimately, we want to also store the chain in domain XML, so that
it is remembered across libvirtd restarts, and so that the only
kosher way to modify the backing chain of an offline domain will be
through libvirt API calls, but we aren't there yet.  So for now, we
merely invalidate the cache any time we do a live operation that
alters the chain (block-pull, block-commit, external disk snapshot),
as well as tear down the cache when the domain is not running.

* src/conf/domain_conf.h (_virDomainDiskDef): New field.
* src/conf/domain_conf.c (virDomainDiskDefFree): Clean new field.
* src/qemu/qemu_domain.h (qemuDomainDetermineDiskChain): New
prototype.
* src/qemu/qemu_domain.c (qemuDomainDetermineDiskChain): New
function.
* src/qemu/qemu_driver.c (qemuDomainAttachDeviceDiskLive)
(qemuDomainChangeDiskMediaLive): Pre-populate chain.
(qemuDomainSnapshotCreateSingleDiskActive): Uncache chain before
snapshot.
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Update
chain after block pull.
2012-10-19 17:35:10 -06:00
Eric Blake
5eaf605447 storage: make it easier to find file within chain
In order to temporarily label files read/write during a commit
operation, we need to crawl the backing chain and find the absolute
file name that needs labeling in the first place, as well as the
name of the file that owns the backing file.

* src/util/storage_file.c (virStorageFileChainLookup): New
function.
* src/util/storage_file.h: Declare it.
* src/libvirt_private.syms (storage_file.h): Export it.
2012-10-19 17:35:10 -06:00
Eric Blake
82507838e0 storage: remember relative names in backing chain
In order to search for a backing file name as literally present
in a chain, we need to remember if the chain had relative names.
Also, searching for absolute names is easier if we only have
to canonicalize once, rather than on every iteration.

* src/util/storage_file.h (_virStorageFileMetadata): Add field.
* src/util/storage_file.c (virStorageFileGetMetadataFromBuf):
(virStorageFileFreeMetadata): Manage it
(absolutePathFromBaseFile): Store absolute names in canonical form.
2012-10-19 17:35:10 -06:00
Eric Blake
1fc9593271 storage: don't require caller to pre-allocate metadata struct
Requiring pre-allocation was an unusual idiom.  It allowed iteration
over the backing chain to use fewer mallocs, but made one-shot
clients harder to read.  Also, this makes it easier for a future
patch to move away from opening fds on every iteration over the chain.

* src/util/storage_file.h (virStorageFileGetMetadataFromFD): Alter
signature.
* src/util/storage_file.c (virStorageFileGetMetadataFromFD): Allocate
return value.
 (virStorageFileGetMetadata): Update clients.
* src/conf/domain_conf.c (virDomainDiskDefForeachPath): Likewise.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Likewise.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Likewise.
2012-10-19 17:35:10 -06:00
Eric Blake
35c74c1733 storage: get entire metadata chain in one call
Previously, no one was using virStorageFileGetMetadata, and for good
reason - it couldn't support root-squash NFS.  Change the signature
and make it useful to future patches, including enhancing the metadata
to recursively track the entire chain.

* src/util/storage_file.h (_virStorageFileMetadata): Add field.
(virStorageFileGetMetadata): Alter signature.
* src/util/storage_file.c (virStorageFileGetMetadata): Rewrite.
(virStorageFileGetMetadataRecurse): New function.
(virStorageFileFreeMetadata): Handle recursion.
2012-10-19 17:35:10 -06:00
Eric Blake
eac74c1f47 storage: don't probe non-files
Backing chains can end on a network protocol, such as nbd:xxx; we
should not attempt to probe the file system in this case.

* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Only probe files.
2012-10-19 17:35:10 -06:00
Eric Blake
1246640b3d storage: use enum for snapshot driver type
This is the last use of raw strings for disk formats throughout
the src/conf directory.

* src/conf/snapshot_conf.h (_virDomainSnapshotDiskDef): Store enum
rather than string for disk type.
* src/conf/snapshot_conf.c (virDomainSnapshotDiskDefClear)
(virDomainSnapshotDiskDefParseXML, virDomainSnapshotDefFormat):
Adjust users.
* src/qemu/qemu_driver.c (qemuDomainSnapshotDiskPrepare)
(qemuDomainSnapshotCreateSingleDiskActive): Likewise.
2012-10-19 17:35:10 -06:00
Eric Blake
e5e8d5d082 storage: use enum for disk driver type
Actually use the enum in the domain conf structure.

* src/conf/domain_conf.h (_virDomainDiskDef): Store enum rather
than string for disk type.
* src/conf/domain_conf.c (virDomainDiskDefFree)
(virDomainDiskDefParseXML, virDomainDiskDefFormat)
(virDomainDiskDefForeachPath): Adjust users.
* src/xenxs/xen_sxpr.c (xenParseSxprDisks, xenFormatSxprDisk):
Likewise.
* src/xenxs/xen_xm.c (xenParseXM, xenFormatXMDisk): Likewise.
* src/vbox/vbox_tmpl.c (vboxAttachDrives): Likewise.
* src/libxl/libxl_conf.c (libxlMakeDisk): Likewise.
2012-10-19 17:35:09 -06:00
Eric Blake
09e7fb5e1f storage: use enum for default driver type
Express the default disk type as an enum, for easier handling.

* src/conf/capabilities.h (_virCaps): Store enum rather than
string for disk type.
* src/conf/domain_conf.c (virDomainDiskDefParseXML): Adjust
clients.
* src/qemu/qemu_driver.c (qemuCreateCapabilities): Likewise.
2012-10-19 17:35:09 -06:00
Eric Blake
41e0edaf84 storage: treat 'aio' like 'raw' at parse time
We have historically allowed 'aio' as a synonym for 'raw' for
back-compat to xen, but since a future patch will move to using
an enum value, we have to pick one to be our preferred output
name.  This is a slight change in the output XML, but the sexpr
and xm outputs should still be identical, and the input XML can
still use either form.

* src/conf/domain_conf.c (virDomainDiskDefForeachPath): Move aio
back-compat...
(virDomainDiskDefParseXML): ...to parse time.
* src/xenxs/xen_sxpr.c (xenParseSxprDisks, xenFormatSxprDisk): ...and
to output time.
* src/xenxs/xen_xm.c (xenParseXM, xenFormatXMDisk): Likewise.
* tests/sexpr2xmldata/sexpr2xml-*.xml: Update tests.
2012-10-19 17:35:09 -06:00
Eric Blake
f772b3d91f storage: list more file types
When an image has no backing file, using VIR_STORAGE_FILE_AUTO
for its type is a bit confusing.  Additionally, a future patch
would like to reserve a default value for the case of no file
type specified in the XML, but different from the current use
of -1 to imply probing, since probing is not always safe.

Also, a couple of file types were missing compared to supported
code: libxl supports 'vhd', and qemu supports 'fat' for directories
passed through as a file system.

* src/util/storage_file.h (virStorageFileFormat): Add
VIR_STORAGE_FILE_NONE, VIR_STORAGE_FILE_FAT, VIR_STORAGE_FILE_VHD.
* src/util/storage_file.c (virStorageFileMatchesVersion): Match
documentation when version probing not supported.
(cowGetBackingStore, qcowXGetBackingStore, qcow1GetBackingStore)
(qcow2GetBackingStoreFormat, qedGetBackingStore)
(virStorageFileGetMetadataFromBuf)
(virStorageFileGetMetadataFromFD): Take NONE into account.
* src/conf/domain_conf.c (virDomainDiskDefForeachPath): Likewise.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Likewise.
* src/conf/storage_conf.c (virStorageVolumeFormatFromString): New
function.
(poolTypeInfo): Use it.
2012-10-19 17:35:09 -06:00
Guannan Ren
4492ef7f48 selinux: relabel tapfd in qemuPhysIfaceConnect
Relabeling tapfd right after the tap device is created.
qemuPhysIfaceConnect is common function called both for static
netdevs and for hotplug netdevs.
2012-10-20 00:01:03 +08:00
Jiri Denemark
8d75e47ede qemu: Do not require hostuuid in migration cookie
Having hostuuid in migration cookie is a nice bonus since it provides an
easy way of detecting migration to the same host. However, requiring it
breaks backward compatibility with older libvirt releases.
2012-10-19 15:08:29 +02:00
Jiri Denemark
9fcc5436d3 qemu: Allow migration with host USB devices
Recently, patches were added support for (managed)saving, restoring, and
migrating domains with host USB devices. However, qemu driver would
still forbid migration of such domains because qemuMigrationIsAllowed
was not updated.
2012-10-19 14:18:26 +02:00
Guido Günther
c324bad93a qemu: Set arch to i686 if qemu-system-i386 is found
If we can't probe the architecture from QMP we parse the architecture
from the qemu binaries name. This results in the architecture being i386
instead of i686 which then results in QEMU_CAPS_PCI_MULTIBUS being unset
which gives a broken qemu command line.

This probably didn't show up earlier since most of the time there's also
a /usr/bin/qemu around which results in i686 capabilities.
2012-10-19 08:12:21 +02:00
Guido Günther
a605594f8e qemu: Don't fail without emulatorpin or cpumask
This unbreaks qemu:///session that got broken by
ba63d8f7d8.
2012-10-19 01:25:19 +02:00
Michal Privoznik
b7e9202401 network: Set to NULL after virNetworkDefFree()
which frees all allocated memory but doesn't set the passed pointer to
NULL.  Therefore, we must do it ourselves. This is causing actual
libvirtd crash: Basically, when doing 'virsh net-edit' the newDef should
be dropped.  And the memory is freed, indeed. However, the pointer is
not set to NULL but kept instead. And the next duo of calls 'virsh
net-start' and 'virsh net-destroy' starts the disaster. The latter one
does the same as 'virsh destroy'; it sees that newDef is nonNULL so it
replaces def with newDef (which has been freed already as said a few
lines above). Therefore any subsequent call accessing def will hit the ground.
2012-10-18 17:02:48 +02:00
Viktor Mihajlovski
47a7b93584 dist: added cpu/cpu_ppc_data.h to Makefile.am
Missing entry for cpu_ppc_data.h added to fix RPM build.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2012-10-18 16:50:47 +02:00
Jiri Denemark
f1c7010040 qemu: Always format CPU topology
When libvirt cannot find a suitable CPU model for host CPU (easily
reproducible by running libvirt in a guest), it would not provide CPU
topology in capabilities XML either. Even though CPU topology is known
and can be queried by virNodeGetInfo. With this patch, CPU topology will
always be provided in capabilities XML regardless on the presence of CPU
model.
2012-10-18 14:57:08 +02:00
Peter Krempa
09f10a12be qemu: Add support for HyperV Enlightenment feature "relaxed"
This patch adds QEMU support for the "relaxed" feature implemented by
previous patch.
2012-10-18 12:22:50 +02:00
Peter Krempa
cc922fddc3 conf: Add support for HyperV Enlightenment features
Hypervisors are starting to support HyperV Enlightenment features that
improve behavior of guests running Microsoft Windows operating systems.

This patch adds support for the "relaxed" feature that improves timer
behavior and also establishes a framework to add these features in
future.
2012-10-18 12:22:50 +02:00
Peter Krempa
88cac66d92 conf: Make tri-state feature options more universal
The apic-eoi feature enum and implementation can be made more universal
to allow re-use of the enum for other features.
2012-10-18 12:22:49 +02:00
Michal Privoznik
998dc17da3 qemu: Correctly wait for spice to migrate
Currently we query-spice after the main migration has completed
before moving to next state. Qemu reports this as boolean (not
enclosed within quotes). Therefore it is not correct to use
virJSONValueObjectGetString but virJSONValueObjectGetBoolean instead.
2012-10-18 10:31:56 +02:00
Viktor Mihajlovski
1916679506 qemu: Fixed default machine detection in qemuCapsParseMachineTypesStr
The machine in the last output line of <qemu-binary> -M ?
was always reported as default machine even if this wasn't the
actual default. Trivial fix.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2012-10-17 17:24:41 -06:00
Martin Kletzander
ba63d8f7d8 qemu: Pin the emulator when only cpuset is specified
According to our recent changes (clarifications), we should be pinning
qemu's emulator processes using the <vcpu> 'cpuset' attribute in case
there is no <emulatorpin> specified.  This however doesn't work
entirely as expected and this patch should resolve all the remaining
issues.
2012-10-17 17:37:10 +02:00
Jiri Denemark
837993d845 qemu: Clear async job when p2p migration fails early
When p2p migration fails early because qemuMigrationIsAllowed or
qemuMigrationIsSafe say migration should be cancelled, we fail to clear
the migration-out async job. As a result of that, further APIs called
for the same domain may fail with Timed out during operation: cannot
acquire state change lock.

Reported by Guido Winkelmann.
2012-10-17 15:43:38 +02:00
Doug Goldstein
1e7ec88d9a interface: add virInterfaceGetXMLDesc() in udev
Added support for retrieving the XML defining a specific interface via
the udev based backend to virInterface. Implement the following APIs
for the udev based backend:
* virInterfaceGetXMLDesc()

Note: Does not support bond devices.
2012-10-17 13:59:16 +02:00
Li Zhang
40f58ca75d Doc-fix for PowerPC CPU model driver
There are some descriptions not right in PowerPC CPU model driver.
This patch is to fix them.

Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Acked-by: Michal Privoznik <mprivozn@redhat.com>
2012-10-17 10:03:34 +02:00
Li Zhang
9943a7341c Implement CPU model driver for PowerPC
Currently, the CPU model driver is not implemented for PowerPC.
Host's CPU information is needed to exposed to guests' XML file some
time.

This patch is to implement the callback functions of CPU model driver.

Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Acked-by: Michal Privoznik <mprivozn@redhat.com>
2012-10-17 10:03:34 +02:00
Li Zhang
309f03db40 Add one file cpu_ppc_data.h to define CPU data for PPC
CPU version can be got by PVR on PowerPC. So this PVR is defined in
the CPU data in cpuData structure.

Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Acked-by: Michal Privoznik <mprivozn@redhat.com>
2012-10-17 10:03:34 +02:00
Guannan Ren
d37a3a1d6c selinux: remove unused variables in socket labelling 2012-10-17 13:13:17 +08:00
Guannan Ren
89b63f0ad4 selinux: fix wrong tapfd relablling
It should relabel tapfd of virtual network of type VIR_DOMAIN_NET_TYPE_DIRECT
rather than VIR_DOMAIN_NET_TYPE_NETWORK and VIR_DOMAIN_NET_TYPE_BRIDGE
(commit ae368ebfcc introduced this bug)

Caution: The context of the two hunks is identical other than indentation.
Please be extremely cautious of where the patch gets applied.
2012-10-17 13:13:14 +08:00
Cole Robinson
9f0e9cba27 storage: lvm: lvcreate fails with allocation=0, don't do that
On F17 at least, this command fails:

$ sudo /usr/sbin/lvcreate --name sparsetest -L 0K --virtualsize 16384K vgvirt
  Unable to create new logical volume with no extents

Which is unfortunate since allocation=0 is what virt-manager tries to use
by default.

Rather than telling the user 'don't do that', let's just give them the
smallest allocation possible if alloc=0 is requested.

https://bugzilla.redhat.com/show_bug.cgi?id=866481
2012-10-16 21:16:44 -04:00
Cole Robinson
01df6f2bff storage: lvm: Don't overwrite lvcreate errors
Before:
$ sudo virsh vol-create-as --pool vgvirt sparsetest --capacity 16M --allocation 0
error: Failed to create vol sparsetest
error: internal error Child process (/usr/sbin/lvchange -aln vgvirt/sparsetest) unexpected exit status 5:   One or more specified logical volume(s) not found.

After:
$ sudo virsh vol-create-as --pool vgvirt sparsetest --capacity 16M --allocation 0
error: Failed to create vol sparsetest
error: internal error Child process (/usr/sbin/lvcreate --name sparsetest -L 0K --virtualsize 16384K vgvirt) unexpected exit status 5:   Unable to create new logical volume with no extents
2012-10-16 21:16:44 -04:00
Jiri Denemark
5ce6d95eed locking: Fix build with sanlock < 2.4
libvirt started using sanlock_killpath to implement on_lockfailure
action. Since sanlock_killpath was introduced in sanlock 2.4, libvirt
fails to build with older sanlock.
2012-10-16 21:32:05 +02:00
Daniel P. Berrange
7bd744c401 Fix typo in previous commit s/lik/like/
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-16 16:37:50 +01:00
Daniel P. Berrange
d507f8f9b9 Make virInitialize thread safe
Currently there is a restriction that multi-threaded applications
must manually call virInitialize, before threads start using
libvirt, because it is not thread-safe. By switching it to use
a virOnceControl initializer we gain thread safety, and thus
applications no longer need to manually call it. They can rely
on virConnectOpen invoking it for them.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-16 16:33:38 +01:00
Daniel P. Berrange
84912e9c91 Fix virProcessKillPainfully on Win32
Win32 platforms don't have SIGKILL defined, but they do have
SIGABRT. Since our virProcess wrapper treats anything which
isn't SIGTERM/SIGINT as equivalent to SIGKILL, just use
SIGABRT on Win32.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-16 15:47:14 +01:00
Daniel P. Berrange
381a339e98 Add JSON serialization of virNetServerPtr objects for process re-exec()
Add two new APIs virNetServerNewPostExecRestart and
virNetServerPreExecRestart which allow a virNetServerPtr
object to be created from a JSON object and saved to a
JSON object, for the purpose of re-exec'ing a process.

This includes serialization of all registered services
and clients

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-16 15:45:55 +01:00
Daniel P. Berrange
3cfc3d7d2c Add JSON serialization of virNetServerClientPtr objects for process re-exec()
Add two new APIs virNetServerClientNewPostExecRestart and
virNetServerClientPreExecRestart which allow a virNetServerClientPtr
object to be created from a JSON object and saved to a
JSON object, for the purpose of re-exec'ing a process.

This includes serialization of the connected socket associated
with the client

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-16 15:45:55 +01:00
Daniel P. Berrange
0cc7925520 Add JSON serialization of virNetServerServicePtr objects for process re-exec()
Add two new APIs virNetServerServiceNewPostExecRestart and
virNetServerServicePreExecRestart which allow a virNetServerServicePtr
object to be created from a JSON object and saved to a
JSON object, for the purpose of re-exec'ing a process.

This includes serialization of the listening sockets associated
with the service

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-16 15:45:55 +01:00
Daniel P. Berrange
c298145344 Add JSON serialization of virNetSocketPtr objects for process re-exec()
Add two new APIs virNetSocketNewPostExecRestart and
virNetSocketPreExecRestart which allow a virNetSocketPtr
object to be created from a JSON object and saved to a
JSON object, for the purpose of re-exec'ing a process.

As well as saving the state in JSON format, the second
method will disable the O_CLOEXEC flag so that the open
file descriptors are preserved across the process re-exec()

Since it is not possible to serialize SASL or TLS encryption
state, an error will be raised if attempting to perform
serialization on non-raw sockets

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-16 15:45:55 +01:00
Daniel P. Berrange
8057c04e8d Add JSON serialization of virLockSpacePtr objects for process re-exec()
Add two new APIs virLockSpaceNewPostExecRestart and
virLockSpacePreExecRestart which allow a virLockSpacePtr
object to be created from a JSON object and saved to a
JSON object, for the purposes of re-exec'ing a process.

As well as saving the state in JSON format, the second
method will disable the O_CLOEXEC flag so that the open
file descriptors are preserved across the process re-exec()

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-16 15:45:55 +01:00
Daniel P. Berrange
eca72d4759 Introduce an internal API for handling file based lockspaces
The previously introduced virFile{Lock,Unlock} APIs provide a
way to acquire/release fcntl() locks on individual files. For
unknown reason though, the POSIX spec says that fcntl() locks
are released when *any* file handle referring to the same path
is closed. In the following sequence

  threadA: fd1 = open("foo")
  threadB: fd2 = open("foo")
  threadA: virFileLock(fd1)
  threadB: virFileLock(fd2)
  threadB: close(fd2)

you'd expect threadA to come out holding a lock on 'foo', and
indeed it does hold a lock for a very short time. Unfortunately
when threadB does close(fd2) this releases the lock associated
with fd1. For the current libvirt use case for virFileLock -
pidfiles - this doesn't matter since the lock is acquired
at startup while single threaded an never released until
exit.

To provide a more generally useful API though, it is necessary
to introduce a slightly higher level abstraction, which is to
be referred to as a "lockspace".  This is to be provided by
a virLockSpacePtr object in src/util/virlockspace.{c,h}. The
core idea is that the lockspace keeps track of what files are
already open+locked. This means that when a 2nd thread comes
along and tries to acquire a lock, it doesn't end up opening
and closing a new FD. The lockspace just checks the current
list of held locks and immediately returns VIR_ERR_RESOURCE_BUSY.

NB, the API as it stands is designed on the basis that the
files being locked are not being otherwise opened and used
by the application code. One approach to using this API is to
acquire locks based on a hash of the filepath.

eg to lock /var/lib/libvirt/images/foo.img the application
might do

   virLockSpacePtr lockspace = virLockSpaceNew("/var/lib/libvirt/imagelocks");
   lockname = md5sum("/var/lib/libvirt/images/foo.img");
   virLockSpaceAcquireLock(lockspace, lockname);

NB, in this example, the caller should ensure that the path
is canonicalized before calculating the checksum.

It is also possible to do locks directly on resources by
using a NULL lockspace directory and then using the file
path as the lock name eg

   virLockSpacePtr lockspace = virLockSpaceNew(NULL);
   virLockSpaceAcquireLock(lockspace, "/var/lib/libvirt/images/foo.img");

This is only safe to do though if no other part of the process
will be opening the files. This will be the case when this
code is used inside the soon-to-be-reposted virlockd daemon

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-16 15:45:55 +01:00
Eric Blake
819c8ce043 maint: prepare for next release number
Given Daniel's announcement[1], code targetting the next release will
be in 1.0.0, not 0.10.3.  Changed mechanically with:

for f in $(git grep -l '0\(.\)10\13\b') ; do
   sed -i -e 's/0\(.\)10\13/1\10\10/g' $f
done

[1]https://www.redhat.com/archives/libvir-list/2012-October/msg00403.html

* docs/formatdomain.html.in: Use 1.0.0 for next release.
* src/interface/interface_backend_udev.c: Likewise.
2012-10-16 08:09:01 -06:00
Martin Kletzander
280b8c9e7c conf: Fix crash with cleanup
There was a crash possible when both <boot dev... and <boot
order... were specified due to virDomainDefParseBootXML() erroring out
before setting *tmp (which was free'd in cleanup).  As a fix, I
created this cleanup that uses one pointer for all the temporary
stored XPath strings and values, plus this pointer is correctly
initialized to NULL.
2012-10-16 11:15:04 +02:00
Martin Kletzander
6676c1fc8f selinux: Use raw contexts 2
In commit 9674f2c637, I forgot to change
selabel_lookup with the other functions, so this one-liner does exactly
that.
2012-10-16 10:30:18 +02:00
Eric Blake
2cfa14bc8a maint: drop spurious semicolons
Detected with:
git grep ';;$' -- '**/*.[ch]'

* src/network/bridge_driver.c (networkRadvdConfContents): Fix
harmless typo.
* src/phyp/phyp_driver.c (phypUUIDTable_Pull): Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDriveDel):
Likewise.
2012-10-15 09:08:19 -06:00
Guannan Ren
ae368ebfcc selinux: add security selinux function to label tapfd
BZ:https://bugzilla.redhat.com/show_bug.cgi?id=851981
When using macvtap, a character device gets first created by
kernel with name /dev/tapN, its selinux context is:
system_u:object_r:device_t:s0

Shortly, when udev gets notification when new file is created
in /dev, it will then jump in and relabel this file back to the
expected default context:
system_u:object_r:tun_tap_device_t:s0

There is a time gap happened.
Sometimes, it will have migration failed, AVC error message:
type=AVC msg=audit(1349858424.233:42507): avc:  denied  { read write } for
pid=19926 comm="qemu-kvm" path="/dev/tap33" dev=devtmpfs ino=131524
scontext=unconfined_u:system_r:svirt_t:s0:c598,c908
tcontext=system_u:object_r:device_t:s0 tclass=chr_file

This patch will label the tapfd device before qemu process starts:
system_u:object_r:tun_tap_device_t:MCS(MCS from seclabel->label)
2012-10-15 21:01:07 +08:00
Martin Kletzander
7ba5defb5a Add support for SUSPEND_DISK event
This patch adds support for SUSPEND_DISK event; both lifecycle and
separated.  The support is added for QEMU, machines are changed to
PMSUSPENDED, but as QEMU sends SHUTDOWN afterwards, the state changes
to shut-off.  This and much more needs to be done in order for libvirt
to work with transient devices, wake-ups etc.  This patch is not
aiming for that functionality.
2012-10-15 12:09:10 +02:00
Ján Tomko
a9e3b4f78e util: switch virLogEatParams to virLogSource
Commit e8fd8757c8 changed 'const char *'
category to virLogSource enum. This changes it in virLogEatParams as
well, thus fixing the build with --disable-debug.
--
Hopefully moving the enum declarations is less ugly than using int.
2012-10-15 11:13:43 +02:00
Osier Yang
f81f0f2f1d node_memory: Add new parameter field to tune the new sysfs knob
Upstream kernel introduced new sysfs knob "merge_across_nodes" to
specify if pages from different numa nodes can be merged. When set
to 0, only pages which physically reside in the memory area of
same NUMA node can be merged. When set to 1, pages from all nodes
can be merged.

This patch supports the tuning by adding new param field
"shm_merge_across_nodes".
2012-10-15 17:35:54 +08:00
Laine Stump
6bde0a1a37 qemu: reorganize qemuDomainChangeNet and qemuDomainChangeNetBridge
This patch resolves:

  https://bugzilla.redhat.com/show_bug.cgi?id=805071

to the extent that it can be resolved with current qemu functionality.
It attempts to detect as many situations as possible when the simple
operation of disconnecting an existing tap device from one bridge and
attaching it to another will satisfy the change requested in
virDomainUpdateDeviceFlags() for a network device. Before this patch,
that situation could only be detected if the pre-change interface
*and* the post-change interface definition were both "type='bridge'".
After this patch, it can also be detected if the before or after
interfaces are any combination of type='bridge' and type='network'
(the networks can be <forward mode='nat|route|bridge'>, as long as
they use a Linux host bridge and not macvtap connections).

This extra effort is especially useful since the recent discovery that
a netdev_del+netdev_add combo (to reconnect the network device with
completely different hostside configuration) doesn't work properly
with current qemu (1.2) unless it is accompanied by the matching
device_del+device_add - see this mailing list message for details:

  http://lists.nongnu.org/archive/html/qemu-devel/2012-10/msg02355.html

(A slight modification of the patch referenced there has been prepared
to apply on top of this patch, but won't be pushed until qemu can be
made to work with it.)

* qemuDomainChangeNet needs access to the virDomainDeviceDef that
holds the new netdef (so that it can clear out the virDomainDeviceDef
if it ends up using the NetDef to replace the original), so the
virDomainNetDefPtr arg is replaced with a virDomainDeviceDefPtr.

* qemuDomainChangeNet previously checked for *some* changes to the
interface config, but this check was by no means complete. It was also
a bit disorganized.

This refactoring of the code is (I believe) complete in its check of
all NetDef attributes that might be changed, and either returns a
failure (for changes that are simply impossible), or sets one of three
flags:

  needLinkStateChange - if the device link state needs to go up/down
  needBridgeChange    - if everything else is the same, but it needs
                        to be connected to a difference linux host
                        bridge
  needReconnect       - if the entire host side of the device needs
                        to be torn down and reconstructed (currently
                        non-working, as mentioned above)

Note that this function will refuse to make any change that requires
the *guest* side of the device to be detached (e.g. changing the PCI
address or mac address). Those would be disruptive enough to the guest
that it's reasonable to require an explicit detach/attach sequence
from the management application.

* As mentioned above, qemuDomainChangeNet also does its best to
understand when a simple change in attached bridge for the existing
tap device will work vs. the need to completely tear down/reconstruct
the host side of the device (including tap device).

This patch *does not* implement the "reconnect" code anyway - there is
a placeholder that turns that into an error. Rather, the purpose of
this patch is to replicate existing behavior with code that is ready
to have that functionality plugged in in a later patch.

* The expanded uses for qemuDomainChangeNetBridge meant that it needed
to be enhanced as well - it no longer replaces the original brname
string in olddev with the new brname; instead, it relies on the
caller to replace the *entire* olddev with newdev (since we've gone
to great lengths to assure they are functionally identical other
than the name of the bridge, this is now not only safe, but more
correct). Additionally, qemuDomainNetChangeBridge can now set the
bridge for type='network' interfaces as well as plain type='bridge'
interfaces. (Note that I had to make this change simultaneous to the
reorganization of qemuDomainChangeNet because the two are too
closely intertwined to separate).
2012-10-15 04:36:39 -04:00