Commit Graph

6596 Commits

Author SHA1 Message Date
Michal Privoznik
50e9b38930 qemu: Clenup qemuDomainSetInterfaceParameters
which contained some useless lines, copied code, NULL
dereference.
2012-02-01 08:56:54 +01:00
Michal Privoznik
bb311b3458 qemu: Don't jump to endjob if no job was even started
In qemuDomainShutdownFlags if we try to use guest agent,
which has error or is not configured, we jump go endjob
label even if we haven't started any job yet. This may
lead to the daemon crash:
1) virsh shutdown --mode agent on a domain without agent configured
2) wait until domain quits
3) virsh edit
2012-02-01 08:42:47 +01:00
Taku Izumi
53e23e99a9 qemu: fix my typo at commit 74e034964c
Fix my typo at
  commit 74e034964c

"disk->rawio == -1" indicates that this value is not
specified. So in case of this, domain must not
be tainted.

Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
2012-01-31 20:21:06 -07:00
Alex Jia
bfdbae0694 simplify block of codes
Using new function 'virTypedParameterArrayClear' to simplify block of codes.

* daemon/remote.c, src/remote/remote_driver.c: simplify codes.

Signed-off-by: Alex Jia <ajia@redhat.com>
2012-02-01 10:57:56 +08:00
Taku Izumi
74e034964c qemu: make qemu processes to retain rawio capability
This patch revises qemuProcessStart() function for qemu
processes to retain CAP_SYS_RAWIO if needed.
And in case of that, add taint flag to domain.

Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Shota Hirae <m11g1401@hibikino.ne.jp>
2012-01-31 13:36:38 -05:00
Taku Izumi
c2e146bfb0 util: extend virExecWithHook()
This patch extends virExecWithHook() to receive
capability information.

Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Shota Hirae <m11g1401@hibikino.ne.jp>
2012-01-31 13:36:33 -05:00
Taku Izumi
53bd0cebd3 util: add functions to keep capabilities
This patch introduces virSetCapabilities() function and implements
virCommandAllowCap() function.

Existing virClearCapabilities() is function to clear all capabilities.
Instead virSetCapabilities() is function to set arbitrary capabilities.

Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Shota Hirae <m11g1401@hibikino.ne.jp>
2012-01-31 13:36:28 -05:00
Taku Izumi
397e6a705b conf: add rawio attribute to disk element of domain XML
This patch adds a new attribute "rawio" to the "disk" element
 of domain XML. Valid values of "rawio" attribute are "yes"
 and "no".
 rawio='yes' indicates the disk is desirous of CAP_SYS_RAWIO.

 If you specify the following XML:

 <disk type='block' device='lun' rawio='yes'>
  ...
 </disk>

 the domain will be granted CAP_SYS_RAWIO.
 (of course, the domain have to be executed with root privilege)

NOTE:
   - "rawio" attribute is only valid when device='lun'
   - At the moment, any other disks you won't use rawio can use rawio.

Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
2012-01-31 13:36:23 -05:00
Zeeshan Ali (Khattak)
e545dd4ffe Implement virStorageVolResize() for FS backend
Currently only VIR_STORAGE_VOL_RESIZE_DELTA flag is supported.
2012-01-31 11:58:11 -05:00
Eric Blake
055bbf45e4 resize: slightly alter signature
Our existing virDomainBlockResize takes an unsigned long long
argument; if that command is later taught a DELTA and SHRINK flag,
we cannot change its type without breaking API (but at least such
a change would be ABI compatible).  Meanwhile, the only time a
negative size makes sense is if both DELTA and SHRINK are used
together, but if we keep the argument unsigned, applications can
pass the positive delta amount by which they would like to shrink
the system, and have the flags imply the negative value.  So,
since this API has not yet been released, and in the interest of
consistency with existing API, we swap virStorageVolResize to
always pass an unsigned value.

* include/libvirt/libvirt.h.in (virStorageVolResize): Use unsigned
argument.
* src/libvirt.c (virStorageVolResize): Likewise.
* src/driver.h (virDrvStorageVolUpload): Adjust clients.
* src/remote/remote_protocol.x (remote_storage_vol_resize_args):
Likewise.
* src/remote_protocol-structs: Regenerate.
Suggested by Daniel P. Berrange.
2012-01-31 11:58:06 -05:00
Philipp Hahn
098a987b98 XenXs: Update documentation
Fix several references to now renamed functions and parameters when the
functions were moved from src/xen/ to src/xenxs/.

Signed-off-by: Philipp Hahn <hahn@univention.de>
2012-01-30 13:13:23 -07:00
Laine Stump
3801831cdf qemu: add "romfile" support to specify device boot ROM
This patch addresses: https://bugzilla.redhat.com/show_bug.cgi?id=781562

Along with the "rombar" option that controls whether or not a boot rom
is made visible to the guest, qemu also has a "romfile" option that
allows specifying a binary file to present as the ROM BIOS of any
emulated or passthrough PCI device. This patch adds support for
specifying romfile to both passthrough PCI devices, and emulated
network devices that attach to the guest's PCI bus (just about
everything other than ne2k_isa).

One example of the usefulness of this option is described in the
bugzilla report: 82576 sriov network adapters don't provide a ROM BIOS
for the cards virtual functions (VF), but an image of such a ROM is
available, and with this ROM visible to the guest, it can PXE boot.

In libvirt's xml, the new option is configured like this:

   <hostdev>
     ...
     <rom file='/etc/fake/boot.bin'/>
     ...
   </hostdev

(similarly for <interface>).
2012-01-30 12:30:35 -05:00
Laine Stump
3284ac046f qemu: (and conf) support rombar for network devices
When support for the rombar option was added, it was only added for
PCI passthrough devices, configured with <hostdev>. The same option is
available for any network device that is attached to the guest's PCI
bus. This patch allows setting rombar for any PCI network device type.

After adding cases to test this to qemuxml2argv-hostdev-pci-rombar.*,
I decided to rename those files (to qemuxml2argv-pci-rom.*) to more
accurately reflect the additional tests, and also noticed that up to
now we've only been performing a domainschematest for that case, so I
added the "pci-rom" test to both qemuxml2argv and qemuxml2xml (and in
the process found some bugs whose fixes I squashed into previous
commits of this series).
2012-01-30 12:25:32 -05:00
Laine Stump
c01ba1a48f conf: relocate rombar and boot order parse/format
Since these two items are now in the virDomainDeviceInfo struct, it
makes sense to parse/format them in the functions written to
parse/format that structure. Not all types of devices allow them, so
two internal flags are added to indicate when it is appropriate to do
so.

I was lucky - only one test case needed to be re-ordered!
2012-01-30 12:25:25 -05:00
Laine Stump
159f4d0b30 conf: put all guest-related HostdevDef data in one object
To help consolidate the commonality between virDomainHostdevDef and
virDomainNetDef into as few members as possible (and because I
think it makes sense), this patch moves the rombar and bootIndex
members into the "info" member that is common to both (and to all the
other structs that use them).

It's a bit problematic that this gives rombar and bootIndex to many
device types that don't use them, but this is already the case for the
master and mastertype members of virDomainDeviceInfo, and is properly
commented as such in the definition.

Note that this opens the door to supporting rombar for other devices
that are attached to the guest PCI bus - virtio-blk-pci,
virtio-net-pci, various other network adapters - which which have that
capability in qemu, but previously had no support in libvirt.
2012-01-30 12:25:20 -05:00
Laine Stump
aaa6210f81 conf: remove duplicate call to VIR_FREE(info->alias)
There is another identical call 4 lines up in the same function.
2012-01-30 11:38:39 -05:00
Hendrik Schwartke
484a0bab39 qemu: Fix segfault in qemuMonitorTextGetBlockInfo
If some error occurs then the cleanup code calls VIR_FREE(info)
without ensuring that info is initialized.
2012-01-30 13:48:34 +01:00
Cole Robinson
efb0839c1d xen: Don't add <console> to xml for dom0
It just doesn't really make sense and confuses virt-manager
2012-01-30 07:17:36 -05:00
KAMEZAWA Hiroyuki
c6ec021b3c remote handler for virDomainGetCPUStats()
Unlike other users of virTypedParameter with RPC, this interface
can return zero-filled entries because the interface assumes
2 dimensional array. We compress these entries out from the
server when generating the over-the-wire contents, then reconstitute
them in the client.

Signed-off-by: Eric Blake <eblake@redhat.com>
2012-01-28 11:09:31 -07:00
Eric Blake
f0b22ebea4 docs: tweak recent suspend API additions
* src/libvirt.c (virDomainPMSuspendForDuration): Clarify usage.
2012-01-28 07:29:10 -07:00
KAMEZAWA Hiroyuki
e1eea7470b Add new public API virDomainGetCPUStats()
add new API virDomainGetCPUStats() for getting cpu accounting information
per real cpus which is used by a domain.  The API is designed to allow
future extensions for additional statistics.

based on ideas by Lai Jiangshan and Eric Blake.

* src/libvirt_public.syms: add API for LIBVIRT_0.9.10
* src/libvirt.c: define virDomainGetCPUStats()
* include/libvirt/libvirt.h.in: add virDomainGetCPUStats() header
* src/driver.h: add driver API
* python/generator.py: add python API (as not implemented)

Signed-off-by: Eric Blake <eblake@redhat.com>
2012-01-28 07:18:27 -07:00
Michal Privoznik
8f8b080263 Introduce virDomainPMSuspendForDuration API
This API allows a domain to be put into one of S# ACPI states.
Currently, S3 and S4 are supported. These states are shared
with virNodeSuspendForDuration.
However, for now we don't support any duration other than zero.
The same apply for flags.
2012-01-28 10:20:46 +01:00
Zeeshan Ali (Khattak)
835817806e resize: implement remote protocol for virStorageVolResize()
Autogeneration saves the day.

Signed-off-by: Eric Blake <eblake@redhat.com>
2012-01-27 19:56:21 -07:00
Zeeshan Ali (Khattak)
6714fd04d2 resize: add virStorageVolResize() API
Add a new function to allow changing of capacity of storage volumes.
Plan out several flags, even if not all of them will be implemented
up front.

Expose the new command via 'virsh vol-resize'.

Signed-off-by: Eric Blake <eblake@redhat.com>
2012-01-27 19:56:18 -07:00
Cole Robinson
bb2eddc6cf Add new error code VIR_ERROR_AUTH_CANCELLED
And hook it up for policykit auth. This allows virt-manager to detect
that the user clicked the policykit 'cancel' button and not throw
an 'authentication failed' error message at the user.
2012-01-27 16:53:27 -05:00
Eric Blake
ab6f1c9814 qemu: avoid double free of qemu help output
If yajl was not compiled in, we end up freeing an incoming
parameter, which leads to a bogus free later on.  Regression
introduced in commit 6e769eb.

* src/qemu/qemu_capabilities.c (qemuCapsParseHelpStr): Avoid alloc
on failure path, which in turn fixes bogus free.
Reported by Cole Robinson.
2012-01-27 13:53:11 -07:00
Eric Blake
83ed03010b xml: fix struct typos
Noticed this while reviewing Dan's patches.

* src/util/xml.c (virXMLRewritFileData): Rename to
virXMLRewriteFileData.
2012-01-27 11:08:58 -07:00
Daniel P. Berrange
9b516aa31b Move virEmitXMLWarning into xml.h
The virEmitXMLWarning function should always have been in
the xml.[hc] files, and should use virXML as its name
prefix

* src/util/util.c, src/util/util.h: Remove virEmitXMLWarning
* src/util/xml.c, src/util/xml.h: Add virXMLEmitWarning
2012-01-27 18:03:30 +00:00
Daniel P. Berrange
510fa47c2a Move virMacAddrXXX functions to src/util/virmacaddr.[ch]
Move the virMacAddrXXX functions out of util.[ch] and into a
new dedicate file virmacaddr.[ch]
2012-01-27 17:56:10 +00:00
Daniel P. Berrange
4ce98dadcc Rename virXXXXMacAddr to virMacAddrXXX
Rename virFormatMacAddr, virGenerateMacAddr and virParseMacAddr
to virMacAddrFormat, virMacAddrGenerate and virMacAddrParse
respectively
2012-01-27 17:53:44 +00:00
Paolo Bonzini
b66d1bef14 qemu: parse and create -cpu ...,-kvmclock
QEMU supports a bunch of CPUID features that are tied to the kvm CPUID
nodes rather than the processor's.  They are "kvmclock",
"kvm_nopiodelay", "kvm_mmu", "kvm_asyncpf".  These are not known to
libvirt and their CPUID leaf might move if (for example) the Hyper-V
extensions are enabled. Hence their handling would anyway require some
special-casing.

However, among these the most useful is kvmclock; an additional
"property" of this feature is that a <timer> element is a better model
than a CPUID feature.  Although, creating part of the -cpu command-line
from something other than the <cpu> XML element introduces some
ugliness.

Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-01-27 16:51:50 +01:00
Paolo Bonzini
5a137f3620 conf: add kvmclock timer
Add kvmclock timer to documentation, schema and parsers.  Keep the
platform timer first since it is kind of special, and alphabetize
the others when possible (i.e. when it does not change the ABI).

Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-01-27 16:51:50 +01:00
Paolo Bonzini
df8e6918b3 qemu: do not create useless <cpu> element
Avoid creating an empty <cpu> element when the QEMU command-line simply
specifies the default "-cpu qemu32" or "-cpu qemu64".

This requires the previous patch, which lets us represent "-cpu qemu32"
as <os arch='i686'> in the generated XML.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-01-27 16:51:50 +01:00
Paolo Bonzini
d5e88b2c33 qemu: get arch name from <cpu> element
The qemu32 CPU model is chosen based on the <os arch=...> name when
creating the QEMU command line for a 64-bit host.  For the opposite
transformation we can test the guest CPU model for the "lm" feature.
If it is absent, def->os.arch needs to be corrected.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-01-27 16:51:50 +01:00
Paolo Bonzini
4be541a6d9 qemu: detect arch correctly for KVM
When running under KVM, the arch is usually set to i686 because
the name of the emulator is not qemu-system-x86_64.  Use the host
arch instead.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-01-27 16:51:49 +01:00
Paolo Bonzini
ef00a05e51 x86: add kvm32 and kvm64, update qemu64
Recently (or not so recently) QEMU added the kvm32 and kvm64
architectures, representing a least common denominator of all
hosts that can run KVM.  Add them to the machine map.

Also, some features that TCG supports were added to qemu64.
Add them to the cpu_map.xml whenever KVM is guaranteed to support
those.  We still have to leave some out, because they would not
be available to guests running on older hosts.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-01-27 16:51:49 +01:00
Paolo Bonzini
4a00c099ab qemu: parse -enable-kvm
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-01-27 16:51:49 +01:00
Eric Blake
6e769ebadb qemu: require qmp on new enough qemu
The qemu developers have made it clear that modern qemu will no
longer guarantee human monitor command stability; furthermore,
some features, such as async events, are only supported via qmp.
If we are compiled without support for handling JSON, we cannot
expect to sanely interact with modern qemu.

However, things must continue to build on RHEL 5, where qemu
is stuck at 0.10, and where yajl is not available.

Another benefit of this patch: future additions of new monitor
commands need only focus on qemu_monitor_json.c, instead of
also wasting time with qemu_monitor_text.c.

* src/qemu/qemu_capabilities.c (qemuCapsComputeCmdFlags): Report
error if yajl is missing but qemu requires qmp.
(qemuCapsParseHelpStr): Propagate error.
(qemuCapsExtractVersionInfo): Update caller.
* tests/qemuhelptest.c (testHelpStrParsing): Likewise.
2012-01-27 08:45:50 -07:00
Eric Blake
ff88cd5905 qemu: support qmp on RHEL/CentOS qemu
I'm getting tired of remembering to backport RHEL-specific
patches when building upstream libvirt on RHEL 6.x or CentOS.
All the affected versions of RHEL qemu-kvm have backported
enough patches to a) make JSON useful, and b) modify the
-help text to mention libvirt as the preferred interface;
which means this string in the help output is a reliable
indicator that we can outsmart a strict version check,
even when upstream qemu 0.12 lacked the needed features.

* src/qemu/qemu_capabilities.c (qemuCapsComputeCmdFlags):
Recognize particular help string present when enough features were
backported to be worth using JSON.
* tests/qemuhelptest.c (mymain): Update tests accordingly.
2012-01-27 08:11:19 -07:00
Stefan Berger
823b90339f nwfilter: Rebuild filters only if new filter is different than current
Compare two filters' XML for equality and only rebuild/instantiate the new
filter if the new and current filters are found to be different. This
improves performance during an update of a filter with no obvious change
or the reloading of filters during a 'kill -SIGHUP'
2012-01-27 08:19:58 -05:00
Stefan Berger
8fa78dd49c nwfilter: Force instantiation of filters upon driver reload
Introduce a function that rebuilds all running VMs' filters. Call
this function when reloading the nwfilter driver.

This addresses a problem introduced by the 2nd patch that typically
causes no filters to be reinstantiate anymore upon driver reload
since their XML has not changed. Yet the current behavior is that
upon a SIGHUP all filters get reinstantiated.
2012-01-27 08:19:58 -05:00
Jiri Denemark
65c27e2935 qemu: Refactor qemuMonitorGetBlockInfo
QEMU always sends details about all available block devices as an answer
for "info block"/"query-block" command. On the other hand, our
qemuMonitorGetBlockInfo was made for a single block devices queries
only. Thus, when asking for multiple devices, we asked qemu multiple
times to always get the same answer from which different parts were
filtered. This patch makes qemuMonitorGetBlockInfo return a hash table
of all block devices, which may later be used for getting details about
specific devices.
2012-01-27 13:07:56 +01:00
Jiri Denemark
bc1edeb611 apparmor: Fix use of uninitialized random_data
Without this, virt-aa-helper would segfault in -c or -r commands.
2012-01-27 11:14:21 +01:00
Marcelo Cerri
98b01e8f2b Update VIRT_CONTROL audit record with pid.
Added a new field "vm-pid" to the VIRT_CONTROL audit record. This information
is useful to correlated another audit events to the events generated by
libvirt.
2012-01-26 16:49:02 -07:00
Eric Blake
19896423f7 hash: minor touchups
On RHEL5, I got:
util/virrandom.c:66: warning: nested extern declaration of '_gl_verify_function66' [-Wnested-externs]

The fix is to hoist the verify earlier.  Also some other hodge-podge
fixes I noticed while reviewing Dan's recent series.

* .gitignore: Ignore new test.
* src/util/cgroup.c: Bump copyright year.
* src/util/virhash.c: Fix typo in description.
* src/util/virrandom.c (virRandomBits): Mark doc comment, and
hoist assert to silence older gcc.
2012-01-26 15:27:10 -07:00
Michal Privoznik
8973190735 util: Include stdint.h because of uint32_t
Some files are using uint32_t or int64_t without including
stdint.h which defines them. Fix this.
2012-01-26 19:14:01 +01:00
Daniel P. Berrange
1f7aa0ac56 Remove tabs from libvirt_public.syms & enforce it
* src/libvirt_public.syms: Death to tabs
* cfg.mk: Check .syms files for tabs
2012-01-26 15:03:43 +00:00
Daniel P. Berrange
72b4139700 Replace hashing algorithm with murmurhash
Recent discussions have illustrated the potential for DOS attacks
with the hash table implementations used by most languages and
libraries.

   https://lwn.net/Articles/474912/

libvirt has an internal hash table impl, and uses hash tables for
a variety of purposes. The hash key generation code is pretty
simple and thus not strongly collision resistant.

This patch replaces the current libvirt hash key generator with
the (public domain) Murmurhash3 code. In addition every hash
table now gets a random seed value which is used to perturb the
hashing code. This should make it impossible to mount any
practical attack against libvirt hashing code.

* bootstrap.conf: Import bitrotate module
* src/Makefile.am: Add virhashcode.[ch]
* src/util/util.c: Make virRandom() return a fixed 32 bit
  integer value.
* src/util/hash.c, src/util/hash.h, src/util/cgroup.c: Replace
  hash code generation with a call to virHashCodeGen()
* src/util/virhashcode.h, src/util/virhashcode.c: Add a new
  virHashCodeGen() API using the Murmurhash3 algorithm.
2012-01-26 14:18:53 +00:00
Daniel P. Berrange
1d5c7a9fdf Rename hash.h and hash.c to virhash.h and virhash.c
In preparation for the patch to include Murmurhash3, which
introduces a virhashcode.h and virhashcode.c files, rename
the existing hash.h and hash.c to virhash.h and virhash.c
respectively.
2012-01-26 14:11:13 +00:00
Daniel P. Berrange
9f2bf8fd03 Convert various virHash functions to use size_t / uint32
In preparation for conversion over to use the Murmurhash3
algorithm, convert various virHash APIs to use size_t or
uint32 for their return values/parameters, instead of the
variable size 'unsigned long' or 'int' types
2012-01-26 14:09:21 +00:00
Daniel P. Berrange
e95ef67b35 Introduce new API for generating random numbers
The old virRandom() API was not generating good random numbers.
Replace it with a new API virRandomBits which instead of being
told the upper limit, gets told the number of bits of randomness
required.

* src/util/virrandom.c, src/util/virrandom.h: Add virRandomBits,
  and move virRandomInitialize
* src/util/util.h, src/util/util.c: Delete virRandom and
  virRandomInitialize
* src/libvirt.c, src/security/security_selinux.c,
  src/test/test_driver.c, src/util/iohelper.c: Update for
  changes from virRandom to virRandomBits
* src/storage/storage_backend_iscsi.c: Remove bogus call
  to virRandomInitialize & convert to virRandomBits
2012-01-26 14:03:14 +00:00
Michal Privoznik
adb99a05b1 storage: Support different wiping algorithms
Currently, we support only filling a volume with zeroes on wiping.
However, it is not enough as data might still be readable by
experienced and equipped attacker. Many technical papers have been
written, therefore we should support other wiping algorithms.
2012-01-26 13:59:30 +01:00
Marc-André Lureau
d553554b75 Cast pointer to int using intptr_t
Fix a few warnings with mingw64 x86_64.
2012-01-25 18:00:47 -07:00
Eric Blake
3d5c139c49 build: fix header order on mingw
In file included from ../gnulib/lib/unistd.h:51:0,
                 from ../src/util/util.h:30,
                 from rpc/virkeepalive.c:29:
/usr/x86_64-w64-mingw32/sys-root/mingw/include/winsock2.h:15:2: warning: #warning Please include winsock2.h before windows.h [-Wcpp]

Reported by Marc-André Lureau.

* src/util/threads-win32.h (includes): Pick up winsock2.h before
windows.h, as required by mingw64.
2012-01-25 15:05:45 -07:00
Marc-André Lureau
75d3612ef8 errcode is typedef by mingw, rename an argument name
Fixes the following warning:
util/virterror.c:1242:31: warning: declaration of 'errcode' shadows a global declaration [-Wshadow]
2012-01-25 14:49:24 -07:00
Marc-André Lureau
5f1767e845 Add missing virGetGroupName()
Add missing function if !HAVE_GETPWUID_R.
2012-01-25 12:27:11 -07:00
Cole Robinson
275155f664 storage: Fix any VolLookupByPath if we have an empty logical pool
On F16 at least, empty volume groups don't have a directory under /dev.
The directory only appears once a logical volume is created.

This tickles some behavior in BackendStablePath which ends with
libvirt sleeping for 5 seconds while waiting for the directory to appear.
This causes all sorts of problems for the virStorageVolLookupByPath API
which virtinst uses, even if trying to resolve a path that is independent
of the logical pool.

In reality we don't even need to do that checking since logical pools
always have a stable target path. Short circuit the polling in that
case.

Fixes bug 782261
2012-01-25 13:15:35 -05:00
Eric Blake
16dc4ade7a lxc: export container=lxc-libvirt for systemd
Systemd detects containers based on whether they have
an environment variable starting with 'container=lxc';
using a longer name fits the expectations, while also
allowing detection of who created the container.

Requested by Lennart Poettering, in response to
https://bugs.freedesktop.org/show_bug.cgi?id=45175

* src/lxc/lxc_container.c (lxcContainerBuildInitCmd): Add another
env-var.
2012-01-25 08:25:37 -07:00
Daniel P. Berrange
c30a78c398 Don't bind mount onto a char device for /dev/ptmx in LXC
The current setup code for LXC is bind mounting /dev/pts/ptmx
on top of a character device /dev/ptmx. This is denied by SELinux
policy and is just wrong. The target of a bind mount should just
be a plain file

* src/lxc/lxc_container.c: Don't bind /dev/pts/ptmx onto
  a char device
2012-01-25 14:11:08 +00:00
Daniel P. Berrange
ef7efbc6ef Add virFileTouch for creating empty files
Add a virFileTouch API which ensures that a file will always
exist, even if zero length

* src/util/virfile.c, src/util/virfile.h,
  src/libvirt_private.syms: Introduce virFileTouch
2012-01-25 14:11:03 +00:00
Michal Privoznik
109593ecb0 snapshots: Introduce VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE flag
With this flag, virDomainSnapshotCreate will use fs-freeze and
fs-thaw guest agent commands to quiesce guest's disks.
2012-01-25 10:59:41 +01:00
Michal Privoznik
29bce12ada qemu_agent: Create file system freeze and thaw functions
These functions simply issue command to guest agent which
should freeze or unfreeze all file systems within guest.
2012-01-25 10:59:41 +01:00
Jiri Denemark
24a001493a qemu: Emit bootindex even for direct boot
Direct boot (using kernel, initrd, and command line) is used by
virt-install/virt-manager for network install. While any bootindex has
no direct effect since -kernel is always first, we need it as a hint for
SeaBIOS to present disks in the same order as they will be presented
during normal boot.
2012-01-25 10:38:01 +01:00
Eric Blake
4d71ff450f metadata: group metadata next to description
It's better to group all the metadata together.  This is a
cosmetic output change; since the RNG allows interleave, it
doesn't matter where the user stuck it on input, and an XPath
query will find the same information when parsing the output.

* src/conf/domain_conf.c (virDomainDefFormatInternal): Output
metadata earlier.
* docs/formatdomain.html.in: Update documentation.
* tests/domainsnapshotxml2xmlout/metadata.xml: Update test.
* tests/qemuxml2xmloutdata/qemuxml2xmlout-metadata.xml: Likewise.
2012-01-24 17:40:23 -07:00
Zeeshan Ali (Khattak)
fa981fc945 Allow custom metadata in domain configuration XML
Applications can now insert custom nodes and hierarchies into domain
configuration XML. Although currently not enforced, applications are
required to use their own namespaces on every custom node they insert,
with only one top-level element per namespace.
2012-01-24 17:06:34 -07:00
Laszlo Ersek
d19149dda8 virCommandProcessIO(): make poll() usage more robust
POLLIN and POLLHUP are not mutually exclusive. Currently the following
seems possible: the child writes 3K to its stdout or stderr pipe, and
immediately closes it. We get POLLIN|POLLHUP (I'm not sure that's possible
on Linux, but SUSv4 seems to allow it). We read 1K and throw away the
rest.

When poll() returns and we're about to check the /revents/ member in a
given array element, let's map all the revents bits to two (independent)
ideas: "let's attempt to read()", and "let's attempt to write()". This
should cover all errors, EOFs, and normal conditions; the read()/write()
call should report any pending error.

Under this approach, both POLLHUP and POLLERR are mapped to "needs read()"
if we're otherwise prepared for POLLIN. POLLERR also maps to "needs
write()" if we're otherwise prepared for POLLOUT. The rest of the mappings
(POLLPRI etc.) would be easy, but probably useless for pipes.

Additionally, SUSv4 doesn't appear to forbid POLLIN|POLLERR (or
POLLOUT|POLLERR) set simultaneously. One could argue that the read() or
write() call would return without blocking in these cases (with an error),
so POLLIN / POLLOUT would be justified beside POLLERR.

The code now penalizes POLLIN|POLLERR differently from plain POLLERR. The
former (ie. read() returning -1) is terminal and we jump to cleanup, while
plain POLLERR masks only the affected file descriptor for the future.
Let's unify those.

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
2012-01-24 13:50:45 -07:00
Alon Levy
3f0a757e80 src/datatypes.h: fix typo
Signed-off-by: Alon Levy <alevy@redhat.com>
2012-01-24 13:48:43 +01:00
Daniel P. Berrange
fb52a39928 Wire up QEMU agent to reboot/shutdown APIs
This makes use of the QEMU guest agent to implement the
virDomainShutdownFlags and virDomainReboot APIs. With
no flags specified, it will prefer to use the agent, but
fallback to ACPI. Explicit choice can be made by using
a suitable flag

* src/qemu/qemu_driver.c: Wire up use of agent
2012-01-24 12:19:51 +01:00
Daniel P. Berrange
0b7ddf9e77 Add new virDomainShutdownFlags API
Add a new API virDomainShutdownFlags and define:

    VIR_DOMAIN_SHUTDOWN_DEFAULT        = 0,
    VIR_DOMAIN_SHUTDOWN_ACPI_POWER_BTN = (1 << 0),
    VIR_DOMAIN_SHUTDOWN_GUEST_AGENT    = (1 << 1),

Also define some flags for the reboot API

    VIR_DOMAIN_REBOOT_DEFAULT        = 0,
    VIR_DOMAIN_REBOOT_ACPI_POWER_BTN = (1 << 0),
    VIR_DOMAIN_REBOOT_GUEST_AGENT    = (1 << 1),

Although these two APIs currently have the same flags, using
separate enums allows them to expand separately in the future.

Add stub impls of the new API for all existing drivers
2012-01-24 12:19:51 +01:00
Daniel P. Berrange
c160ce3316 QEMU guest agent support
There is now a standard QEMU guest agent that can be installed
and given a virtio serial channel

    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/f16x86_64.agent'/>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
    </channel>

The protocol that runs over the guest agent is JSON based and
very similar to the JSON monitor. We can't use exactly the same
code because there are some odd differences in the way messages
and errors are structured. The qemu_agent.c file is based on
a combination and simplification of qemu_monitor.c and
qemu_monitor_json.c

* src/qemu/qemu_agent.c, src/qemu/qemu_agent.h: Support for
  talking to the agent for shutdown
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h: Add thread
  helpers for talking to the agent
* src/qemu/qemu_process.c: Connect to agent whenever starting
  a guest
* src/qemu/qemu_monitor_json.c: Make variable static
2012-01-24 12:19:51 +01:00
Stefan Berger
da094fe201 Compare two hash tables for equality
Add function to compare two hash tables for equality.
2012-01-23 15:35:54 -05:00
Guido Günther
549cedc6a9 xen: Don't crash when we fail to init caps
by dereferencing a NULL pointer in the call to
virNodeSuspendGetTargetMask.
2012-01-23 12:45:06 +01:00
Guido Günther
c76a17b428 xen: properly report out of memory when hvm_type is too small 2012-01-21 16:19:24 +01:00
Eric Blake
32b57a72de maint: cleanup qemu capabilities
Fix inconsistent whitespace and long lines.

* src/qemu/qemu_capabilities.h (qemuCapsFlags): Improve formatting.
2012-01-20 16:34:29 -07:00
Eric Blake
bb69630b6c maint: enforce use of _LAST marker
When converting a linear enum to a string, we have checks in
place in the VIR_ENUM_IMPL macro to ensure that there is one
string for every value, which lets us quickly flag if a user
added a value but forgot to add a counterpart string.  However,
this only works if we use the _LAST marker.

* cfg.mk (sc_require_enum_last_marker): New syntax check.
* src/conf/domain_conf.h (virDomainSnapshotState): Add new marker.
* src/conf/domain_conf.c (virDomainSnapshotState): Fix offender.
* src/qemu/qemu_monitor_json.c (qemuMonitorWatchdogAction)
(qemuMonitorIOErrorAction, qemuMonitorGraphicsAddressFamily):
Likewise.
* src/util/virtypedparam.c (virTypedParameter): Likewise.
2012-01-20 16:16:04 -07:00
Eric Blake
7b4e5693c1 API: make declaration of _LAST enum values conditional
Although this is a public API break, it only affects users that
were compiling against *_LAST values, and can be trivially
worked around without impacting compilation against older
headers, by the user defining VIR_ENUM_SENTINELS before using
libvirt.h.  It is not an ABI break, since enum values do not
appear as .so entry points.  Meanwhile, it prevents users from
using non-stable enum values without explicitly acknowledging
the risk of doing so.

See this list discussion:
https://www.redhat.com/archives/libvir-list/2012-January/msg00804.html

* include/libvirt/libvirt.h.in: Hide all sentinels behind
LIBVIRT_ENUM_SENTINELS, and add missing sentinels.
* src/internal.h (VIR_DEPRECATED): Allow inclusion after
libvirt.h.
(LIBVIRT_ENUM_SENTINELS): Expose sentinels internally.
* daemon/libvirtd.h: Use the sentinels.
* src/remote/remote_protocol.x (includes): Don't expose sentinels.
* python/generator.py (enum): Likewise.
* tests/cputest.c (cpuTestCompResStr): Silence compiler warning.
* tools/virsh.c (vshDomainStateReasonToString)
(vshDomainControlStateToString): Likewise.
2012-01-20 16:05:51 -07:00
Eric Blake
c2551bea56 error: drop old-style error reporting
While we still don't want to enable gcc's new -Wformat-literal
warning, I found a rather easy case where the warning could be
reduced, by getting rid of obsolete error-reporting practices.
This is the last place where we were passing the (unused) net
and conn arguments for constructing an error.

* src/util/virterror_internal.h (virErrorMsg): Delete prototype.
(virReportError): Delete macro.
* src/util/virterror.c (virErrorMsg): Make static.
* src/libvirt_private.syms (virterror_internal.h): Drop export.
* src/util/conf.c (virConfError): Convert to macro.
(virConfErrorHelper): New function, and adjust error calls.
* src/xen/xen_hypervisor.c (virXenErrorFunc): Delete.
(xenHypervisorGetSchedulerType)
(xenHypervisorGetSchedulerParameters)
(xenHypervisorSetSchedulerParameters)
(xenHypervisorDomainBlockStats)
(xenHypervisorDomainInterfaceStats)
(xenHypervisorDomainGetOSType)
(xenHypervisorNodeGetCellsFreeMemory, xenHypervisorGetVcpus):
Update callers.
2012-01-19 13:26:04 -07:00
Eric Blake
9e48c22534 util: use new virTypedParameter helpers
Reusing common code makes things smaller; it also buys us some
additional safety, such as now rejecting duplicate parameters
during a set operation.

* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters)
(qemuDomainSetMemoryParameters, qemuDomainSetNumaParameters)
(qemuSetSchedulerParametersFlags)
(qemuDomainSetInterfaceParameters, qemuDomainSetBlockIoTune)
(qemuDomainGetBlkioParameters, qemuDomainGetMemoryParameters)
(qemuDomainGetNumaParameters, qemuGetSchedulerParametersFlags)
(qemuDomainBlockStatsFlags, qemuDomainGetInterfaceParameters)
(qemuDomainGetBlockIoTune): Use new helpers.
* src/esx/esx_driver.c (esxDomainSetSchedulerParametersFlags)
(esxDomainSetMemoryParameters)
(esxDomainGetSchedulerParametersFlags)
(esxDomainGetMemoryParameters): Likewise.
* src/libxl/libxl_driver.c
(libxlDomainSetSchedulerParametersFlags)
(libxlDomainGetSchedulerParametersFlags): Likewise.
* src/lxc/lxc_driver.c (lxcDomainSetMemoryParameters)
(lxcSetSchedulerParametersFlags, lxcDomainSetBlkioParameters)
(lxcDomainGetMemoryParameters, lxcGetSchedulerParametersFlags)
(lxcDomainGetBlkioParameters): Likewise.
* src/test/test_driver.c (testDomainSetSchedulerParamsFlags)
(testDomainGetSchedulerParamsFlags): Likewise.
* src/xen/xen_hypervisor.c (xenHypervisorSetSchedulerParameters)
(xenHypervisorGetSchedulerParameters): Likewise.
2012-01-19 13:20:30 -07:00
Eric Blake
61ca98b054 util: add new file for virTypedParameter utils
Preparation for another patch that refactors common patterns
into the new file for fewer lines of code overall.

* src/util/util.h (virTypedParameterArrayClear): Move...
* src/util/virtypedparam.h: ...to new file.
(virTypedParameterArrayValidate, virTypedParameterAssign): New
prototypes.
* src/util/util.c (virTypedParameterArrayClear): Likewise.
* src/util/virtypedparam.c: New file.
* po/POTFILES.in: Mark file for translation.
* src/Makefile.am (UTIL_SOURCES): Build it.
* src/libvirt_private.syms (util.h): Split...
(virtypedparam.h): to new section.
(virkeycode.h): Sort.
* daemon/remote.c: Adjust callers.
* tools/virsh.c: Likewise.
2012-01-19 13:14:10 -07:00
Eric Blake
9c3775765e lxc: use live/config helper
Based on qemu changes made in commits ae523427 and 659ded58.

* src/lxc/lxc_driver.c (lxcSetSchedulerParametersFlags)
(lxcGetSchedulerParametersFlags, lxcDomainSetBlkioParameters)
(lxcDomainGetBlkioParameters): Use helpers.
(lxcDomainSetBlkioParameters): Allow setting live and config at
once.
2012-01-19 13:14:10 -07:00
Eric Blake
927cfaf467 threads: check for failure to set thread-local value
We had a memory leak on a very arcane OOM situation (unlikely to ever
hit in practice, but who knows if libvirt.so would ever be linked
into some other program that exhausts all thread-local storage keys?).
I found it by code inspection, while analyzing a valgrind report
generated by Alex Jia.

* src/util/threads.h (virThreadLocalSet): Alter signature.
* src/util/threads-pthread.c (virThreadHelper): Reduce allocation
lifetime.
(virThreadLocalSet): Detect failure.
* src/util/threads-win32.c (virThreadLocalSet): Likewise.
(virCondWait): Fix caller.
* src/util/virterror.c (virLastErrorObject): Likewise.
2012-01-19 13:14:10 -07:00
Daniel P. Berrange
91f79d27cc Fix rpc generator to anchor matches for method names
The RPC generator transforms methods matching certain
patterns like 'id' or 'uuid', etc but does not anchor
its matches to the end of the word. So if a method
contains 'id' in the middle (eg virIdentity) then the
RPC generator munges that.

* src/rpc/gendispatch.pl: Anchor matches
2012-01-19 15:39:54 +00:00
Daniel P. Berrange
2f9dc36d49 Rename APIs for fetching UNIX socket credentials
To avoid a namespace clash with forthcoming identity APIs,
rename the virNet*GetLocalIdentity() APIs to have the form
virNet*GetUNIXIdentity()

* daemon/remote.c, src/libvirt_private.syms: Update
  for renamed APIs
* src/rpc/virnetserverclient.c, src/rpc/virnetserverclient.h,
  src/rpc/virnetsocket.c, src/rpc/virnetsocket.h: s/LocalIdentity/UNIXIdentity/
2012-01-19 15:39:52 +00:00
Daniel P. Berrange
1fff03ef9b Add virGetGroupName to convert from GID to group name 2012-01-19 13:30:04 +00:00
Daniel P. Berrange
59cf039815 Also retrieve GID from SO_PEERCRED
* daemon/remote.c, src/rpc/virnetserverclient.c,
  src/rpc/virnetserverclient.h, src/rpc/virnetsocket.c,
  src/rpc/virnetsocket.h: Add gid parameter
2012-01-19 13:30:03 +00:00
Martin Kletzander
4c82f09ef0 Added capability checking for block <iotune> setting.
There was missing capability for blkiotune and thus specifying these
settings caused libvirt to run qemu with invalid parameters and then
reporting qemu error instead of the standard libvirt one. The support
for blkiotune setting was added in upstream qemu repo under commit
0563e191516289c9d2f282a8c50f2eecef2fa773.
2012-01-18 09:56:00 -07:00
Daniel P. Berrange
c53ba61b21 Fix startup of LXC containers with filesystems containing symlinks
Given an LXC guest with a root filesystem path of

  /export/lxc/roots/helloworld/root

During startup, we will pivot the root filesystem to end up
at

  /.oldroot/export/lxc/roots/helloworld/root

We then try to open

  /.oldroot/export/lxc/roots/helloworld/root/dev/pts

Now consider if '/export/lxc' is an absolute symlink pointing
to '/media/lxc'. The kernel will try to open

  /media/lxc/roots/helloworld/root/dev/pts

whereas it should be trying to open

  /.oldroot//media/lxc/roots/helloworld/root/dev/pts

To deal with the fact that the root filesystem can be moved,
we need to resolve symlinks in *any* part of the filesystem
source path.

* src/libvirt_private.syms, src/util/util.c,
  src/util/util.h: Add virFileResolveAllLinks to resolve
  all symlinks in a path
* src/lxc/lxc_container.c: Resolve all symlinks in filesystem
  paths during startup
2012-01-18 13:34:42 +00:00
Osier Yang
7aeb9794d2 qemu: Prohibit reattaching node device if it is in use
It doesn't make sense to reattach a device to host while it's
still in use, e.g, by a domain.
2012-01-17 17:15:22 -07:00
Osier Yang
6be610bfaa qemu: Introduce inactive PCI device list
pciTrySecondaryBusReset checks if there is active device on the
same bus, however, qemu driver doesn't maintain an effective
list for the inactive devices, and it passes meaningless argument
for parameter "inactiveDevs". e.g. (qemuPrepareHostdevPCIDevices)

if (!(pcidevs = qemuGetPciHostDeviceList(hostdevs, nhostdevs)))
    return -1;

..skipped...

if (pciResetDevice(dev, driver->activePciHostdevs, pcidevs) < 0)
    goto reattachdevs;

NB, the "pcidevs" used above are extracted from domain def, and
thus one won't be able to attach a device of which bus has other
device even detached from host (nodedev-detach). To see more
details of the problem:

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=773667

This patch is to resolve the problem by introducing an inactive
PCI device list (just like qemu_driver->activePciHostdevs), and
the whole logic is:

  * Add the device to inactive list during nodedev-dettach
  * Remove the device from inactive list during nodedev-reattach
  * Remove the device from inactive list during attach-device
    (for non-managed device)
  * Add the device to inactive list after detach-device, only
    if the device is not managed

With the above, we have a sufficient inactive PCI device list, and thus
we can use it for pciResetDevice. e.g.(qemuPrepareHostdevPCIDevices)

if (pciResetDevice(dev, driver->activePciHostdevs,
                   driver->inactivePciHostdevs) < 0)
    goto reattachdevs;
2012-01-17 17:05:32 -07:00
Deepak C Shetty
d9e0d8204b Add new attribute wrpolicy to <driver> element
This introduces new attribute wrpolicy with only supported
value as immediate. This will be an optional
attribute with no defaults. This helps specify whether
to skip the host page cache.

When wrpolicy is specified, meaning when wrpolicy=immediate
a writeback is explicitly initiated for the dirty pages in
the host page cache as part of the guest file write operation.

Usage:
<filesystem type='mount' accessmode='passthrough'>
  <driver type='path' wrpolicy='immediate'/>
  <source dir='/export/to/guest'/>
  <target dir='mount_tag'/>
</filesystem>

Currently this only works with type='mount' for the QEMU/KVM driver.

Signed-off-by: Deepak C Shetty <deepakcs@linux.vnet.ibm.com>
2012-01-17 15:37:42 -07:00
Jiri Denemark
9619d8a62e qemu: Don't break domain with 0:0:2.0 assigned to anything but VGA
In the past we didn't reserve 0:0:2.0 PCI address if there was no video
device assigned to a domain, which made it impossible to add a video
device later on. So we fixed it (commit v0.9.0-37-g7b2cac1) by always
reserving that address. However, that breaks existing domains without
video devices that already have another device assigned to the
problematic address.

This patch reserves address 0:0:2.0 only in case it was not explicitly
assigned to another device, which means libvirt will try to keep this
address free and will not automatically assign it new devices. But
existing domains for which older libvirt already assigned the address to
a non-video device will keep working as they used to work before 0.9.1.
Moreover, users who want to create a domain without a video device and
use its address for another device may do so by explicitly configuring
the PCI address in domain XML.
2012-01-17 21:01:23 +01:00
Martin Kletzander
e1eb93470e Fixed dumpxml of <iotune> parameters
The output of dumpxml for <iotune> settings was misformatted, this
patch just adds missing newlines.
2012-01-17 11:47:30 -07:00
Jiri Denemark
e7201afdf7 qemu: Add support for host CPU modes
This adds support for host-model and host-passthrough CPU modes to qemu
driver. The host-passthrough mode is mapped to -cpu host.
2012-01-17 12:22:19 +01:00
Jiri Denemark
c8506d6662 Taint domains configured with cpu mode=host-passthrough
There are several reasons for doing this:

- the CPU specification is out of libvirt's control so we cannot
  guarantee stable guest ABI
- not every feature of a CPU may actually work as expected when
  advertised directly to a guest
- migration between two machines with exactly the same CPU may work but
  no guarantees can be made
- this mode is not supported and its use is at one's own risk
2012-01-17 11:49:42 +01:00
Jiri Denemark
277bc0dcb8 cpu: Update guest CPU in host-* mode
VIR_DOMAIN_XML_UPDATE_CPU flag for virDomainGetXMLDesc may be used to
get updated custom mode guest CPU definition in case it depends on host
CPU. This patch implements the same behavior for host-model and
host-passthrough CPU modes.
2012-01-17 11:42:56 +01:00
Jiri Denemark
f7dd3a4e62 Add support for cpu mode attribute
The mode can be either of "custom" (default), "host-model",
"host-passthrough". The semantics of each mode is described in the
following examples:

- guest CPU is a default model with specified topology:
    <cpu>
      <topology sockets='1' cores='2' threads='1'/>
    </cpu>

- guest CPU matches selected model:
    <cpu mode='custom' match='exact'>
      <model>core2duo</model>
    </cpu>

- guest CPU should be a copy of host CPU as advertised by capabilities
  XML (this is a short cut for manually copying host CPU specification
  from capabilities to domain XML):
    <cpu mode='host-model'/>

  In case a hypervisor does not support the exact host model, libvirt
  automatically falls back to a closest supported CPU model and
  removes/adds features to match host. This behavior can be disabled by
    <cpu mode='host-model'>
      <model fallback='forbid'/>
    </cpu>

- the same as previous returned by virDomainGetXMLDesc with
  VIR_DOMAIN_XML_UPDATE_CPU flag:
    <cpu mode='host-model' match='exact'>
      <model fallback='allow'>Penryn</model>       --+
      <vendor>Intel</vendor>                         |
      <topology sockets='2' cores='4' threads='1'/>  + copied from
      <feature policy='require' name='dca'/>         | capabilities XML
      <feature policy='require' name='xtpr'/>        |
      ...                                          --+
    </cpu>

- guest CPU should be exactly the same as host CPU even in the aspects
  libvirt doesn't model (such domain cannot be migrated unless both
  hosts contain exactly the same CPUs):
    <cpu mode='host-passthrough'/>

- the same as previous returned by virDomainGetXMLDesc with
  VIR_DOMAIN_XML_UPDATE_CPU flag:
    <cpu mode='host-passthrough' match='minimal'>
      <model>Penryn</model>                        --+ copied from caps
      <vendor>Intel</vendor>                         | XML but doesn't
      <topology sockets='2' cores='4' threads='1'/>  | describe all
      <feature policy='require' name='dca'/>         | aspects of the
      <feature policy='require' name='xtpr'/>        | actual guest CPU
      ...                                          --+
    </cpu>
2012-01-17 11:39:23 +01:00
Jiri Denemark
a6f88cbd2d cpu: Optionally forbid fallback CPU models
In case a hypervisor doesn't support the exact CPU model requested by a
domain XML, we automatically fallback to a closest CPU model the
hypervisor supports (and make sure we add/remove any additional features
if needed). This patch adds 'fallback' attribute to model element, which
can be used to disable this automatic fallback.
2012-01-17 11:39:19 +01:00
Jiri Denemark
5e31e71365 Clarify semantics of virDomainMigrate{,ToURI}2
Commit 5d784bd6d7 was a nice attempt to
clarify the semantics by requiring domain name from dxml to either match
original name or dname. However, setting dxml domain name to dname
doesn't really work since destination host needs to know the original
domain name to be able to use it in migration cookies. This patch
requires domain name in dxml to match the original domain name. The
change should be safe and backward compatible since migration would fail
just a bit later in the process.
2012-01-17 10:31:24 +01:00
Michael Ellerman
bfbbc49638 conf: Remove do-nothing validation functions
There are three address validation routines that do nothing:
  virDomainDeviceDriveAddressIsValid()
  virDomainDeviceUSBAddressIsValid()
  virDomainDeviceVirtioSerialAddressIsValid()

Remove them, and replace their call sites with "1" which is what they
currently return. In some cases this means we can remove an entire
if block.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2012-01-13 16:18:03 -07:00
Michael Ellerman
69dde2e653 tests: Teach qemuxml2argvtest about spapr-vio addresses
We can't call qemuCapsExtractVersionInfo() from test code, because it
expects to be able to call the emulator, and for testing we have fake
emulators that can't be executed. For that reason qemuxml2argvtest.c
doesn't call qemuDomainAssignPCIAddresses(), instead it open codes its
own version.

That means we can't call qemuDomainAssignAddresses() from the test code,
instead we need to manually call qemuDomainAssignSpaprVioAddresses().

Also add logic to cope with qemuDomainAssignSpaprVioAddresses() failing,
so that we can write a test that checks for a known failure in there.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2012-01-13 16:08:22 -07:00
Paolo Bonzini
c9abfadf37 qemu: add virtio-scsi controller model
Adding a new model for virtio-scsi roughly follows the same scheme
as the previous patch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-01-13 14:54:48 -07:00
Paolo Bonzini
7b345b69f2 qemu: add ibmvscsi controller model
KVM will be able to use a PCI SCSI controller even on POWER.  Let
the user specify the vSCSI controller by other means than a default.

After this patch, the QEMU driver will actually look at the model
and reject anything but auto, lsilogic and ibmvscsi.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-01-13 14:13:30 -07:00
Adam Litke
c972237ee1 events: Return the correct number of registered events
Commit d09f6ba5fe introduced a regression in event
registration.  virDomainEventCallbackListAddID() will only return a positive
integer if the type of event being registered is VIR_DOMAIN_EVENT_ID_LIFECYCLE.
For other event types, 0 is always returned on success.  This has the
unfortunate side effect of not enabling remote event callbacks because
remoteDomainEventRegisterAny() uses the return value from the local call to
determine if an event callback needs to be registered on the remote end.

Make sure virDomainEventCallbackListAddID() returns the callback count for the
eventID being registered.

Signed-off-by: Adam Litke <agl@us.ibm.com>
2012-01-13 13:59:48 -07:00
Paolo Bonzini
ed6bd4bc49 export virNetDevGetVirtualFunctions as a private symbol
This avoids a linking error.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-01-13 08:12:16 -07:00
Osier Yang
5edfcaae6f qemu: Support copy on read for disk
The new introduced optional attribute "copy_on_read</code> controls
whether to copy read backing file into the image file. The value can
be either "on" or "off". Copy-on-read avoids accessing the same backing
file sectors repeatedly and is useful when the backing file is over a
slow network. By default copy-on-read is off.
2012-01-13 10:08:15 +08:00
Martin Kletzander
b54de0830a Added check for maximum number of vcpus exceeding topology limit
Earlier, when the number of vcpus was greater than the topology allowed,
libvirt didn't raise an error and continued, resulting in running qemu
with parameters making no sense. Even though qemu did not report any
error itself, the number of vcpus was set to maximum allowed by the
topology.
2012-01-12 16:02:08 -07:00
Eric Blake
0327ff0798 uuid: fix off-by-one
Detected by Coverity.  Although unlikely, if we are ever started
with stdin closed, we could reach a situation where we open a
uuid file but then fail to close it, making that file the new
stdin for the rest of the process.

* src/util/uuid.c (getDMISystemUUID): Allow for stdin.
2012-01-12 15:18:23 -07:00
Daniel P. Berrange
08272dc8b4 Rsync keymaps.csv file with GTK-VNC 2012-01-12 20:44:55 +00:00
Daniel P. Berrange
9130396214 Re-write LXC controller end-of-file I/O handling yet again
Currently the LXC controller attempts to deal with EOF on a
tty by spawning a thread to do an edge triggered epoll_wait().
This avoids the normal event loop spinning on POLLHUP. There
is a subtle mistake though - even after seeing POLLHUP on a
master PTY, it is still perfectly possible & valid to write
data to the PTY. There is a buffer that can be filled with
data, even when no client is present.

The second mistake is that the epoll_wait() thread was not
looking for the EPOLLOUT condition, so when a new client
connects to the LXC console, it had to explicitly send a
character before any queued output would appear.

Finally, there was in fact no need to spawn a new thread to
deal with epoll_wait(). The epoll file descriptor itself
can be poll()'d on normally.

This patch attempts to deal with all these problems.

 - The blocking epoll_wait() thread is replaced by a poll
   on the epoll file descriptor which then does a non-blocking
   epoll_wait() to handle events
 - Even if POLLHUP is seen, we continue trying to write
   any pending output until getting EAGAIN from write.
 - Once write returns EAGAIN, we modify the epoll event
   mask to also look for EPOLLOUT

* src/lxc/lxc_controller.c: Avoid stalled I/O upon
  connected to an LXC console
2012-01-12 20:42:52 +00:00
Michal Privoznik
833b901cb7 stream: Check for stream EOF
If client stream does not have any data to sink and neither received
EOF, a dummy packet is sent to the daemon signalising client is ready to
sink some data. However, after we added event loop to client a race may
occur:

Thread 1 calls virNetClientStreamRecvPacket and since no data are cached
nor stream has EOF, it decides to send dummy packet to server which will
sent some data in turn. However, during this decision and actual message
exchange with server -

Thread 2 receives last stream data from server. Therefore an EOF is set
on stream and if there is a call waiting (which is not yet) it is woken
up. However, Thread 1 haven't sent anything so far, so there is no call
to be woken up. So this thread sent dummy packet to daemon, which
ignores that as no stream is associated with such packet and therefore
no reply will ever come.

This race causes client to hang indefinitely.
2012-01-12 12:02:40 +01:00
Deepak C Shetty
99fbb3866c Do not generate security_model when fs driver is anything but 'path'
QEMU does not support security_model for anything but 'path' fs driver type.
Currently in libvirt, when security_model ( accessmode attribute) is not
specified it auto-generates it irrespective of the fs driver type, which
can result in a qemu error for drivers other than path. This patch ensures
that the qemu cmdline is correctly generated by taking into account the
fs driver type.

Signed-off-by: Deepak C Shetty <deepakcs@linux.vnet.ibm.com>
2012-01-11 13:48:52 -07:00
Shradha Shah
52d064f42d Added new option to virsh net-dumpxml called --inactive
The above option helps to differentiate between implicit and explicit
interface pools.
2012-01-11 13:15:09 -07:00
Shradha Shah
42c81d18c2 Functionality to implicitly get interface pool from SR-IOV PF.
If a system has 64 or more VF's, it is quite tedious to mention each VF
in the interface pool.
The following modification will implicitly create an interface pool from
the SR-IOV PF.
2012-01-11 13:14:12 -07:00
Shradha Shah
b01b53de3f Adding the element pf to network xml.
This element will help the user to just specify the SR-IOV physical
function in order to access all the Virtual functions attached to it.
2012-01-11 13:10:21 -07:00
Shradha Shah
3a0c717b9e Added Function virNetDevGetVirtualFunctions
This functions enables us to get the Virtual Functions attached to
a Physical function given the name of a SR-IOV physical functio.

In order to accomplish the task, added a getter function pciGetDeviceAddrString
to get the BDF of the Virtual Function in a char array.
2012-01-11 13:01:16 -07:00
Shradha Shah
f19338c66c Added function pciSysfsFile to enable access to the PCI SYSFS files. 2012-01-11 13:01:16 -07:00
Eric Blake
90cd148027 build: fix build on mingw with netcf available
The autobuilder pointed out an odd failure on mingw:
../../src/interface/netcf_driver.c:644:5: error: unknown field 'close_used_without_including_unistd_h' specified in initializer
cc1: warnings being treated as errors

This is because the gnulib headers #define close to different strings,
according to which headers are included, in order to work around some
odd mingw problems with close(), and these defines happen to also
affect field members declared with a name of struct foo.close. As long
as all headers are included before both the definition and use of the
struct, the various #define doesn't matter, but the netcf file hit
an instance where things were included in a different order.  Fix this
for all clients that use a struct member named 'close'.

* src/driver.h: Include <unistd.h> before using 'close'.
2012-01-11 07:54:10 -07:00
Eric Blake
18262b5587 build: avoid spurious compiler warning
For some weird reason, i686-pc-mingw32-gcc version 4.6.1 at -O2 complained:
../../src/conf/nwfilter_params.c: In function 'virNWFilterVarCombIterCreate':
../../src/conf/nwfilter_params.c:346:23: error: 'minValue' may be used uninitialized in this function [-Werror=uninitialized]
../../src/conf/nwfilter_params.c:319:28: note: 'minValue' was declared here
../../src/conf/nwfilter_params.c:344:23: error: 'maxValue' may be used uninitialized in this function [-Werror=uninitialized]
../../src/conf/nwfilter_params.c:319:18: note: 'maxValue' was declared here
cc1: all warnings being treated as errors

even though all paths of the preceding switch statement either
assign the variables or return.

* src/conf/nwfilter_params.c (virNWFilterVarCombIterAddVariable):
Initialize variables.
2012-01-11 06:32:52 -07:00
Stefan Berger
64484d550d Address side effects of accessing vars via index
Address side effect of accessing a variable via an index: Filters
accessing a variable where an element is accessed that is beyond the
size of the list (for example $TEST[10] and only 2 elements are available)
cannot instantiate that filter. Test for this and report proper error
to user.
2012-01-11 06:42:37 -05:00
Stefan Berger
caa6223a9b Add access to elements of variables via index
This patch adds access to single elements of variables via index. Example:

  <rule action='accept' direction='in' priority='500'>
    <tcp srcipaddr='$ADDR[1]' srcportstart='$B[2]'/>
  </rule>
2012-01-11 06:42:37 -05:00
Stefan Berger
80e9a5cd4c Introduce possibility to have an iterator per variable
This patch introduces the capability to use a different iterator per
variable.

The currently supported notation of variables in a filtering rule like

  <rule action='accept' direction='out'>
     <tcp  srcipaddr='$A' srcportstart='$B'/>
  </rule>

processes the two lists 'A' and 'B' in parallel. This means that A and B
must have the same number of 'N' elements and that 'N' rules will be 
instantiated (assuming all tuples from A and B are unique).

In this patch we now introduce the assignment of variables to different
iterators. Therefore a rule like

  <rule action='accept' direction='out'>
     <tcp  srcipaddr='$A[@1]' srcportstart='$B[@2]'/>
  </rule>

will now create every combination of elements in A with elements in B since
A has been assigned to an iterator with Id '1' and B has been assigned to an
iterator with Id '2', thus processing their value independently.

The first rule has an equivalent notation of

  <rule action='accept' direction='out'>
     <tcp  srcipaddr='$A[@0]' srcportstart='$B[@0]'/>
  </rule>
2012-01-11 06:42:37 -05:00
Stefan Berger
134c56764f Optimize the elements the iterator visits.
In this patch we introduce testing whether the iterator points to a
unique set of entries that have not been seen before at one of the previous
iterations. The point is to eliminate duplicates and with that unnecessary
filtering rules by preventing identical filtering rules from being
instantiated.
Example with two lists:

list1 = [1,2,1]
list2 = [1,3,1]

The 1st iteration would take the 1st items of each list -> 1,1
The 2nd iteration would take the 2nd items of each list -> 2,3
The 3rd iteration would take the 3rd items of each list -> 1,1 but
skip them since this same pair has already been encountered in the 1st
iteration

Implementation-wise this is solved by taking the n-th element of list1 and
comparing it against elements 1..n-1. If no equivalent is found, then there
is no possibility of this being a duplicate. In case an equivalent element
is found at position i, then the n-th element in the 2nd list is compared
against the i-th element in the 2nd list and if that is not the same, then
this is a unique pair, otherwise it is not unique and we may need to do
the same comparison on the 3rd list.
2012-01-11 06:42:37 -05:00
Jiri Denemark
d82ef7c39d apparmor: Mark pid parameter as unused 2012-01-11 12:27:47 +01:00
Daniel P. Berrange
99be754ada Change security driver APIs to use virDomainDefPtr instead of virDomainObjPtr
When sVirt is integrated with the LXC driver, it will be neccessary
to invoke the security driver APIs using only a virDomainDefPtr
since the lxc_container.c code has no virDomainObjPtr available.
Aside from two functions which want obj->pid, every bit of the
security driver code only touches obj->def. So we don't need to
pass a virDomainObjPtr into the security drivers, a virDomainDefPtr
is sufficient. Two functions also gain a 'pid_t pid' argument.

* src/qemu/qemu_driver.c, src/qemu/qemu_hotplug.c,
  src/qemu/qemu_migration.c, src/qemu/qemu_process.c,
  src/security/security_apparmor.c,
  src/security/security_dac.c,
  src/security/security_driver.h,
  src/security/security_manager.c,
  src/security/security_manager.h,
  src/security/security_nop.c,
  src/security/security_selinux.c,
  src/security/security_stack.c: Change all security APIs to use a
  virDomainDefPtr instead of virDomainObjPtr
2012-01-11 09:52:18 +00:00
Eric Blake
4e9953a426 snapshot: allow reuse of existing files in disk snapshot
When disk snapshots were first implemented, libvirt blindly refused
to allow an external snapshot destination that already exists, since
qemu will blindly overwrite the contents of that file during the
snapshot_blkdev monitor command, and we don't like a default of
data loss by default.  But VDSM has a scenario where NFS permissions
are intentionally set so that the destination file can only be
created by the management machine, and not the machine where the
guest is running, so that libvirt will necessarily see the destination
file already existing; adding a flag will allow VDSM to force the file
reuse without libvirt complaining of possible data loss.

https://bugzilla.redhat.com/show_bug.cgi?id=767104

* include/libvirt/libvirt.h.in (virDomainSnapshotCreateFlags): Add
VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT.
* src/libvirt.c (virDomainSnapshotCreateXML): Document it.  Add
note about partial failure.
* tools/virsh.c (cmdSnapshotCreate, cmdSnapshotCreateAs): Add new
flag.
* tools/virsh.pod (snapshot-create, snapshot-create-as): Document
it.
* src/qemu/qemu_driver.c (qemuDomainSnapshotDiskPrepare)
(qemuDomainSnapshotCreateXML): Implement the new flag.
2012-01-10 11:53:23 -07:00
Eric Blake
529e4a5006 docs: standardize description of flags
We had loads of different styles in describing the @flags parameter
for various APIs, as well as several APIs that didn't list which
enums provided the bit values valid for the flags.

The end result is one of two formats:
@flags: bitwise-OR of vir...Flags
@flags: extra flags; not used yet, so callers should always pass 0

* src/libvirt.c: Use common sentences for flags.  Also,
(virDomainGetBlockIoTune): Mention virTypedParameterFlags.
(virConnectOpenAuth): Mention virConnectFlags.
(virDomainMigrate, virDomainMigrate2, virDomainMigrateToURI)
(virDomainMigrateToURI2): Mention virDomainMigrateFlags.
(virDomainMemoryPeek): Mention virDomainMemoryFlags.
(virStoragePoolBuild): Mention virStoragePoolBuildFlags.
(virStoragePoolDelete): Mention virStoragePoolDeleteFlags.
(virStreamNew): Mention virStreamFlags.
(virDomainOpenGraphics): Mention virDomainOpenGraphicsFlags.
2012-01-10 11:49:54 -07:00
Laine Stump
32f63e912d qemu: check for kvm availability before starting kvm guests
This *kind of* addresses:

  https://bugzilla.redhat.com/show_bug.cgi?id=772395

(it doesn't eliminate the failure to start, but causes libvirt to give
a better idea about the cause of the failure).

If a guest uses a kvm emulator (e.g. /usr/bin/qemu-kvm) and the guest
is started when kvm isn't available (either because virtualization is
unavailable / has been disabled in the BIOS, or the kvm modules
haven't been loaded for some reason), a semi-cryptic error message is
logged:

  libvirtError: internal error Child process (LC_ALL=C
  PATH=/sbin:/usr/sbin:/bin:/usr/bin /usr/bin/qemu-kvm -device ? -device
  pci-assign,? -device virtio-blk-pci,? -device virtio-net-pci,?) status
  unexpected: exit status 1

This patch notices at process start that a guest needs kvm, and checks
for the presence of /dev/kvm (a reasonable indicator that kvm is
available) before trying to execute the qemu binary. If kvm isn't
available, a more useful (too verbose??) error is logged.
2012-01-10 13:42:59 -05:00
Alex Jia
d8d9b0e058 qemu: fix a typo on qemuDomainSetBlkioParameters
It should be a copy-paste error, the result is programming will result in an
infinite loop again due to without iterating 'j' variable.

* src/qemu/qemu_driver.c: fix a typo on qemuDomainSetBlkioParameters.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=770520

Signed-off-by: Alex Jia <ajia@redhat.com>
2012-01-10 11:41:27 +01:00
Jim Fehlig
9ae4ac7ac0 PolicyKit: Check auth before asking client to obtain it
I previously mentioned [1] a PolicyKit issue where libvirt would
proceed with authentication even though polkit-auth failed:

testusr xen134:~> virsh list --all
Attempting to obtain authorization for org.libvirt.unix.manage.
polkit-grant-helper: given auth type (8 -> yes) is bogus
Failed to obtain authorization for org.libvirt.unix.manage.
 Id Name                 State
----------------------------------
  0 Domain-0             running
  - sles11sp1-pv         shut off

AFAICT, libvirt attempts to obtain a privilege it already has,
causing polkit-auth to fail with above message.  Instead of calling
obtain and then checking auth, IMO the workflow should be for the
server to check auth first, and if that fails ask the client to
obtain it and check again.  This workflow also allows for checking
only successful exit of polkit-auth in virConnectAuthGainPolkit().

[1] https://www.redhat.com/archives/libvir-list/2011-December/msg00837.html
2012-01-09 11:23:13 -07:00
Laine Stump
177db08775 qemu: add new disk device='lun' for bus='virtio' & type='block'
In the past, generic SCSI commands issued from a guest to a virtio
disk were always passed through to the underlying disk by qemu, and
the kernel would also pass them on.

As a result of CVE-2011-4127 (see:
http://seclists.org/oss-sec/2011/q4/536), qemu now honors its
scsi=on|off device option for virtio-blk-pci (which enables/disables
passthrough of generic SCSI commands), and the kernel will only allow
the commands for physical devices (not for partitions or logical
volumes). The default behavior of qemu is still to allow sending
generic SCSI commands to physical disks that are presented to a guest
as virtio-blk-pci devices, but libvirt prefers to disable those
commands in the standard virtio block devices, enabling it only when
specifically requested (hopefully indicating that the requester
understands what they're asking for). For this purpose, a new libvirt
disk device type (device='lun') has been created.

device='lun' is identical to the default device='disk', except that:

1) It is only allowed if bus='virtio', type='block', and the qemu
   version is "new enough" to support it ("new enough" == qemu 0.11 or
   better), otherwise the domain will fail to start and a
   CONFIG_UNSUPPORTED error will be logged).

2) The option "scsi=on" will be added to the -device arg to allow
   SG_IO commands (if device !='lun', "scsi=off" will be added to the
   -device arg so that SG_IO commands are specifically forbidden).

Guests which continue to use disk device='disk' (the default) will no
longer be able to use SG_IO commands on the disk; those that have
their disk device changed to device='lun' will still be able to use SG_IO
commands.

*docs/formatdomain.html.in - document the new device attribute value.
*docs/schemas/domaincommon.rng - allow it in the RNG
*tests/* - update the args of several existing tests to add scsi=off, and
 add one new test that will test scsi=on.
*src/conf/domain_conf.c - update domain XML parser and formatter

*src/qemu/qemu_(command|driver|hotplug).c - treat
 VIR_DOMAIN_DISK_DEVICE_LUN *almost* identically to
 VIR_DOMAIN_DISK_DEVICE_DISK, except as indicated above.

Note that no support for this new device value was added to any
hypervisor drivers other than qemu, because it's unclear what it might
mean (if anything) to those drivers.
2012-01-09 10:55:53 -05:00
Laine Stump
e8daeeb136 qemu: add capabilities flags related to SG_IO
This patch adds two capabilities flags to deal with various aspects
of supporting SG_IO commands on virtio-blk-pci devices:

  QEMU_CAPS_VIRTIO_BLK_SCSI
    set if -device virtio-blk-pci accepts the scsi="on|off" option
    When present, this is on by default, but can be set to off to disable
    SG_IO functions.

  QEMU_CAPS_VIRTIO_BLK_SG_IO
    set if SG_IO commands are supported in the virtio-blk-pci driver
    (present since qemu 0.11 according to a qemu developer, if I
     understood correctly)
2012-01-09 10:55:44 -05:00
Laine Stump
1734cdb995 config: report error when script given for inappropriate interface type
This fixes https://bugzilla.redhat.com/show_bug.cgi?id=638633

Although scripts are not used by interfaces of type other than
"ethernet" in qemu, due to the fact that the parser stores the script
name in a union that is only valid when type is ethernet or bridge,
there is no way for anyone except the parser itself to catch the
problem of specifying an interface script for an inappropriate
interface type (by the time the parsed data gets back to the code that
called the parser, all evidence that a script was specified is
forgotten).

Since the parser itself should be agnostic to which type of interface
allows scripts (an example of why: a script specified for an interface
of type bridge is valid for xen domains, but not for qemu domains),
the solution here is to move the script out of the union(s) in the
DomainNetDef, always populate it when specified (regardless of
interface type), and let the driver decide whether or not it is
appropriate.

Currently the qemu, xen, libxml, and uml drivers recognize the script
parameter and do something with it (the uml driver only to report that
it isn't supported). Those drivers have been updated to log a
CONFIG_UNSUPPORTED error when a script is specified for an interface
type that's inappropriate for that particular hypervisor.

(NB: There was earlier discussion of solving this problem by adding a
VALIDATE flag to all libvirt APIs that accept XML, which would cause
the XML to be validated against the RNG files. One statement during
that discussion was that the RNG shouldn't contain hypervisor-specific
things, though, and a proper solution to this problem would require
that (again, because a script for an interface of type "bridge" is
accepted by xen, but not by qemu).
2012-01-08 10:52:24 -05:00
Eric Blake
13a776ca0d qemu: one more client to live/config helper
Commit ae523427 missed one pair of functions that could use
the helper routine.

* src/qemu/qemu_driver.c (qemuSetSchedulerParametersFlags)
(qemuGetSchedulerParametersFlags): Simplify.
2012-01-07 05:08:01 -07:00
Eric Blake
cf6d36257b tests: work around pdwtags 1.9 failure
On rawhide, gcc is new enough to output new DWARF information that
pdwtags has not yet learned, but the resulting 'make check' output
was rather confusing:

$ make -C src check
...
  GEN    virkeepaliveprotocol-structs
die__process_function: DW_TAG_INVALID (0x4109) @ <0x58c> not handled!
WARNING: your pdwtags program is too old
WARNING: skipping the virkeepaliveprotocol-structs test
WARNING: install dwarves-1.3 or newer
...
$ pdwtags --version
v1.9

I've filed the pdwtags deficiency as
https://bugzilla.redhat.com/show_bug.cgi?id=772358

* src/Makefile.am (PDWTAGS): Don't leave -t file behind on version
mismatch.  Soften warning message, since 1.9 is newer than 1.3.
Don't leak stderr from broken version.
2012-01-07 12:02:54 +08:00
Eric Blake
03ea567327 build: fix mingw virCommand build
Commit db371a2 mistakenly added new functions inside a #ifndef WIN32
guard, even though they are needed on all platforms.

* src/util/command.c (virCommandFDSet): Move outside WIN32
conditional.
2012-01-06 17:34:05 -07:00
Alex Jia
b41d440e61 qemu: Avoid memory leaks on qemuParseRBDString
Detected by valgrind. Leak introduced in commit 5745dc1.

* src/qemu/qemu_command.c: fix memory leak on failure and successful path.

* How to reproduce?
% valgrind -v --leak-check=full ./qemuargv2xmltest

* Actual result:

==2196== 80 bytes in 1 blocks are definitely lost in loss record 3 of 4
==2196==    at 0x4A05FDE: malloc (vg_replace_malloc.c:236)
==2196==    by 0x39CF07F6E1: strdup (in /lib64/libc-2.12.so)
==2196==    by 0x419823: qemuParseRBDString (qemu_command.c:1657)
==2196==    by 0x4221ED: qemuParseCommandLine (qemu_command.c:5934)
==2196==    by 0x422AFB: qemuParseCommandLineString (qemu_command.c:7561)
==2196==    by 0x416864: testCompareXMLToArgvHelper (qemuargv2xmltest.c:48)
==2196==    by 0x417DB1: virtTestRun (testutils.c:141)
==2196==    by 0x415CAF: mymain (qemuargv2xmltest.c:175)
==2196==    by 0x4174A7: virtTestMain (testutils.c:696)
==2196==    by 0x39CF01ECDC: (below main) (in /lib64/libc-2.12.so)
==2196==
==2196== LEAK SUMMARY:
==2196==    definitely lost: 80 bytes in 1 blocks

Signed-off-by: Alex Jia <ajia@redhat.com>
2012-01-06 14:51:26 +08:00
Hu Tao
6b780f744b qemu: fix a bug in numatune
When setting numa nodeset for a domain which has no nodeset set
before, libvirtd crashes by dereferencing the pointer to the old
nodemask which is null in that case.
2012-01-05 13:04:02 -07:00
Eric Blake
820a2159e9 qemu: fix use-after-free regression
Commit baade4d fixed a memory leak on failure, but in the process,
introduced a use-after-free on success, which can be triggered with:

1. set bandwidth with --live
2. query bandwidth
3. set bandwidth with --live

* src/qemu/qemu_driver.c (qemuDomainSetInterfaceParameters): Don't
free newBandwidth on success.
Reported by Hu Tao.
2012-01-05 10:21:34 -07:00
Eric Blake
302fe95ffa seclabel: fix regression in libvirtd restart
Commit b434329 has a logic bug: seclabel overrides don't set
def->type, but the default value is 0 (aka static).  Restarting
libvirtd would thus reject the XML for any domain with an
override of <seclabel relabel='no'/> (which happens quite
easily if a disk image lives on NFS), with a message:

2012-01-04 22:29:40.949+0000: 6769: error : virSecurityLabelDefParseXMLHelper:2593 : XML error: security label is missing

Fix the logic to never read from an override's def->type, and
to allow a missing <label> subelement when relabel is no.  There's
a lot of stupid double-negatives in the code (!norelabel) because
of the way that we want the zero-initialized defaults to behave.

* src/conf/domain_conf.c (virSecurityLabelDefParseXMLHelper): Use
type field from correct location.
2012-01-05 17:05:02 +08:00
Michal Privoznik
db371a217d command: Discard FD_SETSIZE limit for opened files
Currently, virCommand implementation uses FD_ macros from
sys/select.h. However, those cannot handle more opened files
than FD_SETSIZE. Therefore switch to generalized implementation
based on array of integers.
2012-01-05 09:50:07 +01:00
Jim Fehlig
49d8c8bc0c Support Xen domctl v8
xen-unstable c/s 23874:651aed73b39c added another member to
xen_domctl_getdomaininfo struct and bumped domctl version to 8.
Add a corresponding domctl v8 struct in xen hypervisor sub-driver
and detect domctl v8 during initialization.
2012-01-04 10:17:01 -07:00
Jim Fehlig
beeea90a37 Fix xenstore serial console path for HVM guests
The console path in xenstore is /local/domain/<id>/console/tty
for PV guests (PV console) and /local/domain/<id>/serial/0/tty
(serial console) for HVM guests.  Similar to Xen's in-tree console
client, read the correct path for PV vs HVM.
2012-01-04 10:15:13 -07:00
Michal Privoznik
06b9c5b923 virCommand: Properly handle POLLHUP
It is a good practise to set revents to zero before doing any poll().
Moreover, we should check if event we waited for really occurred or
if any of fds we were polling on didn't encountered hangup.
2012-01-04 10:40:23 +01:00
Yuri Chornoivan
524ba58bb9 Fix typos in messages.
https://bugzilla.redhat.com/show_bug.cgi?id=770954
2012-01-03 20:30:33 -07:00
Jiri Denemark
66ca7ce573 virCPUDefCopy forgot to copy NUMA topology
As a result of it, guest NUMA topology would be lost during migration.
2012-01-03 21:05:54 +01:00
Eric Blake
851fc8139f qemu: fix block stat naming
Typo has existed since API introduction in commit ee0d8c3.

* src/qemu/qemu_driver.c (qemuDomainBlockStatsFlags): Use correct
name.
2012-01-02 20:43:07 -07:00
Eric Blake
269ce467fc domiftune: clean up previous patches
Most severe here is a latent (but currently untriggered) memory leak
if any hypervisor ever adds a string interface property; the
remainder are mainly cosmetic.

* include/libvirt/libvirt.h.in (VIR_DOMAIN_BANDWIDTH_*): Move
macros closer to interface that uses them, and document type.
* src/libvirt.c (virDomainSetInterfaceParameters)
(virDomainGetInterfaceParameters): Formatting tweaks.
* daemon/remote.c (remoteDispatchDomainGetInterfaceParameters):
Avoid memory leak.
* src/libvirt_public.syms (LIBVIRT_0.9.9): Sort lines.
* src/libvirt_private.syms (domain_conf.h): Likewise.
* src/qemu/qemu_driver.c (qemuDomainSetInterfaceParameters): Fix
comments, break long lines.
2012-01-02 14:35:12 -07:00
Peter Krempa
f4384b8439 network_conf: Fix whitespace to pass syntax-check 2012-01-02 17:59:05 +01:00
Michal Novotny
973af2362c Implement DNS SRV record into the bridge driver
Hi,
this is the fifth version of my SRV record for DNSMasq patch rebased
for the current codebase to the bridge driver and libvirt XML file to
include support for the SRV records in the DNS. The syntax is based on
DNSMasq man page and tests for both xml2xml and xml2argv were added as
well. There are some things written a better way in comparison with
version 4, mainly there's no hack in tests/networkxml2argvtest.c and
also the xPath context is changed to use a simpler query using the
virXPathInt() function relative to the current node.

Also, the patch is also fixing the networkxml2argv test to pass both
checks, i.e. both unit tests and also syntax check.

Please review,
Michal

Signed-off-by: Michal Novotny <minovotn@redhat.com>
2012-01-02 23:05:55 +08:00
Alex Jia
baade4cd2b qemu: Fix bandwidth memory leak on failure
Detected by Coverity. Leaks introduced in commit e8d6b29.

Signed-off-by: Alex Jia <ajia@redhat.com>
2011-12-31 16:42:23 -07:00
Eric Blake
8267aea5a6 qemu: fix blkio memory leak on failure
Leak detected by Coverity, and introduced in commit 93ab585.
Reported by Alex Jia.

* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters): Free
devices array on error.
2011-12-31 16:32:35 -07:00
Daniel Veillard
c4ac050fcb Fix build on s390(x) and other stange arches
The blocks to extract node information on a per-arch
basis wasn't well balanced leading to a compilation
failure if not on one of the handled arches (PCs and PPCs)
2011-12-30 14:15:26 +08:00
Eric Blake
904e05a292 seclabel: honor device override in selinux
This wires up the XML changes in the previous patch to let SELinux
labeling honor user overrides, as well as affecting the live XML
configuration in one case where the user didn't specify anything
in the offline XML.

I noticed that the logs contained messages like this:

2011-12-05 23:32:40.382+0000: 26569: warning : SELinuxRestoreSecurityFileLabel:533 : cannot lookup default selinux label for /nfs/libvirt/images/dom.img

for all my domain images living on NFS.  But if we would just remember
that on domain creation that we were unable to set a SELinux label (due to
NFSv3 lacking labels, or NFSv4 not being configured to expose attributes),
then we could avoid wasting the time trying to clear the label on
domain shutdown.  This in turn is one less point of NFS failure,
especially since there have been documented cases of virDomainDestroy
hanging during an attempted operation on a failed NFS connection.

* src/security/security_selinux.c (SELinuxSetFilecon): Move guts...
(SELinuxSetFileconHelper): ...to new function.
(SELinuxSetFileconOptional): New function.
(SELinuxSetSecurityFileLabel): Honor override label, and remember
if labeling failed.
(SELinuxRestoreSecurityImageLabelInt): Skip relabeling based on
override.
2011-12-30 10:57:59 +08:00
Eric Blake
b43432931a seclabel: allow a seclabel override on a disk src
Implement the parsing and formatting of the XML addition of
the previous commit.  The new XML doesn't affect qemu command
line, so we can now test round-trip XML->memory->XML handling.

I chose to reuse the existing structure, even though per-device
override doesn't use all of those fields, rather than create a
new structure, in order to reuse more code.

* src/conf/domain_conf.h (_virDomainDiskDef): Add seclabel member.
* src/conf/domain_conf.c (virDomainDiskDefFree): Free it.
(virSecurityLabelDefFree): New function.
(virDomainDiskDefFormat): Print it.
(virSecurityLabelDefFormat): Reduce output if model not present.
(virDomainDiskDefParseXML): Alter signature, and parse seclabel.
(virSecurityLabelDefParseXML): Split...
(virSecurityLabelDefParseXMLHelper): ...into new helper.
(virDomainDeviceDefParse, virDomainDefParseXML): Update callers.
* tests/qemuxml2argvdata/qemuxml2argv-seclabel-dynamic-override.args:
New file.
* tests/qemuxml2xmltest.c (mymain): Enhance test.
* tests/qemuxml2argvtest.c (mymain): Likewise.
2011-12-30 10:57:59 +08:00
Eric Blake
e83837945c seclabel: move seclabel stuff earlier
Pure code motion; no semantic change.

* src/conf/domain_conf.h (virDomainSeclabelType)
(virSecurityLabelDefPtr): Declare earlier.
* src/conf/domain_conf.c (virSecurityLabelDefClear)
(virSecurityLabelDefParseXML): Move earlier.
(virDomainDefParseXML): Move seclabel parsing earlier.
2011-12-30 10:38:37 +08:00
Eric Blake
336df7966b seclabel: refactor existing domain_conf usage
A future patch will parse and output <seclabel> in more than one
location in a <domain> xml; make it easier to reuse code.

* src/conf/domain_conf.c (virSecurityLabelDefFree): Rename...
(virSecurityLabelDefClear): ...and make static.
(virSecurityLabelDefParseXML): Alter signature.
(virDomainDefParseXML, virDomainDefFree): Adjust callers.
(virDomainDefFormatInternal): Split output...
(virSecurityLabelDefFormat): ...into new helper.
2011-12-30 10:38:37 +08:00
Hu Tao
e8d6b293d8 domiftune: Add virDomain{S,G}etInterfaceParameters support to qemu driver
* src/qemu/qemu_driver.c: implement the qemu driver support
2011-12-29 18:28:47 +08:00
Hu Tao
ee3de186b3 domiftune: Add a util function virDomainNetFind
Add a util function virDomainNetFind to find a domain's net def.
2011-12-29 18:27:35 +08:00
Hu Tao
e7dfe00d06 domiftune: Add support of new APIs to the remote driver
* daemon/remote.c: implement the server side support
* src/remote/remote_driver.c: implement the client side support
* src/remote/remote_protocol.x: definitions for the new entry points
* src/remote_protocol-structs: structure definitions
2011-12-29 18:25:26 +08:00
Hu Tao
51fded0be9 domiftune: virDomain{S,G}etInterfaceParameters: the main entry points
* src/libvirt.c: implement the main entry points
2011-12-29 18:25:12 +08:00
Hu Tao
85f3493f34 domiftune: Add API virDomain{S,G}etInterfaceParameters
The APIs are used to set/get domain's network interface's parameters.
Currently supported parameters are bandwidth settings.

* include/libvirt/libvirt.h.in: new API and parameters definition
* python/generator.py: skip the Python API generation
* src/driver.h: add new entry to the driver structure
* src/libvirt_public.syms: export symbols
2011-12-29 18:24:43 +08:00
Eric Blake
1a3f6608aa qemu: fix inf-loop in blkio parameters
https://bugzilla.redhat.com/show_bug.cgi?id=770520

We had two nested loops both trying to use 'i' as the iteration
variable, which can result in an infinite loop when the inner
loop interferes with the outer loop.  Introduced in commit 93ab585.

* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters): Don't
reuse iteration variable across two loops.
2011-12-28 06:57:42 -07:00
Michal Privoznik
8a34f822e6 qemu: Keep list of USB devices attached to domains
In order to avoid situation where a USB device is
in use by two domains, we must keep a list of already
attached devices like we do for PCI.
2011-12-24 18:12:04 +01:00
Michal Privoznik
d8db0f9690 qemu: Support for overriding NOFILE limit
This patch adds max_files option to qemu.conf which can be used to
override system default limit on number of opened files that are
allowed for qemu user.
2011-12-22 17:49:04 +01:00
Osier Yang
a1a83c5874 qemu: Support readonly filesystem passthrough
Upstream QEMU starts to support it from commit 2c74c2cb.
2011-12-22 12:29:58 +08:00
Stefan Berger
1c8f0cbb83 nwfilter: Do not require DHCP requests to be broadcasted
Remove the requirement that DHCP messages have to be broadcasted.
DHCP requests are most often sent via broadcast but can be directed
towards a specific DHCP server. For example 'dhclient' takes '-s <server>'
as a command line parameter thus allowing DHCP requests to be sent to a
specific DHCP server.
2011-12-21 10:54:47 -05:00
Osier Yang
33eca17f6a qemu: Release the lock on domobj if fails on finding the disk path 2011-12-21 10:22:08 +08:00
Michael Ellerman
d64955a91a qemu: Add spapr-vio address assignment
Add logic to assign addresses for devices with spapr-vio addresses.

We also do validation of addresses specified by the user, ie. ensuring
that there are not duplicate addresses on the bus.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2011-12-20 16:09:21 -07:00
Michael Ellerman
7e4d896b5e Add address type for SPAPR VIO devices
For QEMU PPC64 we have a machine type ("pseries") which has a virtual
bus called "spapr-vio". We need to be able to create devices on this
bus, and as such need a way to specify the address for those devices.

This patch adds a new address type "spapr-vio", which achieves this.

The addressing is specified with a "reg" property in the address
definition. The reg is optional, if it is not specified QEMU will
auto-assign an address for the device.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2011-12-20 15:39:16 -07:00
Michael Ellerman
5abbe04d68 qemu: Add a capability flag for -no-acpi
Currently non-x86 guests must have <acpi/> defined in <features> to
prevent libvirt from running qemu with -no-acpi. Although it works, it
is a hack.

Instead add a capability flag which indicates whether qemu understands
the -no-acpi option. Use it to control whether libvirt emits -no-acpi.

Current versions of qemu always display -no-acpi in their help output,
so this patch has no effect. However the development version of qemu
has been modified such that -no-acpi is only displayed when it is
actually supported.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2011-12-20 12:33:55 -07:00
Hu Tao
6758a01b18 Implement virDomain{G, S}etNumaParameters for the qemu driver 2011-12-20 11:01:27 -07:00
Hu Tao
1b051d8652 Add virDomain{G, S}etNumaParameters support to the remote driver 2011-12-20 10:47:17 -07:00
Hu Tao
c57ca57034 add new API virDomain{G, S}etNumaParameters
Set up the types for the numa functions and insert them into the
virDriver structure definition.
2011-12-20 10:21:37 -07:00
Hu Tao
9d3a721ad5 use cpuset to manage numa
This patch also sets cgroup cpuset parameters for numatune.
2011-12-20 09:32:23 -07:00
Hu Tao
059425ae45 Add functions to set/get cgroup cpuset parameters 2011-12-20 09:13:36 -07:00
Eric Blake
4e394dea1f rpc: handle param_int, plug memory leaks
The RPC code had several latent memory leaks and an attempt to
free the wrong string, but thankfully nothing triggered them
(blkiotune was the only one returning a string, and always as
the last parameter).  Also, our cleanups for rpcgen ended up
nuking a line of code that renders VIR_TYPED_PARAM_INT broken,
because it was the only use of 'i' in a function, even though
it was a member usage rather than a standalone declaration.

* daemon/remote.c (remoteSerializeTypedParameters): Free the
correct array element.
(remoteDispatchDomainGetSchedulerParameters)
(remoteDispatchDomainGetSchedulerParametersFlags)
(remoteDispatchDomainBlockStatsFlags)
(remoteDispatchDomainGetMemoryParameters): Don't leak strings.
* src/rpc/genprotocol.pl: Don't nuke member-usage of 'buf' or 'i'.
2011-12-20 08:41:10 -07:00
Daniel P. Berrange
707781fe12 Only add the timer when a callback is registered
The lifetime of the virDomainEventState object is tied to
the lifetime of the driver, which in stateless drivers is
tied to the lifetime of the virConnectPtr.

If we add & remove a timer when allocating/freeing the
virDomainEventState object, we can get a situation where
the timer still triggers once after virDomainEventState
has been freed. The timeout callback can't keep a ref
on the event state though, since that would be a circular
reference.

The trick is to only register the timer when a callback
is registered with the event state & remove the timer
when the callback is unregistered.

The demo for the bug is to run

  while true ; do date ; ../tools/virsh -q -c test:///default 'shutdown test; undefine test; dominfo test' ; done

prior to this fix, it will frequently hang and / or
crash, or corrupt memory
2011-12-19 11:08:25 +00:00
Daniel P. Berrange
34ad13536e Hide use of timers for domain event dispatch
Currently all drivers using domain events need to provide a callback
for handling a timer to dispatch events in a clean stack. There is
no technical reason for dispatch to go via driver specific code. It
could trivially be dispatched directly from the domain event code,
thus removing tedious boilerplate code from all drivers

Also fix the libxl & xen drivers to pass 'true' when creating the
virDomainEventState, since they run inside the daemon & thus always
expect events to be present.

* src/conf/domain_event.c, src/conf/domain_event.h: Internalize
  dispatch of events from timer callback
* src/libxl/libxl_driver.c, src/lxc/lxc_driver.c,
  src/qemu/qemu_domain.c, src/qemu/qemu_driver.c,
  src/remote/remote_driver.c, src/test/test_driver.c,
  src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
  src/xen/xen_driver.c: Remove all timer dispatch functions
2011-12-19 11:08:24 +00:00
Daniel P. Berrange
2c2d533768 Remove decl of all APIs related to domain event callbacks & queues
The virDomainEventCallbackList and virDomainEventQueue APIs are
now solely helpers used internally by virDomainEventState APIs.
Remove their decls from domain_event.h since no driver code should
need to use them any more.

* src/conf/domain_event.c: Make virDomainEventCallbackList and
  virDomainEventQueue APIs static & remove some unused APIs
* src/conf/domain_event.h, src/libvirt_private.syms: Remove
  virDomainEventCallbackList and virDomainEventQueue APIs
2011-12-19 11:08:11 +00:00
Daniel P. Berrange
06eb22df01 Remove all domain event structs from header
No caller of the domain events APIs should need to poke at the
struct internals. Thus they should all be removed from the
header file

* src/conf/domain_event.h: Remove struct definitions
* src/conf/domain_event.c: Add struct definitions
2011-12-19 11:08:10 +00:00
Daniel P. Berrange
7b87a30f15 Convert drivers to thread safe APIs for adding callbacks
* src/libxl/libxl_driver.c, src/lxc/lxc_driver.c,
  src/qemu/qemu_driver.c, src/remote/remote_driver.c,
  src/test/test_driver.c, src/uml/uml_driver.c,
  src/vbox/vbox_tmpl.c, src/xen/xen_driver.c: Convert
  to threadsafe APIs
2011-12-19 11:08:10 +00:00
Daniel P. Berrange
4f5326c315 Add APIs to allow management of callbacks purely with virDomainEventState
While virDomainEventState has APIs for managing removal of callbacks,
while locked, adding callbacks in the first place requires direct
access to the virDomainEventCallbackList structure. This is not
threadsafe since it is bypassing the virDomainEventState locks

* src/conf/domain_event.c, src/conf/domain_event.h,
  src/libvirt_private.syms: Add APIs for managing callbacks
  via virDomainEventState.
2011-12-19 11:08:10 +00:00
Daniel P. Berrange
d09f6ba5fe Return count of callbacks when registering callbacks
When registering a callback for a particular event some callers
need to know how many callbacks already exist for that event.
While it is possible to ask for a count, this is not free from
race conditions when threaded. Thus the API for registering
callbacks should return the count of callbacks. Also rename
virDomainEventStateDeregisterAny to virDomainEventStateDeregisterID

* src/conf/domain_event.c, src/conf/domain_event.h,
  src/libvirt_private.syms: Return count of callbacks when
  registering callbacks
* src/libxl/libxl_driver.c, src/libxl/libxl_driver.c,
  src/qemu/qemu_driver.c, src/remote/remote_driver.c,
  src/remote/remote_driver.c, src/uml/uml_driver.c,
  src/vbox/vbox_tmpl.c, src/xen/xen_driver.c: Update
  for change in APIs
2011-12-19 11:08:10 +00:00
Daniel P. Berrange
a86bbc6003 Convert Xen & VBox drivers to use virDomainEventState
The Xen & VBox drivers deal with callbacks & dispatching of
events directly. All the other drivers use a timer to dispatch
events from a clean stack state, rather than deep inside the
drivers. Convert Xen & VBox over to virDomainEventState so
that they match behaviour of other drivers

* src/conf/domain_event.c: Return count of remaining
  callbacks when unregistering event callback
* src/vbox/vbox_tmpl.c, src/xen/xen_driver.c,
  src/xen/xen_driver.h: Convert to virDomainEventState
2011-12-19 11:08:09 +00:00
Stefan Berger
b4d579de1e nwfilter: do not create ebtables chain unnecessarily
If only iptables rules are created then two unnecessary ebtables chains
are also created. This patch fixes this and prevents these chains from
being created. They have been cleaned up properly, though.
2011-12-16 16:54:49 -05:00
Peter Krempa
8fb2aeb662 migration: Add more specific error code/message on migration abort
A generic error code was returned, if the user aborted a migration job.
This made it hard to distinguish between a user requested abort and an
error that might have occured. This patch introduces a new error code,
which is returned in the specific case of a user abort, while leaving
all other failures with their existing code. This makes it easier to
distinguish between failure while mirgrating and an user requested
abort.

 * include/libvirt/virterror.h: - add new error code
 * src/util/virterror.c: - add message for the new error code
 * src/qemu/qemu_migration.h: - Emit operation aborted error instead of
                                operation failed, on migration abort
2011-12-16 16:38:26 +01:00
Eric Blake
d99fe011a2 qemu: detect truncated file as invalid save image
If managed save fails at the right point in time, then the save
image can end up with 0 bytes in length (no valid header), and
our attempts in commit 55d88def to detect and skip invalid save
files missed this case.

* src/qemu/qemu_driver.c (qemuDomainSaveImageOpen): Also unlink
empty file as corrupt.  Reported by Dennis Householder.
2011-12-16 08:29:31 -07:00
Michal Privoznik
13d5a6b83d qemu: Don't drop hostdev config until security label restore
Currently, on device detach, we parse given XML, find the device
in domain object, free it and try to restore security labels.
However, in some cases (e.g. usb hostdev) parsed XML contains
less information than freed device. In usb case it is bus & device
IDs. These are needed during label restoring as a symlink into
/dev/bus is generated from them. Therefore don't drop device
configuration until security labels are restored.
2011-12-16 11:53:03 +01:00
Jim Fehlig
d8916dc8e2 Fix default migration speed in qemu driver
In commit 6f84e110 I mistakenly set default migration speed to
33554432 Mb!  The units of migMaxBandwidth is Mb, with conversion
handled in qemuMonitor{JSON,Text}SetMigrationSpeed().

Also, remove definition of QEMU_DOMAIN_FILE_MIG_BANDWIDTH_MAX since
it is no longer used after reverting commit ef1065cf.
2011-12-15 11:25:07 -07:00
Jiri Denemark
6948b725e7 qemu: Fix race between async and query jobs
If an async job run on a domain will stop the domain at the end of the
job, a concurrently run query job can hang in qemu monitor and nothing
can be done with that domain from this point on. An attempt to start
such domain results in "Timed out during operation: cannot acquire state
change lock" error.

However, quite a few things have to happen at the right time... There
must be an async job running which stops a domain at the end. This race
was reported with dump --crash but other similar jobs, such as
(managed)save and migration, should be able to trigger this bug as well.
While this async job is processing its last monitor command, that is a
query-migrate to which qemu replies with status "completed", a new
libvirt API that results in a query job must arrive and stay waiting
until the query-migrate command finishes. Once query-migrate is done but
before the async job closes qemu monitor while stopping the domain, the
other thread needs to wake up and call qemuMonitorSend to send its
command to qemu. Before qemu gets a chance to respond to this command,
the async job needs to close the monitor. At this point, the query job
thread is waiting for a condition that no-one will ever signal so it
never finishes the job.
2011-12-15 11:53:20 +01:00
Osier Yang
3f29d6c91f qemu: Do not free the device from activePciHostdevs if it's in use
* src/qemu/qemu_hostdev.c (qemuDomainReAttachHostdevDevices):
pciDeviceListFree(pcidevs) in the end free()s the device even if
it's in use by other domain, which can cause a race.

How to reproduce:

<script>

virsh nodedev-dettach pci_0000_00_19_0
virsh start test
virsh attach-device test hostdev.xml
virsh start test2

for i in {1..5}; do
        echo "[ -- ${i}th time --]"
        virsh nodedev-reattach pci_0000_00_19_0
done

echo "clean up"
virsh destroy test
virsh nodedev-reattach pci_0000_00_19_0
</script>

Device pci_0000_00_19_0 dettached

Domain test started

Device attached successfully

error: Failed to start domain test2
error: Requested operation is not valid: PCI device 0000:00:19.0 is in use by domain test

[ -- 1th time --]
Device pci_0000_00_19_0 re-attached

[ -- 2th time --]
Device pci_0000_00_19_0 re-attached

[ -- 3th time --]
Device pci_0000_00_19_0 re-attached

[ -- 4th time --]
Device pci_0000_00_19_0 re-attached

[ -- 5th time --]
Device pci_0000_00_19_0 re-attached

clean up
Domain test destroyed

Device pci_0000_00_19_0 re-attached

The patch also fixes another problem, there won't be error like
"qemuDomainReAttachHostdevDevices: Not reattaching active
device 0000:00:19.0" in daemon log if some device is in active.
As pciResetDevice and pciReattachDevice won't be called for
the device anymore. This is sensible as we already reported
error when preparing the device if it's active. Blindly trying
to pciResetDevice & pciReattachDevice on the device and getting
an error is just redundant.
2011-12-15 10:18:20 +08:00
Osier Yang
a0aec362e8 qemu: Honor the original properties of PCI device when detaching
This patch fixes two problems:
    1) The device will be reattached to host even if it's not
       managed, as there is a "pciDeviceSetManaged".
    2) The device won't be reattached to host with original
       driver properly. As it doesn't honor the device original
       properties which are maintained by driver->activePciHostdevs.
2011-12-15 10:14:11 +08:00
Lei Li
ae52342754 Provide a helper method virDomainLiveConfigHelperMethod
This chunk of code below repeated in several functions, factor it into
a helper method virDomainLiveConfigHelperMethod to eliminate duplicated code
based on Eric and Adam's suggestion. I have tested it for all the
relevant APIs changed.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
2011-12-13 15:10:42 -07:00
Osier Yang
380f326955 storage: Fix a potential crash when creating vol object
If the vol object is newly created, it increases the volumes count,
but doesn't decrease the volumes count when do cleanup. It can
cause libvirtd to crash when one trying to free the volume objects
like:
    for (i = 0; i < pool->volumes.count; i++)
        virStorageVolDefFree(pool->volumes.objs[i]);

It's more reliable if we add the newly created vol object in the
end.
2011-12-13 11:14:26 +08:00
Jiri Denemark
5547d2b81c qemu: Disable EOF processing during qemuDomainDestroy
When destroying a domain qemuDomainDestroy kills its qemu process and
starts a new job, which means it unlocks the domain object and locks it
again after some time. Although the object is usually unlocked for a
pretty short time, chances are another thread processing an EOF event on
qemu monitor is able to lock the object first and does all the cleanup
by itself. This leads to wrong shutoff reason and lifecycle event detail
and virDomainDestroy API incorrectly reporting failure to destroy an
inactive domain.

Reported by Charlie Smurthwaite.
2011-12-12 16:31:19 +01:00
Rommer
95ab415417 storage: Activate/deactivate logical volumes only on local node
Current "-ay | -an" has problems on pool starting/refreshing if
the volumes are clustered. Rommer has posted a patch to list 2
months ago.

https://www.redhat.com/archives/libvir-list/2011-October/msg01116.html

But IMO we shouldn't skip the inactived vols. So this is a squashed
patch by Rommer.

Signed-off-by: Rommer <rommer@active.by>
2011-12-12 21:55:47 +08:00
Josh Durgin
20e1233c31 security: don't try to label network disks
Network disks don't have paths to be resolved or files to be checked
for ownership. ee3efc41e6 checked this
for some image label functions, but was partially reverted in a
refactor.  This finishes adding the check to each security driver's
set and restore label methods for images.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
2011-12-12 11:52:15 +01:00
Laine Stump
ae1232b298 network: don't add iptables rules for externally managed networks
This patch addresses https://bugzilla.redhat.com/show_bug.cgi?id=760442

When a network has any forward type other than route, nat or none, the
network configuration should be done completely external to libvirt -
libvirt only uses these types to allow configuring guests in a manner
that isn't tied to a specific host (all the host-specific information,
in particular interface names, port profile data, and bandwidth
configuration is in the network definition, and the guest
configuration only references it).

Due to a bug in the bridge network driver, libvirt was adding iptables
rules for networks with forward type='bridge' etc. any time libvirtd
was restarted while one of these networks was active.

This patch eliminates that error by only "reloading" iptables rules if
forward type is route, nat, or none.
2011-12-09 19:21:33 -05:00
Michael Ellerman
9f406c5838 qemu: Prepare to cater for more general address assignment
Currently qemuDomainAssignPCIAddresses() is called to assign addresses
to PCI devices.

We need to do something similar for devices with spapr-vio addresses.
So create one place where address assignment will be done, that is
qemuDomainAssignAddresses().

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2011-12-09 15:01:52 -07:00
Michael Ellerman
2a994a3b1e qemu: Add address in qemuBuildChrDeviceStr() on pseries
For the PPC64 pseries machine type we need to add address information
for the spapr-vty device.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
2011-12-09 13:27:57 -07:00