Commit Graph

8774 Commits

Author SHA1 Message Date
Martin Kletzander
b12a9cdedd conf: eliminate redundant use of VIR_ALLOC
We can use VIR_REALLOC_N with NULL pointer, which behaves the same way
as VIR_ALLOC_N in that case, so no need for a condition that's
checking if some data are allocated already.

---

I tried to find other parts of the code similar to this, so I can do a
full cleanup for the whole repository, so I used this (excuse the long
line, but that's how I was writing it):

git grep -nHC 5 -e VIR_REALLOC_N -e VIR_ALLOC_N | while read line; do if [[ "$line" == "--" ]]; then if [[ ${#tmpbuf} -gt 10 && "$REALLOC_N" == "true" && "$ALLOC_N" == "true" ]]; then echo $line; while [[ ${#tmpbuf[*]} -gt 0 ]]; do echo "${tmpbuf[0]}"; tmpbuf=( "${tmpbuf[@]:1:${#tmpbuf[*]}}" ); done; fi; unset tmpbuf REALLOC_N ALLOC_N; else if [[ "$ALLOC_N" != "true" && "${line/VIR_ALLOC_N//}" != "${line}" ]]; then ALLOC_N="true"; fi; if [[ "$REALLOC_N" != "true" && "${line/VIR_REALLOC_N//}" != "${line}" ]]; then REALLOC_N="true"; fi; tmpbuf[${#tmpbuf[*]}]="$line"; fi; done | less

And reviewed the output just to find out this was the only occurrence of
the inconsistency.
2012-12-19 02:21:54 +01:00
Martin Kletzander
7affb25be9 conf: minor indentation cleanups
On few places there are too many levels of indentation when some of
them can be fixed with negating the option they are in or omitting
useless condition altogether.
2012-12-19 02:21:47 +01:00
Martin Kletzander
b72c97e732 fix typo in the word affinities
This patch fixes just the word Affinites to Affinities (it's really
painful to search in TAGS without being able to find the right
function).
2012-12-19 02:17:38 +01:00
Daniel P. Berrange
8db1f2d228 Fix libxl driver for virArch changes 2012-12-18 19:50:24 +00:00
Daniel P. Berrange
473011334c Fix XenAPI driver for virArch changes 2012-12-18 19:32:15 +00:00
Daniel P. Berrange
5411e7e176 Export all symbols from virarch.{c,h} to drivers/tests/etc
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-18 19:32:04 +00:00
Daniel P. Berrange
aaf1636875 Convert QEMU capabilities code to use virArch
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-18 18:54:50 +00:00
Daniel P. Berrange
1846b80be8 Convert CPU APIs to use virArch
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-18 16:53:03 +00:00
Daniel P. Berrange
c25c18f71b Convert capabilities / domain_conf to use virArch
Convert the host capabilities and domain config structs to
use the virArch datatype. Update the parsers and all drivers
to take account of datatype change

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-18 16:53:03 +00:00
Daniel P. Berrange
2f4a139a4c Convert QEMU command line builder to virArch APIs
Use virArch APIs to determine host architecture when launching
QEMU.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-18 16:53:03 +00:00
Daniel P. Berrange
5a217e84c4 Convert nodeGetInfo to virArch APIs
Replace use of uname in nodeGetInfo with virArch APIs to
provide canonicalization of host architecture name

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-18 16:53:03 +00:00
Daniel P. Berrange
0333180185 Introduce a set of APIs for managing architectures
Introduce a 'virArch' enum for CPU architectures. Include
data type providing wordsize and endianness, and APIs to
query this info and convert to/from enum and string form.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-18 16:53:02 +00:00
Laine Stump
4b31da3478 network: don't require private addresses if dnsmasq uses SO_BINDTODEVICE
This is yet another refinement to the fix for CVE-2012-3411:

   https://bugzilla.redhat.com/show_bug.cgi?id=833033

It turns out that it would be very intrusive to correctly backport the
entire --bind-dynamic option to older dnsmasq versions
(e.g. dnsmasq-2.48 that is used on RHEL6.x and CentOS 6.x), but very
simple to patch those versions to just use SO_BINDTODEVICE on all
their listening sockets (SO_BINDTODEVICE also has the desired effect
of permitting only traffic that was received on the interface(s) where
dnsmasq was set to listen.)

This patch modifies the dnsmasq capabilities detection to detect the
string:

    --bind-interfaces with SO_BINDTODEVICE

in the output of "dnsmasq --version", and in that case realize that
using the old --bind-interfaces option is just as safe as
--bind-dynamic (and therefore *not* forbid creation of networks that
use public IP address ranges).

If -bind-dynamic is available, it is still preferred over
--bind-interfaces.

Note that this patch does no harm in upstream, or in any distro's
downstream if it happens to end up there, but builds for distros that
have a new enough dnsmasq to support --bind-dynamic do *NOT* need to
specifically backport this patch; it's only required for distro
releases that have dnsmasq too old to have --bind-dynamic (and those
distros will need to add the SO_BINDTODEVICE patch to dnsmasq,
*including the extra string in the --version output*, as well.
2012-12-17 15:51:19 -05:00
Jiri Denemark
cdfe739c97 apparmor: Fix build 2012-12-17 21:17:55 +01:00
Laine Stump
bc5b270c44 network: fix indentation of networkDnsmasqConfContents
Somehow I managed to push the changes to this file with improper
indentation. This patch just re-indents, reformats the comment lines,
and re-groups a couple of multi-line strings so that they fit within
80 columns. The resulting binary should be identical.
2012-12-17 15:08:54 -05:00
Cole Robinson
2628ad8368 hostusb: Move USB_DEVFS define to hostusb.h to fix the build 2012-12-17 14:37:11 -05:00
Daniel P. Berrange
4ad6a01330 Add support for hotplug/unplug of host misc devices in LXC
Wire up the attach/detach device drivers in LXC to support the
hotplug/unplug of host misc devices.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:52 +00:00
Daniel P. Berrange
a5efb31909 Add support for hotplug/unplug of host storage devices in LXC
Wire up the attach/detach device drivers in LXC to support the
hotplug/unplug of host storage devices.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
ed77abc58b Add support for hotplug/unplug of USB host devices in LXC
Wire up the attach/detach device drivers in LXC to support the
hotplug/unplug of USB host devices.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
af7ab7fc5d Add support for hotplug/unplug of NIC devices in LXC
Wire up the attach/detach device drivers in LXC to support the
hotplug/unplug of NICs.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
de858e3fa7 Add support for hotplug/unplug of disk devices in LXC
Wire up the attach/detach device drivers in LXC to support the
hotplug/unplug of disks.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
986c270dac Add support for attach/detach/update hostdev devices in config for LXC
Wire up the attach/detach/update device APIs to support changing
of hostdevs in the persistent config file

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
8cacd8b4ea Add support for attach/detach/update disk devices in config for LXC
Wire up the attach/detach/update device APIs to support changing
of disks in the persistent config file

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
74a909fef1 Add support for attach/detach/update net devices in config for LXC
Wire up the attach/detach/update device APIs to support changing
of network interfaces in the persistent config file

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
d4e5359a1c Add basic driver API framework for device attach/detach support in LXC
This wires up the LXC driver to support the domain device attach/
detach/update APIs, following the same code design as used in
the QEMU driver. No actual changes are possible with this commit,
it is only providing the framework

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
83a9c93807 Add support for misc host device passthrough with LXC
This extends support for host device passthrough with LXC to
cover misc devices. In this case all we need todo is a
mknod in the container's /dev and whitelist the device in
cgroups

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
313669d1c1 Add support for storage host device passthrough with LXC
This extends support for host device passthrough with LXC to
cover storage devices. In this case all we need todo is a
mknod in the container's /dev and whitelist the device in
cgroups

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
95fef5f407 Add support for USB host device passthrough with LXC
This adds support for host device passthrough with the
LXC driver. Since there is only a single kernel image,
it doesn't make sense to pass through PCI devices, but
USB devices are fine. For the latter we merely need to
make the /dev/bus/usb/NNN/MMM character device exist
in the container's /dev

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
368e341ac1 Add support for disks with LXC
Currently LXC guests can be given arbitrary pre-mounted
filesystems, however, for some usecases it is more appropriate
to provide block devices which the container can mount itself.
This first impl only allows for <disk type='block'>, in other
words exposing a host disk device to a container. Since LXC
does not have device namespace virtualization, we are cheating
a little bit. If the XML specifies /dev/sdc4 to be given to
the container as /dev/sda1, when we do the mknod /dev/sda1
in the container's /dev, we actually use the major:minor
number of /dev/sdc4, not /dev/sda1.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
a6cbdd7b81 Add support for SELinux labelling of hostdev storage/misc devices
The SELinux security driver needs to learn to label storage/misc
hostdev devices for LXC

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
570ad09ef9 Refactor SELinux security driver hostdev labelling
Prepare to support different types of hostdevs by refactoring
the current SELinux security driver code

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
df5928ea56 Allow passing a vroot into security manager hostdev labelling
When LXC labels USB devices during hotplug, it is running in
host context, so it needs to pass in a vroot path to the
container root.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
89c5a9d0e8 Skip bulk relabelling of resources in SELinux driver when used with LXC
The virSecurityManager{Set,Restore}AllLabel methods are invoked
at domain startup/shutdown to relabel resources associated with
a domain. This works fine with QEMU, but with LXC they are in
fact both currently no-ops since LXC does not support disks,
hostdevs, or kernel/initrd files. Worse, when LXC gains support
for disks/hostdevs, they will do the wrong thing, since they
run in host context, not container context. Thus this patch
turns then into a formal no-op when used with LXC. The LXC
controller will call out to specific security manager labelling
APIs as required during startup.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
e89c68b8bb Refactor LXC NIC creation to allow reuse by hotplug code
The code for creating veth/macvlan devices is part of the
LXC process startup code. Refactor this a little and export
the methods to the rest of the LXC driver. This allows them
to be reused for NIC hotplug code

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:51 +00:00
Daniel P. Berrange
aae0fc2a92 Add support for <hostdev mode="capabilities">
The <hostdev> device type has long had a redundant "mode"
attribute, which has always been "subsys". This finally
introduces a new mode "capabilities", which will be used
by the LXC driver for device assignment. Since container
based virtualization uses a single kernel, the idea of
assigning physical PCI devices doesn't make sense. It is
still reasonable to assign USB devices, but for assigning
arbitrary nodes in /dev, the new 'capabilities' mode is
to be used.

The first capability support is 'storage', which is for
assignment of block devices. Functionally this is really
pretty similar to the <disk> support. The only difference
is the device node name is identical in both host and
container namespaces.

    <hostdev mode='capabilities' type='storage'>
      <source>
        <block>/dev/sdf1</block>
      </source>
    </hostdev>

The second capability support is 'misc', which is for
assignment of character devices. There is no existing
parallel to this. Again the device node is the same
inside & outside the container.

    <hostdev mode='capabilities' type='misc'>
      <source>
        <char>/dev/input/event3</char>
      </source>
    </hostdev>

The reason for keeping the char & storage devices
separate in the domain XML, is to mirror the split
in the node device XML. NB the node device XML does
not yet report character devices, but that's another
new patch to come

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:50 +00:00
Viktor Mihajlovski
cab938c993 S390: Fix virSysinfoRead memory corruption
There was a double free issue caused by virSysinfoRead on s390,
as the same manufacturer string instance was assigned to more
than one processor record.
Cleaned up other potential memory issues and restructured the sysinfo
parsing code by moving repeating patterns into a helper function.

The restructuring made it necessary to conditionally disable
-Wlogical-op for some older GCC versions, using pragma GCC diagnostic.
This is a GCC specific pragma, which is acceptable, since we're
using it to work around a GCC specific bug.

Finally, added a function virSysinfoSetup to configure the sysinfo
data source files/script during run time, to facilitate writing test
programs. This function is not published in sysinfo.h and only
there for testing.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2012-12-17 17:36:58 +00:00
Peter Krempa
41bd91f8ad conf: cpu: Break some long lines 2012-12-17 17:28:04 +01:00
Peter Krempa
4a9c179325 conf: cpu: Refactor parsing of vendor_id and fallback attributes
This patch simplifies the code that parses the fallback and vendor_id
attributes from the domain xml cpu definition.

Changes done:
- free temp variables in the cleanup section instead of local use
- remove checking for presence of the attribute to directly getting the
value (saving call to virXPathBoolean)
- replace loop used to check for ',' in the vendor_id string with strchr
2012-12-17 17:27:56 +01:00
Peter Krempa
fb49ffc3bb conf: cpu: Fix memory leak when specifying cpu vendor_id manually
The field was not freed from the cpu definition.
2012-12-17 16:55:54 +01:00
Ken ICHIKAWA
1190a82469 conf: cpu: Fix parsing of vendor_id
This patch fixes a problem that vendor_id attribute can not be defined
when fallback attribute is not defined.

If I define domain xml like below:
<domain>
  <cpu>
    <model vendor_id='aaaabbbbcccc'>core2duo</model>
  </cpu>
</domain>

In dumpxml, vendor_id is not reflected:
<domain>
  <cpu mode='custom' match='exact'>
    <model fallback='allow'>core2duo</model>
  </cpu>
</domain>

The expected output is:
<domain>
  <cpu mode='custom' match='exact'>
    <model fallback='allow' vendor_id='aaaabbbbcccc'>core2duo</model>
  </cpu>
</domain>

If the fallback attribute and vendor_id attribute is defined at the same
time, it's reflected as expected.

Signed-off-by: Ken ICHIKAWA <ichikawa.ken@jp.fujitsu.com>
2012-12-17 16:55:54 +01:00
Daniel P. Berrange
77d3a80974 Support custom 'svirt_tcg_t' context for TCG based guests
The current SELinux policy only works for KVM guests, since
TCG requires the 'execmem' privilege. There is a 'virt_use_execmem'
boolean to turn this on globally, but that is unpleasant for users.
This changes libvirt to automatically use a new 'svirt_tcg_t'
context for TCG based guests. This obsoletes the previous
boolean tunable and makes things 'just work(tm)'

Since we can't assume we run with new enough policy, I also
make us log a warning message (once only) if we find the policy
lacks support. In this case we fallback to the normal label and
expect users to set the boolean tunable

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 11:22:36 +00:00
Guannan Ren
aa51202b72 qemu: use newer -device video device in qemu commandline
'-device VGA' maps to '-vga std'
'-device cirrus-vga' maps to '-vga cirrus'
'-device qxl-vga' maps to '-vga qxl'
             (there is also '-device qxl' for secondary devices)
'-device vmware-svga' maps to '-vga vmware'

For qemu(>=1.2), we can use -device to replace -vga for video
device. For the primary video device, the patch tries to use 0x2
slot for matching old qemu. If the 0x2 slot is allocated already,
the addr property could help for using any available slot.
For qemu(< 1.2), we keep using -vga for primary device.
2012-12-17 14:02:50 +08:00
Guannan Ren
09938bb3b0 conf: add optional attribte primary to video <model> element
If there are multiple video devices
primary = 'yes' marks this video device as the primary one.
The rest are secondary video devices. No more than one could be
mark as primary. If none of them has primary attribute, the first
one will be the primary by default like what it was.
The reason of this changing is that for qemu, only one primary video
device is permitted which can be of any type. For secondary video
devices, only qxl is allowd. Primary attribute removes the restriction
that the first have to be the primary one.

We always put the primary video device into the first position of
video device structure array after parsing.
2012-12-17 14:01:20 +08:00
Guannan Ren
4c993d8ab5 qemu: add qemu vga devices caps and one cap to mark them usable
QEMU_CAPS_DEVICE_QXL          -device qxl
QEMU_CAPS_DEVICE_VGA          -device VGA
QEMU_CAPS_DEVICE_CIRRUS_VGA   -device cirrus-vga
QEMU_CAPS_DEVICE_VMWARE_SVGA  -device vmware-svga

QEMU_CAPS_DEVICE_VIDEO_PRIMARY  /* safe to use -device XXX
                                 for primary video device */

Fix a typo in qemuCapsObjectTypes, the string 'qxl' here
should be -device qxl rather than -vga [...|qxl|..]
2012-12-17 13:55:50 +08:00
Eric Blake
70743daeec build: minor build fixes for BSD
Noticed these while building on FreeBSD.

* src/qemu/qemu_monitor.c (qemuMonitorBlockInfoLookup): Rename
variable to avoid 'devname' collision.
* src/qemu/qemu_driver.c (qemuDomainInterfaceStats): Mark unused
variable.
2012-12-14 12:14:52 -07:00
Roman Bogorodskiy
0c94357f9d Socket identity support for FreeBSD.
This adds an implementation of virNetSocketGetUNIXIdentity()
using LOCAL_PEERCRED socket option and xucred struct, defined
in <sys/ucred.h> on systems that have it.
2012-12-14 11:49:31 -07:00
Laine Stump
e3802e13df network: fix (non)update of dnsmasq config during virDomainUpdateDeviceFlags
A forgotten "!" in recently-modified code at the top of
networkRefreshDaemon() meant an improper early return, which led to 1)
dnsmasq config files not being updated from the newly modified config,
and 2) dnsmasq not being sent a SIGHUP so that it could learn about
the changes to the config.

virNetworkDefGetIpByIndex() returns NULL if there are no ip objects of
the requested type, and if there are no IP elements, then dnsmasq
shouldn't be running, so we can return early. Otherwise we should
rewrite the config files and send a SIGHUP.
2012-12-14 13:37:17 -05:00
Michal Privoznik
11cfa28850 sanlock: Re-add lockspace unconditionally
Currently, if sanlock is already registering a lockspace other
libvirtd instances (from other hosts) obtain -EINPROGRESS. On
sufficiently new sanlock, sanlock_inq_lockspace() is called,
which suspend execution until lockspace state is changed. With
current libvirt implementation, we fail to retry adding the
lockspace again but continue in error path. Therefore we produce
meaningless error message:

virLockManagerSanlockSetupLockspace:363 : Unable to add lockspace
/var/lib/libvirt/sanlock/__LIBVIRT__DISKS__: Success
qemudLoadDriverConfig:558 : Failed to load lock manager sanlock

We should try to re-add the lockspace after its state change to
be sure it was added successfully. In fact, with sufficiently new
sanlock we can just avoid dummy usleep() which is used if there's
no inquire API.
2012-12-14 15:01:03 +01:00
Eric Blake
8d59a025bb install: fix virtlockd installation
The virtlockd daemon scripts were lousy, when compared to their
counterparts in daemon/Makefile.am.  In particular, when init
scripts were selected, this resulted in 'make distcheck' failing
due to failure to clean up src/virtlockd.init.

* src/Makefile.am (install-systemd): Fix dependencies.  Use MKDIR_P.
(uninstall-systemd): Remove empty directory.  Use fewer processes.
(install-init, install-sysconfig): Use MKDIR_P.
(uninstall-init): Remove correct file, and also empty directory.
(uninstall-sysconfig): Remove empty directory.
(DISTCLEANFILES): Clean up trivially built sources.
2012-12-14 06:27:10 -07:00
Laine Stump
9cf8734e7c qemu: don't fail update netdev on bridge detach failure
When a network device's bridge connection is changed by
virDomainUpdateDevice, libvirt first removes the netdev's tap from its
old bridge, then adds it to the new bridge. Sometimes, due to a
network being destroyed while a guest device is still attached, the
tap may already be "removed" from the old bridge (or the old bridge
may not even exist any more); the existing code was needlessly failing
the update when this happened, making it impossible to recover from
the situation without completely detaching (i.e. removing) the netdev
from the guest and re-attaching.

Instead of failing the entire operation when removal of the tap from
the old bridge fails, this patch changes qemuDomainChangeNetBridge to
just log a warning and continue, allowing a reasonable recover from
the situation.

(you'll appreciate this change if you ever accidentally destroy a
network while your guests are still using it).
2012-12-14 07:14:10 -05:00
Jiri Denemark
2e59e1207a build: Install both qemu-lockd.conf and qemu-sanlock.conf
With sanlock enabled, only one of those files was installed.
2012-12-14 11:59:37 +01:00
Eric Blake
c0a8056ee2 build: use fewer cat processes
* src/Makefile.am (libvirt.syms): Let cat loop for us.
2012-12-13 15:45:40 -07:00
Ján Tomko
b28fb61fd7 selinux: fix NULL dereference in GetSecurityMountOptions
In the case of an OOM error in virDomainDefGetSecurityLabelDef, secdef
is set to NULL, then dereferenced while printing the debug message.
2012-12-13 15:41:44 -07:00
Jiri Denemark
912a4e9c06 build: Distribute more files 2012-12-13 23:17:34 +01:00
Jiri Denemark
809473ba6c locking: Fix VPATH build and distribute generated files 2012-12-13 23:17:34 +01:00
Laine Stump
d66eb78667 network: prevent dnsmasq from listening on localhost
This patch resolves the problem reported in:

   https://bugzilla.redhat.com/show_bug.cgi?id=886663

The source of the problem was the fix for CVE 2011-3411:

   https://bugzilla.redhat.com/show_bug.cgi?id=833033

which was originally committed upstream in commit
753ff83a50. That commit improperly
removed the "--except-interface lo" from dnsmasq commandlines when
--bind-dynamic was used (based on comments in the latter bug).

It turns out that the problem reported in the CVE could be eliminated
without removing "--except-interface lo", and removing it actually
caused each instance of dnsmasq to listen on localhost on port 53,
which created a new problem:

If another instance of dnsmasq using "bind-interfaces" (instead of
"bind-dynamic") had already been started (or if another instance
started later used "bind-dynamic"), this wouldn't have any immediately
visible ill effects, but if you tried to start another dnsmasq
instance using "bind-interfaces" *after* starting any libvirt
networks, the new dnsmasq would fail to start, because there was
already another process listening on port 53.

(Subsequent to the CVE fix, another patch changed the network driver
to put dnsmasq options in a conf file rather than directly on the
dnsmasq commandline, but preserved the same options.)

This patch changes the network driver to *always* add
"except-interface=lo" to dnsmasq conf files, regardless of whether we use
bind-dynamic or bind-interfaces. This way no libvirt dnsmasq instances
are listening on localhost (and the CVE is still fixed).

The actual code change is miniscule, but must be propogated through all
of the test files as well.
2012-12-13 12:15:03 -05:00
Jiri Denemark
d0d3e92d0b build: Fix VPATH build
$(srcdir) is already part of $$file since commit f1f9a7ac7e.
2012-12-13 17:06:36 +01:00
Daniel P. Berrange
64f0e145c1 Add support for locking based on SCSI volume ID 2012-12-13 15:26:58 +00:00
Daniel P. Berrange
565d040f43 Add support for locking based on LVM volume uuid 2012-12-13 15:26:58 +00:00
Daniel P. Berrange
f14fdae368 Add ability to maintain disk leases indirectly
The default lockd driver behavour is to acquire leases
directly on the disk files. This introduces an alternative
mode, where leases are acquire indirectly on a file that
is based on a SHA256 hash of the disk filename.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-13 15:26:57 +00:00
Daniel P. Berrange
eb8268a4f6 Add a virtlockd client as a lock driver impl
This adds a 'lockd' lock driver which is just a client which
talks to the lockd daemon to perform all locking. This will
be the default lock driver for any hypervisor which needs one.

* src/Makefile.am: Add lockd.so plugin
* src/locking/lock_driver_lockd.c: Lockd driver impl

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-13 15:26:57 +00:00
Daniel P. Berrange
f234dc9366 Add support for re-exec() of virtlockd upon SIGUSR1
The virtlockd daemon maintains file locks on behalf of libvirtd
and any VMs it is running. These file locks must be held for as
long as any VM is running. If virtlockd itself ever quits, then
it is expected that a node would be fenced/rebooted. Thus to
allow for software upgrads on live systemd, virtlockd needs the
ability to re-exec() itself.

Upon receipt of SIGUSR1, virtlockd will save its current live
state out to a file /var/run/virtlockd-restart-exec.json
It then re-exec()'s itself with exactly the same argv as it
originally had, and loads the state file, reconstructing any
objects as appropriate.

The state file contains information about all locks held and
all network services and clients currently active. An example
state document is

 {
    "server": {
        "min_workers": 1,
        "max_workers": 20,
        "priority_workers": 0,
        "max_clients": 20,
        "keepaliveInterval": 4294967295,
        "keepaliveCount": 0,
        "keepaliveRequired": false,
        "services": [
            {
                "auth": 0,
                "readonly": false,
                "nrequests_client_max": 1,
                "socks": [
                    {
                        "fd": 6,
                        "errfd": -1,
                        "pid": 0,
                        "isClient": false
                    }
                ]
            }
        ],
        "clients": [
            {
                "auth": 0,
                "readonly": false,
                "nrequests_max": 1,
                "sock": {
                    "fd": 9,
                    "errfd": -1,
                    "pid": 0,
                    "isClient": true
                },
                "privateData": {
                    "restricted": true,
                    "ownerPid": 1722,
                    "ownerId": 6,
                    "ownerName": "f18x86_64",
                    "ownerUUID": "97586ba9-df27-9459-c806-f016c8bbd224"
                }
            },
            {
                "auth": 0,
                "readonly": false,
                "nrequests_max": 1,
                "sock": {
                    "fd": 10,
                    "errfd": -1,
                    "pid": 0,
                    "isClient": true
                },
                "privateData": {
                    "restricted": true,
                    "ownerPid": 1784,
                    "ownerId": 7,
                    "ownerName": "f16x86_64",
                    "ownerUUID": "7b8e5e42-b875-61e9-b981-91ad8fa46979"
                }
            }
        ]
    },
    "defaultLockspace": {
        "resources": [
            {
                "name": "/var/lib/libvirt/images/f16x86_64.raw",
                "path": "/var/lib/libvirt/images/f16x86_64.raw",
                "fd": 14,
                "lockHeld": true,
                "flags": 0,
                "owners": [
                    1784
                ]
            },
            {
                "name": "/var/lib/libvirt/images/shared.img",
                "path": "/var/lib/libvirt/images/shared.img",
                "fd": 12,
                "lockHeld": true,
                "flags": 1,
                "owners": [
                    1722,
                    1784
                ]
            },
            {
                "name": "/var/lib/libvirt/images/f18x86_64.img",
                "path": "/var/lib/libvirt/images/f18x86_64.img",
                "fd": 11,
                "lockHeld": true,
                "flags": 0,
                "owners": [
                    1722
                ]
            }
        ]
    },
    "lockspaces": [

    ],
    "magic": "30199"
 }

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-13 15:26:57 +00:00
Daniel P. Berrange
74c0353e4f Enable systemd socket activation with virtlockd
This enhancement virtlockd so that it can receive a pre-opened
UNIX domain socket from systemd at launch time, and adds the
systemd service/socket unit files

* daemon/libvirtd.service.in: Require virtlockd to be running
* libvirt.spec.in: Add virtlockd systemd files
* src/Makefile.am: Install systemd files
* src/locking/lock_daemon.c: Support socket activation
* src/locking/virtlockd.service.in, src/locking/virtlockd.socket.in:
  systemd unit files
* src/rpc/virnetserverservice.c, src/rpc/virnetserverservice.h:
  Add virNetServerServiceNewFD() method
* src/rpc/virnetsocket.c, src/rpc/virnetsocket.h: Add virNetSocketNewListenFD
  method

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-13 15:26:57 +00:00
Daniel P. Berrange
0e49b83912 Implement dispatch functions for lock protocol in virtlockd
Introduce a lock_daemon_dispatch.c file which implements the
server side dispatcher the RPC APIs previously defined in the
lock protocol.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-13 15:26:57 +00:00
Daniel P. Berrange
ad39fd83a8 Define a wire protocol for talking to the virtlockd daemon
The virtlockd daemon will be responsible for managing locks
on virtual machines. Communication will be via the standard
RPC infrastructure. This provides the XDR protocol definition

* src/locking/lock_protocol.x: Wire protocol for virtlockd
* src/Makefile.am: Include lock_protocol.[ch] in virtlockd

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-13 15:26:57 +00:00
Daniel P. Berrange
c57e3d8994 Introduce basic infrastructure for virtlockd daemon
The virtlockd daemon will maintain locks on behalf of libvirtd.
There are two reasons for it to be separate

 - Avoid risk of other libvirtd threads accidentally
   releasing fcntl() locks by opening + closing a file
   that is locked
 - Ensure locks can be preserved across libvirtd restarts.
   virtlockd will need to be able to re-exec itself while
   maintaining locks. This is simpler to achieve if its
   sole job is maintaining locks

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-13 15:26:57 +00:00
Daniel P. Berrange
f199f75e9b Refactor creation of lock manager plugins
Refactor virLockManagerPluginNew() so that the caller does
not need to pass in the config file path itself - just the
config directory and driver name.

Fix QEMU to actually pass in a config file when creating the
default lock manager plugin, rather than NULL.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-13 15:26:57 +00:00
Daniel P. Berrange
41ac222e52 Fix error reporting when fetching SCSI/LVM keys
The current  virStorageFileGet{LVM,SCSI}Key methods return
the key as the return value. Unfortunately it is desirable
for "NULL" to be a valid return value, as well as an error
indicator. Thus the returned key must instead be provided
as an out-parameter.

When we invoke lvs or scsi_id to extract ID for block devices,
we don't want virCommandWait logging errors messages. Thus we
must explicitly check 'status != 0', rather than letting
virCommandWait do it.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-13 15:26:57 +00:00
Jim Fehlig
f6b5ed5ef0 Support network boot for HVM guests in libxl
The libxl driver ignored boot devices in the domain config,
preventing PXE booting HVM domains.  This patch accounts for
user-specified boot devices when building the libxl domain
configuration.
2012-12-13 08:05:12 -07:00
Daniel P. Berrange
32bef82a2d Fix probing of QED file format
The QED file format is non-versioned, so although the magic
value matched, libvirt rejected it due to lack of a version
number to compare against. We need to distinguish this case
by allowing a value of '-2' to indicate a non-versioned file
where only the magic is required to match

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-13 15:01:38 +00:00
Daniel P. Berrange
24643c780b Add lots of debugging to storage file probing code
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-13 15:01:25 +00:00
Daniel P. Berrange
dfba37048a Log warning if storage magic matches, but version does not
To help us detect when new storage file versions come into
existance log a warning if the storage file magic matches,
but the version does not

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-13 15:00:12 +00:00
Daniel P. Berrange
f6bd0a8899 Fix memory leak in QEMU QMP capabilities initialization
The qemuCapsInitQMP method never frees the QEMU 'package'
version string.
2012-12-13 14:45:53 +00:00
Daniel P. Berrange
cc5c7f9865 Change virCgroupGetAppRoot stub on non-Linux to avoid unused param warning
Fully stub out the virCgroupGetAppRoot method as done with other
methods in the file, rather than just the body. This lets us
annotate the unused parameter to avoid a warning
2012-12-13 13:11:44 +00:00
Eric Blake
7339bc4ced network: match xml warning message
I noticed that /var/lib/libvirt/dnsmasq/*.conf used the wrong word;
it was intended to match the wording in src/util/xml.c.

* src/network/bridge_driver.c (networkDnsmasqConfContents): Fix typo.
* tests/networkxml2confdata/*.conf: Update accordingly.
2012-12-12 15:12:58 -07:00
Roman Bogorodskiy
9a2f36ec04 Qemu FreeBSD: fix compilation
* Autotools changes:
  - Don't assume Qemu is Linux-only
  - Check Linux headers only on Linux
  - Disable firewalld on FreeBSD
* Initctl:
  Initctl seem to present only on Linux, so stub it on other platforms
* Raw I/O: Linux-only as well
* Headers cleanup
2012-12-12 11:59:53 -07:00
Roman Bogorodskiy
b467e9323c Drop mntent.h include.
It's no longer used and also causes build fail on FreeBSD.
2012-12-12 11:07:24 -07:00
Viktor Mihajlovski
f1f9a7ac7e Fix make check with different object directory
make check fails in check-symsorting if configure is not run in
the source directory. Prefixing symfile names with $(srcdir)
fixes this.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2012-12-12 16:15:25 +00:00
Peter Krempa
ed0bfd04f8 qemu: Improve error reporting from qemuDomainManagedSaveRemove
Report an error if unlink of the managedsave file fails.
2012-12-12 14:34:12 +01:00
Peter Krempa
a02579141e qemu: Small code cleanups in the managedsave functions
Save a few lines moving assignments into conditions and fix braces
position.
2012-12-12 14:34:12 +01:00
Peter Krempa
2745177b34 qemu: Refactor managed save functions to use domain lookup helpers 2012-12-12 14:34:12 +01:00
Peter Krempa
7fc06b0480 qemu: Add a new domain lookup helper and improve the docs
This patch adds a new domain lookup helper qemuDomObjFromDomainDriver
that lookups the domain and leaves the driver locked. The driver is
returned as the second argument of that function. If the lookup fails
the driver is unlocked to help avoid cleanup codepaths.

This patch also improves docs for the helpers.
2012-12-12 14:34:12 +01:00
Peter Krempa
ab8d323319 util: Fix warning message in previous patch
I didn't notice the extra "does" in the previous patch. Remove it.
2012-12-12 14:19:03 +01:00
Peter Krempa
96460a1987 util: rework error reporting in virGet(User|Group)IDByName
This patch gets rid of the undeterministic error reporting code done on
return values of get(pw|gr)nam_r. With this patch, if the group record
is not returned by the corresponding function this error is not
considered fatal even if errno != 0. The error is logged in such case.
2012-12-12 14:06:59 +01:00
Daniel P. Berrange
9cdd9ea20e Refactor virDomainHostdevFind method
Move the code for matching hostdev instances out of virDomainHostdevFind
and into virDomainHostdevMatch method, which in turn calls out to other
helper methods depending on the type of hostdev.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-12 12:27:53 +00:00
Daniel P. Berrange
50897ffbb6 Slightly refactor hostdev parsing / formating
Rename virDomainHostdevPartsParse to virDomainHostdevDefParseSubsys
to reflect the fact that it only deals with hostdevs uing the
traditional mode=subsystem, and not mode=capabilities

Rename virDomainHostSourceFormat to virDomainHostdevDefFormatSubsys
for the same reason.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-12 12:26:53 +00:00
Daniel P. Berrange
3f0010a673 Remove bogus const return values in storage file APIs
virStorageFileGetLVMKey and virStorageFileGetSCSIKey
both return heap allocated strings, so the return value
should not be marked const.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-12 10:50:55 +00:00
Daniel P. Berrange
64212ed20e Add missing export of virStorageFileGetLVMKey & virStorageFileGetSCSIKey 2012-12-12 10:50:11 +00:00
Daniel P. Berrange
a8c8685eaa Fix sorting of libvirt_private.syms and add syntax check rule
Add check-symsorting.pl to perform case-insensitive alphabetical
sorting of groups of symbols. Fix all violations it reports

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-12 10:45:37 +00:00
Cole Robinson
7b97030ad4 uml: Report error if inotify fails on driver startup 2012-12-11 20:03:08 -05:00
Serge Hallyn
a4e44e674e add vnc unix sockets to apparmor policy
When using vnc gaphics over a unix socket, virt-aa-helper needs to provide
access for the qemu domain to access the sockfile.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2012-12-11 14:32:39 -07:00
Serge Hallyn
88bd1a644b add security hook for permitting hugetlbfs access
When a qemu domain is backed by huge pages, apparmor needs to grant the domain
rw access to files under the hugetlbfs mount point.  Add a hook, called in
qemu_process.c, which ends up adding the read-write access through
virt-aa-helper.  Qemu will be creating a randomly named file under the
mountpoint and unlinking it as soon as it has mmap()d it, therefore we
cannot predict the full pathname, but for the same reason it is generally
safe to provide access to $path/**.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
2012-12-11 14:27:20 -07:00
Peter Krempa
08379dbd45 qemu: reuse qemuMigrationIsAllowed when doing save and managedsave
Save and managedsave both use migration to file. This patch reuses
qemuMigrationIsAllowed to check if the migration could happen before
trying.
2012-12-11 19:48:37 +01:00
Peter Krempa
98e92ba83b qemu: snapshot: Report better error message if migration isn't allowed
Qemu doesn't support migration on guests with host devices. This patch
adds a check to ensure migration is safe before actually doing so.
2012-12-11 19:48:37 +01:00
Peter Krempa
e5d3ab5e21 qemu: Make qemuMigrationIsAllowed more reusable
This patch exports qemuMigrationIsAllowed and adds a new parameter to it
to denote if it's a remote migration or a local migration. Local
migrations are used in snapshots and saving of the machine state and
have fewer restrictions. This patch also adjusts callers of the function
and tweaks some error messages to be more universal.
2012-12-11 19:48:37 +01:00
Ján Tomko
6543a459ef qemu: assume seccomp sandbox is supported since qemu 1.2
Currently there is no way to detect it via QMP and requesting "-sandbox
off" works correctly even if it was compiled out, so this will work
unless someone both requests the sandbox in qemu.conf and builds QEMU
without the support for it.
2012-12-11 18:52:29 +01:00
Michal Privoznik
c2fbb3c656 domain: Keep assigned class_id in domstatus XML
Interfaces keeps a class_id, which is an ID from which bridge
part of QoS settings is derived. We need to store class_id
in domain status file, so we can later pass it to
virNetDevBandwidthUnplug.
2012-12-11 18:42:54 +01:00
Michal Privoznik
ae757743dc network: Create real network status files
Currently, we are only keeping a inactive XML configuration
in status dir. This is no longer enough as we need to keep
this class_id attribute so we don't overwrite old entries
when the daemon restarts. However, since there has already
been release which has just <network/> as root element,
and we want to keep things compatible, detect that loaded
status file is older one, and don't scream about it.
2012-12-11 18:42:54 +01:00
Michal Privoznik
07d1b6b5b1 bandwidth: Create network bandwidth (un)plug functions
Network should be notified if we plug in or unplug an
interface, so it can perform some action, e.g. set/unset
network part of QoS. However, we are doing this in very
early stage, so iface->ifname isn't filled in yet. So
whenever we want to report an error, we must use a different
identifier, e.g. the MAC address.
2012-12-11 18:41:47 +01:00
Michal Privoznik
b697411ca0 bandwidth: Create rate update function
This will be used whenever a NIC with guaranteed throughput is to
be plugged into a bridge. It will adjust the average throughput of
non guaranteed NICs (classid 1:2) to meet new requirements.
2012-12-11 18:36:55 +01:00
Michal Privoznik
7cdbacb472 bandwidth: Create (un)plug functions
These set bridge part of QoS when bringing domain's interface up.
Long story short, if there's a 'floor' set, a new QoS class is created.
ClassID MUST be unique within the bridge and should be kept for
unplug phase.
2012-12-11 18:36:55 +01:00
Michal Privoznik
67159f1c60 bandwidth: Create hierarchical shaping classes
These classes can borrow unused bandwidth. Basically,
only egress qdsics can have classes, therefore we can
do this kind of traffic shaping only on host's outgoing,
that is domain's incoming traffic.
2012-12-11 18:36:55 +01:00
Michal Privoznik
ec6474b245 bandwidth: add new 'floor' attribute
This is however supported only on domain interfaces with
type='network'. Moreover, target network needs to have at least
inbound QoS set. This is required by hierarchical traffic shaping.

From now on, the required attribute for <inbound/> is either 'average'
(old) or 'floor' (new). This new attribute can be used just for
interfaces type of network (<interface type='network'/>) currently.
2012-12-11 18:35:12 +01:00
Michal Privoznik
7e5040bd20 bandwidth: Attach sfq to leaf node
Stochastic Fairness Queuing (SFQ) is queuing discipline
(qdisc) which doesn't really shape any traffic but 'just'
re-arrange packets in sending buffer so no stream starve.
The goal is to ensure fairness. There is basically only one
configuration parameter (perturb) which is set to advised
value of 10.
2012-12-11 18:16:52 +01:00
Dmitry Guryanov
ad9d8dbcae parallels: handle network adapters of type 'routed'
Network adapters of type 'routed' is a special case. Other adapters
have 'network' parameter in prlctl's output instead.

Routed network adapters should be connected to 'routed' network
from libvirt's view.

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2012-12-11 22:50:38 +08:00
Dmitry Guryanov
84f0a0b8f2 parallels: add routed pseudo network
Historically if traffic from the adapter is routed to LAN without
NAT, it isn't connected to any virtual networks, but has a 'type'
instead. Sinse libvirt has special virtual network type for such case,
let's add pseudo network 'routed' to fit libvirt's API well.

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2012-12-11 22:50:38 +08:00
Dmitry Guryanov
56494d2b57 parallels: parse virtual network properties
Fill bridge name and mac for bridged network and
DHCP server parameter for host-only network.

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2012-12-11 22:50:38 +08:00
Dmitry Guryanov
6034ce3130 parallels: add network driver
Parallels Cloud Server uses virtual networks model for network
configuration. It uses own tools for virtual network management.
So add network driver, which will be responsible for listing
virtual networks and performing different operations on them
(in consequent patched).

This patch only allows listing virtual network names, without
any parameters like DHCP server settings.

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2012-12-11 22:46:16 +08:00
Dmitry Guryanov
68c6d3dc31 parallels: move parallelsParseError to parallels_utils.h
This macro will be used in another file in the next
patch, so move it to common header file.

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2012-12-11 22:46:16 +08:00
Dmitry Guryanov
880fcf6ab2 parallels: add support of network interfaces to parallelsDomainDefineXML
Allow changing network interfaces in domain configuration.

ifname is used as iterface identifier: if there is interface
with some ifname in old config and there are no interfaces with
such name in the new config - issue prlctl command to delete
the network interface. And vice versa - if interface with
some ifname exists only in new config - issue prlctl command
to create it.

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2012-12-11 22:46:16 +08:00
Dmitry Guryanov
8ce9e2abc3 parallels: parse information about network interfaces
Parse network interfaces info from prlctl output.

Parallels Cloud Server uses virtual networks model for
network configuration: You can add network adapter to
VM and connect it to some predefined virtual network.

Fill type, mac, network name and linkstate fields of
virDomainNetDef structure.

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2012-12-11 22:05:15 +08:00
Peter Krempa
a912977a65 qemu: snapshot: Remove memory image if external checkpoint fails
When the disk snapshot part of an external system checkpoint fails the
memory image is retained. This patch adds code to remove the image in
such case.
2012-12-11 13:59:14 +01:00
Peter Krempa
d5b2828763 qemu: snapshot: Don't leak XML definition if restarting of CPUs fails
In case the snapshot code isn't able to restart CPUs after an external
checkpoint we would leak a copy of the domains XML definition. This
patch fixes the cleanup path.
2012-12-11 13:48:15 +01:00
Ján Tomko
07b64de505 qemu: fix uninitialized variable warning in doPeer2PeerMigrate
False positive, but it breaks the build with gcc-4.6.3.

qemu/qemu_migration.c:2931:37: error: 'offline' may be used
uninitialized in this function [-Werror=uninitialized]
qemu/qemu_migration.c:2887:10: note: 'offline' was declared here
2012-12-11 13:38:22 +01:00
Jiri Denemark
8075687679 conf: Remove duplicate declaration of virNetworkDNSDefPtr 2012-12-11 13:27:53 +01:00
Gene Czarcinski
8b32c80df0 network: put dnsmasq parameters in conf-file instead of command line
This patch changes how parameters are passed to dnsmasq.  Instead of
being on the command line, the parameters are put into a file (one
parameter per line) and a commandline --conf-file= specifies the
location of the file.  The file is located in the same directory as
the leases file.

Putting the dnsmasq parameters into a configuration file
allows them to be examined and more easily understood than
examining the command lines displayed by "ps ax".  This is
especially true when a number of networks have been started.

When the use of dnsmasq was originally done, the required command line
was simple, but it has gotten more complicated over time and will
likely become even more complicated in the future.

Note: The test conf files have all been renamed .conf instead of
.argv, and tests/networkxml2xmlargvdata was moved to
tests/networkxml2xmlconfdata.
2012-12-11 05:49:45 -05:00
Gene Czarcinski
2d5cd1d724 network: add support for DHCPv6
The DHCPv6 support includes IPV6 dhcp-range and dhcp-host for one
IPv6 subnetwork on one interface.  This support will only work
if dnsmasq version >= 2.64; otherwise an error occurs if
dhcp-range or dhcp-host is specified for an IPv6 address.

Essentially, this change provides the same DHCP support for IPv6
that has been available for IPv4.

With dnsmasq >= 2.64, support for the RA service is also now provided
by dnsmasq (radvd is no longer used/started). (Although at least one
version of dnsmasq prior to 2.64 "supported" IPv6 Router
Advertisement, there were bugs (fixed in 2.64) that rendered it
unusable.)

Documentation and the network schema has been updated
to reflect the new support.
2012-12-11 05:49:45 -05:00
Laine Stump
71e30eff46 conf: split <forward> parser/clear into separate functions
virNetworkDefUpdateForward requires separate functions to parse and
clear a virNetworkForwardDef by itself, but they were previously just
inlined in the virNetworkDef parse and free functions. This patch
makes them into separate functions.
2012-12-11 05:49:45 -05:00
Laine Stump
47c94b6563 conf: put data for network <forward> element into its own struct
The attributes of a <network> element's <forward> element were
previously stored directly in the virNetworkDef object, but
virNetworkUpdateForward() needs to operate on a <forward> in
isolation, so this patchs pulls out all those attributes into a
separate virNetworkForwardDef struct (and shortens their names
appropriately). This new object is contained in the virNetworkDef, not
pointed to by it, so there is no extra memory management.

This patch makes no functional changes, it only changes, e.g.,
"nForwardIfs" to "forward.nifs".
2012-12-11 05:49:44 -05:00
Laine Stump
31d21197d3 conf: make virNetworkIpDefClear consistent with other functions
The other clear functions in network_conf.c that clear out arrays of
sub-objects do so by using the n[itemname]s value as a counter going
down to 0. Make this one consistent. There's no functional value, just
makes the style more consistent with the rest of the file.
2012-12-11 05:49:44 -05:00
Laine Stump
dc9d8d6810 conf: rename some labels and functions in network_conf
This makes some function names and arg lists for consistent with other
parse functions in network_conf.c. While modifying
virNetworkIPParseXML(), also change its "error" label to "cleanup",
since the code at that label is executed on success as well as
failure.
2012-12-11 05:49:44 -05:00
Laine Stump
fc19a00597 network: backend functions for updating network dns host/srv/txt
These three functions are very similar - none allow a MODIFY
operation; you can only add or delete.

The biggest difference between them (other than the data itself) is in
the criteria for determining a match, and whether or not multiple
matches are possible:

1) for HOST records, it's considered a match if the IP address or any
of the hostnames of an existing record matches.

2) for SRV records, it's a match if all of
domain+service+protocol+target *which have been specified* are
matched.

3) for TXT records, there is only a single field to match - name
(value can be the same for multiple records, and isn't considered a
search term), so by definition there can be no ambiguous matches.

In all three cases, if any matches are found, ADD will fail; if
multiple matches are found, it means the search term was ambiguous,
and a DELETE will fail.

The upper level code in bridge_driver.c is already implemented for
these functions - appropriate conf files will be re-written, and
dnsmasq will be SIGHUPed or restarted as appropriate.
2012-12-11 05:49:44 -05:00
Laine Stump
ab297becc1 conf: clear and parse functions for dns host/srv/txt records
Since there is only a single virNetworkDNSDef for any virNetworkDef,
and it's trivial to determine whether or not it contains any real
data, it's much simpler (and fits more uniformly with the parse
function calling sequence of the parsers for many other objects that
are subordinates of virNetworkDef) if virNetworkDef *contains* an
virNetworkDNSDef rather than pointing to one.

Since it is now just a part of another object rather than its own
object, it no longer makes sense to have a *Free() function, so that
is changed to a *Clear() function.

More importantly though, ParseXML and Clear functions are needed for
the individual items contained in a virNetworkDNSDef (srv, txt, and
host records), but none of them have a *Clear(), and only two of the
three had *ParseXML() functions (both of which used a non-uniform
arglist). Those problems are cleared up by this patch - it splits the
higher-level Clear function into separate functions for each of the
three, creates a parse for txt records, and cleans up the srv and host
parsers, so we now have all the utility functions necessary to
implement virNetworkDefUpdateDNS(Host|Srv|Txt).
2012-12-11 05:49:44 -05:00
Laine Stump
8b7d187417 conf: rename network dns host/srv/txt arrays
This shortens the name of the structs for srv and txt, and their
instances in virNetworkDNSDef, to be more compact and uniform with the
naming of the dns host array. It also changes the type of ntxts, etc
from unsigned int to size_t, so that they can be used directly as args
to VIR_*_ELEMENT.
2012-12-11 05:49:44 -05:00
Laine Stump
2dc5839a16 conf: use VIR_(INSERT|DELETE)_ELEMENT in virNetworkUpdate backend
The already-written backend functions for virNetworkUpdate that add
and delete items into lists within the a network were already debugged
to work properly, but future such functions will use
VIR_(INSERT|DELETE)_ELEMENT instead, so these are changed for
uniformity.
2012-12-11 05:49:44 -05:00
Laine Stump
85b22f528f util: add VIR_(APPEND|INSERT|DELETE)_ELEMENT
I noticed when writing the backend functions for virNetworkUpdate that
I was repeating the same sequence of memmove, VIR_REALLOC, nXXX-- (and
messed up the args to memmove at least once), and had seen the same
sequence in a lot of other places, so I decided to write a few
utility functions/macros - see the .h file for full documentation.

The intent is to reduce the number of lines of code, but more
importantly to eliminate the need to check the element size and
element count arithmetic every time we need to do this (I *always*
make at least one mistake.)

VIR_INSERT_ELEMENT: insert one element at an arbitrary index within an
  array of objects. The size of each object is determined
  automatically by the macro using sizeof(*array). The new element's
  contents are copied into the inserted space, then the original copy
  of contents are 0'ed out (if everything else was
  successful). Compile-time assignment and size compatibility between
  the array and the new element is guaranteed (see explanation below
  [*])

VIR_INSERT_ELEMENT_COPY: identical to VIR_INSERT_ELEMENT, except that
  the original contents of newelem are not cleared to 0 (i.e. a copy
  is made).

VIR_APPEND_ELEMENT: This is just a special case of VIR_INSERT_ELEMENT
  that "inserts" one past the current last element.

VIR_APPEND_ELEMENT_COPY: identical to VIR_APPEND_ELEMENT, except that
  the original contents of newelem are not cleared to 0 (i.e. a copy
  is made).

VIR_DELETE_ELEMENT: delete one element at an arbitrary index within an
  array of objects. It's assumed that the element being deleted is
  already saved elsewhere (or cleared, if that's what is appropriate).

All five of these macros have an _INPLACE variant, which skips the
memory re-allocation of the array, assuming that the caller has
already done it (when inserting) or will do it later (when deleting).

Note that VIR_DELETE_ELEMENT* can return a failure, but only if an
invalid index is given (index + amount to delete is > current array
size), so in most cases you can safely ignore the return (that's why
the helper function virDeleteElementsN isn't declared with
ATTRIBUTE_RETURN_CHECK). A warning is logged if this ever happens,
since it is surely a coding error.

[*] One initial problem with the INSERT and APPEND macros was that,
due to both the array pointer and newelem pointer being cast to void*
when passing to virInsertElementsN(), any chance of type-checking was
lost. If we were going to move in newelem with a memmove anyway, we
would be no worse off for this. However, most current open-coded
insert/append operations use direct struct assignment to move the new
element into place (or just populate the new element directly) - thus
use of the new macros would open a possibility for new usage errors
that didn't exist before (e.g. accidentally sending &newelemptr rather
than newelemptr - I actually did this quite a lot in my test
conversions of existing code).

But thanks to Eric Blake's clever thinking, I was able to modify the
INSERT and APPEND macros so that they *do* check for both assignment
and size compatibility of *ptr (an element in the array) and newelem
(the element being copied into the new position of the array). This is
done via clever use of the C89-guaranteed fact that the sizeof()
operator must have *no* side effects (so an assignment inside sizeof()
is checked for validity, but not actually evaluated), and the fact
that virInsertElementsN has a "# of new elements" argument that we
want to always be 1.
2012-12-11 05:49:44 -05:00
Peter Krempa
46b0c93332 qemu: Restart CPUs with valid async job type when doing external snapshots
When restarting CPUs after an external snapshot, the restarting function
was called without the appropriate async job type. This caused that a
new sync job wasn't created and allowed races in the monitor.
2012-12-11 11:20:53 +01:00
Dmitry Guryanov
84e27a6f2a parallels: add support of removing disks
If some hard disk is not found in new domain configuration, it
should be removed.

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2012-12-11 16:26:32 +08:00
Dmitry Guryanov
d5c4783c64 parallels: apply config after VM creation
New VM will have default values for all parameters, like
cpu number, we have to change its configuration as provided
by xml definition, given to parallelsDomainDefineXML.

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2012-12-11 16:26:32 +08:00
Dmitry Guryanov
b4f0c19eed parallels: add support of disks creation
Implement creation of new disks - if a new disk found
in configuration, find a volume by disk path and
actually create a disk image by issuing prlctl command.
If it's successfully finished - remove the file with volume
definition.

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2012-12-11 16:26:32 +08:00
Dmitry Guryanov
592664c181 parallels: add function parallelsGetDiskBusName
Add function for convertion bus from libvirt's numeric constant
to a name, used in a parallels command-line tools.

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2012-12-11 16:26:32 +08:00
Dmitry Guryanov
944705e28f parallels: split parallelsStorageVolumeDelete function
Move part, which deletes existing volume, to a new function
parallelsStorageVolumeDefRemove so that we can use it later
in parallels_driver.c

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2012-12-11 16:26:31 +08:00
Dmitry Guryanov
a9bd9b94e1 parallels: fill volumes capacity parameter
Read disk images size from xml description and fill
virStorageVolDef.capacity and allocation (let's consider
that allocation is the same as capacity, calculating real
allcoation will be implemented later).

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2012-12-11 16:26:31 +08:00
Dmitry Guryanov
9b4c03ae5d parallels: add info about volumes
Disk images in Parallels Cloud Server stored in directories. Each
one has files with data and xml description of an image stored in
file DiskDescriptior.xml.

Since we have to support 'detached' images, which are not used by
any VM, the better way to collect info about volumes is searching for
directories with a file DiskDescriptior.xml in each VM directory.

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2012-12-11 16:26:31 +08:00
Dmitry Guryanov
7abe342d96 parallels: fix leaks in parallelsFindVolumes
We always have to close opened dir and free 'path'.

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2012-12-11 16:26:31 +08:00
Dmitry Guryanov
766e0c91d7 parallels: create storage pools by VM list
There are no storage pools in Parallels Cloud Server -
All VM data stored in a single directory: config, snapshots,
memory dump together with disk images.

Let's look through list of VMs and create a storage pool for
each directory, containing VMs.

So if you have 3 vms: /var/parallels/vm-1.pvm,
/var/parallels/vm-2.pvm and /root/test.pvm - 2 storage pools
appear: -var-parallels and -root. xml descriptions of the pools
will be saved in /etc/libvirt/parallels-storage, so UUIDs will
not change netween connections to libvirt.

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2012-12-11 16:26:31 +08:00
Dmitry Guryanov
4dc52e1e2f parallels: remove unused code from storage driver
We don't support unprivileged users anymore, so remove code, which
selects configuration directory depending on user.

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2012-12-11 16:26:31 +08:00
Dmitry Guryanov
21e1bdeb3d parallels: split parallelsStorageOpen function
Move code for loading inforation about pools to a separate
function - parallelsLoadPools.

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2012-12-11 16:26:31 +08:00
Dmitry Guryanov
45e6317158 parallels: handle disk devices in parallelsDomainDefineXML
Allow changing some parameters of the hard disks: bus,
image and drive address.

Creating new disk devices and removing existing ones
require changes in the storage driver, so it will be
implemented later.

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2012-12-11 16:26:31 +08:00
Dmitry Guryanov
6718b2d711 parallels: add info about hard disk devices
Parse information about hard disks and fill disks array
in virDomainDef structure.

Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
2012-12-11 16:26:31 +08:00
liguang
8b9bf7879b Add support for offline migration
Offline migration transfers inactive definition of a domain (which may
or may not be active). After successful completion, the domain remains
in its current state on source host and is defined but inactive on
destination host. It's a bit more clever than virDomainGetXMLDesc() on
source host followed by virDomainDefineXML() on destination host, as
offline migration will run pre-migration hook to update the domain XML
on destination host. Currently, copying non-shared storage is not
supported during offline migration.

Offline migration can be requested with a new migration flag called
VIR_MIGRATE_OFFLINE (which has to be combined with
VIR_MIGRATE_PERSIST_DEST flag).
2012-12-10 21:52:15 +01:00
Laine Stump
e5577872cb qemu: eliminate bogus error log when changing netdev's bridge
This fixes a problem that showed up during testing of:

  https://bugzilla.redhat.com/show_bug.cgi?id=881480

Due to a logic error in the function that gets the name of the bridge
an interface connects to, any time a bridge was specified directly
(type='bridge') rather than indirectly (type='network'), An error
would be logged (although the operation would then complete
successfully):

   Network type 6 is not supported

The final virReportError() in the function
qemuDomainNetGetBridgeName() was apparently avoided in the past with a
"goto cleanup" at the end of each case, but the case of bridge somehow
no longer has that final goto cleanup.

The proper solution is anyway to not rely on goto's, but put the error
log inside an else {} clause, so that it's executed only if the type
is neither bridge nor network (in reality, this function should only
ever be called for those two types, that's why this is an internal
error).

While making this change, the error message was also tuned to be more
correct (since it's not really the type of the network, but the type
of the interface, and it *is* otherwise supported, it's just that the
interface type in question doesn't *have* a bridge device associated
with it, or at least we don't know how to get it).
2012-12-10 13:17:41 -05:00
Viktor Mihajlovski
539d73dbf6 S390: Assign default model "virtio" for network interfaces
If a network interface model is not specified, libvirt will run
into an unchecked NULL pointer coredump. On the other hand if
the empty model is ignored, a PCI bus address would be generated,
which is not supported by S390.
Since the only valid network type model for S390 is virtio,
we use this as the default value, which is the same for QEMU.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2012-12-10 14:57:17 +01:00
Michal Privoznik
28de547997 Revert "dnsmasq: Fix parsing of the version number"
This reverts commit 5114431396
which was pushed accidentally.
2012-12-10 14:00:02 +01:00
Cole Robinson
3130541ebf qemu: capabilities: fix machine name/canonical swappage
Things are supposed to look like:

<machine canonical='pc-0.12'>pc</machine>

But are currently swapped. This can cause many VMs to revert to having
machine type='pc' which will affect save/restore across qemu upgrades.
2012-12-07 11:30:34 -05:00
Ján Tomko
1c9a2fb1ae storage: allow metadata preallocation when creating qcow2 images
Add VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA flag to virStorageVolCreateXML
and virStorageVolCreateXMLFrom. This flag requests metadata
preallocation when creating/cloning qcow2 images, resulting in creating
a sparse file with qcow2 metadata. It has only slightly larger disk usage
compared to new image with no allocation, but offers higher performance.
2012-12-07 11:46:48 +01:00
Osier Yang
b718ded39a qemu: Allow the user to specify vendor and product for disk
QEMU supports setting vendor and product strings for disk since
1.2.0 (only scsi-disk, scsi-hd, scsi-cd support it), this patch
exposes it with new XML elements <vendor> and <product> of disk
device.
2012-12-07 16:53:27 +08:00
Jim Fehlig
dfa1e1dd53 Convert libxl driver to Xen 4.2
Based on a patch originally authored by Daniel De Graaf

  http://lists.xen.org/archives/html/xen-devel/2012-05/msg00565.html

This patch converts the Xen libxl driver to support only Xen >= 4.2.
Support for Xen 4.1 libxl is dropped since that version of libxl is
designated 'technology preview' only and is incompatible with Xen 4.2
libxl.  Additionally, the default toolstack in Xen 4.1 is still xend,
for which libvirt has a stable, functional driver.
2012-12-06 16:15:54 -07:00
Christophe Fergeau
a33f4eae83 util: Don't fail virGetGroupIDByName when group not found
virGetGroupIDByName is documented as returning 1 if the groupname
cannot be found. getgrnam_r is documented as returning:
« 0 or ENOENT or ESRCH or EBADF or EPERM or ...  The given name
or gid was not found. »
 and that:
« The formulation given above under "RETURN VALUE" is from POSIX.1-2001.
It  does  not  call  "not  found"  an error, hence does not specify what
value errno might have in this situation.  But that makes it impossible to
recognize errors.  One might argue that according to POSIX errno should be
left unchanged if an entry is not found.  Experiments on various UNIX-like
systems shows that lots of different values occur in this situation: 0,
ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. »

virGetGroupIDByName returns an error when the return value of getgrnam_r
is non-0. However on my RHEL system, getgrnam_r returns ENOENT when the
requested user cannot be found, which then causes virGetGroupID not
to behave as documented (it returns an error instead of falling back
to parsing the passed-in value as an gid).

This commit makes virGetGroupIDByName only report an error when errno
is set to one of the values in the posix description of getgrnam_r
(which are the same as the ones described in the manpage on my system).
2012-12-06 17:21:54 +01:00
Christophe Fergeau
6c6c03dc0e util: Don't fail virGetUserIDByName when user not found
virGetUserIDByName is documented as returning 1 if the username
cannot be found. getpwnam_r is documented as returning:
« 0 or ENOENT or ESRCH or EBADF or EPERM or ...  The given name
or uid was not found. »
 and that:
« The formulation given above under "RETURN VALUE" is from POSIX.1-2001.
It  does  not  call  "not  found"  an error, hence does not specify what
value errno might have in this situation.  But that makes it impossible to
recognize errors.  One might argue that according to POSIX errno should be
left unchanged if an entry is not found.  Experiments on various UNIX-like
systems shows that lots of different values occur in this situation: 0,
ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. »

virGetUserIDByName returns an error when the return value of getpwnam_r
is non-0. However on my RHEL system, getpwnam_r returns ENOENT when the
requested user cannot be found, which then causes virGetUserID not
to behave as documented (it returns an error instead of falling back
to parsing the passed-in value as an uid).

This commit makes virGetUserIDByName only report an error when errno
is set to one of the values in the posix description of getpwnam_r
(which are the same as the ones described in the manpage on my system).
2012-12-06 17:21:54 +01:00
Michal Privoznik
ff33f80773 dnsmasq: Fix parsing of the version number
If debugging is enabled, the debug messages are sent to stderr.
Moreover, if a command has catching of stderr set, the messages
gets mixed with stdout output (assuming both outputs are stored
in the same variable). The resulting string then doesn't
necessarily have to start with desired prefix then. This bug
exposes itself when parsing dnsmasq output:

2012-12-06 11:18:11.445+0000: 18491: error :
dnsmasqCapsSetFromBuffer:664 : internal error cannot parse
/usr/sbin/dnsmasq version number in '2012-12-06
11:11:02.232+0000: 18492: debug : virFileClose:72 : Closed fd 22'

We can clearly see that the output of dnsmasq --version doesn't
start with expected "Dnsmasq version " string but a libvirt debug
output.
2012-12-06 13:48:11 +01:00
Michal Privoznik
5114431396 dnsmasq: Fix parsing of the version number
If the debugging is enabled, the virCommand subsystem catches debug
messages in the command output as well. In that case, we can't assume
the string corresponding to command's stdout will start with specific
prefix. But the prefix can be moved deeper in the string. This bug
shows itself when parsing dnsmasq output:

2012-12-06 11:18:11.445+0000: 18491: error :
dnsmasqCapsSetFromBuffer:664 : internal error cannot parse
/usr/sbin/dnsmasq version number in '2012-12-06 11:11:02.232+0000:
18492: debug : virFileClose:72 : Closed fd 22'

We can clearly see that the output of dnsmasq --version
doesn't start with expected "Dnsmasq version " string but a libvirt
debug output.
2012-12-06 12:25:50 +01:00
Laine Stump
fd54f1de53 network: prevent a few invalid configuration combinations
This resolves: https://bugzilla.redhat.com/show_bug.cgi?id=767057

It was possible to define a network with <forward mode='bridge'> that
had both a bridge device and a forward device defined. These two are
mutually exclusive by definition (if you are using a bridge device,
then this is a host bridge, and if you have a forward dev defined,
this is using macvtap). It was also possible to put <ip>, <dns>, and
<domain> elements in this definition, although those aren't supported
by the current driver (although it's conceivable that some other
driver might support that).

The items that are invalid by definition, are now checked in the XML
parser (since they will definitely *always* be wrong), and the others
are checked in networkValidate() in the network driver (since, as
mentioned, it's possible that some other network driver, or even this
one, could some day support setting those).
2012-12-05 18:03:34 -05:00
Gene Czarcinski
705e67d40b network: allow guest to guest IPv6 without gateway definition
This patch adds the capability for virtual guests to do IPv6
communication via a virtual network interface with no IPv6 (gateway)
addresses specified.  This capability has always been enabled by
default for IPv4, but disabled for IPv6 for security concerns, and
because it requires the ip6tables command to be operational (which
isn't the case on a system with the ipv6 module completely disabled).

This patch adds a new attribute "ipv6" at the toplevel of a <network>
object.  If ipv6='yes', the extra ip6tables rules required to permite
inter-guest communications are added when the network is started. If
it is 'no', or not present, those rules will not be added; thus the
default behavior doesn't change, so there should be no compatibility
issues with any existing installations.

Note that virtual guests cannot communication with the virtualization
host via this interface, because the following kernel tunable has
been set:

   net.ipv6.conf.<bridge_interface_name>.disable_ipv6 = 1

This assures that the bridge interface will not have an IPv6
link-local (fe80::) address.

To control this behavior so that it is not enabled by default, the parameter
ipv6='yes' on the <network> statement has been added.

Documentation related to this patch has been updated.
The network schema has also been updated.
2012-12-05 14:58:32 -05:00
Osier Yang
d1f3d14974 storage: Error out earlier if the volume target path already exists
https://bugzilla.redhat.com/show_bug.cgi?id=832302

It's odd to fall through to buildVol, and the existed file is
removed when buildVol fails. This checks if the volume target
path already exists in createVol. The reason for not using
error like "Volume already exists" is that there isn't volume
maintained by libvirt for the path until a operation like
pool-refresh, using error like that will just cause confusion.
2012-12-06 01:10:00 +08:00
Daniel P. Berrange
b362938e57 remote: Avoid the thread race condition
https://bugzilla.redhat.com/show_bug.cgi?id=866524

Since the virConnect object is not locked wholely when doing
virConenctDispose, a thread can get the lock and thus might
cause the race.

Detected by valgrind:

==23687== Invalid read of size 4
==23687==    at 0x38BAA091EC: pthread_mutex_lock (pthread_mutex_lock.c:61)
==23687==    by 0x3FBA919E36: remoteClientCloseFunc (remote_driver.c:337)
==23687==    by 0x3FBA936BF2: virNetClientCloseLocked (virnetclient.c:688)
==23687==    by 0x3FBA9390D8: virNetClientIncomingEvent (virnetclient.c:1859)
==23687==    by 0x3FBA851AAE: virEventPollRunOnce (event_poll.c:485)
==23687==    by 0x3FBA850846: virEventRunDefaultImpl (event.c:247)
==23687==    by 0x40CD61: vshEventLoop (virsh.c:2128)
==23687==    by 0x3FBA8626F8: virThreadHelper (threads-pthread.c:161)
==23687==    by 0x38BAA077F0: start_thread (pthread_create.c:301)
==23687==    by 0x33F68E570C: clone (clone.S:115)
==23687==  Address 0x4ca94e0 is 144 bytes inside a block of size 312 free'd
==23687==    at 0x4A0595D: free (vg_replace_malloc.c:366)
==23687==    by 0x3FBA8588B8: virFree (memory.c:309)
==23687==    by 0x3FBA86AAFC: virObjectUnref (virobject.c:145)
==23687==    by 0x3FBA8EA767: virConnectClose (libvirt.c:1458)
==23687==    by 0x40C8B8: vshDeinit (virsh.c:2584)
==23687==    by 0x41071E: main (virsh.c:3022)

The above race is caused by the eventLoop thread tries to handle
the net client event by calling the callback set by:
    virNetClientSetCloseCallback(priv->client,
                                 remoteClientCloseFunc,
                                 conn, NULL);

I.E. remoteClientCloseFunc, which lock/unlock the virConnect object.

This patch is to fix the bug by setting the callback to NULL when
doRemoteClose.
2012-12-06 00:43:18 +08:00
Peter Krempa
35aa14fcd0 pci: Fix building of 32bit PCI command array
The pciWrite32 function assembled the array of data to be written to the
fd with a bad offset on the last byte. This issue was probably caused by
a typo (14, 24).
2012-12-05 14:04:54 +01:00
Jiri Denemark
ad65d1e502 util: Do not keep PCI device config file open
Directly open and close PCI config file in the APIs that need it rather
than keeping the file open for the whole life of PCI device structure.
2012-12-05 13:45:35 +01:00
Jiri Denemark
6910318798 qemu: Fix memory (and FD) leak on PCI device detach
Unmanaged PCI devices were only leaked if pciDeviceListAdd failed but
managed devices were always leaked. And leaking PCI device is likely to
leave PCI config file descriptor open. This patch fixes
qemuReattachPciDevice to either free the PCI device or add it to the
inactivePciHostdevs list.
2012-12-05 13:45:34 +01:00
Jiri Denemark
5eb8a7ac4d util: Slightly refactor PCI list functions
In order to be able to steal PCI device by its index in the list.
2012-12-05 13:45:34 +01:00
Jiri Denemark
ea1a9b5fdd qemu: Don't free PCI device if adding it to activePciHostdevs fails
The device is still referenced from pcidevs and freeing it would leave
an invalid pointer there.
2012-12-05 13:45:34 +01:00
Jiri Denemark
935550c6d3 qemu: Fix error code when attaching existing device
An attempt to attach device that is already attached to a domain results
in the following error:

virsh # attach-device rhel6 pci2 --persistent
error: Failed to attach device from pci2
error: invalid argument: device is already in the domain configuration

The "invalid argument" error code looks wrong, we usually use "operation
invalid" when the action cannot be done in current state.
2012-12-05 13:45:34 +01:00
Osier Yang
9ee809d60c qemu: Simplify the code
"disk" is initialized to "dev->data.disk" in the beginning of the
function.
2012-12-05 12:45:10 +08:00
Osier Yang
8f218fbdfa storage: Remove the redundant white lines
Pushed under trivial rule.
2012-12-05 12:17:18 +08:00
Eric Blake
149fa591c1 qemu: improve error for failed JSON commands
Only one error in qemu_monitor was already using the relatively
new OPERATION_UNSUPPORTED error, even though it is a better fit
for all of the messages related to options that are unsupported
due to the version of qemu in use rather than due to a user's
XML or .conf file choice.  Suggested by Osier Yang.

* src/qemu/qemu_monitor.c (qemuMonitorSendFileHandle)
(qemuMonitorAddHostNetwork, qemuMonitorRemoveHostNetwork)
(qemuMonitorAttachDrive, qemuMonitorDiskSnapshot)
(qemuMonitorDriveMirror, qemuMonitorTransaction)
(qemuMonitorBlockCommit, qemuMonitorDrivePivot)
(qemuMonitorBlockJob, qemuMonitorSystemWakeup)
(qemuMonitorGetVersion, qemuMonitorGetMachines)
(qemuMonitorGetCPUDefinitions, qemuMonitorGetCommands)
(qemuMonitorGetEvents, qemuMonitorGetKVMState)
(qemuMonitorGetObjectTypes, qemuMonitorGetObjectProps)
(qemuMonitorGetTargetArch): Use better error category.
2012-12-04 15:56:03 -07:00
Eric Blake
3bef4adf73 qemu: nicer error message if live disk snapshot unsupported
Without this patch, attempts to create a disk snapshot when qemu
is too old results in a cryptic message:

virsh # snapshot-create 23 --disk-only
error: operation failed: Failed to take snapshot: unknown command: 'snapshot_blkdev'

Now it reports:

virsh # snapshot-create 23 --disk-only
error: unsupported configuration: live disk snapshot not supported with this QEMU binary

All versions of qemu that support live disk snapshot also support
QMP (basically upstream qemu 1.1 and later, and backports to RHEL 6.2).

* src/qemu/qemu_capabilities.h (QEMU_CAPS_DISK_SNAPSHOT): New
capability.
* src/qemu/qemu_capabilities.c (qemuCaps): Track it.
(qemuCapsProbeQMPCommands): Set it.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive): Use
it.
* src/qemu/qemu_monitor.c (qemuMonitorDiskSnapshot): Simplify.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDiskSnapshot):
Likewise.
* src/qemu/qemu_monitor_text.h (qemuMonitorTextDiskSnapshot):
Delete.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextDiskSnapshot):
Likewise.
2012-12-04 15:53:41 -07:00
Eric Blake
2215befc8b rpc: fix build failure with older dbus
RHEL 6.3 uses dbus-devel-1.2.24, which lacked support for the
DBUS_TYPE_UNIX_FD define (contrast with Fedora 18 using 1.6.8).
But since it is an older dbus, it also lacks support for shutdown
inhibitions as provided by newer systemd.

Compilation failure introduced in commit 31330926.

* src/rpc/virnetserver.c (virNetServerAddShutdownInhibition):
Compile out if dbus is too old.
2012-12-04 15:50:11 -07:00
Jim Fehlig
cab0cfd5cf Fix memory leak introduced by commit 501bfad1
501bfad1 missed freeing priv->saveDir when opening the Xen unified
driver failed.
2012-12-04 10:39:07 -07:00
Bamvor Jian Zhang
501bfad194 implement managedsave in libvirt xen legacy driver
Implement the domainManagedSave, domainHasManagedSaveImage, and
domainManagedSaveRemove functions in the libvirt legacy xen driver.

domainHasManagedSaveImage check the managedsave image from filesystem
everytime. This is different from qemu and libxl driver. In qemu or
libxl driver, there is a hasManagesSave flag in virDomainObjPtr which
is not used in xen legacy driver. This flag could not add into xen
driver ptr either, because the driver ptr will be released at the end of
every libvirt api call. Meanwhile, AFAIK, xen store all the flags in
xen not in libvirt xen driver. There is no need to add this flag in xen.

Signed-off-by: Bamvor Jian Zhang <bjzhang@suse.com>
2012-12-04 09:59:23 -07:00
Osier Yang
090eb35c0c Do not export symbol virStateActive anymore
Commit 79b8a56995 removes virStateActive, however it forgot to
remove the symbol together. Pushed under build-breaker rule.
2012-12-04 23:41:10 +08:00
Daniel P. Berrange
313309261d Inhibit desktop shutdown while any virtual machines are running
Use the freedesktop inhibition DBus service to prevent host
shutdown or session logout while any VMs are running.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-04 12:14:04 +00:00
Daniel P. Berrange
79b8a56995 Replace polling for active VMs with signalling by drivers
Currently to deal with auto-shutdown libvirtd must periodically
poll all stateful drivers. Thus sucks because it requires
acquiring both the driver lock and locks on every single virtual
machine. Instead pass in a "inhibit" callback to virStateInitialize
which drivers can invoke whenever they want to inhibit shutdown
due to existance of active VMs.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-04 12:14:04 +00:00
Daniel P. Berrange
ae2163f852 Only let VM drivers block libvirtd timed shutdown
The only important state that should prevent libvirtd shutdown
is from running VMs. Networks, host devices, network filters
and storage pools are all long lived resources that have no
significant in-memory state. They should not block shutdown.
2012-12-04 12:12:51 +00:00
Daniel P. Berrange
8f9a69317d Make QEMU perform managed save of all VMs on stop of libvirtd
When the virStateStop() method is invoked, perform a managed
save of all VMs currently running

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-04 12:07:49 +00:00
Ata E Husain Bohra
60f0f55ee4 Add iSCSI backend storage driver for ESX
The patch adds the backend driver to support iSCSI format storage pools
and volumes for ESX host. The mapping of ESX iSCSI specifics to Libvirt
is as follows:

1. ESX static iSCSI target <------> Libvirt Storage Pools
2. ESX iSCSI LUNs          <------> Libvirt Storage Volumes.

The above understanding is based on http://libvirt.org/storage.html.

The operation supported on iSCSI pools includes:

1. List storage pools & volumes.
2. Get XML descriptor operaion on pools & volumes.
3. Lookup operation on pools & volumes by name, UUID and path (if applicable).

iSCSI pools does not support operations such as: Create / remove pools
and volumes.
2012-12-03 21:12:23 +01:00
Laine Stump
258fb278f2 qemu: support live update of an interface's filter
Since we can't (currently) rely on the ability to provide blanket
support for all possible network changes by calling the toplevel
netdev hostside disconnect/connect functions (due to qemu only
supporting a lockstep between initialization of host side and guest
side of devices), in order to support live change of an interface's
nwfilter we need to make a special purpose function to only call the
nwfilter teardown and setup functions if the filter for an interface
(or its parameters) changes. The pattern is nearly identical to that
used to change the bridge that an interface is connected to.

This patch was inspired by a request from Guido Winkelmann
<guido@sagersystems.de>, who tested an earlier version.
2012-12-03 14:35:58 -05:00
Stefan Berger
ab4139a493 nwfilter: utility function virNWFilterVarValueEqual
To detect if an interface's nwfilter has changed, we need to also
compare the filterparams, which is a hashtable of virNWFilterVarValue.
virHashEqual can do this nicely, but requires a pointer to a function
that will compare two of the items being stored in the hashes.
2012-12-03 14:35:58 -05:00
Laine Stump
3738cf41f1 conf: fix virDomainNetGetActualDirect*() and BridgeName()
This resolves:

   https://bugzilla.redhat.com/show_bug.cgi?id=881480

These three functions:

  virDomainNetGetActualBridgeName
  virDomainNetGetActualDirectDev
  virDomainNetGetActualDirectMode

return attributes that are in a union whose contents are interpreted
differently depending on the actual->type and so they should only
return non-0 when actual->type is 'bridge' (in the first case) or
'direct' (in the other two cases, but I had neglected to do that, so
...DirectDev() was returning bridge.brname (which happens to share the
same spot in the union with direct.linkdev) if actual->type was
'bridge', and ...BridgeName was returning direct.linkdev when
actual->type was 'direct'.

How does this involve Bug 881480 (which was about the inability to
switch between two networks that both have "<forward mode='bridge'/>
<bridge name='xxx'/>"? Whenever the return value of
virDomainNetGetActualDirectDev() for the new and old network
definitions doesn't match, qemuDomainChangeNet() requires a "complete
reconnect" of the device, which qemu currently doesn't
support. ...DirectDev() *should* have been returning NULL for old and
new, but was instead returning the old and new bridge names, which
differ.

(The other two functions weren't causing any behavioral problems in
virDomainChangeNet(), but their problem and fix was identical, so I
included them in this same patch).
2012-12-03 14:01:34 -05:00
Peter Krempa
8312435707 maint: Misc whitespace cleanups 2012-12-03 15:13:32 +01:00
Ján Tomko
bc680e1381 conf: prevent crash with no uuid in cephx auth secret
Fix the null pointer access when UUID is not specified.
Introduce a bool 'uuidUsable' to virStoragePoolAuthCephx that indicates
if uuid was specified or not and use it instead of the pointless
comparison of the static UUID array to NULL.
Add an error message if both uuid and usage are specified.

Fixes:
Error: FORWARD_NULL (CWE-476):
libvirt-0.10.2/src/conf/storage_conf.c:461: var_deref_model: Passing
    null pointer "uuid" to function "virUUIDParse(char const *, unsigned
    char *)", which dereferences it. (The dereference is assumed on the
    basis of the 'nonnull' parameter attribute.)
Error: NO_EFFECT (CWE-398):
    libvirt-0.10.2/src/conf/storage_conf.c:979: array_null: Comparing an
    array to null is not useful: "src->auth.cephx.secret.uuid != NULL".
2012-12-03 15:13:32 +01:00
Osier Yang
05858b27d4 Fix the coding style
Fix the "if ... else" coding style, and indentions problem.
2012-12-03 21:20:50 +08:00
Osier Yang
cc3548abe3 Fix indentions 2012-12-03 09:58:57 +08:00
Eric Blake
5a608c3dee logging: more API needing to log flags
Commit a21f5112 fixed one API, but missed two others that also
failed to log their 'flags' argument.

* src/libvirt.c (virNodeSuspendForDuration, virDomainGetHostname):
Log flags parameter.
2012-11-30 13:23:32 -07:00
Daniel P. Berrange
47e99e0d77 s/flags=%u/flags=%x/ in earlier commit
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-30 20:05:43 +00:00
Daniel P. Berrange
76c1fd33c8 Introduce APIs for splitting/joining strings
This introduces a few new APIs for dealing with strings.
One to split a char * into a char **, another to join a
char ** into a char *, and finally one to free a char **

There is a simple test suite to validate the edge cases
too. No more need to use the horrible strtok_r() API,
or hand-written code for splitting strings.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-30 20:05:43 +00:00
Daniel P. Berrange
cbb106f807 Add support for shutdown / reboot APIs in LXC driver
Add support for doing controlled shutdown / reboot in the LXC
driver. The default behaviour is to try talking to /dev/initctl
inside the container's virtual root (/proc/$INITPID/root). This
works with sysvinit or systemd. If that file does not exist
then send SIGTERM (for shutdown) or SIGHUP (for reboot). These
signals are not any kind of particular standard for shutdown
or reboot, just something apps can choose to handle. The new
virDomainSendProcessSignal allows for sending custom signals.

We might allow the choice of SIGTERM/HUP to be configured for
LXC containers via the XML in the future.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-30 19:20:14 +00:00
Daniel P. Berrange
dff4a753c4 Move reboot/shutdown flags combination check into QEMU driver
The fact that only the guest agent, or ACPI flag can be used
when requesting reboot/shutdown is merely a limitation of the
QEMU driver impl at this time. Thus it should not be in
libvirt.c code

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-30 19:18:27 +00:00
Daniel P. Berrange
c4ef575c97 Add APIs for talking to init via /dev/initctl
To be able todo controlled shutdown/reboot of containers an
API to talk to init via /dev/initctl is required. Fortunately
this is quite straightforward to implement, and is supported
by both sysvinit and systemd. Upstart support for /dev/initctl
is unclear.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-30 19:17:30 +00:00
Daniel P. Berrange
a21f51121d Ensure virDomainShutdownFlags logs flags parameter
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-30 19:16:48 +00:00
Daniel P. Berrange
07da0a6b54 Quote client identity in SASL whitelist log message
When seeing a message

 virNetSASLContextCheckIdentity:146 : SASL client admin not allowed in whitelist

it isn't immediately obvious that 'admin' is the identity
being checked. Quote the string to make it more obvious
2012-11-30 19:16:05 +00:00
Viktor Mihajlovski
3c465728bf qemu: Fix up the default machine type for QMP probing
The default machine type must be stored in the first element of
the caps->machineTypes array. This was done for help output
parsing but not for QMP probing.

Added a helper function qemuSetDefaultMachine to apply the same
fix up for both probing methods.

Further, it was necessary to set caps->nmachineTypes after QMP
probing.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2012-11-30 11:56:57 -07:00
Guido Günther
d01e427e01 Fix uninitialized variables
detecet by

	http://honk.sigxcpu.org:8001/job/libvirt-build/348/console
2012-11-30 19:12:06 +01:00
Eric Blake
3d7f6649e8 qemu: don't attempt undefined QMP commands
https://bugzilla.redhat.com/show_bug.cgi?id=872292

Libvirt should not attempt to call a QMP command that has not been
documented in qemu.git - if future qemu introduces a command by the
same name but with subtly different semantics, then libvirt will be
broken when trying to use that command.

We also had some code that could never be reached - some of our
commands have an alternate for new vs. old qemu HMP commands; but
if we are new enough to support QMP, we only need a fallback to
the new HMP counterpart, and don't need to try for a QMP counterpart
for the old HMP version.

See also this attempt to convert the three snapshot commands to QMP:
https://lists.gnu.org/archive/html/qemu-devel/2012-07/msg01597.html
although it looks like that will still not happen before qemu 1.3.
That thread eventually decided that qemu would use the name
'save-vm' rather than 'savevm', which mitigates the fact that
libvirt's attempt to use a QMP 'savevm' would be broken, but we
might not be as lucky on the other commands.

* src/qemu/qemu_monitor_json.c (qemuMonitorJSONSetCPU)
(qemuMonitorJSONAddDrive, qemuMonitorJSONDriveDel)
(qemuMonitorJSONCreateSnapshot, qemuMonitorJSONLoadSnapshot)
(qemuMonitorJSONDeleteSnapshot): Use only HMP fallback for now.
(qemuMonitorJSONAddHostNetwork, qemuMonitorJSONRemoveHostNetwork)
(qemuMonitorJSONAttachDrive, qemuMonitorJSONGetGuestDriveAddress):
Delete; QMP implies QEMU_CAPS_DEVICE, which prefers AddNetdev,
RemoveNetdev, and AddDrive anyways (qemu_hotplug.c has all callers).
* src/qemu/qemu_monitor.c (qemuMonitorAddHostNetwork)
(qemuMonitorRemoveHostNetwork, qemuMonitorAttachDrive): Reflect
deleted commands.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONAddHostNetwork)
(qemuMonitorJSONRemoveHostNetwork, qemuMonitorJSONAttachDrive):
Likewise.
2012-11-30 09:51:09 -07:00
Eric Blake
ddd103d342 storage: fix scsi detach regression with cgroup ACLs
https://bugzilla.redhat.com/show_bug.cgi?id=876828

Commit 38c4a9cc introduced a regression in hot unplugging of disks
from qemu, where cgroup device ACLs were no longer being revoked
(thankfully not a security hole: cgroup ACLs only prevent open()
of the disk; so reverting the ACL prevents future abuse but doesn't
stop abuse from an fd that was already opened before the ACL change).

Commit 1b2ebf95 overlooked that there were two spots affected.

* src/qemu/qemu_hotplug.c (qemuDomainDetachDiskDevice):
Transfer backing chain before deletion.
* src/qemu/qemu_driver.c (qemuDomainDetachDeviceDiskLive): Fix
spacing (partly to ensure a different-looking patch).
2012-11-30 08:26:34 -07:00
Ján Tomko
4f9af0857c nwfilter: report an error on OOM
Also removed some unreachable code found by coverity:
libvirt-0.10.2/src/nwfilter/nwfilter_driver.c:259: unreachable: This
code cannot be reached: "nwfilterDriverUnlock(driver...".
2012-11-30 15:35:14 +01:00
Peter Krempa
6c5c4b8d4d qemu: Refactor error reporting in qemu driver configuration parser
This patch adds two labels and gets rid of a ton of duplicated code.
This patch also fixes some error message and switches most of them to
proper error reporting functions.
2012-11-29 22:23:16 +01:00
Peter Krempa
7aba113ca7 qemu: Refactor config parameter retrieval
This patch adds macros to help retrieve configuration values from qemu
driver's configuration. Some configuration options are grouped
together in the process.
2012-11-29 21:54:16 +01:00
Laine Stump
753ff83a50 network: use dnsmasq --bind-dynamic when available
This bug resolves CVE-2012-3411, which is described in the following
bugzilla report:

  https://bugzilla.redhat.com/show_bug.cgi?id=833033

The following report is specifically for libvirt on Fedora:

  https://bugzilla.redhat.com/show_bug.cgi?id=874702

In short, a dnsmasq instance run with the intention of listening for
DHCP/DNS requests only on a libvirt virtual network (which is
constructed using a Linux host bridge) would also answer queries sent
from outside the virtualization host.

This patch takes advantage of a new dnsmasq option "--bind-dynamic",
which will cause the listening socket to be setup such that it will
only receive those requests that actually come in via the bridge
interface. In order for this behavior to actually occur, not only must
"--bind-interfaces" be replaced with "--bind-dynamic", but also all
"--listen-address" options must be replaced with a single
"--interface" option. Fully:

   --bind-interfaces --except-interface lo --listen-address x.x.x.x ...

(with --listen-address possibly repeated) is replaced with:

   --bind-dynamic --interface virbrX

Of course libvirt can't use this new option if the host's dnsmasq
doesn't have it, but we still want libvirt to function (because the
great majority of libvirt installations, which only have mode='nat'
networks using RFC1918 private address ranges (e.g. 192.168.122.0/24),
are immune to this vulnerability from anywhere beyond the local subnet
of the host), so we use the new dnsmasqCaps API to check if dnsmasq
supports the new option and, if not, we use the "old" option style
instead. In order to assure that this permissiveness doesn't lead to a
vulnerable system, we do check for non-private addresses in this case,
and refuse to start the network if both a) we are using the old-style
options, and b) the network has a publicly routable IP
address. Hopefully this will provide the proper balance of not being
disruptive to those not practically affected, and making sure that
those who *are* affected get their dnsmasq upgraded.

(--bind-dynamic was added to dnsmasq in upstream commit
54dd393f3938fc0c19088fbd319b95e37d81a2b0, which was included in
dnsmasq-2.63)
2012-11-29 15:02:39 -05:00
Laine Stump
bf402e77b6 util: new virSocketAddrIsPrivate function
This new function returns true if the given address is in the range of
any "private" or "local" networks as defined in RFC1918 (IPv4) or
RFC3484/RFC4193 (IPv6), otherwise they return false.

These ranges are:

   192.168.0.0/16
   172.16.0.0/16
   10.0.0.0/24
   FC00::/7
   FEC0::/10
2012-11-29 15:02:39 -05:00
Laine Stump
719c2c7665 util: capabilities detection for dnsmasq
In order to optionally take advantage of new features in dnsmasq when
the host's version of dnsmasq supports them, but still be able to run
on hosts that don't support the new features, we need to be able to
detect the version of dnsmasq running on the host, and possibly
determine from the help output what options are in this dnsmasq.

This patch implements a greatly simplified version of the capabilities
code we already have for qemu. A dnsmasqCaps device can be created and
populated either from running a program on disk, reading a file with
the concatenated output of "dnsmasq --version; dnsmasq --help", or
examining a buffer in memory that contains the concatenated output of
those two commands. Simple functions to retrieve capabilities flags,
the version number, and the path of the binary are also included.

bridge_driver.c creates a single dnsmasqCaps object at driver startup,
and disposes of it at driver shutdown. Any time it must be used, the
dnsmasqCapsRefresh method is called - it checks the mtime of the
binary, and re-runs the checks if the binary has changed.

networkxml2argvtest.c creates 2 "artificial" dnsmasqCaps objects at
startup - one "restricted" (doesn't support --bind-dynamic) and one
"full" (does support --bind-dynamic). Some of the test cases use one
and some the other, to make sure both code pathes are tested.
2012-11-29 15:02:39 -05:00
Ján Tomko
892582f9de conf: fix uninitialized variable in virDomainListSnapshots
If allocation of names fails, list is uninitialized.
2012-11-29 10:10:08 -07:00
Ján Tomko
6e1fc35546 rpc: don't destroy xdr before creating it in virNetMessageEncodeHeader
On OOM, xdr_destroy got called even though it wasn't created yet.

Found by coverity:
Error: UNINIT (CWE-457):
    libvirt-0.10.2/src/rpc/virnetmessage.c:214: var_decl: Declaring
    variable "xdr" without initializer.
    libvirt-0.10.2/src/rpc/virnetmessage.c:219: cond_true: Condition
    "virReallocN(&msg->buffer, 1UL /* sizeof (*msg->buffer) */,
    msg->bufferLength) < 0", taking true branch
    libvirt-0.10.2/src/rpc/virnetmessage.c:221: goto: Jumping to label
    "cleanup"
    libvirt-0.10.2/src/rpc/virnetmessage.c:257: label: Reached label
    "cleanup"
    libvirt-0.10.2/src/rpc/virnetmessage.c:258: uninit_use: Using
    uninitialized value "xdr.x_ops".
2012-11-29 10:10:08 -07:00
Ján Tomko
7730257db3 util: fix virBitmap allocation in virProcessInfoGetAffinity
Found by coverity:
Error: REVERSE_INULL (CWE-476):
    libvirt-0.10.2/src/util/processinfo.c:141: deref_ptr: Directly
    dereferencing pointer "map".
    libvirt-0.10.2/src/util/processinfo.c:142: check_after_deref:
    Null-checking "map" suggests that it may be null, but it has already
    been dereferenced on all paths leading to the check.
2012-11-29 10:10:08 -07:00
Ján Tomko
d5e8842538 conf: fix NULL check in virNetDevBandwidthParse
Found by coverity:
Error: REVERSE_INULL (CWE-476):
    libvirt-0.10.2/src/conf/netdev_bandwidth_conf.c:99: deref_ptr:
    Directly dereferencing pointer "node".
    libvirt-0.10.2/src/conf/netdev_bandwidth_conf.c:107:
    check_after_deref: Null-checking "node" suggests that it may be
    null, but it has already been dereferenced on all paths leading to
    the check.
2012-11-29 10:10:08 -07:00
Daniel P. Berrange
f4ea67f5b3 Turn some dual-state int parameters into booleans
The virStateInitialize method and several cgroups methods were
using an 'int privileged' parameter or similar for dual-state
values. These are better represented with the bool type.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-29 16:14:43 +00:00
Daniel P. Berrange
d442ee23bd Introduce a 'stop' method to virDriverState
To allow actions to be performed in libvirtd when the host
shuts down, or user session exits, introduce a 'stop'
method to virDriverState. This will do things like saving
the VM state to a file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-29 16:14:36 +00:00
Daniel P. Berrange
992ed55fea Implement virDomainSendProcessSignal for LXC driver
Implement the new API for sending signals to processes in a guest
for the LXC driver. Only support sending signals to the init
process for now, because

 - The kernel does not appear to expose the mapping between
   container PID numbers and host PID numbers anywhere in the
   host OS namespace
 - There is no race-free way to validate whether a host PID
   corresponds to a process in a container.

* src/lxc/lxc_driver.c: Allow sending processes signals

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-29 15:50:12 +00:00
Daniel P. Berrange
c51babd90e Specify remote protocol for virDomainSendProcessSignal
* src/remote/remote_protocol.x: message definition
* src/remote/remote_driver.c: Register driver function
* src/remote_protocol-structs: Test case

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-29 15:50:12 +00:00
Daniel P. Berrange
46c329bcc5 Add virDomainSendProcessSignal API
Add an API for sending signals to arbitrary processes in the
guest OS. This is primarily useful for container based virt,
but can be used for machine virt too, if there is a suitable
guest agent,

* include/libvirt/libvirt.h.in: Add virDomainSendProcessSignal
  and virDomainProcessSignal enum
* src/driver.h: Driver entry point
* src/libvirt.c, src/libvirt_public.syms: Impl for new API

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-29 15:50:12 +00:00
Jiri Denemark
c0ee3d3b54 qemu: Remove full stop from error messages 2012-11-29 14:16:48 +01:00
Guido Günther
d521119c09 Don't fail hard when we can't connect to the monitor
As of 1a50ba2cb0 we fail to connect to the
monitor instead of getting an exit status != 0 from qemu itself.  This
breaks capabilities probing for the non QMP case.
2012-11-29 13:54:44 +01:00
Michal Privoznik
5049b53689 libvirt.c: Fix wording and grammar in virDomainFSTrim
The documentation to this API has some defects from
grammar and wording POV. These were raised after I've
pushed the patches, so they are in a separate commit.
2012-11-29 09:30:58 +01:00
Osier Yang
ebdbe25a97 node_memory: Do not fail if there is parameter unsupported
It makes no sense to fail the whole getting command if there is
a parameter unsupported by the kernel. This patch fixes it by
omitting the unsupported parameter for getMemoryParameters.

And for setMemoryParameters, this checks if there is an unsupported
parameter up front of the setting, and just returns failure if not
all parameters are supported.
2012-11-29 15:36:23 +08:00
Daniel P. Berrange
b7aba48bca Rename misc QEMU structs/enums to use normal naming style
Replace the following names

 * struct qemu_snap_remove  with virQEMUSnapRemovePtr
 * struct qemu_snap_reparent with virQEMUSnapReparentPtr
 * struct qemu_save_header with virQEMUSaveHeaderPtr
 * enum qemu_save_formats with virQEMUSaveFormat

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-28 18:17:31 +00:00
Daniel P. Berrange
4738c2a7e7 Replace 'struct qemud_driver *' with virQEMUDriverPtr
Remove the obsolete 'qemud' naming prefix and underscore
based type name. Introduce virQEMUDriverPtr as the replacement,
in common with LXC driver naming style
2012-11-28 18:17:25 +00:00
Laine Stump
012d69dff1 network: fix crash when portgroup has no name
This resolves: https://bugzilla.redhat.com/show_bug.cgi?id=879473

The name attribute is required for portgroup elements (yes, the RNG
specifies that), and there is code in libvirt that assumes it is
non-null.  Unfortunately, the portgroup parsing function wasn't
checking for lack of portgroup. One adverse result of this was that
attempts to update a network by adding a portgroup with no name would
cause libvirtd to segfault. For example:

   virsh net-update default add portgroup "<portgroup default='yes'/>"

This patch causes virNetworkPortGroupParseXML to fail if no name is
specified, thus avoiding any later problems.
2012-11-28 11:59:30 -05:00
Michal Privoznik
4ded3fb1c2 maint: Fix use of invalid reboot flags
Throughout the code, we've always used VIR_DOMAIN_SHUTDOWN* flags
even for virDomainReboot() API and its implementation. Fortunately,
the appropriate macros has the same value. But if we want to keep
things consistent, we should be using the correct macros. This
patch doesn't break anything, luckily.
2012-11-28 17:45:30 +01:00
Hu Tao
39ad0001ca build: more fix to avoid C99 for loop
see commit 7e5aa78d0f

* src/interface/interface_backend_udev.c: Declare variable sooner.
2012-11-28 09:34:51 -07:00
Eric Blake
89cf363061 nwfilter: drop dead code
Commit cb022152 went overboard and introduced a dead conditional
while trying to get rid of a potential NULL dereference.

* src/nwfilter/nwfilter_dhcpsnoop.c (virNWFilterSnoopReqNew):
Remove redundant conditional.
2012-11-28 09:21:33 -07:00
Ján Tomko
7794e02c56 util: check for NULL parameter in virFileWrapperFdCatchError
This reverts 8927c0e qemu: fix a crash when save file can't be opened
and allows virFileWrapperFdCatchError to be called with NULL instead.
2012-11-29 00:00:39 +08:00
Ján Tomko
0361917619 conf: snapshot: check return value of virDomainSnapshotObjListNum
If it's negative, this might result in a request to allocate lots of
memory.
2012-11-29 00:00:39 +08:00
Ján Tomko
34e5791332 conf: check the return value of virXPathNodeSet
In a few places, the return value could get passed to VIR_ALLOC_N without
being checked, resulting in a request to allocate a lot of memory if the
return value was negative.
2012-11-29 00:00:39 +08:00
Ján Tomko
7475ee0f75 libssh2_session: support DSS keys as well
Missing break in the switch.
2012-11-29 00:00:39 +08:00
Ján Tomko
28a6fd9396 cgroup: fix impossible overrun in virCgroupAddTaskController
The size of the controllers array is VIR_CGROUP_CONTROLLER_LAST, however
we only call it with values less than VIR_CGROUP_CONTROLLER_LAST.
2012-11-29 00:00:39 +08:00
Ján Tomko
cb02215252 nwfilter: fix NULL pointer check in virNWFilterSnoopReqNew
This can't lead to a crash since virNWFilterSnoopReqNew is only called
with a static array as the argument, but if we check for NULL we should
do it right.
2012-11-29 00:00:39 +08:00
Peter Krempa
d3337028f5 qemu: Fix error messages when dispatching guest agent commands
Error messages produced while dispatching guest agent commands didn't
have an apparent reference to the fact that they are dealing with guest
agent commands. This patch fixes up some of the messages to contain that
reference.
2012-11-28 16:36:34 +01:00
Peter Krempa
86727836c2 qemu: Drop word "either" from comments for agent monitor functions 2012-11-28 16:36:34 +01:00
Michal Privoznik
6092fea93a qemu: Implement virDomainFSTrim
using qemu guest agent. As said in previous patch,
@mountPoint must be NULL and @flags zero because
qemu guest agent doesn't support these arguments
yet. If qemu learns them, we can start supporting
them as well.
2012-11-28 16:15:01 +01:00
Michal Privoznik
bcbe646d92 remote: Implement virDomainFSTrim
A new rule to fixup_name() in gendispatch.pl needs to be added,
otherwise we are left with remoteDomainFstrim which is not wanted.
2012-11-28 16:15:01 +01:00
Michal Privoznik
0fbf3704fd Introduce virDomainFSTrim() public API
This will call FITRIM within guest. The API has 4 arguments,
however, only 2 will be used for now (@dom and @minumum).
The rest two are there if in future qemu guest agent learns them.
2012-11-28 16:15:01 +01:00
Viktor Mihajlovski
856a482207 qemu: Add QEMU version computation to QMP probing
With QMP capability probing, the version was not set.
virsh version returns:
...
Cannot extract running QEMU hypervisor version

This is fixed by computing caps->version from QMP major,
minor, micro values.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2012-11-28 14:54:44 +00:00
Viktor Mihajlovski
1a50ba2cb0 qemu: Fix QMP Capabability Probing Failure
QMP Capability probing will fail if QEMU cannot bind to the
QMP monitor socket in the qemu_driver->libDir directory.
That's because the child process is stripped of all
capabilities and this directory is chown'ed to the configured
QEMU user/group (normally qemu:qemu) by the QEMU driver.

To prevent this from happening, the driver startup will now pass
the QEMU uid and gid down to the capability probing code.
All capability probing invocations of QEMU will be run with
the configured QEMU uid instead of libvirtd's.

Furter, the pid file handling is moved to libvirt, as QEMU
cannot write to the qemu_driver->runDir (root:root). This also
means that the libvirt daemonizing must be used.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2012-11-28 14:54:29 +00:00
Viktor Mihajlovski
7a95eccc81 qemu: Wait for monitor socket even without pid
If qemuMonitorOpenUnix is called without a related pid, i.e. for
QMP probing, a connect failure can happen as the result of a race.
Without a pid there is no retry and thus we give up too early.
This changes the code to retry if no pid is supplied.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2012-11-28 14:54:21 +00:00
Gao feng
df33ecdd9e mount fuse's meminfo file to container's /proc/meminfo
we already have virtualize meminfo for container through fuse filesystem,
add function lxcContainerMountProcFuse to mount this meminfo file to
the container's /proc/meminfo.

So we can isolate container's /proc/meminfo from host now.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2012-11-28 10:28:49 +00:00
Gao feng
d671c0ed1b make /proc/meminfo isolate with host through fuse
with this patch,container's meminfo will be shown based on
containers' mem cgroup.

Right now,it's impossible to virtualize all values in meminfo,
I collect some values such as MemTotal,MemFree,Cached,Active,
Inactive,Active(anon),Inactive(anon),Active(file),Inactive(anon),
Active(file),Inactive(file),Unevictable,SwapTotal,SwapFree.

if I miss something, please let me know.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2012-11-28 10:28:49 +00:00
Gao feng
729acc23df add interface virCgroupGetAppRoot
because libvirt_lxc's cgroup mountpoint is what it shown
in /proc/self/cgroup.

we can get container's cgroup through virCgroupNew("/", &group),
add interface virCgroupGetAppRoot to help container to
get it's cgroup.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2012-11-28 10:28:49 +00:00
Gao feng
4d4f371e09 add interface virCgroupGetMemSwapUsage
virCgroupGetMemSwapUsage is used to get container's swap usage,
with this interface,we can get swap usage in fuse filesystem.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2012-11-28 10:28:49 +00:00
Gao feng
2a596dac5e add fuse support for libvirt lxc
this patch addes fuse support for libvirt lxc.
we can use fuse filesystem to generate sysinfo dynamically,
So we can isolate /proc/meminfo,cpuinfo and so on through
fuse filesystem.

we mount fuse filesystem for every container.
the mount name is libvirt,mount point is
localstatedir/run/libvirt/lxc/containername.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2012-11-28 10:28:49 +00:00
Guannan Ren
237629d204 bitmap: fix typo to use UL type of integer constant in virBitmapIsAllSet
This bug leads to getting incorrect vcpupin information via
qemudDomainGetVcpuPinInfo() API when the number of maximum
cpu on a host falls into a range such as 31 < ncpus < 64.

gcc warning:
left shift count >= width of type

The following bug is such the case
https://bugzilla.redhat.com/show_bug.cgi?id=876415
2012-11-28 18:30:28 +08:00
Ján Tomko
8927c0eab6 qemu: fix a crash when save file can't be opened
In qemuDomainSaveMemory, wrapperFd might be NULL and should be checked before
calling virFileWrapperFdCatchError. Same in doCoreDump.

Bug: https://bugzilla.redhat.com/show_bug.cgi?id=880919
2012-11-28 10:24:31 +01:00
Daniel P. Berrange
ebb1ccb517 Alphabetically sort libvirt_daemon.syms
Sort the symbols listed in libvirt_daemon.syms

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 19:37:13 +00:00
Daniel P. Berrange
54f89ef1fc Change bridge driver to use named initializers with virDriverState
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 19:37:07 +00:00
Alexander Larsson
d74b03e51c virdbus: Add virDBusGetSessionBus helper
This splits out some common code from virDBusGetSystemBus and
uses it to implement a new virDBusGetSessionBus helper.
2012-11-27 19:37:00 +00:00
Daniel P. Berrange
7492276317 s/qemud/qemu/ in QEMU driver sources
Change some legacy function names to use 'qemu' as their
prefix instead of 'qemud' which was a hang over from when
the QEMU driver ran inside a separate daemon

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 19:36:36 +00:00
Daniel P. Berrange
509ce9437f Fix leak of virNetworkPtr in LXC startup failure path
When starting an LXC guest with a virNetwork based NIC device,
if the network was not active, the virNetworkPtr device would
be leaked

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 17:59:28 +00:00
Daniel P. Berrange
0584d6626b Fix error reporting in virNetDevVethDelete
In virNetDevVethDelete the virRun method will properly report
errors, but when checking the exit status for non-zero exit
code no error is reported

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 17:59:28 +00:00
Daniel P. Berrange
9d2bfc1ca7 Ensure transient def is removed if LXC start fails
When starting a container, newDef is initialized to a
copy of 'def', but when startup fails newDef is never
removed. This cause later attempts to use 'virDomainDefine'
to lose the new data being defined.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 17:59:23 +00:00
Daniel P. Berrange
43db9cf4ed Ensure failure to create macvtap device aborts LXC start
A mistaken initialization of 'ret' caused failure to create
macvtap devices to be ignored. The libvirt_lxc process
would later fail to start due to missing devices

Also make sure code checks '< 0' and not '!= 0' since only
-1 is considered an error condition

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 17:02:22 +00:00
Daniel P. Berrange
68dceb635d Avoid crash when LXC start fails with no interface target
If the <interface> device did not contain any <target>
element, LXC would crash on a NULL pointer if starting
the container failed

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 17:02:22 +00:00
Daniel P. Berrange
e11daa2b60 Specify name of target interface with macvlan error
When failing to create a macvlan interface, make sure the
error message contains the name of the host interface

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 17:02:22 +00:00
Daniel P. Berrange
7c5ba648f7 Treat missing driver cgroup as fatal in LXC driver
The LXC driver relies on use of cgroups to kill off LXC processes
in shutdown. If cgroups aren't available, we're unable to kill
off processes, so we must treat lack of cgroups as a fatal startup
error.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 17:02:22 +00:00
Daniel P. Berrange
8e1f0c38fa Ensure LXC container exits if cgroups setup fails
The code setting up LXC cgroups used an 'rc' variable both
for capturing the return value of methods it calls, and
its own return status. The result was that several failures
in setting up cgroups would actually result in success being
returned.

Use a separate 'ret' for tracking return value as per normal
code design in other parts of libvirt

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 17:02:22 +00:00
Daniel P. Berrange
ea2fec86dd Store initpid in the domain status XML for LXC
The initpid will be required long term to enable LXC to
implement various hotplug operations. Thus it needs to be
persisted in the domain status XML. LXC has not used the
domain status XML before, so this introduces use of the
helpers.
2012-11-27 17:02:22 +00:00
Daniel P. Berrange
a33d8fceee Remove bogus newline at end of debug log message
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 17:02:22 +00:00
Daniel P. Berrange
f999e2fdce Pass virSecurityManagerPtr object further down into LXC setup code
Currently the lxcContainerSetupMounts method uses the
virSecurityManagerPtr instance to obtain the mount options
string and then only passes the string down into methods
it calls. As functionality in LXC grows though, those
methods need to have direct access to the virSecurityManagerPtr
instance. So push the code down a level.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 16:45:09 +00:00
Daniel P. Berrange
3f6470f753 Fix error handling in virSecurityManagerGetMountOptions
The impls of virSecurityManagerGetMountOptions had no way to
return errors, since the code was treating 'NULL' as a success
value. This is somewhat pointless, since the calling code did
not want NULL in the first place and has to translate it into
the empty string "". So change the code so that the impls can
return "" directly, allowing use of NULL for error reporting
once again

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-27 16:45:04 +00:00
Eric Blake
1b2ebf9502 storage: fix device detach regression with cgroup ACLs
https://bugzilla.redhat.com/show_bug.cgi?id=876828

Commit 38c4a9cc introduced a regression in hot unplugging of disks
from qemu, where cgroup device ACLs were no longer being revoked
(thankfully not a security hole: cgroup ACLs only prevent open()
of the disk; so reverting the ACL prevents future abuse but doesn't
stop abuse from an fd that was already opened before the ACL change).

The actual regression is due to a latent bug.  The hot unplug code
was computing the set of files needing cgroup ACL revocation based
on the XML passed in by the user, rather than based on the domain's
details on which disk was being deleted.  As long as the revoke
path was always recomputing the backing chain, this didn't really
matter; but now that we want to compute the chain exactly once and
remember that computation, we need to hang on to the backing chain
until after the revoke has happened.

* src/qemu/qemu_hotplug.c (qemuDomainDetachPciDiskDevice):
Transfer backing chain before deletion.
2012-11-27 08:02:26 -07:00
Harsh Prateek Bora
c33c36d28f qemu: Add support for gluster protocol based network storage backend.
Qemu accepts gluster protocol as supported storage backend beside others.

Signed-off-by: Harsh Prateek Bora <harsh@linux.vnet.ibm.com>
2012-11-27 10:19:22 +01:00
Harsh Prateek Bora
a2d2b80fbd Add Gluster protocol as supported network disk backend
This patch introduces the RNG schema and updates necessary data strucutures
to allow various hypervisors to make use of Gluster protocol as one of the
supported network disk backend. Next patch will add support to make use of
this feature in Qemu since it now supports Gluster protocol as one of the
network based storage backend.

Two new optional attributes for <host> element are introduced - 'transport'
and 'socket'. Valid transport values are tcp, unix or rdma. If none specified,
tcp is assumed. If transport is unix, socket specifies path to unix socket.

This patch allows users to specify disks on gluster backends like this:

    <disk type='network' device='disk'>
      <driver name='qemu' type='raw'/>
      <source protocol='gluster' name='Volume1/image'>
        <host name='example.org' port='6000' transport='tcp'/>
      </source>
      <target dev='vda' bus='virtio'/>
    </disk>

    <disk type='network' device='disk'>
      <driver name='qemu' type='raw'/>
      <source protocol='gluster' name='Volume2/image'>
        <host transport='unix' socket='/path/to/sock'/>
      </source>
      <target dev='vdb' bus='virtio'/>
    </disk>

Signed-off-by: Harsh Prateek Bora <harsh@linux.vnet.ibm.com>
2012-11-27 10:19:22 +01:00
Eric Blake
7e5aa78d0f build: avoid C99 for loop
Although we require various C99 features, we don't yet require a
complete C99 compiler.  On RHEL 5, compilation complained:

qemu/qemu_command.c: In function 'qemuBuildGraphicsCommandLine':
qemu/qemu_command.c:4688: error: 'for' loop initial declaration used outside C99 mode

* src/qemu/qemu_command.c (qemuBuildGraphicsCommandLine): Declare
variable sooner.
* src/qemu/qemu_process.c (qemuProcessInitPasswords): Likewise.
2012-11-26 15:28:25 -07:00
Ata E Husain Bohra
067e83ebee Refactor ESX storage driver to implement facade pattern
The patch refactors the current ESX storage driver due to following reasons:

1. Given most of the public APIs exposed by the storage driver in Libvirt
remains same, ESX storage driver should not implement logic specific
for only one supported format (current implementation only supports VMFS).
2. Decoupling interface from specific storage implementation gives us an
extensible design to hook implementation for other supported storage
formats.

This patch refactors the current driver to implement it as a facade pattern i.e.
the driver exposes all the public libvirt APIs, but uses backend drivers to get
the required task done. The backend drivers provide implementation specific to
the type of storage device.

File changes:
------------------
esx_storage_driver.c ----> esx_storage_driver.c (base storage driver)
                     |
                     |---> esx_storage_backend_vmfs.c (VMFS backend)
2012-11-26 22:46:13 +01:00
Peter Krempa
99a388e612 lxc: Don't crash if no security driver is specified in libvirt_lxc
When no security driver is specified libvirt_lxc segfaults as a debug
message tries to access security labels for the container that are not
present.

This problem was introduced in commit 6c3cf57d6c.
2012-11-26 15:48:31 +01:00
Peter Krempa
81efb13b4a lxc: Avoid segfault of libvirt_lxc helper on early cleanup paths
Early jumps to the cleanup label caused a crash of the libvirt_lxc
container helper as the cleanup section called
virLXCControllerDeleteInterfaces(ctrl) without checking the ctrl argument
for NULL. The argument was de-referenced soon after.

$ /usr/libexec/libvirt_lxc
/usr/libexec/libvirt_lxc: missing --name argument for configuration
Segmentation fault
2012-11-26 15:48:31 +01:00
Ata E Husain Bohra
2b121dbc10 Add private data pointer to virStoragePool and virStorageVol
This will simplify the refactoring of the ESX storage driver to support
a VMFS and an iSCSI backend.

One of the tasks the storage driver needs to do is to decide which backend
driver needs to be invoked for a given request. This approach extends
virStoragePool and virStorageVol to store extra parameters:

1. privateData: stores pointer to respective backend storage driver.
2. privateDataFreeFunc: stores cleanup function pointer.

virGetStoragePool and virGetStorageVol are modfied to accept these extra
parameters as user params. virStoragePoolDispose and virStorageVolDispose
checks for cleanup operation if available.

The private data pointer allows the ESX storage driver to store a pointer
to the used backend with each storage pool and volume. This avoids the need
to detect the correct backend in each storage driver function call.
2012-11-26 14:39:39 +01:00
Peter Krempa
bb2704e7b5 cpu: Add Intel Haswell cpu model
The new model supports following features in addition to those supported
by SandyBridge:

fma, pcid, movbe, fsgsbase, bmi1, hle, avx2, smep, bmi2, erms, invpcid,
rtm
2012-11-26 14:19:57 +01:00
Ján Tomko
70f0bbe8e0 storage: fix logical volume cloning
Commit 258e06c removed setting of the volume type to
VIR_STORAGE_VOL_BLOCK, which leads to failures in
storageVolumeCreateXMLFrom.

The type (and target.format) of the volume was set to zero. In
virStorageBackendGetBuildVolFromFunction, this gets interpreted as
VIR_STORAGE_FILE_NONE and the qemu-img tool is called with unknown
"none" format.

Bug: https://bugzilla.redhat.com/show_bug.cgi?id=879780
2012-11-26 14:01:29 +01:00
Ján Tomko
5efacd7813 build: fix build --without-network
bridge_driver.h: silence gcc warnings:
statement with no effect [-Wunused-value]
unused variable 'net' [-Wunused-variable]

virdrivermoduletest.c: don't require network driver module
if it hasn't been built.
2012-11-26 14:01:23 +01:00
Osier Yang
a703566201 util: Use virReportSystemError for system error in pci.c 2012-11-26 09:59:04 +08:00
Osier Yang
3d77b98ca6 util: Fix the indention 2012-11-25 23:22:43 +08:00
Daniel P. Berrange
37db3f5dfe Fix exiting of libvirt_lxc program on container quit
The virLXCControllerClientCloseHook method was mistakenly
assuming that the private data associated with the network
client was the virLXCControllerPtr. In fact it was just a
dummy int, so we were derefencing a bogus struct. The
frequent result of this was that we would never quit, because
we tried to arm a non-existant timer.

Fix the code by removing the dummy private data and just
using the virLXCControllerPtr instance as private data

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-23 10:11:56 +00:00
Daniel P. Berrange
afbd96678e Skip deleted timers when calculting next timeout
It is possible for there to be deleted timers when we
calculate the next timeout, and they must be skipped.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-23 10:11:55 +00:00
Daniel P. Berrange
39064f0ff9 Warn if requesting update to non-existent timer/handle watch
The event code is a no-op if requested to update a non-existent
timer/handle watch. This makes it hard to detect bugs in the
caller who have passed bogus data. Add a VIR_WARN output in
such cases, since the API does not allow for return errors.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-23 10:11:42 +00:00
Daniel P. Berrange
81d6c4defe Fix virDiskNameToIndex to actually ignore partition numbers
The docs for virDiskNameToIndex claim it ignores partition
numbers. In actual fact though, a code ordering bug means
that a partition number will cause the code to accidentally
multiply the result by 26.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-23 10:10:55 +00:00
Peter Krempa
58a54dc373 qemu: Stop recursive detection of image chains when an image is missing
Commit e0c469e58b that fixes the detection
of image chain wasn't complete. Iteration through the backing image
chain has to stop at the last existing image if some of the images are
missing otherwise the backing chain that is cached contains entries with
paths being set to NULL resulting to:

error: Unable to allow access for disk path (null): Bad address

Fortunately stat() is kind enough not to crash when it's presented with
a NULL argument. At least on Linux.
2012-11-22 16:04:17 +01:00
Martin Kletzander
03cd6e4ae8 conf: Report sensible error for invalid disk name
The error "... but the cause is unknown" appeared for XMLs similar to
this:

 <disk type='file' device='cdrom'>
   <driver name='qemu' type='raw'/>
   <source file='/dev/zero'/>
   <target dev='sr0'/>
 </disk>

Notice unsupported disk type (for the driver), but also no address
specified. The first part is not a problem and we should not abort
immediately because of that, but the combination with the address
unknown was causing an unspecified error.

While fixing this, I added an error to one place where this return
value was not managed properly.
2012-11-22 15:23:40 +01:00
Natanael Copa
89ad205f32 build: trivial fix error: implicit declaration of function 'malloc'
Fixes this error when building with -Werror on Alpine Linux:

util/processinfo.c: In function 'virProcessInfoSetAffinity':
util/processinfo.c:52:5: error: implicit declaration of function 'malloc' [-Werror=implicit-function-declaration]

Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
2012-11-22 06:49:06 -07:00
Daniel P. Berrange
a615833664 Log an audit message with the LXC init pid
Currently the LXC driver logs audit messages when a container
is started or stopped. These audit messages, however, contain
the PID of the libvirt_lxc supervisor process. To enable
sysadmins to correlate with audit messages generated by
processes /inside/ the container, we need to include the
container init process PID.

We can't do this in the main 'start' audit message, since
the init PID is not available at that point. Instead we output
a completely new audit record, that lists both PIDs.

type=VIRT_CONTROL msg=audit(1353433750.071:363): pid=20180 uid=0 auid=501 ses=3 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 msg='virt=lxc op=init vm="busy" uuid=dda7b947-0846-1759-2873-0f375df7d7eb vm-pid=20371 init-pid=20372 exe="/home/berrange/src/virt/libvirt/daemon/.libs/lt-libvirtd" hostname=? addr=? terminal=pts/6 res=success'

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-22 10:46:40 +00:00
Daniel P. Berrange
f33e43c235 Use virNetServerRun instead of custom main loop
The LXC controller code currently directly invokes the
libvirt main loop code. The problem is that this misses
the cleanup of virNetServerClient connections that
virNetServerRun takes care of.

The result is that when libvirtd is stopped, the
libvirt_lxc controller process gets stuck in a I/O loop.
When libvirtd is then started again, it fails to connect
to the controller and thus kills off the entire domain.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-22 08:51:03 +00:00
Osier Yang
104650db3e storage: Improve virStorageBackendFileSystemStop
It's actually not used for DIR pool. So removing the checking.
2012-11-22 11:23:11 +08:00
Osier Yang
f4ac06569a storage: Fix bug of fs pool destroying
Regression introduced by commit 258e06c85b, "ret" could be set to 1
or 0 by virStorageBackendFileSystemIsMounted before goto cleanup.
This could mislead the callers (up to the public API
virStoragePoolDestroy) to return success even the underlying umount
command fails.
2012-11-22 11:22:12 +08:00
Scott Sullivan
f0e72b2f5c qemu: fix RBD attach regression
I have been testing libvirt v1.0.0 for deployment within my
organization, and in the process discovered what appears to be a bug
that breaks virsh attach-device, when attaching an RBD volume to an
instance. First, here is the error presented, with v1.0.0 (this worked
in v0.10.2):

[root@host ~]# virsh attach-device W5APQ8  G84VV1.xml
error: Failed to attach device from G84VV1.xml
error: cannot open file 'dc3-1-test/G84VV1': No such file or directory

Using git bisect, I narrowed the problem down to this as the first
commit to break this setup:

4d34c92947 is the first bad commit
2012-11-21 12:33:23 -07:00
Ján Tomko
cc244e2441 conf: add support for booting from redirected USB devices
Commit a4c19459aa only added the
QEMU capability flag, command line option and added the boot element
for redirdev's in the XML schema.

This patch adds support for parsing and writing the XML with redirdevs
with the boot flag. It also ignores unknown XML elements in redirdev
instead of failing with:
"error: An error occurred, but the cause is unknown"

Bug: https://bugzilla.redhat.com/show_bug.cgi?id=805414
2012-11-21 17:54:35 +01:00
Alon Levy
283aafdb29 qemu/qemu_command.c: fix indent of label 2012-11-20 19:57:39 +01:00
Alon Levy
37b415200d qemu: graphics support for simultaneous one of each sdl, vnc, spice 2012-11-20 19:57:39 +01:00
Alon Levy
23e8b5d8e7 qemu: refactor graphics code to not hardcode a single display
The check for a single display remains so no new functionality is added.
2012-11-20 19:57:39 +01:00
Eric Blake
0b5617a607 snapshot: make cloning of domain definition easier
Upcoming patches for revert-and-clone branching of snapshots need
to be able to copy a domain definition; make this step reusable.

* src/conf/domain_conf.h (virDomainDefCopy): New prototype.
* src/conf/domain_conf.c (virDomainObjCopyPersistentDef): Split...
(virDomainDefCopy): ...into new function.
(virDomainObjSetDefTransient): Use it.
* src/libvirt_private.syms (domain_conf.h): Export it.
* src/qemu/qemu_driver.c (qemuDomainRevertToSnapshot): Use it.
2012-11-20 08:41:45 -07:00
Eric Blake
62711817db snapshot: implement new filter sets
Relatively straight-forward.  And since qemu was already using
VIR_DOMAIN_SNAPSHOT_FILTERS_ALL, with 6 different APIs all calling
into this common code, I've instantly added all 5 flags to 6 APIs.

* src/conf/snapshot_conf.h (VIR_DOMAIN_SNAPSHOT_FILTERS_ALL):
Enable new filters.
* src/conf/snapshot_conf.c (virDomainSnapshotObjListGetNames):
Prep the new flags.
(virDomainSnapshotObjListCopyNames): Actually do the filtering.
2012-11-19 14:16:51 -07:00
Eric Blake
e9028f4b73 snapshot: add two more filter sets to API
As we enable more modes of snapshot creation, it becomes more important
to be able to quickly filter based on snapshot properties.  This patch
introduces new filter flags; subsequent patches will introduce virsh
back-compat filtering, as well as actual libvirt filtering.

* include/libvirt/libvirt.h.in (virDomainSnapshotListFlags): Add
five new flags in two new groups.
* src/libvirt.c (virDomainSnapshotNum, virDomainSnapshotListNames)
(virDomainListAllSnapshots, virDomainSnapshotNumChildren)
(virDomainSnapshotListChildrenNames)
(virDomainSnapshotListAllChildren): Document them.
* src/conf/snapshot_conf.h (VIR_DOMAIN_SNAPSHOT_FILTERS_STATUS)
(VIR_DOMAIN_SNAPSHOT_FILTERS_LOCATION): Add new convenience filter
collection macros.
* tools/virsh-snapshot.c (cmdSnapshotList): Add 5 new flags.
* tools/virsh.pod (snapshot-list): Document them.
2012-11-19 08:43:00 -07:00
Laine Stump
89204fca7f qemu: allow larger discrepency between memory & currentMemory in domain xml
This resolves:

  https://bugzilla.redhat.com/show_bug.cgi?id=873134

The reported problem is that an attempt to restore a saved domain that
was configured with <currentMemory> and <memory> set to some (same for
both) number that's not a multiple of 4096KiB results in an error like
this:

  error: Failed to start domain libvirt_test_api
  error: XML error: current memory '4001792k' exceeds maximum '4000768k'

(in this case, currentMemory was set to 4000000KiB).

The reason for this failure is:

1) a saved image contains the "live xml" of the domain at the time of
the save.

2) the live xml of a running domain gets its currentMemory
(a.k.a. cur_balloon) directly from the qemu monitor rather than from
the configuration of the domain.

3) the value reported by qemu is (sometimes) not exactly what was
originally given to qemu when the domain was started, but is rounded
up to [some indeterminate granularity] - in some versions of qemu that
granularity is apparently 1MiB, and in others it is 4MiB.

4) When the XML is parsed to setup the state of the restored domain,
the XML parser for <currentMemory> compares it to <memory> (which is
the maximum allowed memory size for the domain) and if <currentMemory>
is greater than the next 1024KiB boundary above <memory>, it spits out
an error and fails.

For example (from the BZ) if you start qemu on RHEL6 with both
<currentMemory> and <memory> of 4000000 (this number is in KiB),
libvirt's dominfo or dumpxml will report "4001792" back (rounded up to
next 4MiB) for 10-20 seconds after the start, then revert to reporting
"4000000". On Fedora 16 (which uses qemu-1.0), it will instead report
"4000768" (rounded up to next 1MiB). On Fedora 17 (qemu-1.2), it seems
to always report "4000000". ("4000000" is of course okay, and
"4000768" is also okay since that's the next 1024KiB boundary above
"4000000" and the parser was already allowing for that. But "4001792
is *not* okay and produces the error message.)

This patch solves the problem by changing the allowed "fudge factor"
when parsing from 1024KiB to 4096KiB to match the maximum up-rounding
that could be done in qemu.

(I had earlier thought to fix this by up-rounding <memory> in the
dumpxml that's put into the saved image, but that wouldn't have fixed
the case where the save image was produced by an "unfixed"
libvirtd.)
2012-11-16 16:56:41 -05:00
Eric Blake
9504ae5b67 nodeinfo: port nodecpumap to RHEL5
Prior to this patch, 'virsh nodecpumap' on older kernels reported:
error: Unable to get cpu map
error: out of memory

* src/nodeinfo.c (linuxParseCPUmax): Don't overwrite error.
(nodeGetCPUBitmap): Provide backup implementation.
2012-11-16 10:12:36 -07:00
Eric Blake
47976b484c nodeinfo: support kernels that lack socket information
On RHEL 5, I was getting a segfault trying to start libvirtd,
because we were failing virNodeParseSocket but not checking
for errors, and then calling CPU_SET(-1, &sock_map) as a result.
But if you don't have a topology/physical_package_id file,
then you can just assume that the cpu belongs to socket 0.

* src/nodeinfo.c (virNodeGetCpuValue): Change bool into
default_value.
(virNodeParseSocket): Allow for default value when file is missing,
different from fatal error on reading file.
(virNodeParseNode): Update call sites to fail on error.
2012-11-16 10:12:36 -07:00
Eric Blake
516c12237b snapshot: require user to supply external memory file name
For disk snapshots, the user could request an external snapshot
but not supply a filename; later on, we would check this condition
and generate a suitable name if possible, or gracefully error out
when not possible (such as when the original file was a block
device).  But unless we come up with a suitable way to generate
external memory file names, we have no later code point that was
checking for NULL, so we should forbid this up front.

* src/conf/snapshot_conf.c (virDomainSnapshotDefParseString):
Avoid NULL deref, since we don't generate names yet.
2012-11-16 08:22:13 -07:00
liguang
63158d586b qemu: Beautify code indent in migration codes
Signed-off-by: liguang <lig.fnst@cn.fujitsu.com>
2012-11-16 16:42:09 +08:00
Michal Privoznik
96a02703da sanlock: Retry after EINPROGRESS
It may take some time for sanlock to add a lockspace. And if user
restart libvirtd service meanwhile, the fresh daemon can fail adding
the same lockspace with EINPROGRESS. Recent sanlock has
sanlock_inq_lockspace() function which should block until lockspace
changes state. If we are building against older sanlock we should
retry a few times before claiming an error. This issue can be easily
reproduced:

for i in {1..1000} ; do echo $i; service libvirtd restart; sleep 2; done
20
Stopping libvirtd daemon:                                  [FAILED]
Starting libvirtd daemon:                                  [  OK  ]
21
Stopping libvirtd daemon:                                  [  OK  ]
Starting libvirtd daemon:                                  [  OK  ]
22
Stopping libvirtd daemon:                                  [  OK  ]
Starting libvirtd daemon:                                  [  OK  ]

 error : virLockManagerSanlockSetupLockspace:334 : Unable to add
 lockspace /var/lib/libvirt/sanlock/__LIBVIRT__DISKS__: Operation now in
 progress
2012-11-16 08:00:11 +01:00
Viktor Mihajlovski
a2b3d7cff8 qemu, lxc: Change host CPU number detection logic.
The drivers for QEMU and LXC use virNodeGetInfo only to determine
the number of host CPUs. On Linux hosts nodeGetCPUCount has less
overhead.
2012-11-15 08:48:19 -07:00
Viktor Mihajlovski
0c996c10e4 nodeinfo: enable nodeGetCPUCount for older kernels
Since /sys/devices/system/cpu/present is not available on
older kernels like on RHEL 5.x nodeGetCPUCount will
fail there. The fallback implemented is to scan for
/sys/devices/system/cpu/cpuNN entries.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2012-11-14 20:43:54 -07:00
Miloslav Trmač
39c814ff46 Use helper functions to format the journal iov array
This simplifies the top-level code, at the cost of using a little more
stack space.  The primary benefit is being able to send more fields
without knowing in advance how many of them, and of which types, these
fields will be, and without having to individually add buffer variables.

The code imposes an upper limit on the total number of iovs/buffers
used, and fields that wouldn't fit are silently dropped.  This is not
significant in this patch, but will affect the following one.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2012-11-14 20:20:02 -07:00
Miloslav Trmač
37f7a1faf1 Add metadata to virLogOutputFunc
... and update all users.  No change in functionality, the parameter
will be used in the next patch.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2012-11-14 19:14:07 -07:00
Miloslav Trmač
c780e9b882 Add a metadata parameter to virLog{, V}Message
... and update all users.  No change in functionality, the parameter
will be used later.

The metadata representation is as minimal as possible, but requires
the caller to allocate an array on stack explicitly.

The alternative of using varargs in the virLogMessage() callers:
* Would not allow the caller to optionally omit some metadata elements,
  except by having two calls to virLogMessage.
* Would not be as type-safe (e.g. using int vs. size_t), and the compiler
  wouldn't be able to do type checking
* Depending on parameter order:
  a) virLogMessage(..., message format, message params...,
                   metadata..., NULL)
     can not be portably implemented (parse_printf_format() is a glibc
     function)
  b) virLogMessage(..., metadata..., NULL,
                   message format, message params...)
     would prevent usage of ATTRIBUTE_FMT_PRINTF and the associated
     compiler checking.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2012-11-14 19:08:31 -07:00
Ján Tomko
a4c19459aa qemu: add bootindex for usb-host and usb-redir devices
Allow bootindex to be specified for redirected USB devices and host USB
devices.

Bug: https://bugzilla.redhat.com/show_bug.cgi?id=805414
2012-11-14 19:03:18 -07:00
Laine Stump
bc4b433098 util: fix index when building lock owners array
The "restart" function for locks allocates a new array according to
and pre-sets its length, then reads the owner pids from a JSON
document in a loop. Rather than adding each owner at a different
index, though, it repeatedly overwrites the last element of the array
with all the owners.
2012-11-14 12:43:49 -05:00
Daniel P. Berrange
3782814d4a Fix uninitialized variable in virLXCControllerSetupDevPTS
The lack of initialization of 'opts' caused a SEGV in the
cleanup: path if the root->src directory did not exist
2012-11-14 15:39:48 +00:00
Michal Privoznik
9f87247235 qemu: Don't force port=0 for SPICE
If domain uses only TLS port we don't want to add
'port=0' explicitly to command line.
2012-11-14 10:07:27 +01:00
Peter Krempa
30f1bccf33 snapshot: qemu: Fix detection of external snapshots when deleting
This patch adds a helper to determine if snapshots are external and uses
the helper to fix detection of those in snapshot deletion code.

Snapshots are external if they have an external memory image or if the
disk locations are external. As mixed snapshots are forbidden for now
we need to check just one disk to know.
2012-11-13 20:36:26 +01:00
Peter Krempa
9576afd110 nodeinfo: Add check and workaround to guarantee valid cpu topologies
Lately there were a few reports of the output of the virsh nodeinfo
command being inaccurate. This patch tries to avoid that by checking if
the topology actually makes sense. If it doesn't we then report a
synthetic topology that indicates to the user that the host capabilities
should be checked for the actual topology.
2012-11-13 00:35:29 +01:00
Michal Privoznik
fd723164c7 AbortJob: Fix documentation
This API was never synchronous and probably doesn't even need to be.
2012-11-12 10:39:39 +01:00
Michal Privoznik
ab5e7d4977 qemu: Allow migration to be cancelled at prepare phase
Currently, if user calls virDomainAbortJob we just issue
'migrate_cancel' and hope for the best. However, if user calls
the API in wrong phase when migration hasn't been started yet
(perform phase) the cancel request is just ignored. With this
patch, the request is remembered and as soon as perform phase
starts, migration is cancelled.
2012-11-12 10:39:39 +01:00
Viktor Mihajlovski
b1c88c1476 capabilities: defaultConsoleTargetType can depend on architecture
For S390, the default console target type cannot be of type 'serial'.
It is necessary to at least interpret the 'arch' attribute
value of the os/type element to produce the correct default type.

Therefore we need to extend the signature of defaultConsoleTargetType
to account for architecture. As a consequence all the drivers
supporting this capability function must be updated.

Despite the amount of changed files, the only change in behavior is
that for S390 the default console target type will be 'virtio'.

N.B.: A more future-proof approach could be to to use hypervisor
specific capabilities to determine the best possible console type.
For instance one could add an opaque private data pointer to the
virCaps structure (in case of QEMU to hold capsCache) which could
then be passed to the defaultConsoleTargetType callback to determine
the console target type.
Seems to be however a bit overengineered for the use case...

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2012-11-09 09:20:59 -07:00
Peter Krempa
02cf57c0d0 qemu: Fix domain ID numbering race condition
When the libvirt daemon is restarted it tries to reconnect to running
qemu domains. Since commit d38897a5d4 the
re-connection code runs in separate threads. In the original
implementation the maximum of domain ID's (that is used as an
initializer for numbering guests created next) while libvirt was
reconnecting to the guest.

With the threaded implementation this opens a possibility for race
conditions with the thread that is autostarting guests. When there's a
guest running with id 1 and the daemon is restarted. The autostart code
is reached first and spawns the first guest that should be autostarted
as id 1. This results into the following unwanted situation:

 # virsh list
   Id    Name                           State
  ----------------------------------------------------
   1     guest1                         running
   1     guest2                         running

This patch extracts the detection code before the re-connection threads
are started so that the maximum id of the guests being reconnected to is
known.

The only semantic change created by this is if the guest with greatest ID
quits before we are able to reconnect it's ID is used anyway as the
greatest one as without this patch the greatest ID of a process we could
successfuly reconnect to would be used.
2012-11-09 00:12:38 +01:00
Philipp Hahn
e0c469e58b storage: fix broken backing chain
82507838 refactored the code to keep both the raw and canonicalized form
of the backingStore, which breaks badly when the storage pool contains a
storage volume, which is missing its backing store file:
 # ./daemon/libvirtd -l
 2012-11-07 12:43:33.279+0000: 22175: info : libvirt version: 1.0.0
 2012-11-07 12:43:33.279+0000: 22175: error : absolutePathFromBaseFile:542 : Can't canonicalize path '/var/lib/libvirt/images/base.qcow2': No such file or directory
 2012-11-07 12:43:33.280+0000: 22175: error : storageDriverAutostart:115 : Failed to autostart storage pool 'default': Can't canonicalize path '/var/lib/libvirt/images/base.qcow2': No such file or directory

This is because virStorageFileGetMetadataFromBuf() aborts with -1 if the
filename of the backingStore can not be canonicalized:
 #0  absolutePathFromBaseFile () at util/storage_file.c:541
 #1  virStorageFileGetMetadataFromBuf () at util/storage_file.c:728
 #2  virStorageFileGetMetadataFromFD () at util/storage_file.c:932
 #3  virStorageBackendProbeTarget () at storage/storage_backend_fs.c:94
 #4  virStorageBackendFileSystemRefresh () at storage/storage_backend_fs.c:849
 #5  storagePoolStart () at storage/storage_driver.c:700
 #6  virStoragePoolCreate () at libvirt.c:12471
 ...

Treat files which miss their backing file as standalone files.

Signed-off-by: Philipp Hahn <hahn@univention.de>
2012-11-08 16:03:36 -07:00
Peter Krempa
e124f49890 qemu: Fix function header formating of 2 functions
Headers of qemuDomainSnapshotLoad and qemuDomainNetsRestart were
improperly formatted.
2012-11-08 13:45:45 +01:00
Peter Krempa
9b5a514b31 snapshot: qemu: Add support for external inactive snapshots
This patch adds support for external disk snapshots of inactive domains.
The snapshot is created by calling using qemu-img by calling:

 qemu-img create -f format_of_snapshot -o
 backing_file=/path/to/src,backing_fmt=format_of_backing_image
 /path/to/snapshot

in case the backing image format is known or probing is allowed and
otherwise:

 qemu-img create -f format_of_snapshot -o  backing_file=/path/to/src
 /path/to/snapshot

on each of the disks selected for snapshotting. This patch also modifies
the snapshot preparing function to support creating external snapshots
and to sanitize arguments. For now the user isn't able to mix external
and internal snapshots but this restriction might be lifted in the
future.
2012-11-08 11:27:34 +01:00
Michal Privoznik
a08fc66d90 qemu: Emit event if 'cont' fails
Some operations, APIs needs domain to be paused prior operation can be
performed, e.g. (managed-) save of a domain. The processors should be
restored in the end. However, if 'cont' fails for some reason, we log a
message but this is not sufficient as an event should be emitted as
well. Mgmt application can then decide what to do.
2012-11-07 12:06:09 +01:00
Peter Krempa
fb58f8e2a4 qemu: Don't corrupt pointer in qemuDomainSaveMemory()
The code that was split out into the qemuDomainSaveMemory expands the
pointer containing the XML description of the domain that it gets from
higher layers. If the pointer changes the old one is invalid and the
upper layer function tries to free it causing an abort.

This patch changes the expansion of the original string to a new
allocation and copy of the contents.
2012-11-06 14:45:27 +01:00
Martin Kletzander
9c294e6f9a esx: Yet another connection fix for 5.1
After the connection to ESX 5.1 being broken since g1e7cd39, the fix
in bab7752c helped a bit, but still missed a spot, so the connection
is now successful, but some APIs (for example defineXML) don't work.
Two cases missing are added in this patch to avoid that.
2012-11-06 11:09:00 +01:00
Michal Privoznik
0f720ab35a qemu: Add controllers in specified order
qemu is sensitive to the order of arguments passed. Hence, if a
device requires a controller, the controller cmd string must
precede device cmd string. The same apply for controllers, when
for instance ccid controller requires usb controller. So
controllers create partial ordering in which they should be added
to qemu cmd line.
2012-11-06 10:11:34 +01:00
Michal Privoznik
77b93dbc3e qemu: Wrap controllers code into dummy loop
which just re-indent code and prepare it for next patch.
2012-11-06 10:11:34 +01:00
Michal Privoznik
46325e5131 iohelper: Don't report errors on special FDs
Some FDs may not implement fdatasync() functionality,
e.g.  pipes. In that case EINVAL or EROFS is returned.
We don't want to fail then nor report any error.

Reported-by: Christophe Fergeau <cfergeau@redhat.com>
2012-11-05 16:55:42 +01:00
Peter Krempa
0dac29d89f snapshot: qemu: Remove restrictions preventing external checkpoints
Some of the pre-snapshot check have restrictions wired in regarding
configuration options that influence taking of external checkpoints.

This patch removes restrictions that would inhibit taking of such a
snapshot.
2012-11-04 20:17:57 +01:00
Peter Krempa
f569b87f51 snapshot: qemu: Add support for external checkpoints
This patch adds support to take external system checkpoints.

The functionality is layered on top of the previous disk-only snapshot
code. When the checkpoint is requested the domain memory is saved to the
memory image file using migration to file. (The user may specify to
take the memory image while the guest is live with the
VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag.)

The memory save image shares format with the image created by
virDomainSave() API.
2012-11-04 16:53:32 +01:00
Peter Krempa
b5fd404471 snapshot: qemu: Rename qemuDomainSnapshotCreateActive
Before now, libvirt supported only internal snapshots for active guests.
This patch renames this function to qemuDomainSnapshotCreateActiveInternal
to prepare the grounds for external active snapshots.
2012-11-03 15:06:09 +01:00
Peter Krempa
2a59a3d597 snapshot: qemu: Add async job type for snapshots
The new external system checkpoints will require an async job while the
snapshot is taken. This patch adds QEMU_ASYNC_JOB_SNAPSHOT to track this
job type.
2012-11-03 14:57:43 +01:00
Peter Krempa
5f75bd4bbe snapshot: Add flag to enable creating checkpoints in live state
The default behavior while creating external checkpoints is to pause the
guest while the memory state is captured. We want the users to sacrifice
space saving for creating the memory save image while the guest is live
to minimize downtime.

This patch adds a flag that causes the guest not to be paused before
taking the snapshot.
 *include/libvirt/libvirt.h.in:
    - add new paused reason: VIR_DOMAIN_PAUSED_SNAPSHOT
    - add new flag for taking snapshot: VIR_DOMAIN_SNAPSHOT_CREATE_LIVE
 *tools/virsh-domain-monitor.c:
    - add string representation for VIR_DOMAIN_PAUSED_SNAPSHOT
 *tools/virsh-snapshot.c:
    - add support for VIR_DOMAIN_SNAPSHOT_CREATE_LIVE
 *tools/virsh.pod:
    - add docs for --live option added to use
    VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag
2012-11-03 14:43:01 +01:00
Peter Krempa
2771f8b74c qemu: Split out domain memory saving code to allow reuse
The code that saves domain memory by migration to file can be reused
while doing external checkpoints of a machine. This patch extracts the
common code and places it in a separate function.
2012-11-03 11:49:41 +01:00
Peter Krempa
ec69ca14f9 qemu: Clean up snapshot retrieval to use the new helper
Two other places were left with the old code to look up snapshots.
Change them to use the snapshot lookup helper.
2012-11-03 11:26:39 +01:00
Peter Krempa
d38b934c49 cpu: Add AMD Opteron G5 cpu model 2012-11-02 20:57:17 +01:00
Peter Krempa
bafffe7a10 cpu: Add newly added cpu flags
This patch adds a few new processor feature flags. Namely:
 f16c rdrand lwp tbm topoext perfctr_core perfctr_nb fsgsbase bmi1 hle
 avx2 bmi2 erms invpcid rtm rdseed adx tce
2012-11-02 20:52:40 +01:00
Peter Krempa
d0fc6dc831 qemu: Fix possible race when pausing guest
When pausing the guest while migration is running (to speed up
convergence) the virDomainSuspend API checks if the migration job is
active before entering the job. This could cause a possible race if the
virDomainSuspend is called while the job is active but ends before the
Suspend API enters the job (this would require that the migration is
aborted). This would cause a incorrect event to be emitted.
2012-11-02 20:18:46 +01:00
Eric Blake
de76cae971 snapshot: merge pre-snapshot checks
Both system checkpoint snapshots and disk snapshots were iterating
over all disks, doing a final sanity check before doing any work.
But since future patches will allow offline snapshots to be either
external or internal, it makes sense to share the pass over all
disks, and then relax restrictions in that pass as new modes are
implemented.  Future patches can then handle external disks when
the domain is offline, then handle offline --disk-snapshot, and
finally, combine with migration to file to gain a complete external
system checkpoint snapshot of an active domain without using 'savevm'.

* src/qemu/qemu_driver.c (qemuDomainSnapshotDiskPrepare)
(qemuDomainSnapshotIsAllowed): Merge...
(qemuDomainSnapshotPrepare): ...into one function.
(qemuDomainSnapshotCreateXML): Update caller.
2012-11-02 10:19:03 -06:00
Eric Blake
e260e401a5 snapshot: populate new XML info for qemu snapshots
Now that the XML supports listing internal snapshots, it is worth
always populating the <memory> and <disks> element to match.

* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Always
parse disk info and set memory info.
2012-11-02 10:11:50 -06:00
Eric Blake
f9670bf8a4 snapshot: improve disk align checking
There were not previous callers with require_match set to true.
I originally implemented this bool with the intent of supporting
ESX snapshot semantics, where the choice of internal vs. external
vs. non-checkpointable must be made at domain start, but as ESX
has not been wired up to use it yet, we might as well fix it to
work with our next qemu patch for now, and worry about any further
improvements (changing the bool to a flags argument) if the ESX
driver decides to use this function in the future.

* src/conf/snapshot_conf.c (virDomainSnapshotAlignDisks): Alter
logic when require_match is true to deal with new XML.
2012-11-02 10:02:57 -06:00
Eric Blake
4201a7ea1c snapshot: new XML for external system checkpoint
Each <domainsnapshot> can now contain an optional <memory>
element that describes how the VM state was handled, similar
to disk snapshots.  The new element will always appear in
output; for back-compat, an input that lacks the element will
assume 'no' or 'internal' according to the domain state.

Along with this change, it is now possible to pass <disks> in
the XML for an offline snapshot; this also needs to be wired up
in a future patch, to make it possible to choose internal vs.
external on a per-disk basis for each disk in an offline domain.
At that point, using the --disk-only flag for an offline domain
will be able to work.

For some examples below, remember that qemu supports the
following snapshot actions:

qemu-img: offline external and internal disk
savevm: online internal VM and disk
migrate: online external VM
transaction: online external disk

=====
<domainsnapshot>
  <memory snapshot='no'/>
  ...
</domainsnapshot>

implies that there is no VM state saved (mandatory for
offline and disk-only snapshots, not possible otherwise);
using qemu-img for offline domains and transaction for online.

=====
<domainsnapshot>
  <memory snapshot='internal'/>
  ...
</domainsnapshot>

state is saved inside one of the disks (as in qemu's 'savevm'
system checkpoint implementation).  If needed in the future,
we can also add an attribute pointing out _which_ disk saved
the internal state; maybe disk='vda'.

=====
<domainsnapshot>
  <memory snapshot='external' file='/path/to/state'/>
  ...
</domainsnapshot>

This is not wired up yet, but future patches will allow this to
control a combination of 'virsh save /path/to/state' plus disk
snapshots from the same point in time.

=====

So for 1.0.1 (and later, as needed), I plan to implement this table
of combinations, with '*' designating new code and '+' designating
existing code reached through new combinations of xml and/or the
existing DISK_ONLY flag:

domain  memory  disk   disk-only | result
-----------------------------------------
offline omit    omit   any       | memory=no disk=int, via qemu-img
offline no      omit   any       |+memory=no disk=int, via qemu-img
offline omit/no no     any       | invalid combination (nothing to snapshot)
offline omit/no int    any       |+memory=no disk=int, via qemu-img
offline omit/no ext    any       |*memory=no disk=ext, via qemu-img
offline int/ext any    any       | invalid combination (no memory to save)
online  omit    omit   off       | memory=int disk=int, via savevm
online  omit    omit   on        | memory=no disk=default, via transaction
online  omit    no/ext off       | unsupported for now
online  omit    no     on        | invalid combination (nothing to snapshot)
online  omit    ext    on        | memory=no disk=ext, via transaction
online  omit    int    off       |+memory=int disk=int, via savevm
online  omit    int    on        | unsupported for now
online  no      omit   any       |+memory=no disk=default, via transaction
online  no      no     any       | invalid combination (nothing to snapshot)
online  no      int    any       | unsupported for now
online  no      ext    any       |+memory=no disk=ext, via transaction
online  int/ext any    on        | invalid combination (disk-only vs. memory)
online  int     omit   off       |+memory=int disk=int, via savevm
online  int     no/ext off       | unsupported for now
online  int     int    off       |+memory=int disk=int, via savevm
online  ext     omit   off       |*memory=ext disk=default, via migrate+trans
online  ext     no     off       |+memory=ext disk=no, via migrate
online  ext     int    off       | unsupported for now
online  ext     ext    off       |*memory=ext disk=ext, via migrate+transaction

* docs/schemas/domainsnapshot.rng (memory): New RNG element.
* docs/formatsnapshot.html.in: Document it.
* src/conf/snapshot_conf.h (virDomainSnapshotDef): New fields.
* src/conf/domain_conf.c (virDomainSnapshotDefFree)
(virDomainSnapshotDefParseString, virDomainSnapshotDefFormat):
Manage new fields.
* tests/domainsnapshotxml2xmltest.c: New test.
* tests/domainsnapshotxml2xmlin/*.xml: Update existing tests.
* tests/domainsnapshotxml2xmlout/*.xml: Likewise.
2012-11-02 09:56:23 -06:00
Eric Blake
e66bdbb784 snapshot: simplify OOM checking during parse
* src/conf/snapshot_conf.c (virDomainSnapshotDefParseString):
Simplify OOM reporting.
2012-11-02 09:43:49 -06:00
Daniel P. Berrange
1c04f99970 Remove spurious whitespace between function name & open brackets
The libvirt coding standard is to use 'function(...args...)'
instead of 'function (...args...)'. A non-trivial number of
places did not follow this rule and are fixed in this patch.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-02 13:36:49 +00:00
Peter Krempa
0211fd6e04 net: Mark network persistent when assigning persistent definition
When assigning the new persistent definition for a transient network
(thus making it persistent) the network needs to be marked persistent
before actually atempting to assign the definition.
2012-11-02 13:28:40 +01:00
Peter Krempa
fa16957ccd net: Add support for changing persistent networks to transient
Until now, the network undefine API was able to undefine only inactive
networks. The restriction doesn't make sense any more so this patch
implements changing networks to transient.
2012-11-02 13:28:40 +01:00
Peter Krempa
b6dbbae128 net: Re-use checks when creating transient networks
When a transient network was created some of the checks weren't run on
the definition allowing to start invalid networks.

This patch splits out code to the network validation function and
re-uses that code when creating transient networks.
2012-11-02 13:28:40 +01:00
Peter Krempa
e87af617fc net: Remove dnsmasq and radvd files also when destroying transient nets
The network driver didn't care about config files when a network was
destroyed, just when it was undefined leaving behind files for transient
networks.

This patch splits out the cleanup code to a helper function that handles
the cleanup if the inactive network object is being removed and re-uses
this code when getting rid of inactive networks.
2012-11-02 13:28:40 +01:00
Peter Krempa
23ae3fe425 net: Move creation of dnsmasq hosts file to function starting dnsmasq
The hosts file was created in the network definition function. This
patch moves the place the file is being created to the point where
dnsmasq is being started.
2012-11-02 13:28:40 +01:00
Peter Krempa
a3258c0eb9 net: Change argument type of virNetworkObjIsDuplicate()
The argument check_active is used only as a boolean so this patch
changes the type and updates callers.
2012-11-02 13:28:39 +01:00
Peter Krempa
f823089124 conf: net: Fix deadlock if assignment of network def fails
When the assignment fails, the network object is not unlocked and next
call that would use it deadlocks.
2012-11-02 13:28:39 +01:00
Peter Krempa
947230fb56 conf: net: Fix helper for applying new network definition
When there's no new definition the helper overwrote the old one with
NULL.
2012-11-02 13:28:39 +01:00
Daniel Veillard
bd0cb27cf6 Remove a chunk which should not have been pushed as part of 1.0.0
I didn't noticed that that small old patch was still applied locally
2012-11-02 19:23:13 +08:00
Michal Privoznik
30b398d5ef logging.c: Properly indent and ignore one syntax-check rule
With our fix of mkostemp (pushed as 2b435c15) we define a macro
to compile with uclibc. However, this definition is conditional
and thus needs to be properly indented. Moreover, with this definition
sc_prohibit_mkstemp syntax-check rule keeps yelling:

  src/util/logging.c:63:# define mkostemp(x,y) mkstemp(x)
  maint.mk: use mkostemp with O_CLOEXEC instead of mkstemp

Therefore we should ignore this file for this rule.
2012-11-02 11:19:04 +01:00
Guannan Ren
1851a0c864 qemu: use default machine type if missing it in qemu command line
BZ:https://bugzilla.redhat.com/show_bug.cgi?id=871273
when using virsh qemu-attach to attach an existing qemu process,
if it misses the -M option in qemu command line, libvirtd crashed
because the NULL value of def->os.machine in later use.

Example:
/usr/libexec/qemu-kvm -name foo \
                      -cdrom /var/lib/libvirt/images/boot.img \
                      -monitor unix:/tmp/demo,server,nowait \

error: End of file while reading data: Input/output error
error: Failed to reconnect to the hypervisor

This patch tries to set default machine type if the value of
def->os.machine is still NULL after qemu command line parsing.
2012-11-02 12:55:29 +08:00
Daniel Veillard
2b435c153e Release of libvirt-1.0.0
* configure.ac docs/news.html.in libvirt.spec.in: update for the new release
* po/*.po*: update from transifex, a lot of added support e.g. Indian
  languages, and regenerate
2012-11-02 12:08:11 +08:00
Eric Blake
3d0130cbcc cpumap: optimize for clients that don't need online count
It turns out that calling virNodeGetCPUMap(conn, NULL, NULL, 0)
is both useful, and with Viktor's patches, common enough to
optimize.  Since this interface hasn't been released yet, we
can change the RPC call.

A bit more background on the optimization - learning the cpu count
is a single file read (/sys/devices/system/cpu/possible), but
learning the number of online cpus can possibly trigger a file
read per cpu, depending on the age of the kernel, and all wasted
if the caller passed NULL for both arguments.

* src/nodeinfo.c (nodeGetCPUMap): Avoid bitmap when not needed.
* src/remote/remote_protocol.x (remote_node_get_cpu_map_args):
Supply two separate flags for needed arguments.
* src/remote/remote_driver.c (remoteNodeGetCPUMap): Update
caller.
* daemon/remote.c (remoteDispatchNodeGetCPUMap): Likewise.
* src/remote_protocol-structs: Regenerate.
2012-11-01 20:36:01 -06:00
Doug Goldstein
ba804d9fd1 qemu: QMP capabilities support starts with 1.2
Per the code comment in qemuCapsInitQMPBasic() and commit 43e23c7, we
should only use QMP for capabilities probing starting with 1.2 and
newer.  The old code had dead logic that probed on 1.0 and newer.

Signed-off-by: Eric Blake <eblake@redhat.com>
2012-11-01 17:50:02 -06:00
Dan Walsh
2e03b08ead Linux Containers are not allowed to create device nodes.
This needs to be done before the container starts. Turning
off the mknod capability is noticed by systemd, which will
no longer attempt to create device nodes.

This eliminates SELinux AVC messages and ugly failure messages in the journal.
2012-11-01 15:14:25 -06:00
Stefan Hajnoczi
23d47b33a2 qemu: Fix name comparison in qemuMonitorJSONBlockIoThrottleInfo()
The string comparison logic was inverted and matched the first drive
that does *not* have the name we search for.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2012-11-01 13:23:27 -06:00
Stefan Hajnoczi
04ee70bfda qemu: Keep QEMU host drive prefix in BlkIoTune
The QEMU -drive id= begins with libvirt's QEMU host drive prefix
("drive-"), which is stripped off in several places two convert between
host ("-drive") and guest ("-device") device names.

In the case of BlkIoTune it is unnecessary to strip the QEMU host drive
prefix because we operate on "info block"/"query-block" output that uses
host drive names.

Stripping the prefix incorrectly caused string comparisons to fail since
we were comparing the guest device name against the host device name.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2012-11-01 13:03:26 -06:00
Michal Privoznik
f32e3a2dd6 iohelper: fdatasync() at the end
Currently, when we are doing (managed) save, we insert the
iohelper between the qemu and OS. The pipe is created, the
writing end is passed to qemu and the reading end to the
iohelper. It reads data and write them into given file. However,
with write() being asynchronous data may still be in OS
caches and hence in some (corner) cases, all migration data
may have been read and written (not physically though). So
qemu will report success, as well as iohelper. However, with
some non local filesystems, where ENOSPACE is polled every X
time units, we may get into situation where all operations
succeeded but data hasn't reached the disk. And in fact will
never do. Therefore we ought sync caches to make sure data
has reached the block device on remote host.
2012-11-01 16:55:01 +01:00
Peter Krempa
8cd327fa7f conf: Fix private symbols exported by files in conf
Some of the functions were moved to other files but the private symbol
file wasn't tweaked to reflect that.
2012-11-01 10:21:52 +01:00
Daniel P. Berrange
6fea88a119 Fix arch detection for qemu-system-i386 with QMP
QEMU uses 'i386' for its 32-bit x86 architecture, but libvirt
wants that to be 'i686', so we must fix it up

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-01 09:16:37 +00:00
Daniel P. Berrange
6bf55a9752 Don't assume pid_t is the same size as an int
virPidFileReadPathIfAlive passed in an 'int *' where a 'pid_t *'
was expected, which breaks on Mingw64 targets. Also a few places
were using '%d' for formatting pid_t, change them to '%lld' and
force a cast to the longer type as done elsewhere in the same
file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-01 09:16:04 +00:00
Eric Blake
4dbd6e9654 build: prefer mkostemp for multi-thread safety
https://bugzilla.redhat.com/show_bug.cgi?id=871756

Commit cd1e8d1 assumed that systems new enough to have journald
also have mkostemp; but this is not true for uclibc.

For that matter, use of mkstemp[s] is unsafe in a multi-threaded
program.  We should prefer mkostemp[s] in the first place.

* bootstrap.conf (gnulib_modules): Add mkostemp, mkostemps; drop
mkstemp and mkstemps.
* cfg.mk (sc_prohibit_mkstemp): New syntax check.
* tools/virsh.c (vshEditWriteToTempFile): Adjust caller.
* src/qemu/qemu_driver.c (qemuDomainScreenshot)
(qemudDomainMemoryPeek): Likewise.
* src/secret/secret_driver.c (replaceFile): Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainScreenshot): Likewise.
2012-10-31 10:06:10 -06:00
Martin Kletzander
10c5212b10 qemu: Fix EmulatorPinInfo without emulatorpin
https://bugzilla.redhat.com/show_bug.cgi?id=871312

Recent fixes made almost all the right steps to make emulator pinned
to the cpuset of the whole domain in case <emulatorpin> isn't
specified, but qemudDomainGetEmulatorPinInfo still reports all the
CPUs even when cpuset is specified.  This patch fixes that.
2012-10-31 16:27:02 +01:00
Peter Krempa
ca043b8c06 util: Improve error reporting from absolutePathFromBaseFile helper
There are multiple reasons canonicalize_file_name() used in
absolutePathFromBaseFile helper can fail. This patch enhances error
reporting from that helper.
2012-10-31 11:53:07 +01:00
Martin Kletzander
037a49dc66 Make non-KVM machines work with QMP probing
When there is no 'qemu-kvm' binary and the emulator used for a machine
is, for example, 'qemu-system-x86_64' that, by default, runs without
kvm enabled, libvirt still supplies '-no-kvm' option to this process,
even though it does not recognize such option (making the start of a
domain fail in that case).

This patch fixes building a command-line for QEMU machines without KVM
acceleration and is based on following assumptions:

 - QEMU_CAPS_KVM flag means that QEMU is running KVM accelerated
   machines by default (without explicitly requesting that using a
   command-line option).  It is the closest to the truth according to
   the code with the only exception being the comment next to the
   flag, so it's fixed in this patch as well.

 - QEMU_CAPS_ENABLE_KVM flag means that QEMU is, by default, running
   without KVM acceleration and in case we need KVM acceleration it
   needs to be explicitly instructed to do so.  This is partially
   true for the past (this option essentially means that QEMU
   recognizes the '-enable-kvm' option, even though it's almost the
   same).
2012-10-31 08:31:49 +01:00
Gene Czarcinski
adaa7ab653 bugfix: ip6tables rule removal
Three FORWARD chain rules are added and two INPUT chain rules
are added when a network is started but only the FORWARD chain
rules are removed when the network is destroyed.
2012-10-30 16:04:25 -06:00
Eric Blake
270a9fef37 maint: log xml during volume creation
I noticed this while answering a list question about Java bindings
of volume creation.  All other functions that take xml logged xmlDesc.

* src/libvirt.c (virStorageVolCreateXML)
(virStorageVolCreateXMLFrom): Use consistent spelling of xmlDesc,
and log the argument.
2012-10-30 14:59:31 -06:00
Laine Stump
7bafe009d9 util: do a better job of matching up pids with their binaries
This patch resolves: https://bugzilla.redhat.com/show_bug.cgi?id=871201

If libvirt is restarted after updating the dnsmasq or radvd packages,
a subsequent "virsh net-destroy" will fail to kill the dnsmasq/radvd
process.

The problem is that when libvirtd restarts, it re-reads the dnsmasq
and radvd pidfiles, then does a sanity check on each pid it finds,
including checking that the symbolic link in /proc/$pid/exe actually
points to the same file as the path used by libvirt to execute the
binary in the first place. If this fails, libvirt assumes that the
process is no longer alive.

But if the original binary has been replaced, the link in /proc is set
to "$binarypath (deleted)" (it literally has the string " (deleted)"
appended to the link text stored in the filesystem), so even if a new
binary exists in the same location, attempts to resolve the link will
fail.

In the end, not only is the old dnsmasq/radvd not terminated when the
network is stopped, but a new dnsmasq can't be started when the
network is later restarted (because the original process is still
listening on the ports that the new process wants).

The solution is, when the initial "use stat to check for identical
inodes" check for identity between /proc/$pid/exe and $binpath fails,
to check /proc/$pid/exe for a link ending with " (deleted)" and if so,
truncate that part of the link and compare what's left with the
original binarypath.

A twist to this problem is that on systems with "merged" /sbin and
/usr/sbin (i.e. /sbin is really just a symlink to /usr/sbin; Fedora
17+ is an example of this), libvirt may have started the process using
one path, but /proc/$pid/exe lists a different path (indeed, on F17
this is the case - libvirtd uses /sbin/dnsmasq, but /proc/$pid/exe
shows "/usr/sbin/dnsmasq"). The further bit of code to resolve this is
to call virFileResolveAllLinks() on both the original binarypath and
on the truncated link we read from /proc/$pid/exe, and compare the
results.

The resulting code still succeeds in all the same cases it did before,
but also succeeds if the binary was deleted or replaced after it was
started.
2012-10-30 13:28:47 -04:00
Peter Krempa
7af929d065 cpu: Fix definition of flag smap
A mild case of dyslexia caused that commit
012f9b19ef specifies wrong mask for the
smap cpu feature flag. This patch fixes that mistake.
2012-10-30 15:01:27 +01:00
Michal Privoznik
9af1b30da3 sanlock: Introduce 'user' and 'group' conf variables
through which user set under what permissions does sanlock
daemon run so libvirt will set the same permissions for
files exposed to it.
2012-10-30 10:12:10 +01:00
Vladislav Bogdanov
81af5336ac qemu: pass -usb and usb hubs earlier, so USB disks with static address are handled properly 2012-10-30 08:54:32 +01:00
Vladislav Bogdanov
8f708761c0 qemu: Do not ignore address for USB disks 2012-10-30 08:54:28 +01:00
Martin Kletzander
bab7752c0c esx: Fix connection to ESX 5.1
After separating 5.x and 5.1 versions of ESX, we forgot to add 5.1
into the list of allowed connections, so connections to 5.1 fail since
v1.0.0-rc1-5-g1e7cd39
2012-10-30 08:35:24 +01:00
Eric Blake
c047f54749 build: place attributes in correct location
Ever since commit eefb881, ATTRIBUTE_NONNULL has normally been a
no-op under gcc (since it tends to cause more bugs than it cures
given gcc's current lame implementation of the attribute).  However,
the macro is still useful to Coverity and other static-analysis
tools, but only if we use it correctly.  Coverity follows gcc's lead
in accepting function declarations with attributes at the end, but
function bodies must attach attributes to the return type.  That is,
these are valid:

void foo(void *arg) ATTRIBUTE_NONNULL(1);
void ATTRIBUTE_NONNULL(1) foo(void *arg);
void ATTRIBUTE_NONNULL(1) foo(void *arg) {}

but this is not:

void foo(void *arg) ATTRIBUTE_NONNULL(1) {}

even though you don't get a compile failure until you do static
analysis.  Bug introduced in commit 80533ca, with these symptoms:

nodeinfo.c:206: error: expected ',' or ';' before '{' token
cc1: warning: unrecognized command line option "-Wno-suggest-attribute=const"
cc1: warning: unrecognized command line option "-Wno-suggest-attribute=pure"
make[3]: *** [libvirt_driver_la-nodeinfo.lo] Error 1

* src/nodeinfo.c (virNodeParseNode): Fix syntax error when
non-null attribute is in use.
2012-10-29 16:53:44 -06:00
Eric Blake
a047a24d11 build: fix linking with systemtap probes
Commit 34e8f63a3 altered virfile.o to drag in additional symbols,
which in turn led to pulling in other .o files and eventually causing
a link failure when systemtap probes are enabled, such as:

./.libs/libvirt_util.a(libvirt_util_la-event_poll.o): In function `virEventPollRunOnce':
/home/dummy/libvirt/src/util/event_poll.c:614: undefined reference to `libvirt_event_poll_run_semaphore'
./.libs/libvirt_util.a(libvirt_util_la-event_poll.o):(.note.stapsdt+0x24): undefined reference to `libvirt_event_poll_add_handle_semaphore'

Even though libvirt_iohelper and libvirt_parthelper don't directly
use the portion of virfile.o that drags in probing, it was easier
to satisfy the linker and get the build back up, than to figure out
whether it is even possible or worth trying to disentangle the mess.

* src/Makefile.am (libvirt_iohelper_LDADD)
(libvirt_parthelper_LDADD): Use libvirt_probes.lo when needed.
2012-10-29 14:22:28 -06:00
Michal Privoznik
34e8f63a32 qemu: Report errors from iohelper
Currently, we use iohelper when saving/restoring a domain.
However, if there's some kind of error (like I/O) it is not
propagated to libvirt. Since it is not qemu who is doing
the actual write() it will not get error. The iohelper does.
Therefore we should check for iohelper errors as it makes
libvirt more user friendly.
2012-10-29 17:04:26 +01:00
Peter Krempa
cbd10126ed util: Re-format literal strings in virXMLEmitWarning
And drop a stray space at the end of the first line of the warning.
2012-10-29 15:19:26 +01:00
Ján Tomko
0b121614a2 xml: print uuids in the warning
In the XML warning, we print a virsh command line that can be used to
edit that XML. This patch prints UUIDs if the entity name contains
special characters (like shell metacharacters, or "--" that would break
parsing of the XML comment). If the entity doesn't have a UUID, just
print the virsh command that can be used to edit it.
2012-10-29 14:38:43 +01:00
Jiri Denemark
23f5e74ed3 Revert "qemu: Do not require hostuuid in migration cookie"
This reverts commit 8d75e47ede.

Libvirt was never released with support for migration cookies without
hostuuid.
2012-10-29 09:04:27 +01:00
Cole Robinson
9a2975786b qemu: Fix domxml-to-native network model conversion
https://bugzilla.redhat.com/show_bug.cgi?id=636832
2012-10-27 12:20:49 -04:00
Eric Blake
dd0a7040f7 build: typo fix for qemu cpu affinity
Introduced in commit 0039a32f.

* src/qemu/qemu_process.c (qemuPrepareCpumap): s/covert/convert/
2012-10-27 08:09:51 -06:00
Eric Blake
5a3501be9e blockjob: relabel entire existing chain
When using block copy to pivot over to a new chain, the backing files
for the new chain might still need labeling (particularly if the user
passes --reuse-ext with a relative backing file name).  Relabeling a
file that is already labeled won't hurt, so this just labels the entire
chain at the point of the pivot.  Doing the relabel of the chain uses
the fact that we already safely probed the file type of an external
file at the start of the block copy.

* src/qemu/qemu_driver.c (qemuDomainBlockPivot): Relabel chain before
asking qemu to pivot.
2012-10-27 07:43:39 -06:00
Eric Blake
35c7701c64 blockjob: allow mirroring under SELinux and cgroup
Use the recent addition of qemuDomainPrepareDiskChainElement to
obtain locking manager lease, permit a block device through cgroups,
and set the SELinux label; then audit the fact that we hand a new
file over to qemu.  Alas, releasing the lease and label at the end
of the mirroring is a trickier prospect (we would have to trace the
backing chain of both source and destination, and be sure not to
revoke rights to any part of the chain that is shared), so for now,
virDomainBlockJobAbort still leaves things with additional access
granted (as block-pull and block-commit have the same problem of
not clamping access after completion, a future cleanup would cover
all three commands).

* src/qemu/qemu_driver.c (qemuDomainBlockCopy): Set up labeling.
2012-10-27 07:43:39 -06:00
Eric Blake
8ee5073c1e blockjob: allow for existing files in block-copy
Support the REUSE_EXT flag, in part by copying sanity checks from
snapshot code.  This code introduces a case of probing an external
file for its type; such an action would be a security risk if the
existing file is supposed to be raw but the contents resemble some
other format; however, since the virDomainBlockRebase API has a
flag to force treating the file as raw rather than probe, we can
assume that probing is safe in all other instances.  Besides, if
we don't probe or force raw, then qemu will.

* src/qemu/qemu_driver.c (qemuDomainBlockRebase): Allow REUSE_EXT
flag.
(qemuDomainBlockCopy): Wire up flag, and add some sanity checks.
2012-10-27 07:43:39 -06:00
Eric Blake
c1eb38053d blockjob: implement block copy for qemu
Minimal patch to wire up all the pieces in the previous patches
to actually enable a block copy job.  By minimal, I mean that
qemu creates the file (that is, no REUSE_EXT flag support yet),
SELinux must be disabled, a lock manager is not informed, and the
audit logs aren't updated.  But those will be added as
improvements in future patches.

This patch is designed so that if we ever add a future API
virDomainBlockCopy with more bells and whistles (such as letting
the user specify a destination image format different than the
source), where virDomainBlockRebase is a wrapper around the
simpler portions of the new functionality, then the new API can
just reuse the new qemuDomainBlockCopy function and already
support _SHALLOW and _REUSE_EXT flags.  Also note that libvirt.c
already filtered the new flags if _COPY is not present, so that
we are not impacting the case of BlockRebase being a wrapper
around BlockPull.

* src/qemu/qemu_driver.c (qemuDomainBlockCopy): New function.
(qemuDomainBlockRebase): Call it when appropriate.
2012-10-27 07:43:39 -06:00
Eric Blake
400ac797ef blockjob: make block pivot safer
Since libvirt drops locks between issuing a monitor command and
getting a response, it is possible for libvirtd to be restarted
before getting a response on a block-job-complete command; worse, it
is also possible for the guest to shut itself down during the window
while libvirtd is down, ending the qemu process.  A management app
needs to know if the pivot happened (and the destination file
contains guest contents not in the source) or failed (and the source
file contains guest contents not in the destination), but since
the job is finished, 'query-block-jobs' no longer tracks the
status of the job, and if the qemu process itself has disappeared,
even 'query-block' cannot be checked to ask qemu its current state.

At the time of this patch, the design for persistent bitmap has not
been clarified, so a followup patch will be needed once qemu
actually figures out how to expose it, and we figure out how to use
it.  In the meantime, we have a solution that avoids the worst of
the problem.  [This problem was first analyzed with the RHEL 6.3
__com.redhat_drive-reopen command; which partly explains why
upstream qemu 1.3 ditched the drive-reopen idea and went with
block-job-complete plus persistent bitmap instead.]

If we surround 'drive-reopen' with a pause/resume pair, then we can
guarantee that the guest cannot modify either source or destination
files in the window of libvirtd uncertainty, and the management app
is guaranteed that either libvirt knows the outcome and reported it
correctly; or that on libvirtd restart, the guest will still be
paused and that the qemu process cannot have disappeared due to
guest shutdown; and use that as a clue that the management app must
implement recovery protocol, with both source and destination files
still being in sync and with 'query-block' still being an option as
part of that recovery.  My testing shows that the pause window will
typically be only a fraction of a second.

* src/qemu/qemu_driver.c (qemuDomainBlockPivot): Pause around
drive-reopen.
(qemuDomainBlockJobImpl): Update caller.
2012-10-27 07:43:38 -06:00
Eric Blake
eaba79d22e blockjob: support pivot operation on cancel
This is the bare minimum to end a copy job (of course, until a
later patch adds the ability to start a copy job, this patch
doesn't do much in isolation; I've just split the patches to
ease the review).

This patch intentionally avoids SELinux, lock manager, and audit
actions.  Also, if libvirtd restarts at the exact moment that a
'block-job-complete' is in flight, the proposed proper way to
detect the outcome of that would be with a persistent bitmap and
some additional query commands when libvirtd restarts.  This
patch is enough to test the common case of success when used
correctly, while saving the subtleties of proper cleanup for
worst-case errors for later.

When a mirror job is started, cancelling the job safely reverts back
to the source disk, regardless of whether the destination is in
phase 1 (streaming, in which case the destination is worthless) or
phase 2 (mirroring, in which case the destination is synced up to
the source at the time of the cancel).  Our existing code does just
fine in either phase, other than some bookkeeping cleanup; this
implements live block copy.

Ideas for future enhancements via new flags:

Depending on when persistent bitmap support is added, it may be
worth adding a VIR_DOMAIN_REBASE_COPY_ATOMIC flag that fails up
front if we detect an older qemu with risky pivot operation.

Interesting side note: while snapshot-create --disk-only creates a
copy of the disk at a point in time by moving the domain on to a
new file (the copy is the file now in the just-extended backing
chain), blockjob --abort of a copy job creates a copy of the disk
while keeping the domain on the original file.  There may be
potential improvements to the snapshot code to exploit block copy
over multiple disks all at one point in time.  And, if
'block-job-cancel' were made part of 'transaction', you could
copy multiple disks at the same point in time without pausing
the domain.  This also implies we may want to add a --quiesce flag
to virDomainBlockJobAbort, so that when breaking a mirror (whether
by cancel or pivot), the side of the mirror that we are abandoning
is at least in a stable state with regards to guest I/O.

* src/qemu/qemu_driver.c (qemuDomainBlockJobAbort): Accept new flag.
(qemuDomainBlockPivot): New helper function.
(qemuDomainBlockJobImpl): Implement it.
2012-10-27 07:43:38 -06:00
Eric Blake
edecd45c78 blockjob: return appropriate event and info
Handle the new type of block copy event and info.  Of course,
this patch does nothing until a later patch actually allows the
creation/abort of a block copy job.

* include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_JOB_READY): New
block job status.
* src/libvirt.c (virDomainBlockRebase): Document the event.
* src/qemu/qemu_monitor_json.c (eventHandlers): New event.
(qemuMonitorJSONHandleBlockJobReady): New function.
(qemuMonitorJSONGetBlockJobInfoOne): Translate new job type.
(qemuMonitorJSONHandleBlockJobImpl): Handle new event and job type.
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Recognize
the event to minimize snooping.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Snoop a successful
info query to save effort on a pivot request.
2012-10-27 07:43:38 -06:00
Eric Blake
b3822ed04a blockjob: react to active block copy
For now, disk migration via block copy job is not implemented in
libvirt.  But when we do implement it, we have to deal with the
fact that qemu does not yet provide an easy way to re-start a qemu
process with mirroring still intact.  Paolo has proposed an idea
for a persistent dirty bitmap that might make this possible, but
until that design is complete, it's hard to say what changes
libvirt would need.  Even something like 'virDomainSave' becomes
hairy, if you realize the implications that 'virDomainRestore'
would be stuck with recreating the same mirror layout.

But if we step back and look at the bigger picture, we realize that
the initial client of live storage migration via disk mirroring is
oVirt, which always uses transient domains, and that if a transient
domain is destroyed while a mirror exists, oVirt can easily restart
the storage migration by creating a new domain that visits just the
source storage, with no loss in data.

We can make life a lot easier by being cowards for now, forbidding
certain operations on a domain.  This patch guarantees that we
never get in a state where we would have to restart a domain with
a mirroring block copy, by preventing saves, snapshots, migration,
hot unplug of a disk in use, and conversion to a persistent domain
(thankfully, it is still relatively easy to 'virsh undefine' a
running domain to temporarily make it transient, run tests on
'virsh blockcopy', then 'virsh define' to restore the persistence).
Later, if the qemu design is enhanced, we can relax our code.

The change to qemudDomainDefine looks a bit odd for undoing an
assignment, rather than probing up front to avoid the assignment,
but this is because of how virDomainAssignDef combines both a
lookup and assignment into a single function call.

* src/conf/domain_conf.h (virDomainHasDiskMirror): New prototype.
* src/conf/domain_conf.c (virDomainHasDiskMirror): New function.
* src/libvirt_private.syms (domain_conf.h): Export it.
* src/qemu/qemu_driver.c (qemuDomainSaveInternal)
(qemuDomainSnapshotCreateXML, qemuDomainRevertToSnapshot)
(qemuDomainBlockJobImpl, qemudDomainDefine): Prevent dangerous
actions while block copy is already in action.
* src/qemu/qemu_hotplug.c (qemuDomainDetachDiskDevice): Likewise.
* src/qemu/qemu_migration.c (qemuMigrationIsAllowed): Likewise.
2012-10-27 07:43:38 -06:00
Eric Blake
6d264c9182 blockjob: add qemu capabilities related to block jobs
Upstream qemu 1.3 is adding two new monitor commands, 'drive-mirror'
and 'block-job-complete'[1], which can drive live block copy and
storage migration.  [Additionally, RHEL 6.3 had backported an earlier
version of most of the same functionality, but under the names
'__com.redhat_drive-mirror' and '__com.redhat_drive-reopen' and with
slightly different JSON arguments, and has been using patches similar
to these upstream patches for several months now.]

The libvirt API virDomainBlockRebase as already committed for 0.9.12
is flexible enough to expose the basics of block copy, but some
additional features in the 'drive-mirror' qemu command, such as
setting error policy, setting granularity, or using a persistent
bitmap, may later require a new libvirt API virDomainBlockCopy.  I
will wait to add that API until we know more about what qemu 1.3
will finally provide.

This patch caters only to the upstream qemu 1.3 interface, although
I have proven that the changes for RHEL 6.3 can be isolated to
just qemu_monitor_json.c, and the rest of this series will
gracefully handle either interface once the JSON differences are
papered over in a downstream patch.

For consistency with other block job commands, libvirt must handle
the bandwidth argument as MiB/sec from the user, even though qemu
exposes the speed argument as bytes/sec; then again, qemu rounds
up to cluster size internally, so using MiB hides the worst effects
of that rounding if you pass small numbers.

[1]https://lists.gnu.org/archive/html/qemu-devel/2012-10/msg04123.html

* src/qemu/qemu_capabilities.h (QEMU_CAPS_DRIVE_MIRROR)
(QEMU_CAPS_DRIVE_REOPEN): New bits.
* src/qemu/qemu_capabilities.c (qemuCaps): Name them.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONCheckCommands): Set
them.
(qemuMonitorJSONDriveMirror, qemuMonitorDrivePivot): New functions.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDriveMirror)
(qemuMonitorDrivePivot): Declare them.
* src/qemu/qemu_monitor.c (qemuMonitorDriveMirror)
(qemuMonitorDrivePivot): New passthroughs.
* src/qemu/qemu_monitor.h (qemuMonitorDriveMirror)
(qemuMonitorDrivePivot): Declare them.
2012-10-27 07:43:37 -06:00
Laine Stump
def31e4c58 qemu: fix attach/detach of netdevs with matching mac addrs
This resolves:

   https://bugzilla.redhat.com/show_bug.cgi?id=862515

which describes inconsistencies in dealing with duplicate mac
addresses on network devices in a domain.

(at any rate, it resolves *almost* everything, and prints out an
informative error message for the one problem that isn't solved, but
has a workaround.)

A synopsis of the problems:

1) you can't do a persistent attach-interface of a device with a mac
address that matches an existing device.

2) you *can* do a live attach-interface of such a device.

3) you *can* directly edit a domain and put in two devices with
matching mac addresses.

4) When running virsh detach-device (live or config), only MAC address
is checked when matching the device to remove, so the first device
with the desired mac address will be removed. This isn't always the
one that's wanted.

5) when running virsh detach-interface (live or config), the only two
items that can be specified to match against are mac address and model
type (virtio, etc) - if multiple netdevs match both of those
attributes, it again just finds the first one added and assumes that
is the only match.

Since it is completely valid to have multiple network devices with the
same MAC address (although it can cause problems in many cases, there
*are* valid use cases), what is needed is:

1) remove the restriction that prohibits doing a persistent add of a
netdev with a duplicate mac address.

2) enhance the backend of virDomainDetachDeviceFlags to check for
something that *is* guaranteed unique (but still work with just mac
address, as long as it yields only a single results.

This patch does three things:

1) removes the check for duplicate mac address during a persistent
netdev attach.

2) unifies the searching for both live and config detach of netdevices
in the subordinate functions of qemuDomainModifyDeviceFlags() to use the
new function virDomainNetFindIdx (which matches mac address and PCI
address if available, checking for duplicates if only mac address was
specified). This function returns -2 if multiple matches are found,
allowing the callers to print out an appropriate message.

Steps 1 & 2 are enough to fully fix the problem when using virsh
attach-device and detach-device (which require an XML description of
the device rather than a bunch of commandline args)

3) modifies the virsh detach-interface command to check for multiple
matches of mac address and show an error message suggesting use of the
detach-device command in cases where there are multiple matching mac
addresses.

Later we should decide how we want to input a PCI address on the virsh
commandline, and enhance detach-interface to take a --address option,
eliminating the need to use detach-device

* src/conf/domain_conf.c
* src/conf/domain_conf.h
* src/libvirt_private.syms
  * added new virDomainNetFindIdx function
  * removed now unused virDomainNetIndexByMac and
    virDomainNetRemoveByMac

* src/qemu/qemu_driver.c
  * remove check for duplicate max from qemuDomainAttachDeviceConfig
  * use virDomainNetFindIdx/virDomainNetRemove instead
    of virDomainNetRemoveByMac in qemuDomainDetachDeviceConfig
  * use virDomainNetFindIdx instead of virDomainIndexByMac
    in qemuDomainUpdateDeviceConfig

* src/qemu/qemu_hotplug.c
  * use virDomainNetFindIdx instead of a homespun loop in
    qemuDomainDetachNetDevice.

* tools/virsh-domain.c: modified detach-interface command as described
    above
2012-10-26 20:47:54 -04:00
Eric Blake
4fbf322fe9 cpustat: fix regression when cpus are offline
It turns out that the cpuacct results properly account for offline
cpus, and always returns results for every possible cpu, not just
the online ones.  So there is no need to check the map of online
cpus in the first place, merely only a need to know the maximum
possible cpu.  Meanwhile, virNodeGetCPUBitmap had a subtle change
from returning the maximum id to instead returning the width of
the bitmap (one larger than the maximum id) in commit 2f4c5338,
which made this code encounter some off-by-one logic leading to
bad error messages when a cpu was offline:

$ virsh cpu-stats dom
error: Failed to virDomainGetCPUStats()

error: An error occurred, but the cause is unknown

Cleaning this up unraveled a chain of other unused variables.

* src/qemu/qemu_driver.c (qemuDomainGetPercpuStats): Drop
pointless check for cpumap changes, and use correct number of
cpus.  Simplify signature.
(qemuDomainGetCPUStats): Adjust caller.
* src/nodeinfo.h (nodeGetCPUCount): New prototype.
(nodeGetCPUBitmap): Drop unused parameter.
* src/nodeinfo.c (nodeGetCPUBitmap): Likewise.
(nodeGetCPUMap): Adjust caller.
(nodeGetCPUCount): New function.
* src/libvirt_private.syms (nodeinfo.h): Export it.
2012-10-26 15:34:52 -06:00
Eric Blake
60f54f6146 build: silence compiler warning about signedness
Commit 246143b fixed a warning on older gcc, but caused a warning
on newer gcc.

../../src/rpc/virnetserverservice.c: In function 'virNetServerServiceNewPostExecRestart':
../../src/rpc/virnetserverservice.c:277:41: error: pointer targets in passing argument 3 of 'virJSONValueObjectGetNumberUint' differ in signedness [-Werror=pointer-sign]

* src/rpc/virnetserverservice.c: Use correct types.
2012-10-26 14:29:51 -06:00
Eric Blake
246143b69f build: fix type-punning bug
With older gcc and 64-bit size_t, the compiler issues a real warning:
rpc/virnetserverservice.c:277: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]

Introduced in commit 0cc79255.  Depending on machine endianness,
this warning represents a real bug that could mis-interpret the
value by a factor of 2^32.  I don't know why I couldn't get newer
gcc to report the same warning message.

* src/rpc/virnetserverservice.c
(virNetServerServiceNewPostExecRestart): Use temporary instead.
2012-10-26 13:00:27 -06:00
Laine Stump
73ebd86d73 parallels: fix build for some older compilers
Found this when building on RHEL5:

parallels/parallels_storage.c: In function 'parallelsStorageOpen':
parallels/parallels_storage.c:180: error: 'for' loop initial declaration used outside C99 mode

(and similar error in parallels_driver.c). This was in spite of
configuring with "-Wno-error".
2012-10-26 13:23:56 -04:00
Cole Robinson
eba36a3878 daemon: Fix LIBVIRT_DEBUG=1 default output
This commit changes the behavior of LIBVIRT_DEBUG=1 libvirtd:

$ git show 7022b09111
commit 7022b09111
Author: Daniel P. Berrange <berrange@redhat.com>
Date:   Thu Sep 27 13:13:09 2012 +0100

    Automatically enable systemd journal logging

    Probe to see if the systemd journal is accessible, and if
    so enable logging to the journal by default, rather than
    stderr (current default under systemd).

Previously  'LIBVIRT_DEBUG=1 /usr/sbin/libvirtd' would show all debug
output to stderr, now it send debug output to the journal.

Only use the journal by default if running in daemon mode, or
if stdin is _not_ a tty. This should make libvirtd launched from
systemd use the journal, but preserve the old behavior in most
situations.
2012-10-25 16:46:23 -04:00
Laine Stump
d8aae15aa1 network: fix networkValidate check for default portgroup and vlan
This was found during testing of the fix for:

   https://bugzilla.redhat.com/show_bug.cgi?id=868483

networkValidate was supposed to check for the existence of multiple
portgroups and report an error if this was encountered. It did, but
there were two problems:

1) even though it logged an error, it still returned success, allowing
the operation to continue.

2) It could exit the portgroup checking loop early (or possibly not
even do it once) if a vlan tag was supplied in the base network config
or one of the portgroups.

This patch fixes networkValidate to return failure in addition to
logging the error, and also changes it to not exit the portgroup
checking loop early. The logic was a bit off in the checking for vlan
anyway, and it's intertwined with fixing the early loop exit, so I
fixed that as well. Now it correctly checks for combinations where a
<virtualport> is specified in the base network def and <vlan> is given
in a portgroup, as well as the opposite (<vlan> in base network def
and <virtualport> in portgroup), and ignores the case of a disallowed
vlan when using *no* portgroup if there is a default portgroup (since
in that case there is no way to not use any portgroup).
2012-10-25 16:32:04 -04:00
Viktor Mihajlovski
e3ba67037b virNodeGetCPUMap: Implement driver support
Driver support added for:
- test: pretending 8 host CPUS, 3 being online
- qemu, lxc, openvz, uml: using nodeGetCPUMap

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2012-10-25 11:20:15 -06:00
Viktor Mihajlovski
d34439c9e4 virNodeGetCPUMap: Implement support function in nodeinfo
Added an implemention of virNodeGetCPUMap to nodeinfo.c,
(nodeGetCPUMap) which can be used by all drivers for a Linux
hypervisor host.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2012-10-25 11:20:08 -06:00
Eric Blake
2f4c5338a6 nodeinfo: improve probing node cpu bitmap
Callers should not need to know what the name of the file to
be read in the Linux-specific version of nodeGetCPUmap;
furthermore, qemu cares about online cpus, not present cpus,
when determining which cpus to skip.

While at it, I fixed the fact that we were computing the maximum
online cpu id by doing a slow iteration, when what we really want
to know is the max available cpu.

* src/nodeinfo.h (nodeGetCPUmap): Rename...
(nodeGetCPUBitmap): ...and simplify signature.
* src/nodeinfo.c (linuxParseCPUmax): New function.
(linuxParseCPUmap): Simplify and alter signature.
(nodeGetCPUBitmap): Change implementation.
* src/libvirt_private.syms (nodeinfo.h): Reflect rename.
* src/qemu/qemu_driver.c (qemuDomainGetPercpuStats): Update
caller.
2012-10-25 11:20:08 -06:00
Eric Blake
0711c4b74d bitmap: add virBitmapCountBits
Sometimes it's handy to know how many bits are set.

* src/util/bitmap.h (virBitmapCountBits): New prototype.
(virBitmapNextSetBit): Use correct type.
* src/util/bitmap.c (virBitmapNextSetBit): Likewise.
(virBitmapSetAll): Maintain invariant of clear tail bits.
(virBitmapCountBits): New function.
* src/libvirt_private.syms (bitmap.h): Export it.
* tests/virbitmaptest.c (test2): Test it.
2012-10-25 11:19:23 -06:00
Jiri Denemark
0111b409a3 Fix build with apparmor
Recent storage patches changed signature of virStorageFileGetMetadata
and replaced chain with backingChain in virDomainDiskDef.
2012-10-25 10:21:57 +02:00
Matthias Bolte
1e7cd39511 esx: Update version checks for vSphere 5.1
Also remove warnings for upcoming versions. There hadn't been any
compatibility problems with new ESX version over the whole lifetime
of the ESX driver, so I don't expect any in the future.

Update documentation to mention vSphere 5.x support.
2012-10-24 19:50:28 +02:00
Peter Krempa
012f9b19ef cpu: Add recently added cpu feature flags.
Qemu has added some new feature flags. This patch adds them to libvirt.

The new features are for the cpuid function 0x7 that takes an argument
in the ecx register. Currently only 0x0 is used as the argument so I was
lazy and I just clear the registers to 0 before calling cpuid. In future
when there maybe will be some other possible arguments, we will need to
improve the cpu detection code to take this into account.
2012-10-24 17:36:03 +02:00
Osier Yang
a6bd7c22ea qemu: Prohibit chaning affinity of domain process if placement is 'auto'
On one hand, numad probably will manage the affinity of domain process
dynamically in future. On the other hand, even numad won't manage it,
it still could confusion. Let's make things simpler enough to avoid
the lair for now.
2012-10-24 22:26:11 +08:00
Osier Yang
bb81021bfe qemu: Keep the affinity when creating cgroup for emulator thread
When the cpu placement model is "auto", it sets the affinity for
domain process with the advisory nodeset from numad, however,
creating cgroup for the domain process (called emulator thread
in some contexts) later overrides that with pinning it to all
available pCPUs.

How to reproduce:

  * Configure the domain with "auto" placement for <vcpu>, e.g.
    <vcpu placement='auto'>4</vcpu>
  * % virsh start dom
  * % cat /proc/$dompid/status

Though the emulator cgroup cause conflicts, but we can't simply
prohibit creating it, as other tunables are still useful, such
as "emulator_period", which is used by API
virDomainSetSchedulerParameter. So this patch doesn't prohibit
creating the emulator cgroup, but inherit the nodeset from numad,
and reset the affinity for domain process.

* src/qemu/qemu_cgroup.h: Modify definition of qemuSetupCgroupForEmulator
                          to accept the passed nodenet
* src/qemu/qemu_cgroup.c: Set the affinity with the passed nodeset
2012-10-24 21:46:24 +08:00
Osier Yang
0039a32fca qemu: Add helper to prepare cpumap for affinity setting
Abstract the codes to prepare cpumap into a helper a function,
which can be used later.

* src/qemu/qemu_process.h: Declare qemuPrepareCpumap
* src/qemu/qemu_process.c: Implement qemuPrepareCpumap, and use it.
2012-10-24 21:24:10 +08:00
Viktor Mihajlovski
d804d35fac virNodeGetCPUMap: Implement wire protocol.
- Defined the wire protocol format for virNodeGetCPUMap and its
  arguments
- Implemented remote method invocation (remoteNodeGetCPUMap)
- Implemented method dispatcher (remoteDispatchNodeGetCPUMap)

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2012-10-23 18:46:48 -06:00
Viktor Mihajlovski
7ecc1d814a virNodeGetCPUMap: Define public API.
Adding a new API to obtain information about the
host node's present, online and offline CPUs.

int virNodeGetCPUMap(virConnectPtr conn,
                     unsigned char **cpumap,
                     unsigned int *online,
                     unsigned int flags);

The function will return the number of CPUs present on the host
or -1 on failure;
If cpumap is non-NULL virNodeGetCPUMap will allocate an array
containing a bit map representation of the online CPUs. It's
the callers responsibility to deallocate cpumap using free().
If online is non-NULL, the variable pointed to will contain
the number of online host node CPUs.
The variable flags has been added to support future extensions
and must be set to 0.

Extend the driver structure by nodeGetCPUMap entry in support of the
new API virNodeGetCPUMap.
Added implementation of virNodeGetCPUMap to libvirt.c

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2012-10-23 18:46:47 -06:00
Kyle Mestery
2f3e2c0c43 qemu_migration: Transport OVS per-port data during live migration
Transport Open vSwitch per-port data during live
migration by using the utility functions
virNetDevOpenvswitchGetMigrateData() and
virNetDevOpenvswitchSetMigrateData().

Signed-off-by: Kyle Mestery <kmestery@cisco.com>
2012-10-23 15:26:04 -04:00
Kyle Mestery
f6a2f97eb9 openvswitch: Add utility functions for getting and setting Open vSwitch per-port data
Add utility functions for Open vSwitch to both save
per-port data before a live migration, and restore the
per-port data after a live migration.

Signed-off-by: Kyle Mestery <kmestery@cisco.com>
2012-10-23 15:26:04 -04:00
Kyle Mestery
694d0c520b qemu_migration: Add hooks to transport network data during migration
Add the ability for the Qemu V3 migration protocol to
include transporting network configuration. A generic
framework is proposed with this patch to allow for the
transfer of opaque data.

Signed-off-by: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Laine Stump <laine@laine.org>
2012-10-23 15:26:04 -04:00
Jim Fehlig
9785f2b6f2 Fix detection of Xen sysctl version 9
In commit 371ddc98, I mistakenly added the check for sysctl
version 9 after setting the hypercall version to 1, which will
fail with

error : xenHypervisorDoV1Op:967 : Unable to issue hypervisor
ioctl 3166208: Function not implemented

This check should be included along with the others that use
hypercall version 2.
2012-10-23 11:18:20 -06:00
Cole Robinson
767be8be72 selinux: Don't fail RestoreAll if file doesn't have a default label
When restoring selinux labels after a VM is stopped, any non-standard
path that doesn't have a default selinux label causes the process
to stop and exit early. This isn't really an error condition IMO.

Of course the selinux API could be erroring for some other reason
but hopefully that's rare enough to not need explicit handling.

Common example here is storing disk images in a non-standard location
like under /mnt.
2012-10-23 11:45:24 -04:00
Eric Blake
add633bdf9 build: print uids as unsigned
Reported by Michal Privoznik.

* src/security/security_dac.c (virSecurityDACGenLabel): Use
correct format.
2012-10-23 08:38:33 -06:00
Ján Tomko
9b704ab823 xml: omit domain name from comment if it contains double hyphen
We put a comment containing "virsh edit <domain_name>" at the start of
the XML. W3C recommendation forbids the use of "--" in comments [1] and
libvirt can't parse it either. This patch omits the domain name if it
contains a double hyphen.

[1] http://www.w3.org/TR/REC-xml/#sec-comments
2012-10-23 14:24:31 +02:00
Ján Tomko
b326765c80 storage: don't shadow global 'wait' declaration
Rename the 'wait' parameter to 'loop'.
This silences the warning:
storage/storage_backend.c:1348:34: error: declaration of 'wait' shadows
a global declaration [-Werror=shadow]
and fixes the build with -Werror.
--
Note: loop is pool backwards.
2012-10-23 13:56:59 +02:00
Eric Blake
33eaebe48e snapshot: sanity check when reusing file for snapshot
The snapshot code when reusing an existing file had hard-to-read
logic, as well as a missing sanity check: REUSE_EXT should require
the destination to already be present.

* src/qemu/qemu_driver.c (qemuDomainSnapshotDiskPrepare): Require
destination on REUSE_EXT, rename variable for legibility.
2012-10-22 15:10:16 -06:00
Eric Blake
23a4df886d build: use correct printf types for uid/gid
Fixes a build failure on cygwin:
cc1: warnings being treated as errors
security/security_dac.c: In function 'virSecurityDACSetProcessLabel':
security/security_dac.c:862:5: error: format '%u' expects type 'unsigned int', but argument 7 has type 'uid_t' [-Wformat]
security/security_dac.c:862:5: error: format '%u' expects type 'unsigned int', but argument 8 has type 'gid_t' [-Wformat]

* src/security/security_dac.c (virSecurityDACSetProcessLabel)
(virSecurityDACGenLabel): Use proper casts.
2012-10-22 14:41:00 -06:00
Cole Robinson
77eff5eeb2 storage: Don't do wait loops from VolLookupByPath
virStorageVolLookupByPath is an API call that virt-manager uses
quite a bit when dealing with storage. This call use BackendStablePath
which has several usleep() heuristics that can be tripped up
and hang virt-manager for a while.

Current example: an empty mpath pool pointing to /dev/mapper makes
_any_ calls to virStorageVolLookupByPath take 5 seconds.

The sleep heuristics are actually only needed in certain cases
when we are waiting for new storage to appear, so let's skip the
timeout steps when calling from LookupByPath.
2012-10-22 16:15:12 -04:00
Cole Robinson
e58dfad4a4 qemu: Don't use -enable-nesting with qemu 1.2.0+
Since the option doesn't exist. Fixes booting with
cpu mode='host-model' and qemu 1.2.0
2012-10-22 16:15:12 -04:00
Doug Goldstein
2da776b1d6 qemu: Don't blindly assume VNC is supported
Currently it's assumed that qemu always supports VNC, however it is
definitely possible to compile qemu without VNC support so we should at
the very least check for it and handle that correctly.
2012-10-22 23:16:17 +08:00
Eric Blake
d9d77bfa80 storage: let format probing work on root-squash NFS
Yet another instance of where using plain open() mishandles files
that live on root-squash NFS, and where improving the API can
improve the chance of a successful probe.

* src/util/storage_file.h (virStorageFileProbeFormat): Alter
signature.
* src/util/storage_file.c (virStorageFileProbeFormat): Use better
method for opening file.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Update caller.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Likewise.
2012-10-22 09:04:57 -06:00
Ján Tomko
b6ab7a067f migrate: v2: use VIR_DOMAIN_XML_MIGRATABLE when available
In v2 migration protocol, XML is obtained by calling domainGetXMLDesc.
This includes the default USB controller in XML, which breaks migration
to older libvirt (before 0.9.2).

Commit 409b5f5495
    qemu: Emit compatible XML when migrating a domain
only fixed this for v3 migration.

This patch uses the new VIR_DOMAIN_XML_MIGRATABLE flag (detected by
VIR_DRV_FEATURE_XML_MIGRATABLE) to obtain XML without the default controller,
enabling backward v2 migration.
2012-10-22 10:48:50 +02:00
Michal Privoznik
508451e4ad qemu: set seamless migration capability
As we switched to setting capabilities based on QMP communication,
qemu seamless-migration capability was not set. In the -help output
this knob is called seamless-migration=[on|off]. The equivalent in
QMP world is SPICE_MIGRATE_COMPLETED event (qemu upstream commit
2fdd16e2).
2012-10-22 10:09:47 +02:00
Osier Yang
b0f1ba47dd qemu: Fix the unused parameter which causes the build failure 2012-10-22 15:51:13 +08:00
Osier Yang
5828080f71 qemu: Cleanup the unused 'nodeinfo'
"nodeinfo" is not used in these two functions, and it's waste
of goto in qemuProcessSetEmulatorAffinites
2012-10-22 15:12:57 +08:00
Cole Robinson
b62f9b99dd Log parameters passed to virFileMakePath 2012-10-21 13:21:50 -04:00
Cole Robinson
7fcf8d9d69 Log file name passed to virConfReadFile 2012-10-21 13:21:50 -04:00
Laine Stump
6f8a8b30c9 network: don't allow multiple default portgroups
This resolves: https://bugzilla.redhat.com/show_bug.cgi?id=868483

virNetworkUpdate, virNetworkDefine, and virNetworkCreate all three
allow network definitions to contain multiple <portgroup> elements
with default='yes'. Only a single default portgroup should be allowed
for each network.

This patch updates networkValidate() (called by both
virNetworkCreate() and virNetworkDefine()) and
virNetworkDefUpdatePortGroup (called by virNetworkUpdate() to not
allow multiple default portgroups.
2012-10-20 21:29:19 -04:00
Laine Stump
1cb1f9dabf network: always create dnsmasq hosts and addnhosts files, even if empty
This fixes the problem reported in:

  https://bugzilla.redhat.com/show_bug.cgi?id=868389

Previously, the dnsmasq hosts file (used for static dhcp entries, and
addnhosts file (used for additional dns host entries) were only
created/referenced on the dnsmasq commandline if there was something
to put in them at the time the network was started. Once we can update
a network definition while it's active (which is now possible with
virNetworkUpdate), this is no longer a valid strategy - if there were
0 dhcp static hosts (resulting in no reference to the hosts file on the
commandline), then one was later added, the commandline wouldn't have
linked dnsmasq up to the file, so even though we create it, dnsmasq
doesn't pay any attention.

The solution is to just always create these files and reference them
on the dnsmasq commandline (almost always, anyway). That way dnsmasq
can notice when a new entry is added at runtime (a SIGHUP is sent to
dnsmasq by virNetworkUdpate whenever a host entry is added or removed)

The exception to this is that the dhcp static hosts file isn't created
if there are no lease ranges *and* no static hosts. This is because in
this case dnsmasq won't be setup to listen for dhcp requests anyway -
in that case, if the count of dhcp hosts goes from 0 to 1, dnsmasq
will need to be restarted anyway (to get it listening on the dhcp
port). Likewise, if the dhcp hosts count goes from 1 to 0 (and there
are no dhcp ranges) we need to restart dnsmasq so that it will stop
listening on port 67. These special situations are handled in the
bridge driver's networkUpdate() by checking for ((bool)
nranges||nhosts) both before and after the update, and triggering a
dnsmasq restart if the before and after don't match.
2012-10-20 21:29:19 -04:00
Laine Stump
78fab2770b network: free/null newDef if network fails to start
https://bugzilla.redhat.com/show_bug.cgi?id=866364

pointed out a crash due to virNetworkObjAssignDef free'ing
network->newDef without NULLing it afterward. A fix for this is in
upstream commit b7e9202401. While the
NULLing of newDef was a legitimate fix, newDef should have already
been empty (NULL) anyway (as indicated in the comment that was deleted
by that commit).

The reason that newDef had a non-NULL value (i.e. the root cause) was
that networkStartNetwork() had failed after populating
network->newDef, but then neglected to free/NULL newDef in the
cleanup.

(A bit of background here: network->newDef should contain the
persistent config of a network when a network is active (and of course
only when it is persisten), and NULL at all other times. There is also
a network->def which should contain the persistent definition of the
network when it is inactive, and the current live state at all other
times. The idea is that you can make changes to network->newDef which
will take effect the next time the network is restarted, but won't
mess with the current state of the network (virDomainObj has a similar
pair of virDomainDefs that behave in the same fashion). Personally I
think there should be a network->live and network->config, and the
location of the persistent config should *always* be in
network->config, but that's for a later cleanup).

Since I love things to be symmetric, I created a new function called
virNetworkObjUnsetDefTransient(), which reverses the effects of
virNetworkObjSetDefTransient(). I don't really like the name of the
new function, but then I also didn't really like the name of the old
one either (it's just named that way to match a similar function in
the domain conf code).
2012-10-20 02:43:16 -04:00
Eric Blake
a172dfbe2e blockjob: avoid segv on early error
Gcc with optimization warns:
../../src/qemu/qemu_driver.c: In function 'qemuDomainBlockCommit':
../../src/qemu/qemu_driver.c:12813:46: error: 'disk' may be used uninitialized in this function [-Werror=maybe-uninitialized]
../../src/qemu/qemu_driver.c:12698:25: note: 'disk' was declared here
cc1: all warnings being treated as errors

so obviously I had only been testing with optimization off.

* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Guard cleanup.
2012-10-19 21:17:00 -06:00
Eric Blake
2e43cb8e90 blockjob: properly label disks for qemu block-commit
I finally have all the pieces in place to perform a block-commit with
SELinux enforcing.  There's still missing cleanup work when the commit
completes, but doing that requires tracking both the backing chain and
the base and top files within that chain in domain XML across libvirtd
restarts.  Furthermore, from a security standpoint, once you have
granted access, you must assume any damage that can be done will be
done; later revoking access is nice to minimize the window of damage,
but less important as it does not affect the fact that damage can be
done in the first place.  Therefore, deferring the revoke efforts until
we have better XML tracking of what chain operations are in effect,
including across a libvirtd restart, is reasonable.

* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Label disks as
needed.
(qemuDomainPrepareDiskChainElement): Cast away const.
2012-10-19 17:56:39 -06:00
Eric Blake
35a2f5bc52 blockjob: refactor qemu disk chain permission grants
Previously, snapshot code did its own permission granting (lock
manager, cgroup device controller, and security manager labeling)
inline.  But now that we are adding block-commit and block-copy
which also have to change permissions, it's better to reuse
common code for the task.  While snapshot should fall back to
no access if read-write access failed, block-commit will want to
fall back to read-only access.  The common code doesn't know
whether failure to grant read-write access should revert to no
access (snapshot, block-copy) or read-only access (block-commit).
This code can also be used to revoke access to unused files after
block-pull.

It might be nice to clean things up in a future patch by adding
new functions to the lock manager, cgroup manager, and security
manager that takes a single file name and applies context of a
disk to that file, rather than the current semantics of applying
context to the entire chain already associated to a disk.  That
way, we could avoid the games this patch plays of temporarily
swapping out the disk->src and related fields of the disk.  But
that would involve more code changes, so this patch really is
the smallest hack for doing the necessary work; besides, this
patch is more or less code motion (the hack was already employed
by the snapshot creation code, we are just making it reusable).

* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateSingleDiskActive)
(qemuDomainSnapshotUndoSingleDiskActive): Refactor labeling hacks...
(qemuDomainPrepareDiskChainElement): ...into new function.
2012-10-19 17:49:06 -06:00
Eric Blake
0a220e2225 blockjob: implement shallow commit flag in qemu
Now that we can crawl the chain of backing files, we can do
argument validation and implement the 'shallow' flag.  In
testing this, I discovered that it can be handy to pass the
shallow flag and an explicit base, as a means of validating
that the base is indeed the file we expected.

* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Crawl through
chain to implement shallow flag.
* src/libvirt.c (virDomainBlockCommit): Relax API.
2012-10-19 17:35:11 -06:00
Eric Blake
2cbc1fd892 blockjob: wire up online qemu block-commit
This is the bare minimum to kick off a block commit.  In particular,
flags support is missing (shallow requires us to crawl the backing
chain to determine the file name to pass to the qemu monitor command;
delete requires us to track what needs to be deleted at the time
the completion event fires).  Also, we are relying on qemu to do
error checking (such as validating 'top' and 'base' as being members
of the backing chain), including the fact that the current qemu code
does not support committing the active layer (although it is still
planned to add that before qemu 1.3).  Since the active layer won't
change, we have it easy and do not have to alter the domain XML.
Additionally, this will fail if SELinux is enforcing, because we fail
to grant qemu proper read/write access to the files it will modify.

* src/qemu/qemu_driver.c (qemuDomainBlockCommit): New function.
(qemuDriver): Register it.
2012-10-19 17:35:11 -06:00
Eric Blake
3f38c7e3a9 blockjob: manage qemu block-commit monitor command
qemu 1.3 will be adding a 'block-commit' monitor command, per
qemu.git commit ed61fc1.  It matches nicely to the libvirt API
virDomainBlockCommit.

* src/qemu/qemu_capabilities.h (QEMU_CAPS_BLOCK_COMMIT): New bit.
* src/qemu/qemu_capabilities.c (qemuCapsProbeQMPCommands): Set it.
* src/qemu/qemu_monitor.h (qemuMonitorBlockCommit): New prototype.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockCommit):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorBlockCommit): Implement it.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockCommit):
Likewise.
(qemuMonitorJSONHandleBlockJobImpl)
(qemuMonitorJSONGetBlockJobInfoOne): Handle new event type.
2012-10-19 17:35:11 -06:00
Eric Blake
67aea3fb78 blockjob: remove unused parameters after previous patch
Minor cleanup made possible by previous simplifications.

* src/qemu/qemu_cgroup.h (qemuSetupDiskCgroup)
(qemuTeardownDiskCgroup): Alter signature.
* src/qemu/qemu_cgroup.c (qemuSetupDiskCgroup)
(qemuTeardownDiskCgroup, qemuSetupCgroup): Update all uses.
* src/qemu/qemu_hotplug.c (qemuDomainDetachPciDiskDevice)
(qemuDomainDetachDiskDevice): Likewise.
* src/qemu/qemu_driver.c (qemuDomainAttachDeviceDiskLive)
(qemuDomainChangeDiskMediaLive)
(qemuDomainSnapshotCreateSingleDiskActive)
(qemuDomainSnapshotUndoSingleDiskActive): Likewise.
2012-10-19 17:35:11 -06:00
Eric Blake
38c4a9cc40 storage: use cache to walk backing chain
We used to walk the backing file chain at least twice per disk,
once to set up cgroup device whitelisting, and once to set up
security labeling.  Rather than walk the chain every iteration,
which possibly includes calls to fork() in order to open root-squashed
NFS files, we can exploit the cache of the previous patch.

* src/conf/domain_conf.h (virDomainDiskDefForeachPath): Alter
signature.
* src/conf/domain_conf.c (virDomainDiskDefForeachPath): Require caller
to supply backing chain via disk, if recursion is desired.
* src/security/security_dac.c
(virSecurityDACSetSecurityImageLabel): Adjust caller.
* src/security/security_selinux.c
(virSecuritySELinuxSetSecurityImageLabel): Likewise.
* src/security/virt-aa-helper.c (get_files): Likewise.
* src/qemu/qemu_cgroup.c (qemuSetupDiskCgroup)
(qemuTeardownDiskCgroup): Likewise.
(qemuSetupCgroup): Pre-populate chain.
2012-10-19 17:35:11 -06:00
Eric Blake
4d34c92947 storage: cache backing chain while qemu domain is live
Technically, we should not be re-probing any file that qemu might
be currently writing to.  As such, we should cache the backing
file chain prior to starting qemu.  This patch adds the cache,
but does not use it until the next patch.

Ultimately, we want to also store the chain in domain XML, so that
it is remembered across libvirtd restarts, and so that the only
kosher way to modify the backing chain of an offline domain will be
through libvirt API calls, but we aren't there yet.  So for now, we
merely invalidate the cache any time we do a live operation that
alters the chain (block-pull, block-commit, external disk snapshot),
as well as tear down the cache when the domain is not running.

* src/conf/domain_conf.h (_virDomainDiskDef): New field.
* src/conf/domain_conf.c (virDomainDiskDefFree): Clean new field.
* src/qemu/qemu_domain.h (qemuDomainDetermineDiskChain): New
prototype.
* src/qemu/qemu_domain.c (qemuDomainDetermineDiskChain): New
function.
* src/qemu/qemu_driver.c (qemuDomainAttachDeviceDiskLive)
(qemuDomainChangeDiskMediaLive): Pre-populate chain.
(qemuDomainSnapshotCreateSingleDiskActive): Uncache chain before
snapshot.
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Update
chain after block pull.
2012-10-19 17:35:10 -06:00
Eric Blake
5eaf605447 storage: make it easier to find file within chain
In order to temporarily label files read/write during a commit
operation, we need to crawl the backing chain and find the absolute
file name that needs labeling in the first place, as well as the
name of the file that owns the backing file.

* src/util/storage_file.c (virStorageFileChainLookup): New
function.
* src/util/storage_file.h: Declare it.
* src/libvirt_private.syms (storage_file.h): Export it.
2012-10-19 17:35:10 -06:00
Eric Blake
82507838e0 storage: remember relative names in backing chain
In order to search for a backing file name as literally present
in a chain, we need to remember if the chain had relative names.
Also, searching for absolute names is easier if we only have
to canonicalize once, rather than on every iteration.

* src/util/storage_file.h (_virStorageFileMetadata): Add field.
* src/util/storage_file.c (virStorageFileGetMetadataFromBuf):
(virStorageFileFreeMetadata): Manage it
(absolutePathFromBaseFile): Store absolute names in canonical form.
2012-10-19 17:35:10 -06:00
Eric Blake
1fc9593271 storage: don't require caller to pre-allocate metadata struct
Requiring pre-allocation was an unusual idiom.  It allowed iteration
over the backing chain to use fewer mallocs, but made one-shot
clients harder to read.  Also, this makes it easier for a future
patch to move away from opening fds on every iteration over the chain.

* src/util/storage_file.h (virStorageFileGetMetadataFromFD): Alter
signature.
* src/util/storage_file.c (virStorageFileGetMetadataFromFD): Allocate
return value.
 (virStorageFileGetMetadata): Update clients.
* src/conf/domain_conf.c (virDomainDiskDefForeachPath): Likewise.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Likewise.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Likewise.
2012-10-19 17:35:10 -06:00
Eric Blake
35c74c1733 storage: get entire metadata chain in one call
Previously, no one was using virStorageFileGetMetadata, and for good
reason - it couldn't support root-squash NFS.  Change the signature
and make it useful to future patches, including enhancing the metadata
to recursively track the entire chain.

* src/util/storage_file.h (_virStorageFileMetadata): Add field.
(virStorageFileGetMetadata): Alter signature.
* src/util/storage_file.c (virStorageFileGetMetadata): Rewrite.
(virStorageFileGetMetadataRecurse): New function.
(virStorageFileFreeMetadata): Handle recursion.
2012-10-19 17:35:10 -06:00
Eric Blake
eac74c1f47 storage: don't probe non-files
Backing chains can end on a network protocol, such as nbd:xxx; we
should not attempt to probe the file system in this case.

* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Only probe files.
2012-10-19 17:35:10 -06:00
Eric Blake
1246640b3d storage: use enum for snapshot driver type
This is the last use of raw strings for disk formats throughout
the src/conf directory.

* src/conf/snapshot_conf.h (_virDomainSnapshotDiskDef): Store enum
rather than string for disk type.
* src/conf/snapshot_conf.c (virDomainSnapshotDiskDefClear)
(virDomainSnapshotDiskDefParseXML, virDomainSnapshotDefFormat):
Adjust users.
* src/qemu/qemu_driver.c (qemuDomainSnapshotDiskPrepare)
(qemuDomainSnapshotCreateSingleDiskActive): Likewise.
2012-10-19 17:35:10 -06:00
Eric Blake
e5e8d5d082 storage: use enum for disk driver type
Actually use the enum in the domain conf structure.

* src/conf/domain_conf.h (_virDomainDiskDef): Store enum rather
than string for disk type.
* src/conf/domain_conf.c (virDomainDiskDefFree)
(virDomainDiskDefParseXML, virDomainDiskDefFormat)
(virDomainDiskDefForeachPath): Adjust users.
* src/xenxs/xen_sxpr.c (xenParseSxprDisks, xenFormatSxprDisk):
Likewise.
* src/xenxs/xen_xm.c (xenParseXM, xenFormatXMDisk): Likewise.
* src/vbox/vbox_tmpl.c (vboxAttachDrives): Likewise.
* src/libxl/libxl_conf.c (libxlMakeDisk): Likewise.
2012-10-19 17:35:09 -06:00
Eric Blake
09e7fb5e1f storage: use enum for default driver type
Express the default disk type as an enum, for easier handling.

* src/conf/capabilities.h (_virCaps): Store enum rather than
string for disk type.
* src/conf/domain_conf.c (virDomainDiskDefParseXML): Adjust
clients.
* src/qemu/qemu_driver.c (qemuCreateCapabilities): Likewise.
2012-10-19 17:35:09 -06:00
Eric Blake
41e0edaf84 storage: treat 'aio' like 'raw' at parse time
We have historically allowed 'aio' as a synonym for 'raw' for
back-compat to xen, but since a future patch will move to using
an enum value, we have to pick one to be our preferred output
name.  This is a slight change in the output XML, but the sexpr
and xm outputs should still be identical, and the input XML can
still use either form.

* src/conf/domain_conf.c (virDomainDiskDefForeachPath): Move aio
back-compat...
(virDomainDiskDefParseXML): ...to parse time.
* src/xenxs/xen_sxpr.c (xenParseSxprDisks, xenFormatSxprDisk): ...and
to output time.
* src/xenxs/xen_xm.c (xenParseXM, xenFormatXMDisk): Likewise.
* tests/sexpr2xmldata/sexpr2xml-*.xml: Update tests.
2012-10-19 17:35:09 -06:00
Eric Blake
f772b3d91f storage: list more file types
When an image has no backing file, using VIR_STORAGE_FILE_AUTO
for its type is a bit confusing.  Additionally, a future patch
would like to reserve a default value for the case of no file
type specified in the XML, but different from the current use
of -1 to imply probing, since probing is not always safe.

Also, a couple of file types were missing compared to supported
code: libxl supports 'vhd', and qemu supports 'fat' for directories
passed through as a file system.

* src/util/storage_file.h (virStorageFileFormat): Add
VIR_STORAGE_FILE_NONE, VIR_STORAGE_FILE_FAT, VIR_STORAGE_FILE_VHD.
* src/util/storage_file.c (virStorageFileMatchesVersion): Match
documentation when version probing not supported.
(cowGetBackingStore, qcowXGetBackingStore, qcow1GetBackingStore)
(qcow2GetBackingStoreFormat, qedGetBackingStore)
(virStorageFileGetMetadataFromBuf)
(virStorageFileGetMetadataFromFD): Take NONE into account.
* src/conf/domain_conf.c (virDomainDiskDefForeachPath): Likewise.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Likewise.
* src/conf/storage_conf.c (virStorageVolumeFormatFromString): New
function.
(poolTypeInfo): Use it.
2012-10-19 17:35:09 -06:00
Guannan Ren
4492ef7f48 selinux: relabel tapfd in qemuPhysIfaceConnect
Relabeling tapfd right after the tap device is created.
qemuPhysIfaceConnect is common function called both for static
netdevs and for hotplug netdevs.
2012-10-20 00:01:03 +08:00
Jiri Denemark
8d75e47ede qemu: Do not require hostuuid in migration cookie
Having hostuuid in migration cookie is a nice bonus since it provides an
easy way of detecting migration to the same host. However, requiring it
breaks backward compatibility with older libvirt releases.
2012-10-19 15:08:29 +02:00
Jiri Denemark
9fcc5436d3 qemu: Allow migration with host USB devices
Recently, patches were added support for (managed)saving, restoring, and
migrating domains with host USB devices. However, qemu driver would
still forbid migration of such domains because qemuMigrationIsAllowed
was not updated.
2012-10-19 14:18:26 +02:00
Guido Günther
c324bad93a qemu: Set arch to i686 if qemu-system-i386 is found
If we can't probe the architecture from QMP we parse the architecture
from the qemu binaries name. This results in the architecture being i386
instead of i686 which then results in QEMU_CAPS_PCI_MULTIBUS being unset
which gives a broken qemu command line.

This probably didn't show up earlier since most of the time there's also
a /usr/bin/qemu around which results in i686 capabilities.
2012-10-19 08:12:21 +02:00
Guido Günther
a605594f8e qemu: Don't fail without emulatorpin or cpumask
This unbreaks qemu:///session that got broken by
ba63d8f7d8.
2012-10-19 01:25:19 +02:00
Michal Privoznik
b7e9202401 network: Set to NULL after virNetworkDefFree()
which frees all allocated memory but doesn't set the passed pointer to
NULL.  Therefore, we must do it ourselves. This is causing actual
libvirtd crash: Basically, when doing 'virsh net-edit' the newDef should
be dropped.  And the memory is freed, indeed. However, the pointer is
not set to NULL but kept instead. And the next duo of calls 'virsh
net-start' and 'virsh net-destroy' starts the disaster. The latter one
does the same as 'virsh destroy'; it sees that newDef is nonNULL so it
replaces def with newDef (which has been freed already as said a few
lines above). Therefore any subsequent call accessing def will hit the ground.
2012-10-18 17:02:48 +02:00
Viktor Mihajlovski
47a7b93584 dist: added cpu/cpu_ppc_data.h to Makefile.am
Missing entry for cpu_ppc_data.h added to fix RPM build.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2012-10-18 16:50:47 +02:00
Jiri Denemark
f1c7010040 qemu: Always format CPU topology
When libvirt cannot find a suitable CPU model for host CPU (easily
reproducible by running libvirt in a guest), it would not provide CPU
topology in capabilities XML either. Even though CPU topology is known
and can be queried by virNodeGetInfo. With this patch, CPU topology will
always be provided in capabilities XML regardless on the presence of CPU
model.
2012-10-18 14:57:08 +02:00
Peter Krempa
09f10a12be qemu: Add support for HyperV Enlightenment feature "relaxed"
This patch adds QEMU support for the "relaxed" feature implemented by
previous patch.
2012-10-18 12:22:50 +02:00
Peter Krempa
cc922fddc3 conf: Add support for HyperV Enlightenment features
Hypervisors are starting to support HyperV Enlightenment features that
improve behavior of guests running Microsoft Windows operating systems.

This patch adds support for the "relaxed" feature that improves timer
behavior and also establishes a framework to add these features in
future.
2012-10-18 12:22:50 +02:00
Peter Krempa
88cac66d92 conf: Make tri-state feature options more universal
The apic-eoi feature enum and implementation can be made more universal
to allow re-use of the enum for other features.
2012-10-18 12:22:49 +02:00
Michal Privoznik
998dc17da3 qemu: Correctly wait for spice to migrate
Currently we query-spice after the main migration has completed
before moving to next state. Qemu reports this as boolean (not
enclosed within quotes). Therefore it is not correct to use
virJSONValueObjectGetString but virJSONValueObjectGetBoolean instead.
2012-10-18 10:31:56 +02:00
Viktor Mihajlovski
1916679506 qemu: Fixed default machine detection in qemuCapsParseMachineTypesStr
The machine in the last output line of <qemu-binary> -M ?
was always reported as default machine even if this wasn't the
actual default. Trivial fix.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2012-10-17 17:24:41 -06:00
Martin Kletzander
ba63d8f7d8 qemu: Pin the emulator when only cpuset is specified
According to our recent changes (clarifications), we should be pinning
qemu's emulator processes using the <vcpu> 'cpuset' attribute in case
there is no <emulatorpin> specified.  This however doesn't work
entirely as expected and this patch should resolve all the remaining
issues.
2012-10-17 17:37:10 +02:00
Jiri Denemark
837993d845 qemu: Clear async job when p2p migration fails early
When p2p migration fails early because qemuMigrationIsAllowed or
qemuMigrationIsSafe say migration should be cancelled, we fail to clear
the migration-out async job. As a result of that, further APIs called
for the same domain may fail with Timed out during operation: cannot
acquire state change lock.

Reported by Guido Winkelmann.
2012-10-17 15:43:38 +02:00
Doug Goldstein
1e7ec88d9a interface: add virInterfaceGetXMLDesc() in udev
Added support for retrieving the XML defining a specific interface via
the udev based backend to virInterface. Implement the following APIs
for the udev based backend:
* virInterfaceGetXMLDesc()

Note: Does not support bond devices.
2012-10-17 13:59:16 +02:00
Li Zhang
40f58ca75d Doc-fix for PowerPC CPU model driver
There are some descriptions not right in PowerPC CPU model driver.
This patch is to fix them.

Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Acked-by: Michal Privoznik <mprivozn@redhat.com>
2012-10-17 10:03:34 +02:00
Li Zhang
9943a7341c Implement CPU model driver for PowerPC
Currently, the CPU model driver is not implemented for PowerPC.
Host's CPU information is needed to exposed to guests' XML file some
time.

This patch is to implement the callback functions of CPU model driver.

Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Acked-by: Michal Privoznik <mprivozn@redhat.com>
2012-10-17 10:03:34 +02:00
Li Zhang
309f03db40 Add one file cpu_ppc_data.h to define CPU data for PPC
CPU version can be got by PVR on PowerPC. So this PVR is defined in
the CPU data in cpuData structure.

Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Acked-by: Michal Privoznik <mprivozn@redhat.com>
2012-10-17 10:03:34 +02:00
Guannan Ren
d37a3a1d6c selinux: remove unused variables in socket labelling 2012-10-17 13:13:17 +08:00
Guannan Ren
89b63f0ad4 selinux: fix wrong tapfd relablling
It should relabel tapfd of virtual network of type VIR_DOMAIN_NET_TYPE_DIRECT
rather than VIR_DOMAIN_NET_TYPE_NETWORK and VIR_DOMAIN_NET_TYPE_BRIDGE
(commit ae368ebfcc introduced this bug)

Caution: The context of the two hunks is identical other than indentation.
Please be extremely cautious of where the patch gets applied.
2012-10-17 13:13:14 +08:00
Cole Robinson
9f0e9cba27 storage: lvm: lvcreate fails with allocation=0, don't do that
On F17 at least, this command fails:

$ sudo /usr/sbin/lvcreate --name sparsetest -L 0K --virtualsize 16384K vgvirt
  Unable to create new logical volume with no extents

Which is unfortunate since allocation=0 is what virt-manager tries to use
by default.

Rather than telling the user 'don't do that', let's just give them the
smallest allocation possible if alloc=0 is requested.

https://bugzilla.redhat.com/show_bug.cgi?id=866481
2012-10-16 21:16:44 -04:00
Cole Robinson
01df6f2bff storage: lvm: Don't overwrite lvcreate errors
Before:
$ sudo virsh vol-create-as --pool vgvirt sparsetest --capacity 16M --allocation 0
error: Failed to create vol sparsetest
error: internal error Child process (/usr/sbin/lvchange -aln vgvirt/sparsetest) unexpected exit status 5:   One or more specified logical volume(s) not found.

After:
$ sudo virsh vol-create-as --pool vgvirt sparsetest --capacity 16M --allocation 0
error: Failed to create vol sparsetest
error: internal error Child process (/usr/sbin/lvcreate --name sparsetest -L 0K --virtualsize 16384K vgvirt) unexpected exit status 5:   Unable to create new logical volume with no extents
2012-10-16 21:16:44 -04:00
Jiri Denemark
5ce6d95eed locking: Fix build with sanlock < 2.4
libvirt started using sanlock_killpath to implement on_lockfailure
action. Since sanlock_killpath was introduced in sanlock 2.4, libvirt
fails to build with older sanlock.
2012-10-16 21:32:05 +02:00
Daniel P. Berrange
7bd744c401 Fix typo in previous commit s/lik/like/
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-16 16:37:50 +01:00
Daniel P. Berrange
d507f8f9b9 Make virInitialize thread safe
Currently there is a restriction that multi-threaded applications
must manually call virInitialize, before threads start using
libvirt, because it is not thread-safe. By switching it to use
a virOnceControl initializer we gain thread safety, and thus
applications no longer need to manually call it. They can rely
on virConnectOpen invoking it for them.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-16 16:33:38 +01:00
Daniel P. Berrange
84912e9c91 Fix virProcessKillPainfully on Win32
Win32 platforms don't have SIGKILL defined, but they do have
SIGABRT. Since our virProcess wrapper treats anything which
isn't SIGTERM/SIGINT as equivalent to SIGKILL, just use
SIGABRT on Win32.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-16 15:47:14 +01:00
Daniel P. Berrange
381a339e98 Add JSON serialization of virNetServerPtr objects for process re-exec()
Add two new APIs virNetServerNewPostExecRestart and
virNetServerPreExecRestart which allow a virNetServerPtr
object to be created from a JSON object and saved to a
JSON object, for the purpose of re-exec'ing a process.

This includes serialization of all registered services
and clients

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-16 15:45:55 +01:00
Daniel P. Berrange
3cfc3d7d2c Add JSON serialization of virNetServerClientPtr objects for process re-exec()
Add two new APIs virNetServerClientNewPostExecRestart and
virNetServerClientPreExecRestart which allow a virNetServerClientPtr
object to be created from a JSON object and saved to a
JSON object, for the purpose of re-exec'ing a process.

This includes serialization of the connected socket associated
with the client

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-16 15:45:55 +01:00
Daniel P. Berrange
0cc7925520 Add JSON serialization of virNetServerServicePtr objects for process re-exec()
Add two new APIs virNetServerServiceNewPostExecRestart and
virNetServerServicePreExecRestart which allow a virNetServerServicePtr
object to be created from a JSON object and saved to a
JSON object, for the purpose of re-exec'ing a process.

This includes serialization of the listening sockets associated
with the service

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-16 15:45:55 +01:00
Daniel P. Berrange
c298145344 Add JSON serialization of virNetSocketPtr objects for process re-exec()
Add two new APIs virNetSocketNewPostExecRestart and
virNetSocketPreExecRestart which allow a virNetSocketPtr
object to be created from a JSON object and saved to a
JSON object, for the purpose of re-exec'ing a process.

As well as saving the state in JSON format, the second
method will disable the O_CLOEXEC flag so that the open
file descriptors are preserved across the process re-exec()

Since it is not possible to serialize SASL or TLS encryption
state, an error will be raised if attempting to perform
serialization on non-raw sockets

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-16 15:45:55 +01:00
Daniel P. Berrange
8057c04e8d Add JSON serialization of virLockSpacePtr objects for process re-exec()
Add two new APIs virLockSpaceNewPostExecRestart and
virLockSpacePreExecRestart which allow a virLockSpacePtr
object to be created from a JSON object and saved to a
JSON object, for the purposes of re-exec'ing a process.

As well as saving the state in JSON format, the second
method will disable the O_CLOEXEC flag so that the open
file descriptors are preserved across the process re-exec()

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-16 15:45:55 +01:00
Daniel P. Berrange
eca72d4759 Introduce an internal API for handling file based lockspaces
The previously introduced virFile{Lock,Unlock} APIs provide a
way to acquire/release fcntl() locks on individual files. For
unknown reason though, the POSIX spec says that fcntl() locks
are released when *any* file handle referring to the same path
is closed. In the following sequence

  threadA: fd1 = open("foo")
  threadB: fd2 = open("foo")
  threadA: virFileLock(fd1)
  threadB: virFileLock(fd2)
  threadB: close(fd2)

you'd expect threadA to come out holding a lock on 'foo', and
indeed it does hold a lock for a very short time. Unfortunately
when threadB does close(fd2) this releases the lock associated
with fd1. For the current libvirt use case for virFileLock -
pidfiles - this doesn't matter since the lock is acquired
at startup while single threaded an never released until
exit.

To provide a more generally useful API though, it is necessary
to introduce a slightly higher level abstraction, which is to
be referred to as a "lockspace".  This is to be provided by
a virLockSpacePtr object in src/util/virlockspace.{c,h}. The
core idea is that the lockspace keeps track of what files are
already open+locked. This means that when a 2nd thread comes
along and tries to acquire a lock, it doesn't end up opening
and closing a new FD. The lockspace just checks the current
list of held locks and immediately returns VIR_ERR_RESOURCE_BUSY.

NB, the API as it stands is designed on the basis that the
files being locked are not being otherwise opened and used
by the application code. One approach to using this API is to
acquire locks based on a hash of the filepath.

eg to lock /var/lib/libvirt/images/foo.img the application
might do

   virLockSpacePtr lockspace = virLockSpaceNew("/var/lib/libvirt/imagelocks");
   lockname = md5sum("/var/lib/libvirt/images/foo.img");
   virLockSpaceAcquireLock(lockspace, lockname);

NB, in this example, the caller should ensure that the path
is canonicalized before calculating the checksum.

It is also possible to do locks directly on resources by
using a NULL lockspace directory and then using the file
path as the lock name eg

   virLockSpacePtr lockspace = virLockSpaceNew(NULL);
   virLockSpaceAcquireLock(lockspace, "/var/lib/libvirt/images/foo.img");

This is only safe to do though if no other part of the process
will be opening the files. This will be the case when this
code is used inside the soon-to-be-reposted virlockd daemon

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-16 15:45:55 +01:00
Eric Blake
819c8ce043 maint: prepare for next release number
Given Daniel's announcement[1], code targetting the next release will
be in 1.0.0, not 0.10.3.  Changed mechanically with:

for f in $(git grep -l '0\(.\)10\13\b') ; do
   sed -i -e 's/0\(.\)10\13/1\10\10/g' $f
done

[1]https://www.redhat.com/archives/libvir-list/2012-October/msg00403.html

* docs/formatdomain.html.in: Use 1.0.0 for next release.
* src/interface/interface_backend_udev.c: Likewise.
2012-10-16 08:09:01 -06:00
Martin Kletzander
280b8c9e7c conf: Fix crash with cleanup
There was a crash possible when both <boot dev... and <boot
order... were specified due to virDomainDefParseBootXML() erroring out
before setting *tmp (which was free'd in cleanup).  As a fix, I
created this cleanup that uses one pointer for all the temporary
stored XPath strings and values, plus this pointer is correctly
initialized to NULL.
2012-10-16 11:15:04 +02:00
Martin Kletzander
6676c1fc8f selinux: Use raw contexts 2
In commit 9674f2c637, I forgot to change
selabel_lookup with the other functions, so this one-liner does exactly
that.
2012-10-16 10:30:18 +02:00
Eric Blake
2cfa14bc8a maint: drop spurious semicolons
Detected with:
git grep ';;$' -- '**/*.[ch]'

* src/network/bridge_driver.c (networkRadvdConfContents): Fix
harmless typo.
* src/phyp/phyp_driver.c (phypUUIDTable_Pull): Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDriveDel):
Likewise.
2012-10-15 09:08:19 -06:00
Guannan Ren
ae368ebfcc selinux: add security selinux function to label tapfd
BZ:https://bugzilla.redhat.com/show_bug.cgi?id=851981
When using macvtap, a character device gets first created by
kernel with name /dev/tapN, its selinux context is:
system_u:object_r:device_t:s0

Shortly, when udev gets notification when new file is created
in /dev, it will then jump in and relabel this file back to the
expected default context:
system_u:object_r:tun_tap_device_t:s0

There is a time gap happened.
Sometimes, it will have migration failed, AVC error message:
type=AVC msg=audit(1349858424.233:42507): avc:  denied  { read write } for
pid=19926 comm="qemu-kvm" path="/dev/tap33" dev=devtmpfs ino=131524
scontext=unconfined_u:system_r:svirt_t:s0:c598,c908
tcontext=system_u:object_r:device_t:s0 tclass=chr_file

This patch will label the tapfd device before qemu process starts:
system_u:object_r:tun_tap_device_t:MCS(MCS from seclabel->label)
2012-10-15 21:01:07 +08:00
Martin Kletzander
7ba5defb5a Add support for SUSPEND_DISK event
This patch adds support for SUSPEND_DISK event; both lifecycle and
separated.  The support is added for QEMU, machines are changed to
PMSUSPENDED, but as QEMU sends SHUTDOWN afterwards, the state changes
to shut-off.  This and much more needs to be done in order for libvirt
to work with transient devices, wake-ups etc.  This patch is not
aiming for that functionality.
2012-10-15 12:09:10 +02:00
Ján Tomko
a9e3b4f78e util: switch virLogEatParams to virLogSource
Commit e8fd8757c8 changed 'const char *'
category to virLogSource enum. This changes it in virLogEatParams as
well, thus fixing the build with --disable-debug.
--
Hopefully moving the enum declarations is less ugly than using int.
2012-10-15 11:13:43 +02:00
Osier Yang
f81f0f2f1d node_memory: Add new parameter field to tune the new sysfs knob
Upstream kernel introduced new sysfs knob "merge_across_nodes" to
specify if pages from different numa nodes can be merged. When set
to 0, only pages which physically reside in the memory area of
same NUMA node can be merged. When set to 1, pages from all nodes
can be merged.

This patch supports the tuning by adding new param field
"shm_merge_across_nodes".
2012-10-15 17:35:54 +08:00
Laine Stump
6bde0a1a37 qemu: reorganize qemuDomainChangeNet and qemuDomainChangeNetBridge
This patch resolves:

  https://bugzilla.redhat.com/show_bug.cgi?id=805071

to the extent that it can be resolved with current qemu functionality.
It attempts to detect as many situations as possible when the simple
operation of disconnecting an existing tap device from one bridge and
attaching it to another will satisfy the change requested in
virDomainUpdateDeviceFlags() for a network device. Before this patch,
that situation could only be detected if the pre-change interface
*and* the post-change interface definition were both "type='bridge'".
After this patch, it can also be detected if the before or after
interfaces are any combination of type='bridge' and type='network'
(the networks can be <forward mode='nat|route|bridge'>, as long as
they use a Linux host bridge and not macvtap connections).

This extra effort is especially useful since the recent discovery that
a netdev_del+netdev_add combo (to reconnect the network device with
completely different hostside configuration) doesn't work properly
with current qemu (1.2) unless it is accompanied by the matching
device_del+device_add - see this mailing list message for details:

  http://lists.nongnu.org/archive/html/qemu-devel/2012-10/msg02355.html

(A slight modification of the patch referenced there has been prepared
to apply on top of this patch, but won't be pushed until qemu can be
made to work with it.)

* qemuDomainChangeNet needs access to the virDomainDeviceDef that
holds the new netdef (so that it can clear out the virDomainDeviceDef
if it ends up using the NetDef to replace the original), so the
virDomainNetDefPtr arg is replaced with a virDomainDeviceDefPtr.

* qemuDomainChangeNet previously checked for *some* changes to the
interface config, but this check was by no means complete. It was also
a bit disorganized.

This refactoring of the code is (I believe) complete in its check of
all NetDef attributes that might be changed, and either returns a
failure (for changes that are simply impossible), or sets one of three
flags:

  needLinkStateChange - if the device link state needs to go up/down
  needBridgeChange    - if everything else is the same, but it needs
                        to be connected to a difference linux host
                        bridge
  needReconnect       - if the entire host side of the device needs
                        to be torn down and reconstructed (currently
                        non-working, as mentioned above)

Note that this function will refuse to make any change that requires
the *guest* side of the device to be detached (e.g. changing the PCI
address or mac address). Those would be disruptive enough to the guest
that it's reasonable to require an explicit detach/attach sequence
from the management application.

* As mentioned above, qemuDomainChangeNet also does its best to
understand when a simple change in attached bridge for the existing
tap device will work vs. the need to completely tear down/reconstruct
the host side of the device (including tap device).

This patch *does not* implement the "reconnect" code anyway - there is
a placeholder that turns that into an error. Rather, the purpose of
this patch is to replicate existing behavior with code that is ready
to have that functionality plugged in in a later patch.

* The expanded uses for qemuDomainChangeNetBridge meant that it needed
to be enhanced as well - it no longer replaces the original brname
string in olddev with the new brname; instead, it relies on the
caller to replace the *entire* olddev with newdev (since we've gone
to great lengths to assure they are functionally identical other
than the name of the bridge, this is now not only safe, but more
correct). Additionally, qemuDomainNetChangeBridge can now set the
bridge for type='network' interfaces as well as plain type='bridge'
interfaces. (Note that I had to make this change simultaneous to the
reorganization of qemuDomainChangeNet because the two are too
closely intertwined to separate).
2012-10-15 04:36:39 -04:00
Guido Günther
dc9d7a171c Avoid straying </cpuset>
by using the same condition as for the <cpuset>.

Fixes "make check" found by
    http://honk.sigxcpu.org:8001/job/libvirt-check/160/
2012-10-15 17:14:25 +08:00
Laine Stump
11c47d979c conf: virDomainDeviceInfoCopy utility function
This does a shallow copy of all the bits, then strdups the two items
that are actually allocated separately.
2012-10-15 04:03:06 -04:00
Laine Stump
310945597c conf: fix virDevicePCIAddressEqual args
This function really should have been taking virDevicePCIAddress*
instead of the inefficient virDevicePCIAddress (results in copying two
entire structs onto the stack rather than just two pointers), and
returning a bool true/false (not matching is not necessarily a
"failure", as a -1 return would imply, and also using "if
(!virDevicePCIAddressEqual(x, y))" to mean "if x == y" is just a bit
counterintuitive).
2012-10-15 04:03:06 -04:00
Guido Günther
a2b80edbc6 Fix tab vs space
that broke "make syntax-check"

found by http://honk.sigxcpu.org:8001/job/libvirt-syntax-check/157/

Pushed under the build breaker rule.
2012-10-15 09:18:18 +02:00
Osier Yang
3635b41e15 qemu: Ignore def->cpumask if emulatorpin is specified
If the vcpu placement is "static", it's just fine to ignore the
def->cpumask if emulatorpin is specified.
2012-10-15 12:20:37 +08:00
Osier Yang
5378effd57 conf: Ignore emulatorpin if vcpu placement is auto
When vcpu placement is "auto", the domain process will be pinned
to advisory nodeset from querying numad, While emulatorpin will
override the pinning. That means both of them are to set the
pinning policy for domain process, but conflicts with each other.

This patch ingore emulatorpin if vcpu placement is "auto", because
<vcpu> placement can't be simply ignored for <numatune> placement
could default to it.
2012-10-15 12:19:54 +08:00
Osier Yang
0df1a79089 qemu: Initialize cpuset for hotplugged vcpu as def->cpuset
The onlined vcpu pinning policy should inherit def->cpuset if
it's not specified explicitly, and the affinity should be set
in this case. Oppositely, the offlined vcpu pinning policy should
be free()'ed.
2012-10-15 12:16:02 +08:00
Osier Yang
a9bfe887f9 qemu: Create or remove cgroup when doing vcpu hotpluging
Various APIs use cgroup to either set or get the statistics of
host or guest. Hotplug or hot unplug new vcpus without creating
or removing the cgroup for the vcpus could cause problems for
those APIs. E.g.

% virsh vcpucount dom
maximum      config        10
maximum      live          10
current      config         1
current      live           1

% virsh setvcpu dom 2

% virsh schedinfo dom --set vcpu_quota=1000
Scheduler      : posix
error: Unable to find vcpu cgroup for rhel6.2(vcpu: 1): No such file or
directory

This patch fixes the problem by creating cgroups for each of the
onlined vcpus, and destroying cgroups for each of the offlined
vcpus.
2012-10-15 12:15:32 +08:00
Osier Yang
10f8a45deb conf: Initialize the pinning policy for vcpus
Document for <vcpu>'s "cpuset" says:

Since 0.4.4, this element can contain an optional cpuset attribute,
which is a comma-separated list of physical CPU numbers that virtual
CPUs can be pinned to.

However, it's not the truth, libvirt actually pins the domain
process to the specified pCPUs by "cpuset" of <vcpu>. And the
vcpu thread are pinned to all available pCPUs if no <vcpupin>
is specified for it.

This patch is to implement the codes to inherit <vcpu>'s "cpuset" for
vcpu that doesn't have <vcpupin> specified, and <vcpupin>
for these vcpu will be ignored when formating. Underlying
driver implementation will make sure the vcpu thread pinned
to correct pCPUs.
2012-10-15 12:14:22 +08:00
Osier Yang
60b176c3d0 conf: Ignore vcpupin for not onlined vcpus when parsing
Setting pinning policy for vcpu which exceeds current vcpus number
just makes no sense, however, it could cause various problems, E.g.

<vcpu current='1'>4</vcpu>
<cputune>
  <vcpupin vcpuid='3' cpuset='4'/>
</cputune>

% virsh start linux
error: Failed to start domain linux
error: cannot set CPU affinity on process 32534: No such process

We must have some odd codes underlying which produces the
"on process 32534", but the point is why we not to prevent
earlier when parsing? Note that this is only one of the
problem it could cause.

This patch is to ignore the <vcpupin> for not onlined vcpus.
2012-10-15 12:13:57 +08:00
Martin Kletzander
9674f2c637 selinux: Use raw contexts
We are currently able to work only with non-translated SELinux
contexts, but we are using functions that work with translated
contexts throughout the code.  This patch swaps all SELinux context
translation relative calls with their raw sisters to avoid parsing
problems.

The problems can be experienced with mcstrans for example.  The
difference is that if you have translations enabled (yum install
mcstrans; service mcstrans start), fgetfilecon_raw() will get you
something like 'system_u:object_r:virt_image_t:s0', whereas
fgetfilecon() will return 'system_u:object_r:virt_image_t:SystemLow'
that we cannot parse.

I was trying to confirm that the _raw variants were here since the dawn of
time, but the only thing I see now is that it was imported together in
the upstream repo [1] from svn, so before 2008.

Thanks Laurent Bigonville for finding this out.

[1] http://oss.tresys.com/git/selinux.git
2012-10-12 17:54:09 +02:00
Jiri Denemark
f95560b3fe conf: Mark missing optional USB devices in domain XML
When startupPolicy set for a USB devices allows such device to be
missing, there was no way this could be detected from domain XML. With
this patch, libvirt emits a new missing='yes' attribute for such devices
when active domain XML is generated.
2012-10-12 10:55:32 +02:00
Ján Tomko
149c87b49d Various typos and misspellings 2012-10-12 00:03:43 +02:00