Commit Graph

9321 Commits

Author SHA1 Message Date
Peter Krempa
fa006c4fdd qemu: Fix setting of memory tunables
Refactoring done in 19c6ad9ac7 didn't
correctly take into account the order cgroup limit modification needs to
be done in. This resulted into errors when decreasing the limits.

The operations need to take place in this order:

decrease hard limit
change swap hard limit

or

change swap hard limit
increase hard limit

This patch also fixes the check if the hard_limit is less than
swap_hard_limit to print better error messages. For this purpose I
introduced a helper function virCompareLimitUlong to compare limit
values where value of 0 is equal to unlimited. Additionally the check is
now applied also when the user does not provide all of the tunables
through the API and in that case the currently set values are used.

This patch resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=950478
2013-04-23 07:10:56 +02:00
Jiri Denemark
fd2e55302b logging: Make log regexp more compact (and readable) 2013-04-22 20:13:40 +02:00
Jiri Denemark
6d1b3edc6e qemu: Ignore libvirt logs when reading QEMU error output
When QEMU fails to start, libvirt read its error output and reports it
back in an error message. However, when libvirtd is configured to log
debug messages, one would get the following unhelpful garbage:

    virsh # start cd
    error: Failed to start domain cd
    error: internal error process exited while connecting to monitor: \
      2013-04-22 14:24:54.214+0000: 2194219: debug : virFileClose:72 : \
      Closed fd 21
    2013-04-22 14:24:54.214+0000: 2194219: debug : virFileClose:72 : \
      Closed fd 27
    2013-04-22 14:24:54.215+0000: 2194219: debug : virFileClose:72 : \
      Closed fd 3
    2013-04-22 14:24:54.215+0000: 2194220: debug : virExec:602 : Run \
      hook 0x7feb8f600bf0 0x7feb86ef9300
    2013-04-22 14:24:54.215+0000: 2194220: debug : qemuProcessHook:2507 \
      : Obtaining domain lock
    2013-04-22 14:24:54.216+0000: 2194220: debug : \
      virDomainLockProcessStart:170 : plugin=0x7feb780261f0 \
      dom=0x7feb7802a360 paused=1 fd=0x7feb86ef8ec4
    2013-04-22 14:24:54.216+0000: 2194220: debug : \
      virDomainLockManagerNew:128 : plugin=0x7feb780261f0 \
      dom=0x7feb7802a360 withResources=1
    2013-04-22 14:24:54.216+0000: 2194220: debug : \
      virLockManagerPluginGetDriver:297 : plugin=0x7feb780261f0
    2013-04-22 14:24:54.216+0000: 2194220: debug : \
      virLockManagerNew:321 : driver=0x7feb8ef08640 type=0 nparams=5 \
      params=0x7feb86ef8d60 flags=0
    2013-04-22 14:24:54.216+000

instead of (the output with this patch applied):

    virsh # start cd
    error: Reconnected to the hypervisor
    error: Failed to start domain cd
    error: internal error process exited while connecting to monitor: \
      char device redirected to /dev/pts/33 (label charserial0)
    qemu-system-x86_64: -drive file=/home/vm/systemrescuecd-x86-1.2.0.\
      iso,if=none,id=drive-ide0-1-0,readonly=on,format=raw,cache=none: \
      could not open disk image /home/vm/systemrescuecd-x86-1.2.0.iso: \
      Permission denied
2013-04-22 20:13:40 +02:00
Jiri Denemark
e4bdba8d7f qemu: Move QEMU log reading into a separate function 2013-04-22 20:13:40 +02:00
Gene Czarcinski
1e5306c77a update input ip processing
1. Handle invalid ULong prefix specified.
When parsing for @prefix as a ULong, a -2 can be returned
if the specification is not a valid ULong.

2.  Error out if address= is not specified.

3.  Merge netmask process/tests under family tests.

4. Max sure that prefix does not exceed maximum.
.
Signed-off-by: Gene Czarcinski <gene@czarc.net>
2013-04-22 14:10:53 -04:00
Gene Czarcinski
bd7c7c1b3c create virSocketAddrGetIpPrefix utility function
Create the utility function virSocketAddrGetIpPrefix() to
determine the prefix for this network.  The code in this
function was adapted from virNetworkIpDefPrefix().

Update virNetworkIpDefPrefix() in src/conf/network_conf.c
to use the new utility function.

Signed-off-by: Gene Czarcinski <gene@czarc.net>
2013-04-22 14:10:53 -04:00
Daniel P. Berrange
1e05073fbb Replace more cases of /system with /machine
The change in commit aed4986322
was incomplete, missing a couple of cases of /system. This
caused failure to start VMs.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-22 17:11:36 +01:00
Harry Wei
0f35e00135 sheepdog: Omit braces with a single-line body
libvirt/HACKING suggests omitting braces with a
single-line body; this patch fixes the coding style
problem for the Sheepdog storage backend driver.

Signed-off-by: Harry Wei <harryxiyou@gmail.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2013-04-22 08:33:35 -06:00
Daniel P. Berrange
aed4986322 Change default resource partition to /machine
After discussions with systemd developers it was decided that
a better default policy for resource partitions is to have
3 default partitions at the top level

   /system   - system services
   /machine - virtual machines / containers
   /user    - user login session

This ensures that the default policy isolates guest from
user login sessions & system services, so a mis-behaving
guest can't consume 100% of CPU usage if other things are
contending for it.

Thus we change the default partition from /system to
/machine

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-22 12:10:12 +01:00
Osier Yang
a71ec98841 qemu: Fix the wrong expression
Wrong use of the parentheses causes "rc" always having a boolean value,
either "1" or "0", and thus we can't get the detailed error message
when it fails:

Before (I only have 1 node):
% virsh numatune f18 --nodeset 12
error: Unable to change numa parameters
error: unable to set numa tunable: Unknown error -1

After:
virsh numatune f18 --nodeset 12
error: Unable to change numa parameters
error: unable to set numa tunable: Invalid argument
2013-04-22 18:56:20 +08:00
Eric Blake
1bf25ba249 docs: fix usage of 'onto'
http://www.uhv.edu/ac/newsletters/writing/grammartip2009.07.01.htm
(and several other sites) give hints that 'onto' is best used if
you can also add 'up' just before it and still make sense. In many
cases in the code base, we really want the two-word form, or even
a simplification to just 'on' or 'to'.

* docs/hacking.html.in: Use correct 'on to'.
* python/libvirt-override.c: Likewise.
* src/lxc/lxc_controller.c: Likewise.
* src/util/virpci.c: Likewise.
* daemon/THREADS.txt: Use simpler 'on'.
* docs/formatdomain.html.in: Better usage.
* docs/internals/rpc.html.in: Likewise.
* src/conf/domain_event.c: Likewise.
* src/rpc/virnetclient.c: Likewise.
* tests/qemumonitortestutils.c: Likewise.
* HACKING: Regenerate.

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-04-19 14:31:16 -06:00
Eric Blake
31c6bf35b9 audit: properly encode device path in cgroup audit
https://bugzilla.redhat.com/show_bug.cgi?id=922186

Commit d04916fa introduced a regression in audit quality - even
though the code was computing the proper escaped name for a
path, it wasn't feeding that escaped name on to the audit message.
As a result, /var/log/audit/audit.log would mention a pair of
fields class=path path=/dev/hpet instead of the intended
class=path path="/dev/hpet", which in turn caused ausearch to
format the audit log with path=(null).

* src/conf/domain_audit.c (virDomainAuditCgroupPath): Use
constructed encoding.

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-04-19 12:06:08 -06:00
Ján Tomko
6f45099723 qemu: rename CheckSlot to SlotInUse
Also change its return value from int to bool.
2013-04-19 18:16:01 +02:00
Ján Tomko
5d29ca063d qemu: switch PCI address set from hash table to an array
Each bus is represented as an array of 32 8-bit integers
where each bit represents a PCI function and each byte represents
a PCI slot.

Uses just one bus so far.
2013-04-19 18:16:01 +02:00
Ján Tomko
5c3d5b22a9 conf: add model attribute to virDomainDefMaybeAddController 2013-04-19 18:16:01 +02:00
Ján Tomko
db180a1d31 qemu: move PCI address check out of qemuPCIAddressAsString
Create a new function qemuPCIAddressValidate and call it everywhere
the user might supply an incorrect address:
* qemuCollectPCIAddress for domain definition
* qemuDomainPCIAddressEnsureAddr and ReleaseSlot for hotplug

Slot and function shouldn't be wrong at this point, since values
out of range should be rejected by the XML parser.
2013-04-19 17:50:54 +02:00
Ján Tomko
62940d6c68 qemu: QEMU_PCI constant consistency
Change QEMU_PCI_ADDRESS_LAST_SLOT to the number of slots in the bus,
not the maximum slot value, to match QEMU_PCI_ADDRESS_LAST_FUNCTION
and rename them both to have _LAST at the end.
2013-04-19 17:50:54 +02:00
Ján Tomko
ba8b8ddb7f qemu: print PCI address hexadecimally in errors
Use the same formatting as we do for XML in error and debug outputs.
2013-04-19 17:50:54 +02:00
Ján Tomko
8e5928de98 qemu: make qemuComparePCIDevice aware of multiple buses
Bus and domain need to be checked as well, otherwise we might
get false positives when searching for multi-function devices.
2013-04-19 17:50:54 +02:00
Peter Krempa
bcefb50792 conf: Reword error message to be more universal
The error message reported when attempting to change/get persistent
configuration of a transient domain suggests that changes are being
made. Reword it to suit getter APIs too.

Before:
$ virsh vcpucount transient-domain --config
error: Requested operation is not valid: cannot change persistent config of a transient domain

After:
$ virsh vcpucount transient-domain --config
error: Requested operation is not valid: transient domains do not have any persistent config
2013-04-19 16:55:59 +02:00
Peter Krempa
446dd66b7c network: bridge_driver: don't lose transient networks on daemon restart
Until now tranisent networks weren't really useful as libvirtd wasn't
able to remember them across restarts. This patch adds support for
loading status files of transient networks (that already were generated)
so that the status isn't lost.

This patch chops up virNetworkObjUpdateParseFile and turns it into
virNetworkLoadState and a few friends that will help us to load status
XMLs and refactors the functions that are loading the configs to use
them.
2013-04-19 16:43:47 +02:00
Jiri Denemark
f1a1ebf19d cpu: Rename PowerPCUpdate and PowerPCDataFree functions
For consistency with other functions in PowerPC CPU driver, the two
functions are renamed as ppcUpdate and ppcDataFree, respectively.
2013-04-19 14:33:16 +02:00
Jiri Denemark
7a4f12381c cpu: Remove hardcoded list of PowerPC models
The cpu_map.xml file is there to separate CPU model definitions from the
code. Having the only interesting data for PowerPC models only in the
source code. This patch moves this data to the XML file and removes the
hardcoded list completely.
2013-04-19 14:33:16 +02:00
Jiri Denemark
f42ecaf12b cpu: Reimplement PowerPCDecode
PowerPC CPUs are either identical or incompatible and thus we just need
to look up the right model for given PVR without pretending we have
several candidates which we may choose from.

The function is also renamed as ppcDecode to match other functions in
PowerPC CPU driver.
2013-04-19 14:33:16 +02:00
Jiri Denemark
fdf6efde27 cpu: Reimplement PowerPCBaseline
Baseline API is supposed to return guest CPU definition that can be used
on any of the provided host CPUs. Since PowerPC CPUs are either
identical or incompatible, the API just needs to check that all provided
CPUs are identical. Previous implementation was completely bogus.

The function is also renamed as ppcBaseline to match other functions in
PowerPC CPU driver.
2013-04-19 14:33:16 +02:00
Jiri Denemark
ba8ba24711 cpu: Fix loading PowerPC vendor from cpu_map.xml
When ppcVendorLoad fails to parse the vendor element for whatever
reason, it is supposed to ignore it and return 0 rather than -1. The
patch also removes PowerPC vendor string from the XML as it is not
actually used for anything.
2013-04-19 14:33:16 +02:00
Jiri Denemark
70349cb90d cpu: Fix PowerPCNodeData
Make getting node CPU data for PowerPC unsupported on other
architectures. The function is also renamed as ppcNodeData to match
other functions in PowerPC CPU driver.
2013-04-19 14:33:16 +02:00
Jiri Denemark
6af5a06275 cpu: Make comparing PowerPC CPUs easier to read
Revert the condition to make it easier to read. The function is also
renamed as ppcCompare to match other functions in PowerPC CPU driver.
2013-04-19 14:33:15 +02:00
Jiri Denemark
16c6b60cbd cpu: Introduce cpuModelIsAllowed internal API
The API can be used to check if the model is on the supported models
list, which needs to be done in several places.
2013-04-19 14:33:15 +02:00
Li Zhang
88c6159ca7 Set legacy USB option with default for ppc64.
Currently, -device xxx still doesn't work well for ppc64 platform.
It's better use legacy USB option with default for ppc64.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-19 11:30:49 +01:00
Ján Tomko
4327df7eee qemu: fix default spice password setting
Set spice password even if default VNC password hasn't been set.

https://bugzilla.redhat.com/show_bug.cgi?id=953720
2013-04-19 07:08:30 +02:00
Paolo Bonzini
78d7c3c569 qemu_conf: add new configuration key bridge_helper
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-04-18 14:58:33 -06:00
Paolo Bonzini
5c1cfea403 util: allow using virCommandAllowCap with setuid helpers
When running unprivileged, virSetUIDGIDWithCaps will fail because it
tries to add the requested capabilities to the permitted and effective
sets.

Detect this case, and invoke the child with cleared permitted and
effective sets.  If it is a setuid program, it will get them.

Some care is needed also because you cannot drop capabilities from the
bounding set without CAP_SETPCAP.  Because of that, ignore errors from
setting the bounding set.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-04-18 14:52:23 -06:00
Paolo Bonzini
658718454a util: simplify virSetUIDGIDWithCaps
The need_prctl variable is not really needed.  If it is false,
capng_apply will be called twice with the same set, causing
a little extra work but no problem.  This keeps the code a bit
simpler.

It is also clearer to invoke capng_apply(CAPNG_SELECT_BOUNDS)
separately, to make sure it is done while we have CAP_SETPCAP.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-04-18 14:33:28 -06:00
Tal Kain
9b3322c766 qemu: simplify use of virArchFromHost
Reusing the result of virArchFromHost instead of calling it multiple times

Signed-off-by: Tal Kain <tal.kain@ravellosystems.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2013-04-18 06:42:11 -06:00
Peter Krempa
45012bc85b network: remove autostart flag from network when undefining it
When turning a started persistent network into a transient one we forgot
to remove the autostart flag that is no longer valid at that point.
2013-04-18 09:44:14 +02:00
Osier Yang
1d69c6334b syntax-check: Don't include public headers in internal source
Directories python/tools/examples should include them in <> form,
though this patch allows "" form in these directories by excluding
them, a later patch will do the cleanup.
2013-04-18 11:24:46 +08:00
Ján Tomko
9f8badbbe6 conf: fix comment about parsing graphics listen address 2013-04-17 21:01:56 +02:00
Osier Yang
f043199413 remote: Revert removing "libvirt/libvirt.h" in remote_protocol.x
Commit 2d25fd4f41 removed the including of "libvirt/libvirt.h",
which breaks the build. Pushed under build-breaker rule.
2013-04-17 23:18:47 +08:00
Osier Yang
09d2547f96 qemu: Allow the disk wwn to have "0x" prefix
The recent qemu requires "0x" prefix for the disk wwn, this patch
changes virValidateWWN to allow the prefix, and prepend "0x" if
it's not specified. E.g.

qemu-kvm: -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,\
drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,wwn=6000c60016ea71ad:
Property 'scsi-hd.wwn' doesn't take value '6000c60016ea71ad'

Though it's a qemu regression, but it's nice to allow the prefix,
and doesn't hurt for us to always output "0x".
2013-04-17 23:05:56 +08:00
Osier Yang
5829054caf cleanup: Don't include libvirt/virterror.h
Which is already included in "internal.h", later patch will add
syntax-check to avoid it.
2013-04-17 15:54:07 +08:00
Osier Yang
2d25fd4f41 cleanup: Don't include libvirt/libvirt.h
Which is already included by "internal.h", later patch will add
syntax-check to avoid it.
2013-04-17 15:50:53 +08:00
Osier Yang
bc95be5dea cleanup: Remove the duplicate header
Detected by a simple Shell script:

for i in $(git ls-files -- '*.[ch]'); do
    awk 'BEGIN {
        fail=0
    }
    /# *include.*\.h/{
        match($0, /["<][^">]*[">]/)
        arr[substr($0, RSTART+1, RLENGTH-2)]++
    }
    END {
        for (key in arr) {
            if (arr[key] > 1) {
                fail=1
                printf("%d %s\n", arr[key], key)
            }
        }
        if (fail == 1)
            exit 1
    }' $i

    if test $? != 0; then
        echo "Duplicate header(s) in $i"
    fi
done;

A later patch will add the syntax-check to avoid duplicate
headers.
2013-04-17 15:49:35 +08:00
Stefan Berger
0cb171f60f Fix compilation error in util/vircgroup.c
Fix the error

util/vircgroup.c: In function 'virCgroupNewDomainPartition':
util/vircgroup.c:1299:11: error: declaration of 'dirname' shadows a global declaration [-Werror=shadow]


Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
2013-04-16 08:16:37 -04:00
John Ferlan
d94a3cfcfb Fix build breaker with ATTRIBUTE_NONNULL defs
Using "./autogen.sh --system lv_cv_static_analysis=yes" for my daily
Coverity builds resulted in the following error when building:

In file included from util/vircgrouppriv.h:32:0,
                 from util/vircgroup.c:44:
util/vircgroup.h:59:5: error: nonnull argument with out-of-range operand number (argument 1, operand 5)
util/vircgroup.h:74:5: error: nonnull argument references non-pointer operand (argument 1, operand 4)
make[3]: *** [libvirt_util_la-vircgroup.lo] Error 1
make[3]: Leaving directory `/home/jferlan/libvirt.cov.curr/src'
make[2]: *** [all] Error 2
make[2]: Leaving directory `/home/jferlan/libvirt.cov.curr/src'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/jferlan/libvirt.cov.curr'
make: *** [all] Error 2
2013-04-16 07:17:00 -04:00
Stefan Berger
8b934a5cb6 Check for unsupported QMP command
Check for an unsupported QMP command when using the query-tpm-models
and query-tpm-types commands before checking for general errors
in order to avoid error messages in the log.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
2013-04-16 07:05:21 -04:00
Stefan Berger
f62cb55666 Revert checking for QMP query-tpm-models
Revert the patch checking for the QMP query-tpm-models
command.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
2013-04-16 07:05:21 -04:00
Peter Krempa
cbf8ebaad4 qemu_agent: Add support for appending arrays to commands
Add support for array elements for agent commands just like 64d5e815 did for
monitor commands
2013-04-16 10:38:30 +02:00
Peter Krempa
13f2608126 lib: Fix docs about return value of virDomainGetVcpusFlags()
The return value description stated that 0 is returned in case of success
instead of the count of vCPUs.
2013-04-16 10:38:29 +02:00
Stefan Berger
3208c562b4 Check for QMP query-tpm-models
Check for QMP query-tpm-models and set a capability flag. Do not use
this QMP command if it is not supported.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
2013-04-15 16:46:53 -04:00
Daniel P. Berrange
e7d8ab016b Add support for perf_event and net_cls cgroup controllers
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:32 +01:00
Daniel P. Berrange
ff66b45e2b Replace LXC cgroup mount code with call to virCgroupIsolateMount
The LXC driver currently has code to detect cgroups mounts
and then re-mount them inside the new root filesystem. Replace
this fragile code with a call to virCgroupIsolateMount.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:32 +01:00
Daniel P. Berrange
1da631ecf3 Add an API for re-mounting cgroups, to isolate the process location
Add a virCgroupIsolateMount method which looks at where the
current process is place in the cgroups (eg /system/demo.lxc.libvirt)
and then remounts the cgroups such that this sub-directory
becomes the root directory from the current process' POV.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:32 +01:00
Daniel P. Berrange
83336118db Track symlinks for co-mounted cgroup controllers
If a cgroup controller is co-mounted with another, eg

   /sys/fs/cgroup/cpu,cpuacct

Then it is a requirement that there exist symlinks at

   /sys/fs/cgroup/cpu
   /sys/fs/cgroup/cpuacct

pointing to the real mount point. Add support to virCgroupPtr
to detect and track these symlinks

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:32 +01:00
Daniel P. Berrange
767596bdb4 Remove non-functional code for setting up non-root cgroups
The virCgroupNewDriver method had a 'bool privileged' param.
If a false value was ever passed in, it would simply not
work, since non-root users don't have any privileges to create
new cgroups. Just delete this broken code entirely and make
the QEMU driver skip cgroup setup in non-privileged mode

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
db44eb1b5f Change default cgroup layout for QEMU/LXC and honour XML config
Historically QEMU/LXC guests have been placed in a cgroup layout
that is

   $LOCATION-OF-LIBVIRTD/libvirt/{qemu,lxc}/$VMNAME

This is bad for a number of reasons

 - The cgroup hierarchy gets very deep which seriously
   impacts kernel performance due to cgroups scalability
   limitations.

 - It is hard to setup cgroup policies which apply across
   services and virtual machines, since all VMs are underneath
   the libvirtd service.

To address this the default cgroup location is changed to
be

    /system/$VMNAME.{lxc,qemu}.libvirt

This puts virtual machines at the same level in the hierarchy
as system services, allowing consistent policy to be setup
across all of them.

This also honours the new resource partition location from the
XML configuration, for example

  <resource>
    <partition>/virtualmachines/production</partitions>
  </resource>

will result in the VM being placed at

    /virtualmachines/production/$VMNAME.{lxc,qemu}.libvirt

NB, with the exception of the default, /system, path which
is intended to always exist, libvirt will not attempt to
auto-create the partitions in the XML. It is the responsibility
of the admin/app to configure the partitions. Later libvirt
APIs will provide a way todo this.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
8d4adf3efa Add XML config for resource partitions
Allow VMs to be placed into resource groups using the
following syntax

  <resource>
    <partition>/virtualmachines/production</partition>
  </resource>

A resource cgroup will be backed by some hypervisor specific
functionality, such as cgroups with KVM/LXC.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
aa8604dd45 Add a new virCgroupNewPartition for setting up resource partitions
A resource partition is an absolute cgroup path, ignoring the
current process placement. Expose a virCgroupNewPartition API
for constructing such cgroups

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
109554d714 Cleanup if creating cgroup directories fails
Currently if virCgroupMakeGroup fails, we can get in a situation
where some controllers have been setup, but others not. Ensure
we call virCgroupRemove to remove what we've done upon failure

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
854a004fd6 Add misc extra debugging into cgroups code
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
8d1c141a8d Refactor cgroups internal data structures
Currently the virCgroupPtr struct contains 3 pieces of
information

 - path - path of the cgroup, relative to current process'
   cgroup placement
 - placement - current process' placement in each controller
 - mounts - mount point of each controller

When reading/writing cgroup settings, the path & placement
strings are combined to form the file path. This approach
only works if we assume all cgroups will be relative to
the current process' cgroup placement.

To allow support for managing cgroups at any place in the
heirarchy a change is needed. The 'placement' data should
reflect the absolute path to the cgroup, and the 'path'
value should no longer be used to form the paths to the
cgroup attribute files.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
04c18d25f1 Rename virCgroupForXXX to virCgroupNewXXX
Rename all the virCgroupForXXX methods to use the form
virCgroupNewXXX since they are all constructors. Also
make sure the output parameter is the last one in the
list, and annotate all pointers as non-null. Fix up
all callers, and make sure they use true/false not 0/1
for the boolean parameters

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
f0e5f92434 Pull definition of structs out of vircgroup.c to vircgrouppriv.h
The definition of structs for cgroups are kept in vircgroup.c since
they are intended to be private from users of the API. To enable
effective testing, however, they need to be accessible. To address
the latter issue, without compronmising the former, this introduces
a new vircgrouppriv.h file to hold the struct definitions.

To prevent other files including this private header, it requires
that __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ be defined before inclusion

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
cfed9ad4fb Store a virCgroupPtr instance in virLXCDomainObjPrivatePtr
Instead of calling virCgroupForDomain every time we need
the virCgrouPtr instance, just do it once at Vm startup
and cache a reference to the object in virLXCDomainObjPrivatePtr
until shutdown of the VM. Removing the virCgroupPtr from
the LXC driver state also means we don't have stale mount
info, if someone mounts the cgroups filesystem after libvirtd
has been started

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
632f78caaf Store a virCgroupPtr instance in qemuDomainObjPrivatePtr
Instead of calling virCgroupForDomain every time we need
the virCgrouPtr instance, just do it once at Vm startup
and cache a reference to the object in qemuDomainObjPrivatePtr
until shutdown of the VM. Removing the virCgroupPtr from
the QEMU driver state also means we don't have stale mount
info, if someone mounts the cgroups filesystem after libvirtd
has been started

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
c9b8cdfec1 Add missing param to virCgroupForDriver stub
The virCgroupForDriver method recently gained an 'int controllers'
parameter, but the stub impl did not

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
035cdaa00b Introduce a virFileDeleteTree method
Introduce a method virFileDeleteTree for recursively deleting
an entire directory tree

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:31 +01:00
Daniel P. Berrange
3f85de5292 Fix signature of dummy virNetlinkCommand stub
The second param of virNetlinkCommand should be
struct nlmsghdr, not unsigned char.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:30 +01:00
Daniel P. Berrange
fd856af62b Add empty stub for virThreadCancel on Win32
Win32 does not like undefined symbols, so define an
empty virThreadCancel impl.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 17:35:30 +01:00
Osier Yang
b1ea781eaa Use unsigned int instead of unsigned
Though they are the same thing, mixed use of them is uncomfortable.
"unsigned" is used a lot in old codes, this just tries to change the
ones in utils.
2013-04-15 23:07:08 +08:00
Daniel P. Berrange
e16e2a8bbb Do more complete initialization of libgcrypt
If libvirt makes any gcry_control() calls, then this
prevents gnutls for doing any initialization. As such
we must take care to do full initialization of libcrypt
on a par with what gnutls would have done. In particular
we must disable "sec mem" for cases where the user does
not have mlock() permission. We also skip our init of
libgcrypt if something else (ie the app using libvirt)
has beaten us to it.

https://bugzilla.redhat.com/show_bug.cgi?id=951630

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-15 12:09:10 +01:00
Peter Krempa
63b68f3cb4 qemu: Report also domain name in error message when domain object wasn't found
Report the errors as:
Domain not found: no domain with matching uuid '41414141-4141-4141-4141-414141414141' (crashtest)
instead of:
Domain not found: no domain with matching uuid '41414141-4141-4141-4141-414141414141'
2013-04-15 09:43:54 +02:00
Peter Krempa
54a99ba867 qemu: Refactor lookup of domain object
Use the helper to lookup the domain object in the remaining places.

This patch also fixes error reporting when the domain was not found in several
functions that were printing the raw UUID buffer instead of the formatted
string. The offending functions were:

qemuDomainGetInterfaceParameters
qemuDomainSetInterfaceParameters
qemuGetSchedulerParametersFlags
qemuSetSchedulerParametersFlags
qemuDomainGetNumaParameters
qemuDomainSetNumaParameters
qemuDomainGetMemoryParameters
qemuDomainSetMemoryParameters
qemuDomainGetBlkioParameters
qemuDomainSetBlkioParameters
qemuDomainGetCPUStats
2013-04-15 09:43:54 +02:00
Osier Yang
2f40ede4cd storage: Fix the indention
Pushed under trivial rule
2013-04-13 15:22:01 +08:00
Osier Yang
93002b9827 cleanup: Change datatype of net->stp to boolean 2013-04-13 13:28:36 +08:00
Osier Yang
f2adc3b435 cleanup: Change datatype of usbdev->allow to boolean 2013-04-13 13:28:36 +08:00
Osier Yang
00b6828dc2 cleanup: Change datatype of graphic's members to boolean 2013-04-13 13:28:36 +08:00
Osier Yang
b044b4d78f cleanup: Change datatype of accel's members to boolean 2013-04-13 13:28:36 +08:00
Stefan Berger
291cfb83f3 TPM support for QEMU command line
For TPM passthrough device support create command line parameters like:

-tpmdev passthrough,id=tpm-tpm0,path=/dev/tpm0,cancel-path=/sys/class/misc/tpm0/device/cancel -device tpm-tis,tpmdev=tpm-tpm0,id=tpm0

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
2013-04-12 16:55:46 -04:00
Stefan Berger
22feb0d3e7 QEMU Cgroup support for TPM passthrough
Some refactoring for virDomainChrSourceDef type of devices so
we can use common code.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
2013-04-12 16:55:46 -04:00
Stefan Berger
2c9a063973 Audit the starting of a guest using TPM passthrough
When a VM with a TPM passthrough device is started, the audit daemon
logs the following type of message:

type=VIRT_RESOURCE msg=audit(1365170222.460:3378): pid=16382 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 msg='virt=kvm resrc=dev reason=start vm="TPM-PT" uuid=a4d7cd22-da89-3094-6212-079a48a309a1 device="/dev/tpm0" exe="/usr/sbin/libvirtd" hostname=? addr=? terminal=? res=success'

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
2013-04-12 16:55:46 -04:00
Stefan Berger
2a40a09220 Add SELinux and DAC labeling support for TPM passthrough
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
2013-04-12 16:55:46 -04:00
Stefan Berger
f447ff5982 Convert QMP strings into QEMU capability bits
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
2013-04-12 16:55:45 -04:00
Stefan Berger
6ecff413e1 Parse TPM passthrough XML in the domain XML
Parse the domain XML with TPM passthrough support.
The TPM passthrough XML may look like this:

    <tpm model='tpm-tis'>
      <backend type='passthrough'>
        <device path='/dev/tpm0'/>
      </backend>
    </tpm>


Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
2013-04-12 16:55:45 -04:00
Stefan Berger
06ba4bff91 Helper functions for host TPM support
Implement helper function to create the TPM's sysfs cancel file.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
2013-04-12 16:55:45 -04:00
Stefan Berger
069219577b Add function to find a needle in a string array
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
2013-04-12 16:55:45 -04:00
Stefan Berger
ed1f031850 Add QMP probing for TPM
Probe for QEMU's QMP TPM support by querying the lists of
supported TPM models (query-tpm-models) and backend types
(query-tpm-types). 

The setting of the capability flags following the strings
returned from the commands above is only provided in the
patch where domain_conf.c gets TPM support due to dependencies
on functions only introduced there. 

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Tested-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
2013-04-12 16:55:45 -04:00
Peter Krempa
039a3283fc conf: Allow for non-contiguous device boot orders
This patch adds the ability to configure non-contiguous boot orders on boot
devices. This allows unplugging devices that have boot order specified without
breaking migration.

The new code now uses a slightly less memory efficient approach to store the
boot order fields in a hashtable instead of a bitmap.
2013-04-12 14:43:12 +02:00
Li Zhang
a6e37aedff Add USB option capability
To avoid the collision for creating USB controllers in machine->init()
and -device xx command line, it needs to set usb=off to avoid one USB
controller created in machine->init(). So that libvirt can use -device
or -usb to create USB controller sucessfully.
So QEMU_CAPS_MACHINE_USB_OPT capability is added, and it is for QEMU
v1.3.0 onwards which supports USB option.

Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
2013-04-12 10:56:03 +01:00
Jiri Denemark
88624b5d4c qemu: Do not report unsafe migration for local files
When migrating a domain with disk images stored locally (and using
storage migration), we should not complain about unsafe migration no
matter what cache policy is used for that disk.
2013-04-11 21:57:50 +02:00
Peter Krempa
608d149e97 qemu: Try to use QMP for send-key if supported
Instead of always using HMP use the QMP send-key command introduced in qemu 1.3.
2013-04-11 16:42:30 +02:00
Michal Privoznik
7f15ebc7a2 qemu: Set correct migrate host in client_migrate_info
https://bugzilla.redhat.com/show_bug.cgi?id=920441

Currently, we are discarding listen attribute from qemu cookie even though
we strive to gather it. This result in not so cool bug: if user have
different networks, one for management/migration, and one for VNC/SPICE we
pass incorrect host to the qemu in client_migrate_info. What we actually
pass is remote hostname, while we should be passing remote listen address.
It doesn't matter as long as these two are the same, but they don't need
necessary to be like that.
2013-04-11 12:32:17 +02:00
Ján Tomko
74bff25090 qemu: fix crash in qemuOpen
If the path part of connection URI is not present, cfg is used
unitialized.

https://bugzilla.redhat.com/show_bug.cgi?id=950855
2013-04-11 11:41:22 +02:00
Ján Tomko
4e54714c72 conf: fix error for parallel port mismatch 2013-04-11 09:13:32 +02:00
Osier Yang
f4279c5320 cleanup: Change datatype of secret->private to boolean 2013-04-11 11:54:37 +08:00
Osier Yang
4258a548d2 cleanup: Change datatype of secret->ephemeral to boolean 2013-04-11 11:50:23 +08:00
Osier Yang
ba474c7844 cleanup: Change datatype of fs->readonly to boolean 2013-04-11 11:36:47 +08:00
Osier Yang
e9e37538bb cleanup: Change datatype of disk->readonly to boolean 2013-04-11 11:36:44 +08:00
Osier Yang
71dae03f9b cleanup: Change datatype of disk->transient to boolean 2013-04-11 11:36:41 +08:00
Osier Yang
a29bafd5de cleanup: Change datatype of disk->shared to boolean 2013-04-11 11:36:37 +08:00
Osier Yang
7a984d5713 cleanup: Change datatype of auth->expires to boolean 2013-04-11 11:36:33 +08:00
Osier Yang
1bbc1e7524 cleanup: Change datatype of hostdev->missing to boolean 2013-04-11 11:36:28 +08:00
Osier Yang
cc7da958c8 Cleanup: Change datatype of origstate's members to boolean
Members of struct virPCIDevice are changed together.
2013-04-11 11:35:17 +08:00
Osier Yang
9fda2f5cc9 Cleanup: Change datatype of hostdev->managed to boolean 2013-04-11 11:31:02 +08:00
Guannan Ren
2fff380105 conf: fix a memory leak when parsing nat port XML nodes
==5306== 8 bytes in 1 blocks are definitely lost in loss record 24 of 277
 ==5306==    at 0x4C28B2F: calloc (vg_replace_malloc.c:593)
 ==5306==    by 0x5293CAF: virAllocN (viralloc.c:152)
 ==5306==    by 0x52DFEAE: virXPathNodeSet (virxml.c:611)
 ==5306==    by 0x5313DD9: virNetworkDefParseXML (network_conf.c:1408)
 ==5306==    by 0x53170F6: virNetworkObjUpdateParseFile (network_conf.c:2031)
 ==5306==    by 0x131DA63C: networkStartup (bridge_driver.c:279)
 ==5306==    by 0x53481DF: virStateInitialize (libvirt.c:822)
 ==5306==    by 0x40DF44: daemonRunStateInit (libvirtd.c:877)
 ==5306==    by 0x52D2FF5: virThreadHelper (virthreadpthread.c:161)
 ==5306==    by 0x5D00C52: start_thread (in /usr/lib64/libpthread-2.17.so)
 ==5306==    by 0x6410ECC: clone (in /usr/lib64/libc-2.17.so)
2013-04-11 09:55:11 +08:00
Peter Krempa
b7c98329cb conf: Fix race between looking up a domain object and freeing it
This patch fixes crash of the daemon that happens due to the following race
condition:

Let's have two threads in the libvirtd daemon's qemu driver:
A - thread executing undefine on the same domain
B - thread executing a API call to get information about a domain

Assume following serialization of operations done by the threads:
1) A has the lock on the domain object and is executing some code prior to
   virDomainObjListRemove()
2) B takes the lock on the domain object list, looks up the domain object
pointer and blocks in the attempt to lock the domain object as A is holding the
lock
3) A reaches virDomainObjListRemove() and unlocks the lock on the domain object
4) A blocks on the attempt to get the domain list lock
5) B is able to lock the domain object now and unlocks the domain list
6) A is now able to lock the domain list, and sheds the last reference on the
domain object, this triggers the freeing function.
6) B starts executing the code on the pointer that is being freed
7) The libvirtd daemon crashes while attempting to access invalid pointer in
thread B.

This patch fixes the race by acquiring a reference on the domain object before
unlocking it in virDomainObjListRemove() and re-locks the object prior to
removing and freeing it. This ensures that no thread holds a lock on the domain
object at the time it is removed from the list, and that doing a list lookup
will never find a domain that is about to vanish.

This is a minimal fix of the problem, but a better solution will be to switch to
full reference counting for domain objects.
2013-04-10 09:32:03 +02:00
Laine Stump
9579b6bc20 Fix crash in virNetDevGetVirtualFunctions
Commit 9a3ff01d7f (which was ACKed at
the end of January, but for some reason didn't get pushed until during
the 1.0.4 freeze) fixed the logic in virPCIGetVirtualFunctions().
Unfortunately, a typo in the fix (replacing VIR_REALLOC_N with
VIR_ALLOC_N during code movement) caused not only a memory leak, but
also resulted in most of the elements of the result array being
replaced with NULL. virNetDevGetVirtualFunctions() assumed (and I think
rightly so) that virPCIGetVirtualFunctions() wouldn't return any NULL
elements in the array, so it ended up segfaulting.

This was found when attempting to use a virtual network with an
auto-created pool of SRIOV VFs, e.g.:

    <forward mode='hostdev' managed='yes'>
      <pf dev='eth4'/>
    </forward>

(the pool of PCI addresses is discovered by calling
virNetDevGetVirtualFunctions() on the PF dev).
2013-04-09 14:26:12 -04:00
Ján Tomko
96c45f66fb docs: use MiB/s instead of Mbps for migration speed
https://bugzilla.redhat.com/show_bug.cgi?id=948821
2013-04-09 16:45:24 +02:00
Han Cheng
5bc5a44db9 conf: Change help function
The helper function to look up disk controller model may be used by scsi
hostdev. But it should be changed to use device info.

Signed-off-by: Han Cheng <hanc.fnst@cn.fujitsu.com>
2013-04-09 22:21:16 +08:00
Peter Krempa
b0216da8ee qemu: Remove now obsolete assignment of default network card model for s390 hosts
This effectively reverts commit 539d73dbf6 as the
changes aren't needed after introduction of the XML post parse callbacks.
2013-04-09 15:47:58 +02:00
Peter Krempa
74ba039f82 qemu: Clean up network device CLI generator
With the default model assigned in the parse callback, this code is now obsolete.
2013-04-09 15:47:58 +02:00
Viktor Mihajlovski
d8ddf522a0 qemu: Use correct default model on s390
Commit a68d672667 breaks networking on s390 as it
changes the default network card model.
2013-04-09 15:47:58 +02:00
Milos Vyletel
396c4d34f8 Generate RFC4122 compliant UUIDs
Even though http://libvirt.org/formatdomain.html#elementsMetadata
states that it requires RFC4122 compliance UUIDs that are generated
by virUUIDGenerate() are not. Following patch modifies generated
UUIDs to conform to rules described in RFC.

Signed-off-by: Milos Vyletel <milos.vyletel@sde.cz>
2013-04-08 13:18:07 -06:00
Daniel P. Berrange
1bd955ed60 Unmount existing filesystems under user specified mounts in LXC
If the user requests a mount for /run, this may hide any existing
mounts that are lower down in /run. The result is that the
container still sees the mounts in /proc/mounts, but cannot
access them

sh-4.2# df
df: '/run/user/501/gvfs': No such file or directory
df: '/run/media/berrange/LIVE': No such file or directory
df: '/run/media/berrange/SecureDiskA1': No such file or directory
df: '/run/libvirt/lxc/sandbox': No such file or directory
Filesystem                      1K-blocks      Used Available Use% Mounted on
/dev/mapper/vg_t500wlan-lv_root 151476396 135390200   8384900  95% /
tmpfs                             1970888      3204   1967684   1% /run
/dev/sda1                          194241    155940     28061  85% /boot
devfs                                  64         0        64   0% /dev
tmpfs                                  64         0        64   0% /sys/fs/cgroup
tmpfs                             1970888      1200   1969688   1% /etc/libvirt-sandbox/scratch

Before mounting any filesystem at a particular location, we
must recursively unmount anything at or below the target mount
point

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-08 17:40:08 +01:00
Daniel P. Berrange
2863ca22f3 Move lxcContainerUnmountSubtree further up in file
Ensure lxcContainerUnmountSubtree is at the top of the
lxc_container.c file so it is easily referenced from
any other method. No functional change

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-08 17:40:08 +01:00
Bogdan Purcareata
442d6a0527 Implement support for <hostdev caps=net>
This allows a container-type domain to have exclusive access to one of
the host's NICs.

Wire <hostdev caps=net> with the lxc_controller - when moving the newly
created veth devices into a new namespace, also look for any hostdev
devices that should be moved. Note: once the container domain has been
destroyed, there is no code that moves the interfaces back to the
original namespace. This does happen, though, probably due to default
cleanup on namespace destruction.

Signed-off-by: Bogdan Purcareata <bogdan.purcareata@freescale.com>
2013-04-08 17:40:08 +01:00
Bogdan Purcareata
4aafa1ff86 Update structure & XML definitions to support <hostdev caps=net>
This updates the definitions and supporting structures in the XML
schema and domain configuration files.

Signed-off-by: Bogdan Purcareata <bogdan.purcareata@freescale.com>
2013-04-08 17:40:08 +01:00
Daniel P. Berrange
dca927c82f Rename virCgroupMounted to virCgroupHasController & make it more robust
The virCgroupMounted method is badly named, since a controller can be
mounted, but disabled in the current object. Rename the method to be
virCgroupHasController. Also make it tolerant to a  NULL virCgroupPtr
and out-of-range controller index, to avoid duplication of these
checks in all callers

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-08 14:49:12 +01:00
Osier Yang
70bb34eb2e qemu: Allow volume type disk for device 'lun'
This allows one use block type volume as the disk source for device
'lun'.
2013-04-08 19:10:34 +08:00
Osier Yang
a9762b730b qemu: Support sgio setting for volume type disk 2013-04-08 19:10:12 +08:00
Osier Yang
464d4e559c qemu: Support shareable volume type disk
Since the source is already translated before. This just adds the
checking. Move !disk->shared and !disk->src to improve the performance
a bit.
2013-04-08 19:08:47 +08:00
Osier Yang
60b78b33e1 qemu: Translate the pool disk source earlier
To support "shareable" for volume type disk, we have to translate
the source before trying to add the shared disk entry. To achieve
the goal, this moves the helper qemuTranslateDiskSourcePool into
src/qemu/qemu_conf.c, and introduce an internal only member (voltype)
for struct _virDomainDiskSourcePoolDef, to record the underlying
volume type for use when building the drive string.

Later patch will support "shareable" volume type disk.
2013-04-08 19:02:34 +08:00
Osier Yang
664270b849 Support seclabels for volume type disk
"seclabels" is only valid for 'file' or 'block' type storage volume.
2013-04-08 18:59:50 +08:00
Osier Yang
43404fee37 Support startupPolicy for 'volume' disk
"startupPolicy" is only valid for file type storage volume, otherwise
it fails on starting the domain.
2013-04-08 18:54:37 +08:00
Osier Yang
db94a1d3a0 qemu: Translate the pool disk source when building drive string
This adds a new helper qemuTranslateDiskSourcePool which uses the
storage pool/vol APIs to translate the disk source before building
the drive string. Network volume is not supported yet. Disk chain
for volume type disk may be supported later, but before I'm confident
it doesn't break anything, it's just disabled now.
2013-04-08 18:54:17 +08:00
Osier Yang
4bc331c894 Introduce new XMLs to specify disk source using libvirt storage
With this patch, one can specify the disk source using libvirt
storage like:

  <disk type='volume' device='disk'>
    <driver name='qemu' type='raw' cache='none'/>
    <source pool='default' volume='fc18.img'/>
    <target dev='vdb' bus='virtio'/>
  </disk>

"seclabels" and "startupPolicy" are not supported for this new
disk type ("volume"). They will be supported in later patches.

docs/formatdomain.html.in:
  * Add documents for new XMLs
docs/schemas/domaincommon.rng:
  * Add rng for new XMLs;
src/conf/domain_conf.h:
  * New struct for 'volume' type disk source (virDomainDiskSourcePoolDef)
  * Add VIR_DOMAIN_DISK_TYPE_VOLUME for enum virDomainDiskType
src/conf/domain_conf.c:
  * New helper virDomainDiskSourcePoolDefParse to parse the 'volume'
    type disk source.
  * New helper virDomainDiskSourcePoolDefFree to free the source def
    if 'volume' type disk.
tests/qemuxml2argvdata/qemuxml2argv-disk-source-pool.xml:
tests/qemuxml2xmltest.c:
  * New test
2013-04-08 18:48:14 +08:00
Osier Yang
a05b0fc1ab conf: New helper virDomainDiskSourceDefFormat to format the disk source
The code to format disk source is long enough to have a helper.
2013-04-08 18:45:52 +08:00
Osier Yang
f5a610872a storage: Guess the parent if it's not specified for vHBA
This finds the parent for vHBA by iterating over all the HBA
which supports vport_ops capability on the host, and return
the first one which is online, not saturated (vports in use
is less than max_vports).
2013-04-08 18:41:07 +08:00
Osier Yang
34f9651005 storage: Add startPool and stopPool for scsi backend
startPool creates the vHBA if it's not existed yet, stopPool destroys
the vHBA. Also to support autostart, checkPool will creates the vHBA
if it's not existed yet.
2013-04-08 18:41:06 +08:00
Osier Yang
b52fbad150 util: Add helper to get the scsi host name by iterating over sysfs
The helper iterates over sysfs, to find out the matched scsi host
name by comparing the wwnn,wwpn pair. It will be used by checkPool
and refreshPool of storage scsi backend. New helper getAdapterName
is introduced in storage_backend_scsi.c, which uses the new util
helper virGetFCHostNameByWWN to get the fc_host adapter name.
2013-04-08 18:41:06 +08:00
Osier Yang
b78db1c365 phyp: Prohibit fc_host adapter for phyp driver
It's possible to support fc_host adapter for phyp driver too, but
at this stage I'd like to not allow it when I'm not that clear
how it works.
2013-04-08 18:41:06 +08:00
Osier Yang
6cf9a5bb90 storage: Move virStorageBackendSCSIGetHostNumber into iscsi backend
It's only used by iscsi backend.
2013-04-08 18:41:06 +08:00
Osier Yang
c1f63a9bdf storage: Make the adapter name be consistent with node device driver
node device driver names the HBA like "scsi_host5", but storage
driver uses "host5", which could make the user confused. This
changes them to be consistent. However, for back-compat reason,
adapter name like "host5" is still supported.
2013-04-08 18:41:06 +08:00
Osier Yang
9f781da69d New XML attributes for storage pool source adapter
This introduces 4 new attributes for storage pool source adapter.
E.g.

<adapter type='fc_host' parent='scsi_host5' wwnn='20000000c9831b4b' wwpn='10000000c9831b4b'/>

Attribute 'type' can be either 'scsi_host' or 'fc_host', and defaults
to 'scsi_host' if attribute 'name' is specified. I.e. It's optional
for 'scsi_host' adapter, for back-compat reason. However, mandatory
for 'fc_host' adapter and any new future adapter types. Attribute
'parent' is to specify the parent for the fc_host adapter.

* docs/formatstorage.html.in:
  - Add documents for the 4 new attrs
* docs/schemas/storagepool.rng:
  - Add RNG schema
* src/conf/storage_conf.c:
  - Parse and format the new XMLs
* src/conf/storage_conf.h:
  - New struct virStoragePoolSourceAdapter, replace "char *adapter" with it;
  - New enum virStoragePoolSourceAdapterType
* src/libvirt_private.syms:
  - Export TypeToString and TypeFromString
* src/phyp/phyp_driver.c:
  - Replace "adapter" with "adapter.data.name", which is member of the union
    of the new struct virStoragePoolSourceAdapter now. Later patch will
    add the checking, as "adapter.data.name" is only valid for "scsi_host"
    adapter.
* src/storage/storage_backend_scsi.c:
  - Like above
* tests/storagepoolxml2xmlin/pool-scsi-type-scsi-host.xml:
* tests/storagepoolxml2xmlin/pool-scsi-type-fc-host.xml:
  - New test for 'fc_host' and "scsi_host" adapter
* tests/storagepoolxml2xmlout/pool-scsi.xml:
  - Change the expected output, as the 'type' defaults to 'scsi_host' if 'name"
    specified now
* tests/storagepoolxml2xmlout/pool-scsi-type-scsi-host.xml:
* tests/storagepoolxml2xmlout/pool-scsi-type-fc-host.xml:
  - New test
* tests/storagepoolxml2xmltest.c:
  - Include the test
2013-04-08 18:41:06 +08:00
Daniel P. Berrange
e57aaa6fcf Disable cast-align warnings in various places
There are a number of places which generate cast alignment
warnings, which are difficult or impossible to address. Use
pragmas to disable the warnings in these few places

conf/nwfilter_conf.c: In function 'virNWFilterRuleDetailsParse':
conf/nwfilter_conf.c:1806:16: warning: cast increases required alignment of target type [-Wcast-align]
         item = (nwItemDesc *)((char *)nwf + att[idx].dataIdx);
conf/nwfilter_conf.c: In function 'virNWFilterRuleDefDetailsFormat':
conf/nwfilter_conf.c:3238:16: warning: cast increases required alignment of target type [-Wcast-align]
         item = (nwItemDesc *)((char *)def + att[i].dataIdx);

storage/storage_backend_mpath.c: In function 'virStorageBackendCreateVols':
storage/storage_backend_mpath.c:247:17: warning: cast increases required alignment of target type [-Wcast-align]
         names = (struct dm_names *)(((char *)names) + next);

nwfilter/nwfilter_dhcpsnoop.c: In function 'virNWFilterSnoopDHCPDecode':
nwfilter/nwfilter_dhcpsnoop.c:994:15: warning: cast increases required alignment of target type [-Wcast-align]
         pip = (struct iphdr *) pep->eh_data;
nwfilter/nwfilter_dhcpsnoop.c:1004:11: warning: cast increases required alignment of target type [-Wcast-align]
     pup = (struct udphdr *) ((char *) pip + (pip->ihl << 2));

nwfilter/nwfilter_learnipaddr.c: In function 'procDHCPOpts':
nwfilter/nwfilter_learnipaddr.c:327:33: warning: cast increases required alignment of target type [-Wcast-align]
                 uint32_t *tmp = (uint32_t *)&dhcpopt->value;
nwfilter/nwfilter_learnipaddr.c: In function 'learnIPAddressThread':
nwfilter/nwfilter_learnipaddr.c:501:43: warning: cast increases required alignment of target type [-Wcast-align]
                     struct iphdr *iphdr = (struct iphdr*)(packet +
nwfilter/nwfilter_learnipaddr.c:538:43: warning: cast increases required alignment of target type [-Wcast-align]
                     struct iphdr *iphdr = (struct iphdr*)(packet +
nwfilter/nwfilter_learnipaddr.c:544:48: warning: cast increases required alignment of target type [-Wcast-align]
                         struct udphdr *udphdr= (struct udphdr *)

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-08 10:03:21 +01:00
Daniel P. Berrange
7e6aabc61f Copy struct inotify_event entries to avoid alignment problems
When reading the inotify FD, we get back a sequence of
struct inotify_event, each with variable length data following.
It is not safe to simply cast from the char *buf to the
struct inotify_event struct since this may violate data
alignment rules. Thus we must copy from the char *buf
into the struct inotify_event instance before accessing
the data.

uml/uml_driver.c: In function 'umlInotifyEvent':
uml/uml_driver.c:327:13: warning: cast increases required alignment of target type [-Wcast-align]
         e = (struct inotify_event *)tmp;

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-08 10:03:21 +01:00
Daniel P. Berrange
c4f9edf1a1 Use VIR_ALLOC_VAR instead of VIR_ALLOC_N for creating virObject
The current way virObject instances are allocated using
VIR_ALLOC_N causes alignment warnings

util/virobject.c: In function 'virObjectNew':
util/virobject.c:195:11: error: cast increases required alignment of target type [-Werror=cast-align]

Changing to use VIR_ALLOC_VAR will avoid the need todo
the casts entirely.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-08 10:03:21 +01:00
Daniel P. Berrange
e95de74d4c Avoid casts between unsigned char * and struct nlmsghdr
The virNetlinkCommand() method takes an 'unsigned char **'
parameter to be filled with the received netlink message.
The callers then immediately cast this to 'struct nlmsghdr',
triggering (bogus) warnings about increasing alignment
requirements

util/virnetdev.c: In function 'virNetDevLinkDump':
util/virnetdev.c:1300:12: warning: cast increases required alignment of target type [-Wcast-align]
     resp = (struct nlmsghdr *)*recvbuf;
            ^
util/virnetdev.c: In function 'virNetDevSetVfConfig':
util/virnetdev.c:1429:12: warning: cast increases required alignment of target type [-Wcast-align]
     resp = (struct nlmsghdr *)recvbuf;

Since all callers cast to 'struct nlmsghdr' we can avoid
the warning problem entirely by simply changing the
signature of virNetlinkCommand to return a 'struct nlmsghdr **'
instead of 'unsigned char **'. The way we do the cast inside
virNetlinkCommand does not have any alignment issues.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-08 10:03:21 +01:00
Daniel P. Berrange
d27efd8e5d Rewrite keycode map to avoid a struct
Playing games with field offsets in a struct causes all sorts
of alignment warnings on ARM platforms

util/virkeycode.c: In function '__virKeycodeValueFromString':
util/virkeycode.c:26:7: warning: cast increases required alignment of target type [-Wcast-align]
     (*(typeof(field_type) *)((char *)(object) + field_offset))
       ^
util/virkeycode.c:91:28: note: in expansion of macro 'getfield'
         const char *name = getfield(virKeycodes + i, const char *, name_offset);
                            ^
util/virkeycode.c:26:7: warning: cast increases required alignment of target type [-Wcast-align]
     (*(typeof(field_type) *)((char *)(object) + field_offset))
       ^
util/virkeycode.c:94:20: note: in expansion of macro 'getfield'
             return getfield(virKeycodes + i, unsigned short, code_offset);
                    ^
util/virkeycode.c: In function '__virKeycodeValueTranslate':
util/virkeycode.c:26:7: warning: cast increases required alignment of target type [-Wcast-align]
     (*(typeof(field_type) *)((char *)(object) + field_offset))
       ^
util/virkeycode.c:127:13: note: in expansion of macro 'getfield'
         if (getfield(virKeycodes + i, unsigned short, from_offset) == key_value)
             ^
util/virkeycode.c:26:7: warning: cast increases required alignment of target type [-Wcast-align]
     (*(typeof(field_type) *)((char *)(object) + field_offset))
       ^
util/virkeycode.c:128:20: note: in expansion of macro 'getfield'
             return getfield(virKeycodes + i, unsigned short, to_offset);

There is no compelling reason to use a struct for the keycode
tables. It can easily just use an array of arrays instead,
avoiding all alignment problems

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-08 10:03:20 +01:00
Osier Yang
fd1432c7ae qemu: Error out if the bitmap for pinning is all clear
For both "live" and "config" changes of vcpupin and emulatorpin, an
all clear bitmap doesn't make sense, and it can just cause corruptions.
E.g (similar for emulatorpin).

% virsh vcpupin hame 0 8,^8 --config

% virsh vcpupin hame
VCPU: CPU Affinity
----------------------------------
   0:
   1: 0-63
   2: 0-63
   3: 0-63

% virsh dumpxml hame | grep cpuset
    <vcpupin vcpu='0' cpuset=''/>

% virsh start hame
error: Failed to start domain hame
error: An error occurred, but the cause is unknown
2013-04-06 10:16:59 +08:00
Osier Yang
1acfc171da util: Add a helper to check if all bits of a bitmap are clear 2013-04-06 10:14:21 +08:00
Osier Yang
d4bf0a9378 qemu: Support multiple queue virtio-scsi
This introduce a new attribute "num_queues" (same with the good name
QEMU uses) for virtio-scsi controller. An example of the XML:

<controller type='scsi' index='0' model='virtio-scsi' num_queues='8'/>

The corresponding QEMU command line:

-device virtio-scsi-pci,id=scsi0,num_queues=8,bus=pci.0,addr=0x3 \
2013-04-06 10:08:47 +08:00
Eric Blake
5899e09e61 build: check correct protocol.o file
By default, libtool builds two .o files for every .lo rule:
src/foo.o - static builds
src/.libs/foo.o - shared library builds

But since commit ad42b34b disabled static builds, src/foo.o is
no longer built by default.  On a fresh checkout, this means our
protocol check rules using pdwtags were testing a missing file,
and thanks to a lousy behavior of pdwtags happily giving no output
and 0 exit status (http://bugzilla.redhat.com/949034), we were
merely claiming that "dwarves is too old" and skipping the test.

However, if you swap between branches and do incremental builds,
such as building v0.10.2-maint and then switching back to master,
you end up with src/foo.o being leftover from its 0.10.2 state,
and then 'make check' fails because the .o file does not match
the protocol-structs file due to API additions in the meantime.

A simpler fix would be to always look in .libs for the .o to
be parsed; but since it is possible to pass ./configure options
to tell libtool to do a static-only build with no shared .o,
I went with the approach of finding the newest of the two files,
whenever both exist.

* src/Makefile.am (PDWTAGS): Ensure we test just-built file.
2013-04-05 11:23:18 -06:00
Peter Krempa
ce65b43589 qemu: Remove maximum cpu limit when setting processor count using the API
When setting processor count for a domain using the API libvirt enforced
a maximum processor count, while it isn't enforced when taking the XML path.

This patch removes the check to match the XML.
2013-04-05 15:36:00 +02:00
Daniel P. Berrange
56f27b3bbc Don't create dirs in cgroup controllers we don't want to use
Currently when getting an instance of virCgroupPtr we will
create the path in all cgroup controllers. Only at the virt
driver layer are we attempting to filter controllers. This
is bad because the mere act of creating the dirs in the
controllers can have a functional impact on the kernel,
particularly for performance.

Update the virCgroupForDriver() method to accept a bitmask
of controllers to use. Only create dirs in the controllers
that are requested. When creating cgroups for domains,
respect the active controller list from the parent cgroup

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-05 10:41:54 +01:00
Daniel P. Berrange
804a809a06 Rename virCgroupGetAppRoot to virCgroupForSelf
The virCgroupGetAppRoot is not clear in its meaning. Change
to virCgroupForSelf to highlight that this returns the
cgroup config for the caller's process

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-05 10:41:54 +01:00
Peter Krempa
8ad126e695 rpc: Fix connection close callback race condition and memory corruption/crash
The last Viktor's effort to fix the race and memory corruption unfortunately
wasn't complete in the case the close callback was not registered in an
connection. At that time, the trail of event's that I'll describe later could
still happen and corrupt the memory or cause a crash of the client (including
the daemon in case of a p2p migration).

Consider the following prerequisities and trail of events:
Let's have a remote connection to a hypervisor that doesn't have a close
callback registered and the client is using the event loop. The crash happens in
cooperation of 2 threads. Thread E is the event loop and thread W is the worker
that does some stuff. R denotes the remote client.

1.) W - The client finishes everything and sheds the last reference on the client
2.) W - The virObject stuff invokes virConnectDispose that invokes doRemoteClose
3.) W - the remote close method invokes the REMOTE_PROC_CLOSE RPC method.
4.) W - The thread is preempted at this point.
5.) R - The remote side receives the close and closes the socket.
6.) E - poll() wakes up due to the closed socket and invokes the close callback
7.) E - The event loop is preempted right before remoteClientCloseFunc is called
8.) W - The worker now finishes, and frees the conn object.
9.) E - The remoteClientCloseFunc accesses the now-freed conn object in the
        attempt to retrieve pointer for the real close callback.
10.) Kaboom, corrupted memory/segfault.

This patch tries to fix this by introducing a new object that survives the
freeing of the connection object. We can't increase the reference count on the
connection object itself or the connection would never be closed, as the
connection is closed only when the reference count reaches zero.

The new object - virConnectCloseCallbackData - is a lockable object that keeps
the pointers to the real user registered callback and ensures that the
connection callback is either not called if the connection was already freed or
that the connection isn't freed while this is being called.
2013-04-05 10:36:03 +02:00
Viktor Mihajlovski
03a43efa86 libvirt: Increase connection reference count for callbacks
By adjusting the reference count of the connection object we
prevent races between callback function and virConnectClose.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2013-04-05 10:36:02 +02:00
Peter Krempa
482e5f159c virCaps: get rid of defaultConsoleTargetType callback
This patch refactors various places to allow removing of the
defaultConsoleTargetType callback from the virCaps structure.

A new console character device target type is introduced -
VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_NONE - to mark that no type was
specified in the XML. This type is at the end converted to the standard
VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL. Other types that are
different from this default have to be processed separately in the
device post parse callback.
2013-04-04 22:42:39 +02:00
Peter Krempa
46becc18ba virCaps: get rid of macPrefix field
Use the virDomainXMLConf structure to hold this data and tweak the code
to avoid semantic change.

Without configuration the KVM mac prefix is used by default. I chose it
as it's in the privately administered segment so it should be usable for
any purposes.
2013-04-04 22:42:38 +02:00
Peter Krempa
8960d65674 virCaps: get rid of hasWideScsiBus
Use the virDomainXMLConf structure to hold this data.
2013-04-04 22:42:38 +02:00
Peter Krempa
b299084988 virCaps: get rid of defaultDiskDriverType
Use the qemu specific callback to fill this data in the qemu driver as
it's the only place where it was used and fix tests as the qemu test
capability object didn't configure the defaults for the tests.
2013-04-04 22:42:38 +02:00
Peter Krempa
b5def001cc virCaps: get rid of emulatorRequired
This patch removes the emulatorRequired field and associated
infrastructure from the virCaps object. Instead the driver specific
callbacks are used as this field isn't enforced by all drivers.

This patch implements the appropriate callbacks in the qemu and lxc
driver and moves to check to that location.
2013-04-04 22:42:38 +02:00
Peter Krempa
9ea249e7d9 virCaps: get rid of defaultDiskDriverName
This patch removes the defaultDiskDriverName from the virCaps
structure. This particular default value is used only in the qemu driver
so this patch uses the recently added callback to fill the driver name
if it's needed instead of propagating it through virCaps.
2013-04-04 22:42:38 +02:00
Peter Krempa
4750c848e9 virCaps: get rid of "defaultInitPath" value in the virCaps struct
This gets rid of the parameter in favor of using the new callback
infrastructure to do the same stuff.

This patch implements the domain adjustment callback in the openVZ
driver and moves the check from the parser to a new validation method in
the callback infrastructure.
2013-04-04 22:42:37 +02:00
Peter Krempa
a68d672667 qemu: Record the default NIC model in the domain XML
This patch implements the devices post parse callback and uses it to fill
the default qemu network card model into the XML if none is specified.

Libvirt assumes that the network card model for qemu is the "rtl8139".
Record this in the XML using the new callback to avoid user
confusion.
2013-04-04 22:41:20 +02:00
Peter Krempa
ad0d10b2b1 conf callback: Rearrange function parameters
Move the xmlopt and caps arguments to the end of the argument list.
2013-04-04 22:41:19 +02:00
Peter Krempa
43b99fc4c0 conf: Add post XML parse callbacks and prepare for cleaning of virCaps
This patch adds instrumentation that will allow hypervisor drivers to
fill and validate domain and device definitions after parsed by the XML
parser.

With this patch, after the XML is parsed, a callback to the driver is
issued requesting to fill and validate driver specific details of the
configuration. This allows to use sensible defaults and checks on a per
driver basis at the time the XML is parsed.

Two callback pointers are stored in the new virDomainXMLConf object:
* virDomainDeviceDefPostParseCallback (devicesPostParseCallback)
  - called for a single device parsed and for every single device in a
    domain config. A virDomainDeviceDefPtr is passed along with the
    domain definition and virCaps.

* virDomainDefPostParseCallback, (domainPostParseCallback)
  - A callback that is meant to process the domain config after it's
  parsed.  A virDomainDefPtr is passed along with virCaps.

Both types of callbacks support arbitrary opaque data passed for the
callback functions.

Errors may be reported in those callbacks resulting in a XML parsing
failure.
2013-04-04 22:29:48 +02:00
Peter Krempa
e84b19316a maint: Rename xmlconf to xmlopt and virDomainXMLConfig to virDomainXMLOption
This patch is the result of running:

for i in $(git ls-files | grep -v html | grep -v \.po$ ); do
  sed -i -e "s/virDomainXMLConf/virDomainXMLOption/g" -e "s/xmlconf/xmlopt/g" $i
done

and a few manual tweaks.
2013-04-04 22:18:56 +02:00
Daniel P. Berrange
8d3d05d3c1 Create fake NUMA info if libnuma isn't available
If libnuma is not compiled in, or numa_available() returns an
error, stub out fake NUMA info consisting of one NUMA cell
containing all CPUs and memory.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-04 11:07:32 +01:00
Daniel P. Berrange
4a2891510b Cope with missing /sys/devices/system/cpu/cpu0/topology files
Not all kernel builds have any entries under the location
/sys/devices/system/cpu/cpu0/topology. We already cope with
that being missing in some cases, but not all. Update the
code which looks for thread_siblings to cope with the missing
file

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-04 11:07:32 +01:00
Daniel P. Berrange
9c29c52c5a Add armv6l architecture to list of valid arches
The Raspberry Pi runs the armv6l architecture and apparently
people are trying to run libvirt LXC on it. So we should allow
that as a valid arch

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-04 11:07:32 +01:00
Daniel P. Berrange
347081effa Implement minimal sysinfo for ARM platforms
Implement the bare minimal sysinfo for ARM platforms by
reading the CPU models from /proc/cpuinfo

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-04 11:07:31 +01:00
Daniel P. Berrange
6263fc5a5b Wire up sysinfo for LXC driver
The sysinfo code used by QEMU is trivially portable to the
LXC driver
2013-04-04 11:07:00 +01:00
Daniel P. Berrange
e2b373e6d6 Add support for SD cards in nodedev driver
The nodedev driver currently only detects harddisk, cdrom
and floppy devices. This adds support for SD cards, which
are common storage for ARM devices, eg the Google ChromeBook

<device>
  <name>block_mmcblk0_0xb1c7c08b</name>
  <parent>computer</parent>
  <capability type='storage'>
    <block>/dev/mmcblk0</block>
    <drive_type>sd</drive_type>
    <serial>0xb1c7c08b</serial>
    <size>15758000128</size>
    <logical_block_size>512</logical_block_size>
    <num_blocks>30777344</num_blocks>
  </capability>
</device>

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-04 11:07:00 +01:00
Daniel P. Berrange
edd87fa2ea Revert "lxc: Prevent shutting down the host"
This reverts commit c9c87376f2.

Now that we force all containers to have a root filesystem,
there is no way the host's /dev is ever exposed

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-04 10:51:59 +01:00
Daniel P. Berrange
c131525bec Auto-add a root <filesystem> element to LXC containers on startup
Currently the LXC container code has two codepaths, depending on
whether there is a <filesystem> element with a target path of '/'.
If we automatically add a <filesystem> device with src=/ and dst=/,
for any container which has not specified a root filesystem, then
we only need one codepath for setting up the filesystem.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-04 10:51:59 +01:00
Daniel P. Berrange
f7e8653f7e Remove support for old kernels lacking private devpts
Early on kernel support for private devpts was not widespread,
so we had compatibiltiy codepaths. Such old kernels are not
seriously used for LXC these days, so the compat code can go
away

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-04 10:51:59 +01:00
Atsushi Kumagai
d369e50825 storage: Fix volume cloning for logical volume.
When creating a logical volume with virStorageVolCreateXMLFrom,
"qemu-img convert" is called internally if clonevol is a file volume.
Then, vol->target.format is used as output_fmt parameter but the
target.format of logical volumes is always 0 because logical volumes
haven't the volume format type element.

Fortunately, 0 was treated as RAW file format before commit f772b3d9,
so there was no problem. But now, 0 is treated as the type of none,
qemu-img fails with "Unknown file format 'none'".

This patch fixes this issue by treating output block devices as RAW
file format like for input block devices.

Signed-off-by: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
2013-04-04 10:52:07 +02:00
Guido Günther
ea151935bb security_manager: fix comparison
otherwise we crash later on if we don't find a match like:

 #0  0xb72c2b4f in virSecurityManagerGenLabel (mgr=0xb8e42d20, vm=0xb8ef40c0) at security/security_manager.c:424
 #1  0xb18811f3 in qemuProcessStart (conn=conn@entry=0xb8eed880, driver=driver@entry=0xb8e3b1e0, vm=vm@entry=0xb8ef58f0,
     migrateFrom=migrateFrom@entry=0xb18f6088 "stdio", stdin_fd=18,
     stdin_path=stdin_path@entry=0xb8ea7798 "/var/lib/jenkins/jobs/libvirt-tck-build/workspace/tck.img", snapshot=snapshot@entry=0x0,
     vmop=vmop@entry=VIR_NETDEV_VPORT_PROFILE_OP_RESTORE, flags=flags@entry=2) at qemu/qemu_process.c:3364
 #2  0xb18d6cb2 in qemuDomainSaveImageStartVM (conn=conn@entry=0xb8eed880, driver=driver@entry=0xb8e3b1e0, vm=0xb8ef58f0, fd=fd@entry=0xb6bf3f98,
     header=header@entry=0xb6bf3fa0, path=path@entry=0xb8ea7798 "/var/lib/jenkins/jobs/libvirt-tck-build/workspace/tck.img",
     start_paused=start_paused@entry=false) at qemu/qemu_driver.c:4843
 #3  0xb18d7eeb in qemuDomainRestoreFlags (conn=conn@entry=0xb8eed880,
     path=path@entry=0xb8ea7798 "/var/lib/jenkins/jobs/libvirt-tck-build/workspace/tck.img", dxml=dxml@entry=0x0, flags=flags@entry=0)
     at qemu/qemu_driver.c:4962
 #4  0xb18d8123 in qemuDomainRestore (conn=0xb8eed880, path=0xb8ea7798 "/var/lib/jenkins/jobs/libvirt-tck-build/workspace/tck.img")
     at qemu/qemu_driver.c:4987
 #5  0xb718d186 in virDomainRestore (conn=0xb8eed880, from=0xb8ea87d8 "/var/lib/jenkins/jobs/libvirt-tck-build/workspace/tck.img") at libvirt.c:2768
 #6  0xb7736363 in remoteDispatchDomainRestore (args=<optimized out>, rerr=0xb6bf41f0, client=0xb8eedaf0, server=<optimized out>, msg=<optimized out>)
     at remote_dispatch.h:4679
 #7  remoteDispatchDomainRestoreHelper (server=0xb8e1a3e0, client=0xb8eedaf0, msg=0xb8ee72c8, rerr=0xb6bf41f0, args=0xb8ea8968, ret=0xb8ef5330)
     at remote_dispatch.h:4661
 #8  0xb720db01 in virNetServerProgramDispatchCall (msg=0xb8ee72c8, client=0xb8eedaf0, server=0xb8e1a3e0, prog=0xb8e216b0)
     at rpc/virnetserverprogram.c:439
 #9  virNetServerProgramDispatch (prog=0xb8e216b0, server=server@entry=0xb8e1a3e0, client=0xb8eedaf0, msg=0xb8ee72c8) at rpc/virnetserverprogram.c:305
 #10 0xb7206e97 in virNetServerProcessMsg (msg=<optimized out>, prog=<optimized out>, client=<optimized out>, srv=0xb8e1a3e0) at rpc/virnetserver.c:162
 #11 virNetServerHandleJob (jobOpaque=0xb8ea7720, opaque=0xb8e1a3e0) at rpc/virnetserver.c:183
 #12 0xb70f9f78 in virThreadPoolWorker (opaque=opaque@entry=0xb8e1a540) at util/virthreadpool.c:144
 #13 0xb70f94a5 in virThreadHelper (data=0xb8e0e558) at util/virthreadpthread.c:161
 #14 0xb705d954 in start_thread (arg=0xb6bf4b70) at pthread_create.c:304
 #15 0xb6fd595e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130

This unbreaks libvirt-tck's domain/100-transient-save-restore.t with
qemu:///session and selinux compiled in but disabled.

Introduced by 8d68cbeaa8
2013-04-03 22:57:31 +02:00
Eric Blake
e52a31d166 qemu: fix memory leak on -machine usage error
Commit f84b92ea introduced a memory leak on error; John Ferlan reported
that valgrind caught it during 'make check'.

* src/qemu/qemu_command.c (qemuBuildMachineArgStr): Plug leak.
2013-04-03 11:55:18 -06:00
Daniel P. Berrange
fc8c1787d8 Enable full RELRO mode
By passing the flags -z relro -z now to the linker, we can force
it to resolve all library symbols at startup, instead of on-demand.
This allows it to then make the global offset table (GOT) read-only,
which makes some security attacks harder.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-03 16:19:35 +01:00
Daniel P. Berrange
1150999ca4 Build all binaries with PIE
PIE (position independent executable) adds security to executables
by composing them entirely of position-independent code (PIC. The
.so libraries already build with -fPIC. This adds -fPIE which is
the equivalent to -fPIC, but for executables. This for allows Exec
Shield to use address space layout randomization to prevent attackers
from knowing where existing executable code is during a security
attack using exploits that rely on knowing the offset of the
executable code in the binary, such as return-to-libc attacks.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-04-03 16:19:35 +01:00
Peter Krempa
24ca8fae64 qemu-blockjob: Fix limit of bandwidth for block jobs to supported value
The JSON generator is able to represent only values less than LLONG_MAX, fix the
bandwidth limit checks when converting to value to catch overflows before they
reach the generator.
2013-04-03 16:38:51 +02:00
Michal Privoznik
8d68cbeaa8 sec_manager: Refuse to start domain with unsupported seclabel
https://bugzilla.redhat.com/show_bug.cgi?id=947387

If a user configures a domain to use a seclabel of a specific type,
but the appropriate driver is not accessible, we should refuse to
start the domain. For instance, if user requires selinux, but it is
either non present in the system, or is just disabled, we should not
start the domain. Moreover, since we are touching only those labels we
have a security driver for, the other labels may confuse libvirt when
reconnecting to a domain on libvirtd restart. In our selinux example,
when starting up a domain, missing security label is okay, as we
auto-generate one. But later, when libvirt is re-connecting to a live
qemu instance, we parse a state XML, where security label is required
and it is an error if missing:

  error : virSecurityLabelDefParseXML:3228 : XML error: security label
  is missing

This results in a qemu process left behind without any libvirt control.
2013-04-03 10:19:46 +02:00
Peter Krempa
43b6f304bc qemu: Fix crash when updating media with shared device
Mimic the fix done in 02b9097274 to fix crash by
accessing an already freed structure. Also copy the explaining comment why the
pointer can't be accessed any more.
2013-04-02 23:15:00 +02:00
Peter Krempa
6bd94a1b59 Use virMacAddrFormat instead of manual mac address formatting
Format the address using the helper instead of having similar code in
multiple places.

This patch also fixes leak of the MAC address string in
ebtablesRemoveForwardAllowIn() and ebtablesAddForwardAllowIn() in
src/util/virebtables.c
2013-04-02 15:53:43 +02:00
Peter Krempa
ab4bf20ead util: Change virMacAddrFormat to lowercase hex characters
The domain XML generator creates the mac addres strings with lowercase
strings with a separate piece of code. This patch changes the formating
helper to do the same stuff to allow using it to normalize a string
provided by the user. After this change some of the tests that are
outputing the mac address will need to be changed.
2013-04-02 15:53:43 +02:00
Li Zhang
f84b92ea19 Optimize machine option to set more options with it
Currently, -machine option is used only when dump-guest-core is set.

To use options defined in machine option for newer version of QEMU,
it needs to use -machine xxx, and to be compatible with older version
-M, this patch adds QEMU_CAPS_MACHINE_OPT capability for newer
version which supports -machine option.

Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2013-04-02 07:02:34 -06:00
Peter Krempa
f8e3221f99 conf: Enforce ranges on cputune variables
The limits are documented at
http://libvirt.org/formatdomain.html#elementsCPUTuning . Enforce them
when going through XML parsing in addition to being enforced by the API.
2013-04-02 14:50:25 +02:00
Michal Privoznik
5e5ca84e31 test: Return Libvirt logo as domain screenshot
This is just a bare Easter Egg. Whenever a user runs virDomainScreenshot
over a domain in test driver, he'll get the Libvirt PNG logo in return.
2013-04-02 14:38:56 +02:00
Eric Blake
6f7e4ea359 smartcard: spell ccid-card-emulated qemu property correctly
Reported by Anthony Messina in
https://bugzilla.redhat.com/show_bug.cgi?id=904692
Present since introduction of smartcard support in commit f5fd9baa

* src/qemu/qemu_command.c (qemuBuildCommandLine): Match qemu spelling.
* tests/qemuxml2argvdata/qemuxml2argv-smartcard-host-certificates.args:
Fix broken test.
2013-04-02 06:23:33 -06:00
Ján Tomko
f03dcc5df1 qemu: Allow migration over IPv6
Allow migration over IPv6 by listening on [::] instead of 0.0.0.0
when QEMU supports it (QEMU_CAPS_IPV6_MIGRATION) and there is
at least one v6 address configured on the system.

Use virURIParse in qemuMigrationPrepareDirect to allow parsing
IPv6 addresses, which would cause an 'incorrect :port' error
message before.

Move setting of migrateFrom from qemuMigrationPrepare{Direct,Tunnel}
after domain XML parsing, since we need the QEMU binary path from it
to get its capabilities.

Bug: https://bugzilla.redhat.com/show_bug.cgi?id=846013
2013-04-02 11:23:47 +02:00
John Ferlan
9a80050e52 Resolve valgrind failure
Code added by commit id '523207fe8'

TEST: qemuxml2argvtest
      ........................................ 40
      ........................................ 80
      ........................................ 120
      ........................................ 160
      ........................................ 200
      ........................................ 240
      .................................        273 OK
==30993== 39 bytes in 1 blocks are definitely lost in loss record 33 of 87
==30993==    at 0x4A0887C: malloc (vg_replace_malloc.c:270)
==30993==    by 0x41E501: fakeSecretGetValue (qemuxml2argvtest.c:33)
==30993==    by 0x427591: qemuBuildDriveURIString (qemu_command.c:2571)
==30993==    by 0x42C502: qemuBuildDriveStr (qemu_command.c:2627)
==30993==    by 0x4335FC: qemuBuildCommandLine (qemu_command.c:6443)
==30993==    by 0x41E8A0: testCompareXMLToArgvHelper (qemuxml2argvtest.c:154
==30993==    by 0x41FE8F: virtTestRun (testutils.c:157)
==30993==    by 0x418BE3: mymain (qemuxml2argvtest.c:506)
==30993==    by 0x4204CA: virtTestMain (testutils.c:719)
==30993==    by 0x38D6821A04: (below main) (in /usr/lib64/libc-2.16.so)
==30993==
==30993== 46 bytes in 1 blocks are definitely lost in loss record 64 of 87
==30993==    at 0x4A0887C: malloc (vg_replace_malloc.c:270)
==30993==    by 0x38D690A167: __vasprintf_chk (in /usr/lib64/libc-2.16.so)
==30993==    by 0x4CB28E7: virVasprintf (stdio2.h:210)
==30993==    by 0x4CB29A3: virAsprintf (virutil.c:2017)
==30993==    by 0x4275B4: qemuBuildDriveURIString (qemu_command.c:2580)
==30993==    by 0x42C502: qemuBuildDriveStr (qemu_command.c:2627)
==30993==    by 0x4335FC: qemuBuildCommandLine (qemu_command.c:6443)
==30993==    by 0x41E8A0: testCompareXMLToArgvHelper (qemuxml2argvtest.c:154
==30993==    by 0x41FE8F: virtTestRun (testutils.c:157)
==30993==    by 0x418BE3: mymain (qemuxml2argvtest.c:506)
==30993==    by 0x4204CA: virtTestMain (testutils.c:719)
==30993==    by 0x38D6821A04: (below main) (in /usr/lib64/libc-2.16.so)
==30993==
==30993== 385 (56 direct, 329 indirect) bytes in 1 blocks are definitely los
==30993==    at 0x4A06B6F: calloc (vg_replace_malloc.c:593)
==30993==    by 0x4C6B2CF: virAllocN (viralloc.c:152)
==30993==    by 0x4C9C7EB: virObjectNew (virobject.c:191)
==30993==    by 0x4D21810: virGetSecret (datatypes.c:642)
==30993==    by 0x41E5D5: fakeSecretLookupByUsage (qemuxml2argvtest.c:51)
==30993==    by 0x4D4BEC5: virSecretLookupByUsage (libvirt.c:15295)
==30993==    by 0x4276A9: qemuBuildDriveURIString (qemu_command.c:2565)
==30993==    by 0x42C502: qemuBuildDriveStr (qemu_command.c:2627)
==30993==    by 0x4335FC: qemuBuildCommandLine (qemu_command.c:6443)
==30993==    by 0x41E8A0: testCompareXMLToArgvHelper (qemuxml2argvtest.c:154
==30993==    by 0x41FE8F: virtTestRun (testutils.c:157)
==30993==    by 0x418BE3: mymain (qemuxml2argvtest.c:506)
==30993==
PASS: qemuxml2argvtest

Interesting side note is that running the test singularly via 'make -C tests
check TESTS=qemuxml2argvtest' didn't trip the valgrind error; however,
running during 'make -C tests valgrind' did cause the error to be seen.
2013-04-01 13:13:31 -04:00
Martin Kletzander
2d73f2120f storage: Avoid double virCommandFree in virStorageBackendLogicalDeletePool
When logical pool has no PVs associated with itself (user-created),
virCommandFree(cmd) is called twice with the same pointer and that
causes a segfault in daemon.
2013-03-29 11:09:32 +01:00
Ján Tomko
248371417b nodedev: invert virIsCapableFCHost return value
Both virIsCapableFCHost and virIsCapableVport return 0 when the
respective sysfs path is accessible.
2013-03-29 11:32:04 +08:00
Michal Privoznik
a1c68a1fcb security_manager.c: Append seclabel iff generated
With my previous patches, we unconditionally appended a seclabel,
even if it wasn't generated but found in array of defined seclabels.
This resulted in double free later when doing virDomainDefFree
and iterating over the array of defined seclabels.

Moreover, there was another possibility of double free, if the
seclabel was generated in the last iteration of the process of
walking trough security managers array.
2013-03-28 16:13:01 +01:00
Michal Privoznik
0e9df6bd10 virutil: Fix compilation on non-linux platforms
There has been a typo in virIsCapbleVport function name.
2013-03-28 13:23:04 +01:00
Osier Yang
5eeb56fb2a util: Fix the conflict type for virIsCapableFCHost
---
Pushed under build-breaker rule.
2013-03-28 20:17:05 +08:00
Michal Privoznik
a919e6f776 libvirt_private.syms: Correctly export seclabel APIs
One of my previous patches manipulated virSecurityLabel* APIs,
some were added to header files, and some were renamed. However,
these changes were not reflected in libvirt_private.syms.
2013-03-28 10:39:25 +01:00
Michal Privoznik
e4a28a3281 security: Don't add seclabel of type none if there's already a seclabel
https://bugzilla.redhat.com/show_bug.cgi?id=923946

The <seclabel type='none'/> should be added iff there is no other
seclabel defined within a domain. This bug can be easily reproduced:
1) configure selinux seclabel for a domain
2) disable system's selinux and restart libvirtd
3) observe <seclabel type='none'/> being appended to a domain on its
   startup
2013-03-28 10:01:06 +01:00
Michal Privoznik
6c4de11614 security_manager: Don't manipulate domain XML in virDomainDefGetSecurityLabelDef
The virDomainDefGetSecurityLabelDef was modifying the domain XML.
It tried to find a seclabel corresponding to given sec driver. If the
label wasn't found, the function created one which is wrong. In fact
it's security manager which should modify this part of domain XML.
2013-03-28 10:01:06 +01:00
Guannan Ren
7a0f502119 conf: fix memory leak of class_id bitmap
When libvirtd loads active network configs from network state directory,
it should release the class_id memory block which was allocated
at the time of loading xml from network config directory.
virBitmapParse will create a new memory block of bitmap class_id which
causes a memory leak.

This happens when at least one virtual network is active before.

==12234== 8,216 (24 direct, 8,192 indirect) bytes in 1 blocks are definitely \
              lost in loss record 702 of 709
==12234==    at 0x4A06B2F: calloc (vg_replace_malloc.c:593)
==12234==    by 0x37AB04D77D: virAlloc (in /usr/lib64/libvirt.so.0.1000.3)
==12234==    by 0x37AB04EF89: virBitmapNew (in /usr/lib64/libvirt.so.0.1000.3)
==12234==    by 0x37AB0BFB37: virNetworkAssignDef (in /usr/lib64/libvirt.so.0.1000.3)
==12234==    by 0x37AB0BFD31: ??? (in /usr/lib64/libvirt.so.0.1000.3)
==12234==    by 0x37AB0BFE92: virNetworkLoadAllConfigs (in /usr/lib64/libvirt.so.0.1000.3)
==12234==    by 0x10650E5A: ??? (in /usr/lib64/libvirt/connection-driver/libvirt_driver_network.so)
==12234==    by 0x37AB0EB72F: virStateInitialize (in /usr/lib64/libvirt.so.0.1000.3)
==12234==    by 0x40DE04: ??? (in /usr/sbin/libvirtd)
==12234==    by 0x37AB0832E8: ??? (in /usr/lib64/libvirt.so.0.1000.3)
==12234==    by 0x3796807D14: start_thread (in /usr/lib64/libpthread-2.16.so)
==12234==    by 0x37960F246C: clone (in /usr/lib64/libc-2.16.so)
2013-03-28 12:10:05 +08:00
Guannan Ren
02cbd8b67e uml:release config object when uml driver shutdown 2013-03-28 12:07:35 +08:00
Guannan Ren
1cb03d4e4b qemu:release qemu config object when qemu driver shutdown 2013-03-28 12:07:27 +08:00
Stefan Seyfried
e669a65903 net: use newer iptables syntax
iptables-1.4.18 removed the long deprecated "state" match.
Use "conntrack" instead in forwarding rules.
Fixes openSUSE bug https://bugzilla.novell.com/811251 #811251.
2013-03-27 16:20:03 -06:00
Viktor Mihajlovski
d0cc811ed0 remote: Don't call NULL closeFreeCallback
Check function pointer before calling.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2013-03-27 14:08:19 +01:00
Jiri Denemark
d8d4aa01d8 rpc: Fix client crash when server drops connection
Despite the comment stating virNetClientIncomingEvent handler should
never be called with either client->haveTheBuck or client->wantClose
set, there is a sequence of events that may lead to both booleans being
true when virNetClientIncomingEvent is called. However, when that
happens, we must not immediately close the socket as there are other
threads waiting for the buck and they would cause SIGSEGV once they are
woken up after the socket was closed. Another thing is we should clear
all remaining calls in the queue after closing the socket.

The situation that can lead to the crash involves three threads, one of
them running event loop and the other two calling libvirt APIs. The
event loop thread detects an event on client->sock and calls
virNetClientIncomingEvent handler. But before the handler gets a chance
to lock client, the other two threads (T1 and T2) start calling some
APIs. T1 gets the buck and detects EOF on client->sock while processing
its RPC call. Since T2 is waiting for its own call, T1 passes the buck
on to it and unlocks client. But before T2 gets the signal, the event
loop thread wakes up, does its job and closes client->sock. The crash
happens when T2 actually wakes up and tries to do its job using a closed
client->sock.
2013-03-27 09:00:38 +01:00
Jiri Denemark
a1fe02f0e9 log: Separate thread ID from timestemp in ring buffer
When we write a log message into a log, we separate thread ID from
timestamp using ": ". However, when storing the message into the ring
buffer, we omitted the separator, e.g.:

    2013-02-27 11:49:11.852+00003745: ...
2013-03-27 09:00:35 +01:00
Guannan Ren
a950f03e16 conf: fix a failure when detaching a usb device
#virsh detach-device $guest usb.xml
 error: Failed to detach device from usb2.xml
 error: operation failed: host usb device vendor=0x0951 \
 product=0x1625 not found

This regresstion is due to a typo in matching function. The first
argument is always the usb device that we are checking for. If the
usb xml file provided by user contains bus and device info, we try
to search it by them, otherwise, we use vendor and product info.

The bug occurred only when detaching a usb device with no bus and
device info provided in the usb xml file.
2013-03-27 10:38:08 +08:00