Commit Graph

3391 Commits

Author SHA1 Message Date
Eric Blake
232a31bea3 blockcommit: track job type in xml
A future patch is going to wire up qemu active block commit jobs;
but as they have similar events and are canceled/pivoted in the
same way as block copy jobs, it is easiest to track all bookkeeping
for the commit job by reusing the <mirror> element.  This patch
adds domain XML to track which job was responsible for creating a
mirroring situation, and adds a job='copy' attribute to all
existing uses of <mirror>.  Along the way, it also massages the
qemu monitor backend to read the new field in order to generate
the correct type of libvirt job (even though it requires a
future patch to actually cause a qemu event that can be reported
as an active commit).  It also prepares to update persistent XML
to match changes made to live XML when a copy completes.

* docs/schemas/domaincommon.rng: Enhance schema.
* docs/formatdomain.html.in: Document it.
* src/conf/domain_conf.h (_virDomainDiskDef): Add a field.
* src/conf/domain_conf.c (virDomainBlockJobType): String conversion.
(virDomainDiskDefParseXML): Parse job type.
(virDomainDiskDefFormat): Output job type.
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Distinguish
active from regular commit.
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): Set job type.
(qemuDomainBlockPivot, qemuDomainBlockJobImpl): Clean up job type
on completion.
* tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-mirror-old.xml:
Update tests.
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: Likewise.
* tests/qemuxml2argvdata/qemuxml2argv-disk-active-commit.xml: New
file.
* tests/qemuxml2xmltest.c (mymain): Drive new test.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-07-30 06:32:38 -06:00
Eric Blake
febf84c26a blockjob: properly track blockcopy xml changes on disk
We were not directly saving the domain XML to file after starting
or finishing a blockcopy.  Without the startup write, a libvirtd
restart in the middle of a copy job would forget that the job was
underway.  Then at pivot, we were indirectly writing new XML in
reaction to events that occur as we stop and restart the guest CPUs.
But there was a race: since pivot is an async action, it is possible
that libvirtd is restarted before the pivot completes, so if XML
changes during the event, that change was not written.  The original
blockcopy code cleared out the <mirror> element prior to restarting
the CPUs, but this is also a race, observed if a user does an async
pivot and a dumpxml before the event occurs.  Furthermore, this race
will interfere with active commit in a future patch, because that
code will rely on the <mirror> element at the time of the qemu event
to determine whether to inform the user of a normal commit or an
active commit.

Fix things by saving state any time we modify live XML, while
delaying XML disk modifications until after the event completes.  We
still need a to teach libvirtd restarts to examine all existing
<mirror> elements to see if the job completed in the meantime (that
is, if libvirtd misses the event, the updated state still needs to be
updated in live XML), but that will be a later patch, in part because
we also need to to start taking advantage of newer qemu's ability to
keep the job around after completion rather than the current usage
where the job disappears both on error and on success.

* src/qemu/qemu_driver.c (qemuDomainBlockCopy): Track XML change
on disk.
(qemuDomainBlockJobImpl, qemuDomainBlockPivot): Move job-end XML
rewrites...
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): ...here.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-07-29 15:36:30 -06:00
Eric Blake
9a212d6708 blockcopy: add more XML for state tracking
Doing a blockcopy operation across a libvirtd restart is not very
robust at the moment.  In particular, we are clearing the <mirror>
element prior to telling qemu to finish the job.  Also, thanks to the
ability to request async completion, the user can easily regain
control prior to qemu actually finishing the effort, and they should
be able to poll the domain XML to see if the job is still going.

A future patch will fix things to actually wait until qemu is done
before modifying the XML to reflect the job completion.  But since
qemu issues identical BLOCK_JOB_COMPLETE events regardless of whether
the job was cancelled (kept the original disk) or completed (pivoted
to the new disk), we have to track which of the two operations were
used to end the job.  Furthermore, we'd like to avoid attempts to
end a job where we are already waiting on an earlier request to qemu
to end the job.  Likewise, if we miss the qemu event (perhaps because
it arrived during a libvirtd restart), we still need enough state
recorded to be able to determine how to modify the domain XML once
we reconnect to qemu and manually learn whether the job still exists.

Although this patch doesn't actually fix the problem, it is a
preliminary step that makes it possible to track whether a job
has already begun steps towards completion.

* src/conf/domain_conf.h (virDomainDiskMirrorState): New enum.
(_virDomainDiskDef): Convert bool mirroring to new enum.
* src/conf/domain_conf.c (virDomainDiskDefParseXML)
(virDomainDiskDefFormat): Handle new values.
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Adjust
client.
* src/qemu/qemu_driver.c (qemuDomainBlockPivot)
(qemuDomainBlockJobImpl): Likewise.
* docs/schemas/domaincommon.rng (diskMirror): Expose new values.
* docs/formatdomain.html.in (elementsDisks): Document it.
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: Test it.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-07-29 15:36:30 -06:00
Hu Tao
c5b02b6773 qemu: error out if PCI passthrough type is not supported
If PCI passthrough type is not supported, we should error out rather than
continue building the command line.

When starting a domain, the type has been already checked by
qemuPrepareHostdevPCICheckSupport() before building qemu command line,
so the problem doesn't emerge.

But when coverting a domain xml without specifying passthrough type explictly
to qemu arg, we will get a malformed command line.

the xml:

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0001' bus='0x03' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </hostdev>

the converted command line:

  -device ,host=0001:03:00.0,id=hostdev0,bus=pci.0,addr=0x5

After this patch, virsh gives an error message:

  virsh domxml-to-native qemu-argv /tmp/tmp.xml
  error: internal error: invalid PCI passthrough type 'default'

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
2014-07-29 15:35:08 +02:00
Michal Privoznik
3517e1b2f2 qemu: Implement ./hugepages/page/[@size, @unit, @nodeset]
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-07-29 12:14:52 +01:00
Michal Privoznik
136ad49740 domain: Introduce ./hugepages/page/[@size, @unit, @nodeset]
<memoryBacking>
    <hugepages>
      <page size="1" unit="G" nodeset="0-3,5"/>
      <page size="2" unit="M" nodeset="4"/>
    </hugepages>
  </memoryBacking>

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-07-29 12:02:34 +01:00
Michal Privoznik
725a211fc0 qemu: Utilize virFileFindHugeTLBFS
Use better detection of hugetlbfs mount points. Yes, there can be
multiple mount points each serving different huge page size.

Since we already have ability to override the mount point in the
qemu.conf file, this crazy backward compatibility code is brought in.
Now we allow multiple mount points, so the "hugetlbfs_mount" option
must take an list of strings (mount points). But previously, it was
just a string, so we must accept both types now.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-07-29 11:58:35 +01:00
Peter Krempa
a813d1c61b qemu: sound: Fix uninitialized model string
Commit e5f36698e3 introduces a
false-positive build failure in the sound card model handling switch.
Initialize the model to NULL although the value should never be used.
2014-07-28 11:38:35 +02:00
Peter Krempa
e5f36698e3 qemu: sound: Handle all possible sound cards in switch statement
Use correct type in the switch and handle all sound card models in it so
that the compiler tracks additions.
2014-07-28 10:46:33 +02:00
Peter Krempa
1c6999d340 conf: RNG: Always fill in default random source path for default backend
Libvirt documents that the default entropy source for the 'random'
backend of a RNG device is /dev/random. Instead of storing and
propagating NULL across our code and checking it in multiple places fill
the default in the post parse callback and use that in the other places.
2014-07-28 10:07:09 +02:00
Peter Krempa
efdb9117ee qemu: Fix starting of VMs with empty CDROM drives
Since 24e5cafba6 (thankfully unreleased)
when a VM with an empty disk drive would be started the code would call
stat() on NULL path as a check was missing from the callback rendering
machines unstartable.

Report success when the path is empty (denoting an empty drive).
2014-07-25 14:33:07 +02:00
Peter Krempa
bbddbefa2f virtio-rng: allow multiple RNG devices
qemu supports adding multiple RNG devices. This patch allows libvirt to
support this.
2014-07-25 09:34:53 +02:00
Peter Krempa
99ff49eed1 qemu: cgroup: Don't use NULL path on default backed RNGs
The "random" backend for virtio-rng can be started with no path
specified which equals to /dev/random. The cgroup code didn't consider
this and called few of the functions with NULL resulting into:

 $ virsh start rng-vm
 error: Failed to start domain rng-vm
 error: Path '(null)' is not accessible: Bad address

Problem introduced by commit c6320d3463
2014-07-25 09:34:53 +02:00
Michal Privoznik
3d968f409f qemuConnectGetDomainCapabilities: Report error on unknown arch
If user hasn't provided any @emulatorbin, the qemuCaps are
searched by @arch provided (which in fact can be guessed from the
host). However, there's no guarantee that the qemu binary for
@arch will exist.  Therefore qemu capabilities may be nonexistent
too. If that's the case, we should throw an error message prior
jumping onto 'cleanup' label as the helper lookup function
remains silent on no search result.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-07-24 18:01:57 +02:00
Martin Kletzander
9318121db8 remove range checking for blkiotune weight
This was changed before:

https://www.redhat.com/archives/libvir-list/2013-October/msg00525.html

but not everywhere in the code.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1100769

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-24 17:32:37 +02:00
John Ferlan
17bddc46f4 hostdev: Introduce virDomainHostdevSubsysSCSIiSCSI
Create the structures and API's to hold and manage the iSCSI host device.
This extends the 'scsi_host' definitions added in commit id '5c811dce'.
A future patch will add the XML parsing, but that code requires some
infrastructure to be in place first in order to handle the differences
between a 'scsi_host' and an 'iSCSI host' device.
2014-07-24 07:04:44 -04:00
John Ferlan
a062d1a1cc Add virConnectPtr for qemuBuildSCSIHostdevDrvStr
Add a conn for future patches to be able to grab the secret when
authenticating an iSCSI host device
2014-07-24 06:39:28 -04:00
John Ferlan
42957661dc hostdev: Introduce virDomainHostdevSubsysSCSIHost
Split virDomainHostdevSubsysSCSI further. In preparation for having
either SCSI or iSCSI data, create a union in virDomainHostdevSubsysSCSI
to contain just a virDomainHostdevSubsysSCSIHost to describe the
'scsi_host' host device
2014-07-24 06:39:28 -04:00
John Ferlan
5805621cd9 hostdev: Introduce virDomainHostdevSubsysSCSI
Create a separate typedef for the hostdev union data describing SCSI
Then adjust the code to use the new pointer
2014-07-24 06:39:27 -04:00
John Ferlan
1c8da0d44e hostdev: Introduce virDomainHostdevSubsysPCI
Create a separate typedef for the hostdev union data describing PCI.
Then adjust the code to use the new pointer
2014-07-24 06:39:27 -04:00
John Ferlan
7540d07f09 hostdev: Introduce virDomainHostdevSubsysUSB
Create a separate typedef for the hostdev union data describing USB.
Then adjust the code to use the new pointer
2014-07-24 06:39:27 -04:00
Peter Krempa
185e07a5f8 qemu: snapshot: Use storage driver to pre-create snapshot file
Move the last operation done on local files to the storage driver API.
2014-07-24 09:59:00 +02:00
Peter Krempa
24e5cafba6 qemu: Implement DAC driver chown callback to co-operate with storage drv
Use the storage driver to chown remote images.
2014-07-24 09:59:00 +02:00
Peter Krempa
7490a6d272 security: DAC: Introduce callback to perform image chown
To integrate the security driver with the storage driver we need to
pass a callback for a function that will chown storage volumes.

Introduce and document the callback prototype.
2014-07-24 09:58:59 +02:00
Michal Privoznik
12926a7c39 qemuConnectGetDomainCapabilities: Use wiser defaults
Up to now, users have to pass two arguments at least: domain virt type
('qemu' vs 'kvm') and one of emulatorbin or architecture. This is not
much user friendly. Nowadays users mostly use KVM and share the host
architecture with the guest. So now, the API (and subsequently virsh
command) can be called with all NULLs  (without any arguments).

Before this patch:
 # virsh domcapabilities
 error: failed to get emulator capabilities
 error: virttype_str in qemuConnectGetDomainCapabilities must not be NULL

 # virsh domcapabilities kvm
 error: failed to get emulator capabilities
 error: invalid argument: at least one of emulatorbin or architecture fields must be present

After:

 # virsh domcapabilities
 <domainCapabilities>
   <path>/usr/bin/qemu-system-x86_64</path>
   <domain>kvm</domain>
   <machine>pc-i440fx-2.1</machine>
   <arch>x86_64</arch>
   <vcpu max='255'/>
 </domainCapabilities>

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-07-24 09:19:09 +02:00
Martin Kletzander
dc8b7ce7bc numatune: finish the split from domain_conf and remove all dependencies
This patch adds back the virDomainDef typedef into domain_conf and
makes all the numatune_conf functions independent of any virDomainDef
definitions.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-24 08:40:25 +02:00
Eric Blake
60e4944059 metadata: track title edits across libvirtd restart
https://bugzilla.redhat.com/show_bug.cgi?id=1122205

Although the edits were changing in-memory XML, it was not flushed
to disk; so unless some other action changes XML, a libvirtd restart
would lose the changed information.

* src/conf/domain_conf.c (virDomainObjSetMetadata): Add parameter,
to save live status across restarts.
(virDomainSaveXML): Allow for test driver.
* src/conf/domain_conf.h (virDomainObjSetMetadata): Adjust
signature.
* src/bhyve/bhyve_driver.c (bhyveDomainSetMetadata): Adjust caller.
* src/lxc/lxc_driver.c (lxcDomainSetMetadata): Likewise.
* src/qemu/qemu_driver.c (qemuDomainSetMetadata): Likewise.
* src/test/test_driver.c (testDomainSetMetadata): Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-07-23 10:07:34 -06:00
Ján Tomko
3227e17d82 Introduce virTristateSwitch enum
For the values "default", "on", "off"

Replaces
virDeviceAddressPCIMulti
virDomainFeatureState
virDomainIoEventFd
virDomainVirtioEventIdx
virDomainDiskCopyOnRead
virDomainMemDump
virDomainPCIRombarMode
virDomainGraphicsSpicePlaybackCompression
2014-07-23 12:59:40 +02:00
Ján Tomko
bb018ce6c8 Introduce virTristateBool enum type
Replace all three-state (default/yes/no) enums with it:
virDomainBIOSUseserial
virDomainBootMenu
virDomainPMState
virDomainGraphicsSpiceClipboardCopypaste
virDomainGraphicsSpiceAgentFileTransfer
virNetworkDNSForwardPlainNames
2014-07-23 12:37:39 +02:00
Chen Hanxiao
1ce7c1d20c LXC: show used memory as 0 when domain is not active
Before:
virsh # dominfo chx3
State:          shut off
Max memory:     92160 KiB
Used memory:    92160 KiB

After:
virsh # dominfo container1
State:          shut off
Max memory:     92160 KiB
Used memory:    0 KiB

Similar to qemu cases.

Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>
2014-07-23 15:12:52 +08:00
Peter Krempa
1e833899ce qemu: snapshot: Forbid taking/reverting snapshots in PMSUSPENDED state
Qemu doesn't currently support them and behaves strangely. Just forbid
them.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1079162
2014-07-22 10:22:35 +02:00
Peter Krempa
c71045a9cb qemu: snapshot: Forbid taking snapshot in invalid state
Similarly to 49a3a649a8 forbid creating
snapshots in domain states impossible to reach in qemu.
2014-07-22 10:22:35 +02:00
Peter Krempa
49a3a649a8 qemu: snapshot: Reject revertion from clearly bad states
Report errors on some states snapshots done by qemu should never reach
2014-07-21 11:09:53 +02:00
Peter Krempa
aa7e76a579 qemu: snapshot: Convert if-else switch to switch statement
Convert the target snapshot state selector to a switch statement
enumerating all possible values. This points out a few mistakes in the
original selector.

The logic of the code is preserved until later patches.
2014-07-21 11:00:11 +02:00
Peter Krempa
1f4933f0f4 qemu: snapshot: Forbid snapshots of iSCSI passthrough devices
As with the local SCSI passthrough devicesm qemu can't support snapshots
on those as the block ops are handled by the device. This is also true
for iSCSI backing of the disk. Remove the check for the local block
device and just forbid snapshot when the disk is of type 'lun'.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1073368
2014-07-18 17:20:51 +02:00
Martin Kletzander
7e72ac7878 qemu: leave restricting cpuset.mems after initialization
When domain is started with numatune memory mode strict and the
nodeset does not include host NUMA node with DMA and DMA32 zones, KVM
initialization fails.  This is because cgroup restrict even kernel
allocations.  We are already doing numa_set_membind() which does the
same thing, only it does not restrict kernel allocations.

This patch leaves the userspace numa_set_membind() in place and moves
the cpuset.mems setting after the point where monitor comes up, but
before vcpu and emulator sub-groups are created.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
aa668fccf0 qemu: split out cpuset.mems setting
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
1c19d3e072 qemu: pass numa node binding preferences to qemu
Currently, we only bind the whole QEMU domain to memory nodes
specified in nodemask altogether.  That, however, doesn't make much
sense when one wants to control from where the memory for particular
guest nodes should be allocated.  QEMU allows us to do that by
specifying 'host-nodes' parameter for the 'memory-backend-ram' object,
so let's use that.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
001b9dc1dc qemu: enable disjoint numa cpu ranges
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
1a324c2f88 qemu: newer -numa parameter capability probing
When qemu switched to using OptsVisitor for -numa parameter, it did
two things in the same patch.  One of them is that the numa parameter
is now visible in "query-command-line-options", the second one is that
it enabled using disjoint cpu ranges for -numa specification.  This
will be used in later patch.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
ad064ec6e6 qemu: memory-backend-ram capability probing
The numa patch series in qemu adds "memory-backend-ram" object type by
which we can tell whether we can use such objects.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
7bc1db5a1d qemu: allow qmp probing for cmdline options without params
That can be lately achieved with by having .param == NULL in the
virQEMUCapsCommandLineProps struct.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
1a7be8c600 numatune: add support for per-node memory bindings in private APIs
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
93e82727ec numatune: Encapsulate numatune configuration in order to unify results
There were numerous places where numatune configuration (and thus
domain config as well) was changed in different ways.  On some
places this even resulted in persistent domain definition not to be
stable (it would change with daemon's restart).

In order to uniformly change how numatune config is dealt with, all
the internals are now accessible directly only in numatune_conf.c and
outside this file accessors must be used.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
e764ec7ae3 numatune: unify numatune struct and enum names
Since there was already public virDomainNumatune*, I changed the
private virNumaTune to match the same, so all the uses are unified and
public API is kept:

s/vir\(Domain\)\?Numa[tT]une/virDomainNumatune/g

then shrunk long lines, and mainly functions, that were created after
that:

sed -i 's/virDomainNumatuneMemPlacementMode/virDomainNumatunePlacement/g'

And to cope with the enum name, I haad to change the constants as
well:

s/VIR_NUMA_TUNE_MEM_PLACEMENT_MODE/VIR_DOMAIN_NUMATUNE_PLACEMENT/g

Last thing I did was at least a little shortening of already long
name:

s/virDomainNumatuneDef/virDomainNumatune/g

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
293d5f21b6 numatune: create new module for numatune
There are many places with numatune-related code that should be put
into special numatune_conf and this patch creates a basis for that.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
992000e6d8 conf, schema: add 'id' field for cells
In XML format, by definition, order of fields should not matter, so
order of parsing the elements doesn't affect the end result.  When
specifying guest NUMA cells, we depend only on the order of the 'cell'
elements.  With this patch all older domain XMLs are parsed as before,
but with the 'id' attribute they are parsed and formatted according to
that field.  This will be useful when we have tuning settings for
particular guest NUMA node.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
92ff464bbb qemu: remove useless error check
Excerpt from the virCommandAddArgBuffer() description: "Correctly
transfers memory errors or contents from buf to cmd."

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
cee22001d3 qemu: purely a code movement
to ease the review of commits to follow.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Michele Paolino
a14abd463a support for QEMU vhost-user
This patch adds support for the QEMU vhost-user feature to libvirt.
vhost-user enables the communication between a QEMU virtual machine
and other userspace process using the Virtio transport protocol.
It uses a char dev (e.g. Unix socket) for the control plane,
while the data plane based on shared memory.

The XML looks like:

<interface type='vhostuser'>
    <mac address='52:54:00:3b:83:1a'/>
    <source type='unix' path='/tmp/vhost.sock' mode='server'/>
    <model type='virtio'/>
</interface>

Signed-off-by: Michele Paolino <m.paolino@virtualopensystems.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-07-16 18:44:57 +02:00