Commit Graph

475 Commits

Author SHA1 Message Date
Peter Krempa
e9a4506963 qemu: monitor: Rename and improve qemuMonitorGetPtyPaths
To unify future additions that require information from "query-chardev"
rename qemuMonitorGetPtyPaths and friends to qemuMonitorGetChardevInfo
and move the allocation of the returned hash into the top level
function.
2014-11-21 11:00:10 +01:00
Peter Krempa
6692ba731b qemu: process: report useful error if alias formatting fails
When retrieving the paths for PTY devices the alias gets formatted into
a static string. If it doesn't fit we wouldn't report an error.
2014-11-21 11:00:10 +01:00
Peter Krempa
7e130e8b35 storage: qemu: Fix security labelling of new image chain elements
When creating a disk image snapshot the libvirt code would blindly copy
the parents label to the newly created image. This runs into problems
when you start a VM from an image hosted on NFS (or other storage system
that doesn't support selinux labels) and the snapshot destination is on
a storage system that does support selinux labels. Libvirt's code in
that case generates a different security label for the image hosted on
NFS. This label is valid only for NFS images and doesn't allow access in
case of a locally stored image.

To fix this issue libvirt needs to refrain from copying security
information in cases where the default domain seclabel is a better
choice.

This patch repurposes the now unused @force argument of
virStorageSourceInitChainElement to denote whether a copy of the
security labelling stuff should be attempted or not. This allows to
fine-control the copy operation for cases where we need to keep the
label of the old disk vs. the cases where we need to keep the label
unset to use the default domain imagelabel.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1151718
2014-11-21 09:28:26 +01:00
Anirban Chakraborty
22cff52a2b network: Add network bandwidth support to ethernet interfaces
Ethernet interfaces in libvirt currently do not support bandwidth setting.
For example, following xml file for an interface will not apply these
settings to corresponding qdiscs.

    <interface type="ethernet">
      <mac address="02:36:1d:18:2a:e4"/>
      <model type="virtio"/>
      <script path=""/>
      <target dev="tap361d182a-e4"/>
      <bandwidth>
        <inbound average="984" peak="1024" burst="64"/>
        <outbound average="2000" peak="2048" burst="128"/>
      </bandwidth>
    </interface>

Signed-off-by: Anirban Chakraborty <abchak@juniper.net>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-11-19 10:36:49 +01:00
Martin Kletzander
5cca4cd16f Remove unnecessary curly brackets in src/qemu/
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-11-14 17:13:01 +01:00
Pavel Hrdina
41127244fb nwfilter: fix deadlock caused updating network device and nwfilter
Commit 6e5c79a1 tried to fix deadlock between nwfilter{Define,Undefine}
and starting of guest, but this same deadlock exists for
updating/attaching network device to domain.

The deadlock was introduced by removing global QEMU driver lock because
nwfilter was counting on this lock and ensure that all driver locks are
locked inside of nwfilter{Define,Undefine}.

This patch extends usage of virNWFilterReadLockFilterUpdates to prevent
the deadlock for all possible paths in QEMU driver. LXC and UML drivers
still have global lock.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1143780

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2014-11-13 10:45:19 +01:00
Michal Privoznik
54ddc08ddb qemuPrepareNVRAM: Save domain conf only if domain's persistent
In one of my previous patches (3a3c3780b) I've tried to fix the
problem of nvram path disappearing on a domain that's been
started and shut down again. I fixed this by explicitly saving
domain's config file.  However, I did a bit of clumsy without
realizing we have a transient domains for which we don't save the
config file. Hence, any domain using UEFI became persistent.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-11-13 09:35:25 +01:00
Michal Privoznik
6ea54769ba qemu: Update fsfreeze status on domain state transitions
https://bugzilla.redhat.com/show_bug.cgi?id=1160084

As of b6d4dad1 (1.2.5) libvirt keeps track if domain disks have been
frozen. However, this falls into that set of information which don't
survive domain restart. Therefore, we need to clear the flag upon some
state transitions. Moreover, once we clear the flag we must update the
status file too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-11-06 15:20:01 +01:00
Martin Kletzander
c63ef0452b numa: split util/ and conf/ and support non-contiguous nodesets
This is a reaction to Michal's fix [1] for non-NUMA systems that also
splits out conf/ out of util/ because libvirt_util shouldn't require
libvirt_conf if it is the other way around.  This particular use case
worked, but we're trying to avoid it as mentioned [2], many times.

The only functions from virnuma.c that needed numatune_conf were
virDomainNumatuneNodesetIsAvailable() and virNumaSetupMemoryPolicy().
The first one should be in numatune_conf as it works with
virDomainNumatune, the second one just needs nodeset and mode, both of
which can be passed without the need of numatune_conf.

Apart from fixing that, this patch also fixes recently added
code (between commits d2460f85^..5c8515620) that doesn't support
non-contiguous nodesets.  It uses new function
virNumaNodesetIsAvailable(), which doesn't need a stub as it doesn't use
any libnuma functions, to check if every specified nodeset is available.

[1] https://www.redhat.com/archives/libvir-list/2014-November/msg00118.html
[2] http://www.redhat.com/archives/libvir-list/2011-June/msg01040.html

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-11-06 15:13:55 +01:00
Martin Kletzander
11a48758a7 qemu: make advice from numad available when building commandline
Particularly in qemuBuildNumaArgStr(), there was a need for the advice
due to memory backing, which needs to know the nodeset it will be pinned
to.  With newer qemu this caused the following error when starting
domain:

  error: internal error: Advice from numad is needed in case of
  automatic numa placement

even when starting perfectly valid domain, e.g.:

  ...
  <vcpu placement='auto'>4</vcpu>
  <numatune>
    <memory mode='strict' placement='auto'/>
  </numatune>
  <cpu>
    <numa>
      <cell id='0' cpus='0' memory='524288'/>
      <cell id='1' cpus='1' memory='524288'/>
    </numa>
  </cpu>
  ...

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1138545

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-11-03 16:43:22 +01:00
weiwei li
be598c5ff8 qemu: Release nbd port from migrationPorts instead of remotePorts
commit 3e1e16aa8d (Use a port from the
migration range for NBD as well) changed ndb port allocation from
remotePorts to migrationPorts, but did not change the port releasing
process, which makes an error when migrating several times (above 64):
error: internal error: Unable to find an unused port in range
'migration' (49152-49215)

https://bugzilla.redhat.com/show_bug.cgi?id=1159245

Signed-off-by: Weiwei Li <nuonuoli@tencent.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2014-10-31 12:20:06 +01:00
Zhou yimin
411cea638f qemu: move setting emulatorpin ahead of monitor showing up
If VM is configured with many devices(including passthrough devices)
and large memory, libvirtd will take seconds(in the worst case) to
wait for monitor. In this period the qemu process may run on any
PCPU though I intend to pin emulator to the specified PCPU in xml
configuration.

Actually qemu process takes high cpu usage during vm startup.
So this is not the strict CPU isolation in this case.

Signed-off-by: Zhou yimin <zhouyimin@huawei.com>
2014-10-21 12:26:38 +02:00
Laine Stump
b6bdda458a qemu: setup infrastructure to handle NIC_RX_FILTER_CHANGED event
NIC_RX_FILTER_CHANGED is sent by qemu any time a NIC driver in the
guest modified the NIC's RX Filter (for example, if the MAC address of
the NIC is changed by the guest).

This patch doesn't do anything useful with that event; it just sets up
all the plumbing to get news of the event into a worker thread with
all proper locking/reference counting, and provide an easy place to
add in desired functionality.

See src/qemu/EVENTHANDLERS.txt for information/instructions on adding
a libvirt-internal handler for a qemu event (using
NIC_RX_FILTER_CHANGED as an example).
2014-10-06 13:50:57 -04:00
Guido Günther
4882618ed1 qemu: use systemd's TerminateMachine to kill all processes
If we don't properly clean up all processes in the
machine-<vmname>.scope systemd won't remove the cgroup and subsequent vm
starts fail with

  'CreateMachine: File exists'

Additional processes can e.g. be added via

  echo $PID > /sys/fs/cgroup/systemd/machine.slice/machine-${VMNAME}.scope/tasks

but there are other cases like

  http://bugs.debian.org/761521

Invoke TerminateMachine to be on the safe side since systemd tracks the
cgroup anyway. This is a noop if all processes have terminated already.
2014-10-01 20:17:46 +02:00
Ján Tomko
ec5f817f2e Don't verify CPU features with host-passthrough
Commit fba6bc4 introduced the non-migratable invtsc feature,
breaking save/migration with host-model and host-passthrough.

On hosts with this feature present it was automatically included
in the CPU definition, regardless of QEMU support.

Commit de0aeaf stopped including it by default for host-model,
but failed to fix host-passthrough.

This commit ignores checking of CPU features with host-passthrough,
since we don't pass them to QEMU (only -cpu host is passed),
allowing domains using host-passthrough that were saved with
the broken version of libvirtd to be restored.

https://bugzilla.redhat.com/show_bug.cgi?id=1147584
2014-09-30 10:47:02 +02:00
Michal Privoznik
3a3c3780b4 qemuPrepareNVRAM: Save domain after NVRAM path generation
On a domain startup, the variable store path is generated if needed.
The path is intended to be generated only once. However, the updated
domain definition is not saved into config dir rather than state XML
only. So later, whenever the domain is destroyed and the daemon is
restarted, the generated path is forgotten and the file may be left
behind on virDomainUndefine() call.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-09-26 10:14:34 +02:00
Peter Krempa
639a00984a qemu: Report better errors from broken backing chains
Request erroring out from the backing chain traveller and drop qemu's
internal backing chain integrity tester.

The backing chain traveller reports errors by itself with possibly more
detail than qemuDiskChainCheckBroken ever could.

We also need to make sure that we reconnect to existing qemu instances
even at the cost of losing the backing chain info (this really should be
stored in the XML rather than reloaded from disk, but that needs some
work).
2014-09-24 10:18:47 +02:00
John Ferlan
74eaa0918b qemu: Process the hostdev "rawio" setting
Mimic the "Disk" processing for 'rawio', but for a scsi_host hostdev
lun device.
2014-09-19 07:49:06 -04:00
John Ferlan
320825b4ca domain_conf: Change virDomainDiskDef 'rawio' to use virTristateBool
Adjust disk definition for 'rawio' to use the TristateBool logic
2014-09-19 05:59:36 -04:00
John Ferlan
8921d48868 qemu: Add missing goto on rawio
Commit id '9a2f36ec' added a build conditional of CAP_SYS_RAWIO
in order to determine whether or not a disk definition using rawio
should be allowed on platforms without CAP_SYS_RAWIO. If one was
found, virReportError was used but the code didn't goto cleanup.

This patch adds the goto.
2014-09-19 05:54:00 -04:00
Pavel Hrdina
da7799d879 Move the FIPS detection from capabilities
We are not detecting the presence of FIPS from QEMU, but from procfs and
that means it's not QEMU capability. It was decided that we will pass
this flag to QEMU even if it's not supported by old QEMU binaries.

This patch also reverts changes done by commit a21cfb0f to
qemucapabilitestest and implements a new test case in qemuxml2argvtest.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1135431

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2014-09-19 09:08:23 +02:00
Ján Tomko
c1480871bb Fixes for domains with no iothreads
Plug a memory leak and silence a warning.
2014-09-18 14:49:01 +02:00
Ján Tomko
b20d39a56f Wire up the interface backend options
Pass the user-specified tun path down when creating tap device
when called from the qemu driver.

Also honor the vhost device path specified by user.
2014-09-16 16:02:34 +02:00
John Ferlan
76a81b1d31 qemu: Need to check for capability before query
Prior to trying the query-iothreads call - check if the qemu has
the capability

Signed-off-by: John Ferlan <jferlan@redhat.com>
2014-09-16 06:08:20 -04:00
John Ferlan
500c91c57d qemu_cgroup: Adjust spacing around incrementor
Change "i+1" to "i + 1"
2014-09-15 21:05:46 -04:00
John Ferlan
b66c950fb9 qemu: Fix iothreads issue
If there are no iothreads, then return from qemuProcessDetectIOThreadPIDs
without error; otherwise, the following occurs:

error: Failed to start domain $dom
error: An error occurred, but the cause is unknown
2014-09-15 21:05:46 -04:00
John Ferlan
9bef96ec50 qemu: Allow pinning specific IOThreads to a CPU
Modify qemuProcessStart() in order to allowing setting affinity to
specific CPU's for IOThreads. The process followed is similar to
that for the vCPU's.

This involves adding a function to fetch the IOThread id's via
qemuMonitorGetIOThreads() and adding them to iothreadpids[] list.
Then making sure all the cgroup data has been properly set up and
finally assigning affinity.
2014-09-15 13:18:56 -04:00
John Ferlan
35a50ea8c7 qemu: Resolve Coverity NEGATIVE_RETURNS
In qemuProcessInitPCIAddresses() if qemuMonitorGetAllPCIAddresses()
returns a negative (or zero) value, then no need to call the
qemuProcessDetectPCIAddresses().

Signed-off-by: John Ferlan <jferlan@redhat.com>
2014-09-11 08:10:14 -04:00
Michal Privoznik
742b08e30f qemu: Automatically create NVRAM store
When using split UEFI image, it may come handy if libvirt manages per
domain _VARS file automatically. While the _CODE file is RO and can be
shared among multiple domains, you certainly don't want to do that on
the _VARS file. This latter one needs to be per domain. So at the
domain startup process, if it's determined that domain needs _VARS
file it's copied from this master _VARS file. The location of the
master file is configurable in qemu.conf.

Temporary, on per domain basis the location of master NVRAM file can
be overridden by this @template attribute I'm inventing to the
<nvram/> element. All it does is holding path to the master NVRAM file
from which local copy is created. If that's the case, the map in
qemu.conf is not consulted.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
2014-09-10 09:38:07 +02:00
Jiri Denemark
eaee338ae6 qemu: Recompute downtime and total time when migration completes
Total time of a migration and total downtime transfered from a source to
a destination host do not count with the transfer time to the
destination host and with the time elapsed before guest CPUs are
resumed. Thus, source libvirtd remembers when migration started and when
guest CPUs were paused. Both timestamps are transferred to destination
libvirtd which uses them to compute total migration time and total
downtime. Obviously, this requires the time to be synchronized between
the two hosts. The reported times are useless otherwise but they would
be equally useless if we didn't do this recomputation so don't lose
anything by doing it.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2014-09-10 09:37:34 +02:00
Jiri Denemark
03890605dc qemu: Propagate QEMU errors during incoming migrations
When QEMU fails during incoming migration after we successfully started
it (i.e., during Perform or Finish phase), we report a rather unhelpful
message

    Unable to read from monitor: Connection reset by peer

We already have a code that takes error messages from QEMU's error
output but we disable it once QEMU successfully starts. This patch
postpones this until the end of Finish phase during incoming migration
so that we can report a much better error message:

    internal error: early end of file from monitor: possible problem:
    Unknown savevm section or instance '0000:00:05.0/virtio-balloon' 0
    load of migration failed

https://bugzilla.redhat.com/show_bug.cgi?id=1090093

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2014-09-08 13:33:44 +02:00
Eric Blake
44e30277d8 maint: use consistent if-else braces in qemu
I'm about to add a syntax check that enforces our documented
HACKING style of always using matching {} on if-else statements.

This commit focuses on the qemu driver.

* src/qemu/qemu_command.c (qemuParseISCSIString)
(qemuParseCommandLineDisk, qemuParseCommandLine)
(qemuBuildSmpArgStr, qemuBuildCommandLine)
(qemuParseCommandLineDisk, qemuParseCommandLineSmp): Correct use
of {}.
* src/qemu/qemu_capabilities.c (virQEMUCapsProbeCPUModels):
Likewise.
* src/qemu/qemu_driver.c (qemuDomainCoreDumpWithFormat)
(qemuDomainRestoreFlags, qemuDomainGetInfo)
(qemuDomainMergeBlkioDevice): Likewise.
* src/qemu/qemu_hotplug.c (qemuDomainAttachNetDevice): Likewise.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextCreateSnapshot)
(qemuMonitorTextLoadSnapshot, qemuMonitorTextDeleteSnapshot):
Likewise.
* src/qemu/qemu_process.c (qemuProcessStop): Likewise.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-09-04 08:53:21 -06:00
Wang Rui
4f2ad084bc qemu_process: Resolve Coverity RESOURCE_LEAK
If virSecurityManagerClearSocketLabel() fails, 'agent' won't
be freed before jumping to cleanup.

Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
2014-09-03 15:00:19 -04:00
Chunyan Liu
0e1a1a8c47 qemu: ensure sane umask for qemu process
Add umask to _virCommand, allow user to set umask to command.
Set umask(002) to qemu process to overwrite the default umask
of 022 set by many distros, so that unix sockets created for
virtio-serial has expected permissions.

Fix problem reported here:
https://sourceware.org/bugzilla/show_bug.cgi?id=13078#c11
https://bugzilla.novell.com/show_bug.cgi?id=888166

To use virtio-serial device, unix socket created for chardev with
default umask(022) has insufficient permissions.
e.g.:
-device virtio-serial \
-chardev socket,path=/tmp/foo,server,nowait,id=foo \
-device virtserialport,chardev=foo,name=org.fedoraproject.port.0

srwxr-xr-x 1 qemu qemu 0 21. Jul 14:19 /tmp/somefile.sock

Other users in the same group (like real user, test engines, etc)
cannot write to this socket.

Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2014-09-03 05:58:15 -06:00
Erik Skultety
36a0993a15 qemu: min_guarantee: Parameter 'min_guarantee' not supported
The 'min_guarantee' is used by VMware ESX and OpenVZ drivers,
with qemu however, libvirt should report error when starting a domain,
because this element is not used.
Resolves https://bugzilla.redhat.com/show_bug.cgi?id=1122455
2014-08-22 16:33:18 +02:00
Roman Bogorodskiy
8c170c9fe6 storage: make disk source pool translation generic
Currently, qemu driver uses qemuTranslateDiskSourcePool()
to translate disk volume information. This function is
general enough and could be used for other drivers as well,
so move it to conf/domain_conf.c along with its helpers.

 - qemuTranslateDiskSourcePool: move to storage/storage_driver.c
   and rename to virStorageTranslateDiskSourcePool,
 - qemuAddISCSIPoolSourceHost: move to storage/storage_driver.c
   and rename to virStorageAddISCSIPoolSourceHost,
 - qemuTranslateDiskSourcePoolAuth: move to storage/storage_driver.c
   and rename to virStorageTranslateDiskSourcePoolAuth,
 - Update users of qemuTranslateDiskSourcePool to use a
   new name.
2014-08-19 20:50:12 +04:00
Peter Krempa
482f4e596f qemu: process: Pin on per-vcpu basis instead of per-vcpupin element
Pin existing vcpus rather than existing vcpu pinning infos. This
increases the complexity of the lookup, but avoids pinning cpus that are
not enabled actually.
2014-08-18 17:43:05 +02:00
Peter Krempa
a821f1f028 qemu: process: Remove unnecessary argument and rename function
We set just one affinity of the emulator and the virConnectPtr isn't
needed for that function.
2014-08-18 17:43:05 +02:00
Erik Skultety
9b1759bbe9 qemu: Redundant listen address entry in quest xml
When editing guest's XML (on QEMU), it was possible to add multiple
listen elements into graphics parent element. However QEMU does not
support listening on multiple addresses. Configuration is tested for
multiple 'listen address' and if positive, an error is raised.

https://bugzilla.redhat.com/show_bug.cgi?id=1119212
2014-08-18 14:45:37 +02:00
Pavel Hrdina
0c35a415f7 qemu_process: fix memleak found by coverity
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2014-08-14 19:33:06 +02:00
Sam Bobroff
f0f9eed843 qemu: Tidy up job handling during live migration
During a QEMU live migration several warning messages about job
handling could be written to syslog on the destination host:

"entering monitor without asking for a nested job is dangerous"

The messages are written because the job handling during migration
uses hard coded asyncJob values in several places that are incorrect.

This patch passes the required asyncJob value around and prevents
the warnings as well as any issues that the warnings may be referring
to.

https://bugzilla.redhat.com/show_bug.cgi?id=1130089

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2014-08-14 12:12:42 +02:00
Peter Krempa
e3f5af6a5f qemu: process: Fix header format of qemuProcessSetVcpuAffinities
Fix header alignment and remove the unused conn parameter.
2014-08-12 17:24:34 +02:00
Eric Blake
232a31bea3 blockcommit: track job type in xml
A future patch is going to wire up qemu active block commit jobs;
but as they have similar events and are canceled/pivoted in the
same way as block copy jobs, it is easiest to track all bookkeeping
for the commit job by reusing the <mirror> element.  This patch
adds domain XML to track which job was responsible for creating a
mirroring situation, and adds a job='copy' attribute to all
existing uses of <mirror>.  Along the way, it also massages the
qemu monitor backend to read the new field in order to generate
the correct type of libvirt job (even though it requires a
future patch to actually cause a qemu event that can be reported
as an active commit).  It also prepares to update persistent XML
to match changes made to live XML when a copy completes.

* docs/schemas/domaincommon.rng: Enhance schema.
* docs/formatdomain.html.in: Document it.
* src/conf/domain_conf.h (_virDomainDiskDef): Add a field.
* src/conf/domain_conf.c (virDomainBlockJobType): String conversion.
(virDomainDiskDefParseXML): Parse job type.
(virDomainDiskDefFormat): Output job type.
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Distinguish
active from regular commit.
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): Set job type.
(qemuDomainBlockPivot, qemuDomainBlockJobImpl): Clean up job type
on completion.
* tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-mirror-old.xml:
Update tests.
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: Likewise.
* tests/qemuxml2argvdata/qemuxml2argv-disk-active-commit.xml: New
file.
* tests/qemuxml2xmltest.c (mymain): Drive new test.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-07-30 06:32:38 -06:00
Eric Blake
febf84c26a blockjob: properly track blockcopy xml changes on disk
We were not directly saving the domain XML to file after starting
or finishing a blockcopy.  Without the startup write, a libvirtd
restart in the middle of a copy job would forget that the job was
underway.  Then at pivot, we were indirectly writing new XML in
reaction to events that occur as we stop and restart the guest CPUs.
But there was a race: since pivot is an async action, it is possible
that libvirtd is restarted before the pivot completes, so if XML
changes during the event, that change was not written.  The original
blockcopy code cleared out the <mirror> element prior to restarting
the CPUs, but this is also a race, observed if a user does an async
pivot and a dumpxml before the event occurs.  Furthermore, this race
will interfere with active commit in a future patch, because that
code will rely on the <mirror> element at the time of the qemu event
to determine whether to inform the user of a normal commit or an
active commit.

Fix things by saving state any time we modify live XML, while
delaying XML disk modifications until after the event completes.  We
still need a to teach libvirtd restarts to examine all existing
<mirror> elements to see if the job completed in the meantime (that
is, if libvirtd misses the event, the updated state still needs to be
updated in live XML), but that will be a later patch, in part because
we also need to to start taking advantage of newer qemu's ability to
keep the job around after completion rather than the current usage
where the job disappears both on error and on success.

* src/qemu/qemu_driver.c (qemuDomainBlockCopy): Track XML change
on disk.
(qemuDomainBlockJobImpl, qemuDomainBlockPivot): Move job-end XML
rewrites...
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): ...here.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-07-29 15:36:30 -06:00
Eric Blake
9a212d6708 blockcopy: add more XML for state tracking
Doing a blockcopy operation across a libvirtd restart is not very
robust at the moment.  In particular, we are clearing the <mirror>
element prior to telling qemu to finish the job.  Also, thanks to the
ability to request async completion, the user can easily regain
control prior to qemu actually finishing the effort, and they should
be able to poll the domain XML to see if the job is still going.

A future patch will fix things to actually wait until qemu is done
before modifying the XML to reflect the job completion.  But since
qemu issues identical BLOCK_JOB_COMPLETE events regardless of whether
the job was cancelled (kept the original disk) or completed (pivoted
to the new disk), we have to track which of the two operations were
used to end the job.  Furthermore, we'd like to avoid attempts to
end a job where we are already waiting on an earlier request to qemu
to end the job.  Likewise, if we miss the qemu event (perhaps because
it arrived during a libvirtd restart), we still need enough state
recorded to be able to determine how to modify the domain XML once
we reconnect to qemu and manually learn whether the job still exists.

Although this patch doesn't actually fix the problem, it is a
preliminary step that makes it possible to track whether a job
has already begun steps towards completion.

* src/conf/domain_conf.h (virDomainDiskMirrorState): New enum.
(_virDomainDiskDef): Convert bool mirroring to new enum.
* src/conf/domain_conf.c (virDomainDiskDefParseXML)
(virDomainDiskDefFormat): Handle new values.
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Adjust
client.
* src/qemu/qemu_driver.c (qemuDomainBlockPivot)
(qemuDomainBlockJobImpl): Likewise.
* docs/schemas/domaincommon.rng (diskMirror): Expose new values.
* docs/formatdomain.html.in (elementsDisks): Document it.
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: Test it.

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-07-29 15:36:30 -06:00
Michal Privoznik
136ad49740 domain: Introduce ./hugepages/page/[@size, @unit, @nodeset]
<memoryBacking>
    <hugepages>
      <page size="1" unit="G" nodeset="0-3,5"/>
      <page size="2" unit="M" nodeset="4"/>
    </hugepages>
  </memoryBacking>

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-07-29 12:02:34 +01:00
Michal Privoznik
725a211fc0 qemu: Utilize virFileFindHugeTLBFS
Use better detection of hugetlbfs mount points. Yes, there can be
multiple mount points each serving different huge page size.

Since we already have ability to override the mount point in the
qemu.conf file, this crazy backward compatibility code is brought in.
Now we allow multiple mount points, so the "hugetlbfs_mount" option
must take an list of strings (mount points). But previously, it was
just a string, so we must accept both types now.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-07-29 11:58:35 +01:00
Ján Tomko
3227e17d82 Introduce virTristateSwitch enum
For the values "default", "on", "off"

Replaces
virDeviceAddressPCIMulti
virDomainFeatureState
virDomainIoEventFd
virDomainVirtioEventIdx
virDomainDiskCopyOnRead
virDomainMemDump
virDomainPCIRombarMode
virDomainGraphicsSpicePlaybackCompression
2014-07-23 12:59:40 +02:00
Martin Kletzander
7e72ac7878 qemu: leave restricting cpuset.mems after initialization
When domain is started with numatune memory mode strict and the
nodeset does not include host NUMA node with DMA and DMA32 zones, KVM
initialization fails.  This is because cgroup restrict even kernel
allocations.  We are already doing numa_set_membind() which does the
same thing, only it does not restrict kernel allocations.

This patch leaves the userspace numa_set_membind() in place and moves
the cpuset.mems setting after the point where monitor comes up, but
before vcpu and emulator sub-groups are created.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
93e82727ec numatune: Encapsulate numatune configuration in order to unify results
There were numerous places where numatune configuration (and thus
domain config as well) was changed in different ways.  On some
places this even resulted in persistent domain definition not to be
stable (it would change with daemon's restart).

In order to uniformly change how numatune config is dealt with, all
the internals are now accessible directly only in numatune_conf.c and
outside this file accessors must be used.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00