Commit Graph

19082 Commits

Author SHA1 Message Date
Marek Marczykowski
c374353ca0 conf: support backend domain name in disk and network devices
At least Xen supports backend drivers in another domain (aka "driver
domain"). This patch introduces an XML config option for specifying the
backend domain name for <disk> and <interface> devices.  E.g.

  <disk>
    <backenddomain name='diskvm'/>
    ...
  </disk>
  <interface type='bridge'>
    <backenddomain name='netvm'/>
    ...
  </interface>

In the future, same option will be needed for USB devices (hostdev
objects), but for now libxl doesn't have support for PVUSB.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
2015-02-20 14:50:24 -07:00
Laine Stump
8f8e581a17 network: allow <pf> together with <interface>/<address> in network status
The function that parses the <forward> subelement of a network used to
fail/log an error if the network definition contained both a <pf>
element as well as at least one <interface> or <address> element. That
check was present because the configuration of a network should have
either one <pf>, one or more <interface>, or one or more <address>,
but never combinations of multiple kinds.

This caused a problem when libvirtd was restarted with a network
already active - when a network with a <pf> element is started, the
referenced PF (Physical Function of an SRIOV-capable network card) is
checked for VFs (Virtual Functions), and the <forward> is filled in
with a list of all VFs for that PF either in the form of their PCI
addresses (a list of <address>) or their netdev names (a list of
<interface>); the <pf> element is not removed though. When libvirtd is
restarted, it parses the network status and finds both the original
<pf> from the config, as well as the list of either <address> or
<interface>, fails the parse, and the network is not added to the
active list. This failure is often obscured because the network is
marked as autostart so libvirt immediately restarts it.

It seems odd to me that <interface> and <address> are stored in the
same array rather than keeping two separate arrays, and having
separate arrays would have made the check much simpler. However,
changing to use two separate arrays would have required changes in
more places, potentially creating more conflicts and (more
importantly) more possible regressions in the event of a backport, so
I chose to keep the existing data structure in order to localize the
change.

It appears that this problem has been in the code ever since support
for <pf> was added (0.9.10), but until commit
34cc3b2f10 (first in libvirt 1.2.4)
networks with interface pools were not properly marked as active on
restart anyway, so there is no point in backporting this patch any
further than that.
2015-02-20 15:06:30 -05:00
Peter Krempa
103707d4b7 qemu: caps: Add capability bit for the "pc-dimm" device
The pc-dimm device represents a RAM memory module.
2015-02-20 19:25:09 +01:00
Peter Krempa
e5c7864cfc conf: Hoist validation of memory size into the post parse callback
Later patches will need to access the full definition to do check the
memory size and thus the checking needs to be done after the whole
definition including devices is known.
2015-02-20 19:25:09 +01:00
Peter Krempa
b98596a717 conf: numa: Check ABI stability of NUMA configuration
Add helper to compare initial sizes of indivitual NUMA nodes and the map
of belonging vCPUs. Other configuration is not ABI.
2015-02-20 19:23:38 +01:00
Peter Krempa
e431c3c092 conf: ABI: Hugepage backing definition is not guest ABI
The backing of the vm's memory isn't influencing the guest ABI thus
shouldn't be checked.
2015-02-20 18:19:59 +01:00
Peter Krempa
181742d43f conf: Move all NUMA configuration to virDomainNuma
For historical reasons data regarding NUMA configuration were split
between the CPU definition and numatune. We cannot do anything about the
XML still being split, but we certainly can at least store the relevant
data in one place.

This patch moves the NUMA stuff to the right place.
2015-02-20 17:50:08 +01:00
Peter Krempa
b9ddb25822 conf: numa: Add setter/getter for NUMA node memory size
Add the helpers and refactor places where the value is accessed without
them.
2015-02-20 17:50:08 +01:00
Peter Krempa
7800d473f5 conf: numa: Add accessor to NUMA node's memory access mode 2015-02-20 17:50:08 +01:00
Peter Krempa
d9a779a36e conf: numa: Add accessor for the NUMA node cpu mask
Add virDomainNumaGetNodeCpumask() and refactor a few places that would
get the cpu mask without the helper.
2015-02-20 17:50:08 +01:00
Peter Krempa
be22d07315 conf: numa: Add helper to get guest NUMA node count and refactor users
Add an accessor so that a later refactor is simpler.
2015-02-20 17:50:07 +01:00
Peter Krempa
ba2183a331 qemu: command: Unify retrieval of NUMA cell count in qemuBuildNumaArgStr
The function uses the cell count in 6 places. Add a temp variable to
hold the count as it will greatly simplify the refactor.
2015-02-20 17:50:07 +01:00
Peter Krempa
b83543c563 conf: numa: Don't pass double pointer to virDomainNumatuneParseXML
virDomainNumatuneParseXML now doesn't allocate the def->numa object any
longer so we don't need to pass a double pointer.
2015-02-20 17:50:07 +01:00
Peter Krempa
fa9930720b numa: conf: Tweak parameters of virDomainNumatuneSet
As virDomainNumatuneSet now doesn't allocate the virDomainNuma object
any longer it's not necessary to pass the pointer to a pointer to store
the object as it will not change any longer.

While touching the parameter definitions I've also changed the name of
the parameter to "numa".
2015-02-20 17:50:07 +01:00
Peter Krempa
21008c013c conf: numa: Always allocate the NUMA config
Since our formatter now handles well if the config is allocated and not
filled we can safely always-allocate the NUMA config and remove the
ad-hoc allocation code.

This will help in later patches as the parser will be refactored to just
fill the data.
2015-02-20 17:48:48 +01:00
Peter Krempa
c03411199e conf: Allocate domain definition with the new helper
Use the virDomainDefNew() helper to allocate the definition instead of
doing it via VIR_ALLOC.
2015-02-20 17:43:05 +01:00
Peter Krempa
61e43ce9df conf: Separate helper for creating domain objects
Move the existing virDomainDefNew to virDomainDefNewFull as it's setting
a few things in the conf and re-introduce virDomainDefNew as a function
without parameters for common use.
2015-02-20 17:43:05 +01:00
Peter Krempa
121cde4726 conf: numa: Format <numatune> XML only if necessary
Do a content-aware check if formatting of the <numatune> element is
necessary. Later on the def->numa structure will be always present so we
cannot decide only on the basis whether it's allocated.
2015-02-20 17:43:04 +01:00
Peter Krempa
638e3d270f conf: numa: Refactor logic in virDomainNumatuneParseXML
Shuffling around the logic will allow to simplify the code quite a bit.
As an additional bonus the change in the logic now reports an error if
automatic placement is selected and individual placement is configured.
2015-02-20 17:43:04 +01:00
Peter Krempa
67bd807c4d conf: numa: Reformat virDomainNumatuneParseXML
Collapse few of the conditions so that the program flow is more clear.
2015-02-20 17:43:04 +01:00
Peter Krempa
60a2ce4962 conf: numa: Improve error message in case a numa node doesn't have cpus
Currently the code would exit without reporting an error as
virBitmapParse reports one only if it fails to parse the bitmap, whereas
the code was jumping to the error label even in case 0 cpus were
correctly parsed in the map.
2015-02-20 17:43:04 +01:00
Peter Krempa
6b6166329f conf: numa: Recalculate rather than remember total NUMA cpu count
It's easier to recalculate the number in the one place it's used as
having a separate variable to track it. It will also help with moving
the NUMA code to the separate module.
2015-02-20 17:43:04 +01:00
Peter Krempa
a3673b225d conf: Move enum virMemAccess to the NUMA code and rename it
Name it virNumaMemAccess and add it to conf/numa_conf.[ch]

Note that to avoid a circular dependency the type of the NUMA cell
memAccess variable was changed to int. It will be turned back later
after the circular dependency will not exist.
2015-02-20 17:43:04 +01:00
Peter Krempa
6bc80fa86d conf: numa: Rename virDomainNumatune to virDomainNuma
The structure will gradually become the only place for NUMA related
config, thus rename it appropriately.
2015-02-20 17:43:04 +01:00
Peter Krempa
456268d46b conf: Move NUMA cell formatter to numa_conf
Move the code that formats the /domain/cpu/numa element to numa_conf as
it belongs there.
2015-02-20 17:43:04 +01:00
Peter Krempa
2562141f19 conf: numa: Don't duplicate NUMA cell cpumask
The mask was stored both as a bitmap and as a string. The string is used
for XML output only. Remove the string, as it can be reconstructed from
the bitmap.

The test change is necessary as the bitmap formatter doesn't "optimize"
using the '^' operator.
2015-02-20 17:43:03 +01:00
Peter Krempa
34a1dd73b8 conf: Refactor virDomainNumaDefCPUParseXML
Rewrite the function to save a few local variables and reorder the code
to make more sense.

Additionally the ncells_max member of the virCPUDef structure is used
only for tracking allocation when parsing the numa definition, which can
be avoided by switching to VIR_ALLOC_N as the array is not resized
after initial allocation.
2015-02-20 17:43:03 +01:00
Peter Krempa
5bba61fd58 conf: Move NUMA cell parsing code from cpu conf to numa conf
For weird historical reasons NUMA cells are added as a subelement of
<cpu> while the actual configuration is done in <numatune>.

This patch splits out the cell parser code from cpu config to NUMA
config. Note that the changes to the code are minimal just to make it
work and the function will be refactored in the next patch.
2015-02-20 17:43:03 +01:00
Peter Krempa
fcee64e73c conf: Move numatune_conf to numa_conf
For a while now there are two places that gather information about NUMA
related guest configuration. While the XML can't be changed we can at
least store the data in one place in the definition.

Rename the numatune_conf.[ch] files to numa_conf as later patches will
move the rest of the definitions from the cpu definition to this one.
2015-02-20 17:43:03 +01:00
Pavel Hrdina
81dd81e475 virsh: fix vcpupin info
The "virDomainGetInfo" will get for running domain only live info and for
offline domain only config info. There was no way how to get config info
for running domain. We will use "vshCPUCountCollect" instead to get the
correct cpu count that we need to pass to "virDomainGetVcpuPinInfo".

Also cleanup some unnecessary variables and checks that are done by
drivers.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1160559

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2015-02-20 16:17:19 +01:00
Michal Privoznik
af20423264 virQEMUCapsCacheLookupCopy: Filter qemuCaps based on machineType
Not all machine types support all devices, device properties, backends,
etc. So until we create a matrix of [machineType, qemuCaps], lets just
filter out some capabilities before we return them to the consumer
(which is going to make decisions based on them straight away).
Currently, as qemu is unable to tell which capabilities are (not)
enabled for given machine types, it's us who has to hardcode the matrix.
One day maybe the hardcoding will go away and we can create the matrix
dynamically on the fly based on a few monitor calls.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-02-20 13:28:04 +01:00
Michal Privoznik
37cf163ab2 virQEMUCapsCacheLookupCopy: Pass machine type
It will come handy in the near future when we will filter some
capabilities based on it.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-02-20 13:27:59 +01:00
Martin Kletzander
1bb1de83b2 virsh-edit: Make force editing usable
When editing a domain with 'virsh edit' and failing validation, the
usual message pops up:

  Failed. Try again? [y,n,f,?]:

Turning off validation can be useful, mainly for testing (but other
purposes too), so this patch adds support for relaxing definition in
virsh-edit and makes 'virsh edit <domain>' more usable.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2015-02-20 07:46:54 +01:00
Mikhail Feoktistov
675fa6b360 parallels: Set the first HDD from XML as bootable
1. Delete all boot devices for VM instance
2. Find the first HDD from XML and set it as bootable

Now we support only one boot device and it should be HDD.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-02-19 16:32:49 +01:00
Mikhail Feoktistov
6783cf63b0 parallels: Use IS_CT() macro instead of STREQ("exe") 2015-02-19 16:32:49 +01:00
Mikhail Feoktistov
0268eabd7d parallels: code aligment 2015-02-19 16:32:49 +01:00
Jiri Denemark
bc6e206322 Search for schemas and cpu_map.xml in source tree
Not all files we want to find using virFileFindResource{,Full} are
generated when libvirt is built, some of them (such as RNG schemas) are
distributed with sources. The current API was not able to find source
files if libvirt was built in VPATH.

Both RNG schemas and cpu_map.xml are distributed in source tarball.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2015-02-19 15:25:04 +01:00
Michal Privoznik
80c5f10e86 qemuMigrationDriveMirror: Listen to events
https://bugzilla.redhat.com/show_bug.cgi?id=1179678

When migrating with storage, libvirt iterates over domain disks and
instruct qemu to migrate the ones we are interested in (shared, RO and
source-less disks are skipped). The disks are migrated in series. No
new disk is transferred until the previous one hasn't been quiesced.
This is checked on the qemu monitor via 'query-jobs' command. If the
disk has been quiesced, it practically went from copying its content
to mirroring state, where all disk writes are mirrored to the other
side of migration too. Having said that, there's one inherent error in
the design. The monitor command we use reports only active jobs. So if
the job fails for whatever reason, we will not see it anymore in the
command output. And this can happen fairly simply: just try to migrate
a domain with storage. If the storage migration fails (e.g. due to
ENOSPC on the destination) we resume the host on the destination and
let it run on partly copied disk.

The proper fix is what even the comment in the code says: listen for
qemu events instead of polling. If storage migration changes state an
event is emitted and we can act accordingly: either consider disk
copied and continue the process, or consider disk mangled and abort
the migration.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-02-19 14:12:38 +01:00
Michal Privoznik
76c61cdca2 qemuProcessHandleBlockJob: Take status into account
Upon BLOCK_JOB_COMPLETED event delivery, we check if the job has
completed (in qemuMonitorJSONHandleBlockJobImpl()). For better image,
the event looks something like this:

"timestamp": {"seconds": 1423582694, "microseconds": 372666}, "event":
"BLOCK_JOB_COMPLETED", "data": {"device": "drive-virtio-disk0", "len":
8412790784, "offset": 409993216, "speed": 8796093022207, "type":
"mirror", "error": "No space left on device"}}

If "len" does not equal "offset" it's considered an error, and we can
clearly see "error" field filled in. However, later in the event
processing this case was handled no differently to case of job being
aborted via separate API. It's time that we start differentiate these
two because of the future work.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-02-19 14:12:38 +01:00
Michal Privoznik
c37943a068 qemuProcessHandleBlockJob: Set disk->mirrorState more often
Currently, upon BLOCK_JOB_* event, disk->mirrorState is not updated
each time. The callback code handling the events checks if a blockjob
was started via our public APIs prior to setting the mirrorState.
However, some block jobs may be started internally (e.g. during
storage migration), in which case we don't bother with setting
disk->mirror (there's nothing we can set it to anyway), or other
fields. But it will come handy if we update the mirrorState in these
cases too. The event wasn't delivered just for fun - we've started the
job after all.

So, in this commit, the mirrorState is set to whatever job status
we've obtained. Of course, there are some actions on some statuses
that we want to perform. But instead of if {} else if {} else {} ...
enumeration, let's move to switch().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-02-19 14:12:38 +01:00
Peter Krempa
0df2f0404f qemu: Exit job on error path of qemuDomainSetVcpusFlags()
Commit e105dc9814 moved some code but
didn't adjust the jump labels so that the job would be terminated.
2015-02-18 18:17:54 +01:00
Pavel Hrdina
5c756e580f daemon: Fix segfault by reloading daemon right after start
Libvirt could crash with segfault if user issue "service reload" right
after "service start". One possible way to crash libvirt is to run reload
during initialization of QEMU driver.

It could happen when qemu driver will initialize qemu_driver_lock but
don't have a time to set it's "config" and the SIGHUP arrives. The
reload handler tries to get qemu_drv->config during "virStorageAutostart"
and dereference it which ends with segfault.

Let's ignore all reload requests until all drivers are initialized. In
addition set driversInitialized before we enter virStateCleanup to
ignore reload request while we are shutting down.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1179981

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2015-02-18 17:51:54 +01:00
Ján Tomko
794235e813 docs: clarify nat range behavior
All the addresses from the range are used, not just those
that are in use on the host.

https://bugzilla.redhat.com/show_bug.cgi?id=1079917
2015-02-18 15:36:45 +01:00
Daniel P. Berrange
b4c7f7eaea docs: add page about virtlockd setup
Introduce some basic docs describing the virtlockd setup.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-02-17 13:59:54 +00:00
Daniel P. Berrange
575c839b6e docs: split out sanlock setup docs
In preparation for adding docs about virtlockd, split out
the sanlock setup docs into a separate page.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-02-17 13:59:19 +00:00
Pavel Hrdina
77a9dc0b8d qemu_cgroup: initialize mem_mask to NULL
If 'virNumaGetHostNodeset()' fails then the error path will try to free
uninitialized pointer mem_mask. Introduced by commit af2a1f058.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2015-02-17 14:22:50 +01:00
Prerna Saxena
5e4f49ab8a PowerPC : Forbid NULL CPU model with 'host-model' mode.
PowerPC : Forbid NULL CPU model with 'host-model' mode in qemu command line.

This ensures that an XML such as following:
...
  <cpu mode='host-model'>
    <model fallback='allow'/>
  </cpu>
...

will not generate a '-cpu host,compat=(null)' command line with qemu-system-ppc64.

Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
2015-02-17 12:20:40 +01:00
Prerna Saxena
bdbe723fcd PowerPC : Make 'qemu-system-ppc64' the default emulator on ppc64[le].
PowerPC : Explicitly associate 'qemu-system-ppc64' as the
 default emulator for all 64-bit PowerPC guests ( both Big & Little Endian )

Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
2015-02-17 12:20:40 +01:00
Luyao Huang
337265bb52 qemu: fix vm deadlock when try to use numatune in session mode
https://bugzilla.redhat.com/show_bug.cgi?id=1126762

Commit 43b67f introduced a deadlock issue when we use numatune
to change numa settings to a vm in session mode.

Jump to endjob instead of jump to cleanup.

Signed-off-by: Luyao Huang <lhuang@redhat.com>
2015-02-17 11:08:00 +01:00
Michal Privoznik
7832fac847 qemuBuildMemoryBackendStr: Report backend requirement more appropriately
So, when building the '-numa' command line, the
qemuBuildMemoryBackendStr() function does quite a lot of checks to
chose the best backend, or to check if one is in fact needed. However,
it returned that backend is needed even for this little fella:

  <numatune>
    <memory mode="strict" nodeset="0,2"/>
  </numatune>

This can be guaranteed via CGroups entirely, there's no need to use
memory-backend-ram to let qemu know where to get memory from. Well, as
long as there's no <memnode/> element, which explicitly requires the
backend. Long story short, we wouldn't have to care, as qemu works
either way. However, the problem is migration (as always). Previously,
libvirt would have started qemu with:

  -numa node,memory=X

in this case and restricted memory placement in CGroups. Today, libvirt
creates more complicated command line:

  -object memory-backend-ram,id=ram-node0,size=X
  -numa node,memdev=ram-node0

Again, one wouldn't find anything wrong with these two approaches.
Both work just fine. Unless you try to migrated from the older libvirt
into the newer one. These two approaches are, unfortunately, not
compatible. My suggestion is, in order to allow users to migrate, lets
use the older approach for as long as the newer one is not needed.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-02-17 09:07:09 +01:00