Commit Graph

16035 Commits

Author SHA1 Message Date
Michal Privoznik
d1a7102389 virStringListLength: Ensure const correctness
The virStringListLength function does not ever modify the passed
string list. It merely counts the items in it. Make sure that we
reflect this bit in the function header.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>

(crobinso: fix up spacing and squash in sheepdog bit suggested
 by Andrea)
2016-02-09 15:44:58 -05:00
Michal Privoznik
73b70b403d virDomainFormatSchedDef: Initialize @priority
Older gcc fails to see that the variable is set iff @hasPriority
== true in which case the former is set a value. Initialize the
value while declaring it to make the compiler shut up.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-02-09 17:00:25 +01:00
Martin Kletzander
ea913d185d util: Get rid of virStringListLen()
It does exactly the same thing as virStringListLength() and it's used in
one place only.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-02-09 16:46:14 +01:00
Ján Tomko
99a6f30db0 leaseshelper: swap two parameters of virLeaseNew
My commit e11aa74 messed up the parameter order.

Reported by John Ferlan.
2016-02-09 13:15:59 +01:00
Ján Tomko
6951ab6881 vboxDumpDisplay: realign variable initializations
Remove the extra spaces, do not align them on '='.
2016-02-09 10:11:56 +01:00
Ján Tomko
c5972df7d5 vboxDumpDisplay: remove suspicious strlen
The return type of strlen is 'size_t', which is unsigned and therefore
never less than zero.

Use STREQ to make the check obvious.
2016-02-09 10:11:05 +01:00
Ján Tomko
5a16197459 vboxDumpDisplay: reuse the keyUtf16 variable
We free the key right after calling the API.

Reuse a single variable to remove the typo.
2016-02-09 10:11:02 +01:00
Ján Tomko
2ab95531ca vboxDumpDisplay: use VIR_APPEND_ELEMENT
Instead of open-coding it.
2016-02-09 10:10:25 +01:00
Ján Tomko
ec74a9da7a vboxDumpDisplay: check return of virDomainGraphicsListenSetAddress
Error out if the allocation failed.
2016-02-09 10:10:24 +01:00
Ján Tomko
fcecbb37bf vboxDumpDisplay: clean up VIR_STRDUP usage
Two VIR_STRDUP calls are redundant - just steal the string
converted by VBOX_UTF16_TO_UTF8.

Report an error when the third one fails.
2016-02-09 10:09:41 +01:00
Ján Tomko
8f8c473a98 vboxDumpDisplay: fill out the graphics structure earlier
Remove the need to track what type of graphics were present
by temporary variables.
2016-02-09 10:09:23 +01:00
Ján Tomko
026bcfdcad vboxDumpDisplay: allocate the graphics structure upfront
Allocate it as soon as we know we will need it.

Add it to def->ngraphics if it's allocated, removing the need
to use the addDesktop and totalPresent variables to track this.
2016-02-09 10:09:19 +01:00
Ján Tomko
ef98d93bed vboxDumpDisplay: split out def->graphics allocation
Separate allocation of the def->graphics array from the allocation
and initialization of its first element.

Note that the only possible values of totalPresent at this point
are 0 or 1, because it equals to guiPresent + sdlPresent.
2016-02-09 10:08:39 +01:00
Ján Tomko
2f2a0b2925 vboxDumpDisplay: remove extra virReportOOMError
VIR_ALLOC* already reported an error.
2016-02-09 10:08:11 +01:00
Ján Tomko
56886d5fdd vboxDumpDisplay: add addDesktop bool
When FRONTEND/Type is not any of "sdl", "gui", "vrdp", we add a DESKTOP.
Use a bool to track this, instead of checking that both
totalPresent ("sdl" or "gui" present) and vrdpPresent are zero.
2016-02-09 10:08:00 +01:00
Ján Tomko
bf1691e388 vboxDumpDisplay: more indentation reducing
VRDxEnabled is initialized to false. Put the if (VRDxEnabled)
on the top level to reduce nesting.
2016-02-09 10:07:57 +01:00
Ján Tomko
5cb926f90d vboxDumpDisplay: reduce indentation level
Use STREQ_NULLABLE instead of deep nesting.
2016-02-09 10:07:37 +01:00
Ján Tomko
2ea694053f Check return value of vboxDumpVideo
Error out on allocation failures instead of creating an incomplete
definition.

Fixes a possible crash when def->nvideos is 1, but def->videos is NULL.
2016-02-09 10:06:58 +01:00
Ján Tomko
e11aa74933 leaseshelper: split out virLeaseNew
For the actions ADD and OLD, split out creating the new lease object,
as well as getting the environment variables that do not affect
the parsing of command line arguments.
2016-02-09 08:48:14 +01:00
Peter Krempa
4f3db09bd5 qemu: iothread: Reuse qemuProcessSetupIOThread in iothread hotplug
Since majority of the steps is shared, the function can be reused to
simplify code.

Similarly to previous path doing this same for vCPUs this also fixes the
a similar bug (which is not tracked).
2016-02-08 17:05:00 +01:00
Peter Krempa
1dcc4c7ffd qemu: iothread: Aggregate code to set IOThread tuning
Rather than iterating 3 times for various settings this function
aggregates all the code into single place. One of the other advantages
is that it can then be reused for properly setting IOThread info on
hotplug.
2016-02-08 17:05:00 +01:00
Peter Krempa
c6bd15026b qemu: vcpu: Reuse qemuProcessSetupVcpu in vcpu hotplug
Since majority of the steps is shared, the function can be reused to
simplify code.

Additionally this resolves
https://bugzilla.redhat.com/show_bug.cgi?id=1244128 since the cpu
bandwidth limiting with cgroups would not be set on the hotplug path.

Additionally the code now sets the thread affinity and honors autoCpuset
as in the regular startup code path.
2016-02-08 17:05:00 +01:00
Peter Krempa
56971667ee qemu: vcpu: Aggregate code to set vCPU tuning
Rather than iterating 3 times for various settings this function
aggregates all the code into single place. One of the other advantages
is that it can then be reused for properly setting vCPU info on hotplug.

With this approach autoCpuset is also used when setting the process
affinity rather than just via cgroups.
2016-02-08 17:05:00 +01:00
Joao Martins
d9c57ca9f9 remote: enforce VIR_TYPED_PARAM_STRING_OKAY flag on client side serialization
Commit 8cd1d54 consolidates both daemon and remote driver typed param
serialization functions. The consolidation now enforces client to use
VIR_TYPED_PARAM_STRING_OKAY flag to properly serialize string parameters, which
server has used for quite some time now. And this caused an issue, since the
commit had not adjusted client remote calls appropriately, thus causing a
failure in blkiotune, numatune and migration APIs (as per Xen CI tests). This
patch adjusts both remote_driver.c and gendispatch.pl to properly address this
issue.

http://lists.xenproject.org/archives/html/xen-devel/2016-02/msg01012.html

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Erik Skultety <eskultet@redhat.com>
2016-02-08 14:59:54 +01:00
Michal Privoznik
a0aa92a24b vircgroup: Update virCgroupGetPercpuStats stump
In the commit 7938b533 we've changed the function signature,
however forgot to update stump that's used on systems without
CGroups causing a build failure.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-02-08 14:06:30 +01:00
Peter Krempa
6dfb4507f5 conf: Fix how iothread scheduler info is stored
Similarly to previous commit change the way how iothread scheduler info
is stored and clean up a lot of unnecessary code.
2016-02-08 09:51:34 +01:00
Peter Krempa
99c5fe0e7c conf: Don't store vcpusched orthogonally to other vcpu info
Due to bad design the vcpu sched element is orthogonal to the way how
the data belongs to the corresponding objects. Now that vcpus are a
struct that allow to store other info too, let's convert the data to the
sane structure.

The helpers for the conversion are made universal so that they can be
reused for iothreads too.

This patch also resolves https://bugzilla.redhat.com/show_bug.cgi?id=1235180
since with the correct storage approach you can't have dangling data.
2016-02-08 09:51:34 +01:00
Peter Krempa
e1fa2571c5 conf: Extract code that formats <cputune>
virDomainDefFormatInternal is growing rather large. Extract the cputune
formatter into a separate function.
2016-02-08 09:51:34 +01:00
Peter Krempa
cc715e9391 conf: remove unused cpu pinning helpers and data structures
Now that the pinning info is stored elsewhere we can delete all the
obsolete code.
2016-02-08 09:51:34 +01:00
Peter Krempa
d2a6fc79e3 conf: Store cpu pinning data in def->vcpus
Now with the new struct the data can be stored in a much saner place.
2016-02-08 09:51:34 +01:00
Peter Krempa
856f254eef conf: Don't copy def->cpumask into cpu pinning info
This step can be omitted, so that drivers can decide what to do when the
user requests to use default vcpu pinning.
2016-02-08 09:51:34 +01:00
Peter Krempa
d0d341a30b qemu: Reuse qemuDomainDetectVcpuPids in cpu hot(un)plug
Now that qemuDomainDetectVcpuPids is able to refresh the vCPU pid
information it can be reused in the hotplug and hotunplug code paths
rather than open-coding a very similar algorithm.

A slight algorithm change is necessary for unplug since the vCPU needs
to be marked offline prior to calling the thread detector function and
eventually rolled back if something fails.
2016-02-08 09:51:34 +01:00
Peter Krempa
207e17031a qemu: Differentiate error codes when VM exits in qemuDomainDetectVcpuPids
Some callers will need to behave differently when the detection failed
and when the VM crashed during the redetection. Return -2 if it crashed.
2016-02-08 09:51:34 +01:00
Peter Krempa
7938b533d5 cgroup: Prepare for sparse vCPU topologies in virCgroupGetPercpuStats
Pass a bitmap of enabled guest vCPUs to virCgroupGetPercpuStats so that
non-continuous vCPU topologies can be used.
2016-02-08 09:51:34 +01:00
Peter Krempa
e84ab7938d conf: Move and optimize disk target duplicity checking
Move the logic from virDomainDiskDefDstDuplicates into
virDomainDiskDefCheckDuplicateInfo so that we don't have to loop
multiple times through the array of disks. Since the original function
was called in qemuBuildDriveDevStr, it was actually called for every
single disk which was quite wasteful.

Additionally the target uniqueness check needed to be duplicated in
the disk hotplug case, since the disk was inserted into the domain
definition after the device string was formatted and thus
virDomainDiskDefDstDuplicates didn't do anything in that case.
2016-02-08 09:35:01 +01:00
Peter Krempa
c07bc2cc7d qemu: process: Extract pre-start checks into a function
When starting a qemu process there are certain checks done to ensure
that the configuration makes sense. Extract them into a separate
function so that they can be reused in the test code.
2016-02-08 09:19:48 +01:00
Peter Krempa
c3e170647e qemu: process: Reorder operations on early VM startup
Retrieval of the driver capabilities as well as emulator capabilities
does not require the complete qemuProcessStop to be executed on
failure.
2016-02-08 09:08:38 +01:00
Peter Krempa
4f1324aa48 qemu: hotplug: Check duplicate disk serial/wwn on hotplug too
We do the check on VM start, but the user could still hotplug a disk
with a conflicting serial or WWN. Reuse the checker function to fix the
issue.
2016-02-08 09:08:38 +01:00
Peter Krempa
e76a848e3d conf: Extract code that checks disk serial/wwn conflict
Put it into a separate function that can be called on two disk def
pointers.
2016-02-08 09:08:38 +01:00
Peter Krempa
9e92a0b4c0 qemu: hotplug: Extract common code to qemuDomainAttachDeviceDiskLive
Target uniqueness check was duplicated in all of the three workers
called from it. Extract it to the parent.
2016-02-08 09:08:38 +01:00
Peter Krempa
43d9a14a21 qemu: hotplug: Use more common 'cleanup' label in qemuDomainAttachDeviceDiskLive 2016-02-08 09:08:38 +01:00
Peter Krempa
fab859d11f qemu: hotplug: Break up if/else statement into switch 2016-02-08 09:08:38 +01:00
Peter Krempa
99f9506a66 qemu: hotplug: Remove unnecessary variable 2016-02-08 09:08:38 +01:00
Peter Krempa
f8fee9337b qemu: hotplug: Use typecasted switch
Remove the default case since all cases are covered.
2016-02-08 09:08:38 +01:00
Peter Krempa
986831a8d4 qemu: snapshot: Avoid infinite loop if vCPUs can't be resumed
In b3d2a42e I've refactored the code and moved the 'cleanup' label.
Unfortunately the code that was originally in the 'endjob' label and
wanted to jump to cleanup is now in the cleanup label. Remove the jump
and let the function finish.
2016-02-08 08:50:00 +01:00
Peter Krempa
a9839fe044 qemu: snapshot: Don't overwrite existing errors when thawing filesystems
If we are attempting to thaw the filesystems on error, the code would
overwrite the error code that caused the snapshot to fail with the error
of thawing the filesystem. Since the thawing function allows control of
error reporting behavior we can use this feature.
2016-02-08 08:50:00 +01:00
Roman Bogorodskiy
6450c9e4cf nodedev: stub nodeDeviceSysfsGetPCIRelatedDevCaps
Add a stub for nodeDeviceSysfsGetPCIRelatedDevCaps() for non-Linux
platforms. It allows nodedev driver to work on non-Linux platoforms
that, however, have HAL.
2016-02-07 02:24:55 +03:00
John Ferlan
b8c0f18654 util: Fix virCgroupNewMachine ATTRIBUTE_NONNULL args
Commit id 'c3bd0019c0' removed arg3, but forgot to adjust the numbers
for NONNULL - caused build failure for coverity
2016-02-06 06:45:46 -05:00
Roman Bogorodskiy
dcb3d87d78 bhyve: fix preprocessor indentation
Syntax-check fails with:

cppi: src/bhyve/bhyve_driver.h: line 26: not properly indented
cppi: src/bhyve/bhyve_driver.h: line 27: not properly indented
maint.mk: incorrect preprocessor indentation

Fix by properly indenting '#include's.

Pushed as trivial.
2016-02-06 05:26:51 +03:00
Michal Privoznik
5147f4f3a3 bhyve: Fix the build
After 1036ddadb2 we use bhyveDriverGetCapabilities from other
sources too, not only from bhyve_driver.c. However, the function
was static so not properly expose to other files. In order to
expose it, we need to move couple of #include-s too.
Then, there has been a copy paste error in
virBhyveProcessReconnect: s/privconn/data->driver/.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-02-05 22:58:02 +01:00
Wido den Hollander
2aed051d0d rbd: Use %zu for uint64_t instead of casting to unsigned long long
This was only used in debugging messages and not in any real code.

Ceph/RBD uses uint64_t for sizes internally and they can be printed
with %zu without any need for casting.

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2016-02-05 14:29:24 -05:00
Wido den Hollander
f4981ebf5d rbd: Code styling cleanup
Through the years the RBD storage pool code hasn't maintained the
same or correct coding standard which applies to libvirt.

This patch doesn't change any logic in the code, it only applies
the proper coding standards to the code where possible without
making large changes.

This way the code style used in this storage pool is consistent
throughout the whole file.

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2016-02-05 14:29:24 -05:00
Michal Privoznik
a3b168d01a virSystemdGetMachineNameByPID: Initialize @reply
I've noticed that variable @reply is not initialized and if
something at the beginning of the function fails, e.g.
virDBusGetSystemBus(), the control jump straight to cleanup label
where dbus_message_unref() is then called over this uninitialized
variable.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-02-05 17:17:45 +01:00
Michal Privoznik
065054daa7 virnetdevbandwidth: Compute quantum value
I've noticed couple of warning in dmesg while debugging
something:

[ 9683.973754] HTB: quantum of class 10001 is big. Consider r2q change.
[ 9683.976460] HTB: quantum of class 10002 is big. Consider r2q change.

I've read the HTB documentation and linux kernel code to find out
what's wrong. Basically we need to pass another argument
"quantum" to our tc cmd line because the default computed by HTB
does not always work in which case the warning message is printed
out.

You can read more details here:

http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm#sharing

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-02-05 16:43:19 +01:00
Peter Krempa
173054ceea conf: Extract code for parsing thread resource scheduler info
As the scheduler info elements are represented orthogonally to how it
makes sense to actually store the information, the extracted code will
be later used when converting between XML and internal definitions.
2016-02-05 16:21:45 +01:00
Peter Krempa
e992aa21f7 conf: Add helper to return a bitmap of active iothread ids 2016-02-05 16:21:45 +01:00
Peter Krempa
9479642fd3 util: bitmap: Introduce bitmap subtraction
Performs binary subtraction of two bitmaps. Stores result in the first
operand.
2016-02-05 16:21:45 +01:00
Martin Kletzander
c3bd0019c0 systemd: Modernize machine naming
So, systemd-machined has this philosophy that machine names are like
hostnames and hence should follow the same rules.  But we always allowed
international characters in domain names.  Thus we need to modify the
machine name we are passing to systemd.

In order to change some machine names that we will be passing to systemd,
we also need to call TerminateMachine at the end of a lifetime of a
domain.  Even for domains that were started with older libvirt.  That
can be achieved thanks to virSystemdGetMachineNameByPID().  And because
we can change machine names, we can get rid of the inconsistent and
pointless escaping of domain names when creating machine names.

So this patch modifies the naming in the following way.  It creates the
name as <drivername>-<id>-<name> where invalid hostname characters are
stripped out of the name and if the resulting name is longer, it
truncates it to 64 characters.  That way we can start domains we
couldn't start before.  Well, at least on systemd.

To make it work all together, the machineName (which is needed only with
systemd) is saved in domain's private data.  That way the generation is
moved to the driver and we don't need to pass various unnecessary
arguments to cgroup functions.

The only thing this complicates a bit is the scope generation when
validating a cgroup where we must check both old and new naming, so a
slight modification was needed there.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1282846

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-02-05 16:11:50 +01:00
Joao Martins
b8b03f64e1 conf: add caps to virDomainSnapshotDefFormat
The virDomainSnapshotDefFormat calls into virDomainDefFormat,
so should be providing a non-NULL virCapsPtr instance. On the
qemu driver we change qemuDomainSnapshotWriteMetadata to also
include caps since it calls virDomainSnapshotDefFormat.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
2016-02-05 10:57:39 +00:00
Daniel P. Berrange
1036ddadb2 conf: add caps to virDomainObjFormat/SaveStatus
The virDomainObjFormat and virDomainSaveStatus methods
both call into virDomainDefFormat, so should be providing
a non-NULL virCapsPtr instance.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-02-05 10:57:08 +00:00
Roman Bogorodskiy
02a34d2af4 bhyve: fix build
Fix build fail introduced as a side effect of commit d239a54.

Pushed under the build breaker rule.
2016-02-05 05:36:26 +03:00
Nikolay Shirokovskiy
e29990c5a4 qemu migration: factor out setting migration option
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
2016-02-04 16:35:19 +01:00
Peter Krempa
41c987b72d Fix build after recent patches
Few build breaking mistakes in less-popular parts of our code.
2016-02-04 16:34:28 +01:00
John Ferlan
7de8b442ff logical: Clarify pieces of lvs regex
Rather than have a unwieldy regex string - split it up into its components
each having it's own #define and then combine in a different #define

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-02-04 10:15:30 -05:00
Joao Martins
a040ba9ed4 libxl: set net device prefix
Use the newly added virCapabilitiesSetNetPrefix to set
the network prefix for the driver. This in return will
be use by NetDefFormat() and NetDefParseXML() routines
to free any interface name that start with the registered
prefix.

Acked-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
2016-02-04 12:47:42 +00:00
Joao Martins
cd57b7c742 conf: add caps to virDomainSaveConfig
virDomainSaveConfig calls virDomainDefFormat which was setting the caps
to NULL, thus keeping the old behaviour (i.e. not looking at
netprefix). This patch adds the virCapsPtr to the function and allows
the configuration to be saved and skipping interface names that were
registered with virCapabilitiesSetNetPrefix().

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
2016-02-04 12:38:27 +00:00
Joao Martins
d239a5427f conf: add caps to virDomainDefFormat*
And use the newly added caps->host.netprefix (if it exists) for
interface names that match the autogenerated target names.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
2016-02-04 12:38:26 +00:00
Joao Martins
481e9bd0f6 conf: add prefix in virDomainNetDefParseXML
And use the newly added caps->host.netprefix for free interface
names that match the autogenerated target names.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
2016-02-04 11:15:51 +00:00
Joao Martins
819d1d9438 conf: add net device prefix to capabilities
In the reverted commit d2e5538b1, the libxl driver was changed to copy
interface names autogenerated by libxl to the corresponding network def
in the domain's virDomainDef object. The copied name is freed when the
domain transitions to the shutoff state. But when migrating a domain,
the autogenerated name is included in the XML sent to the destination
host.  It is possible an interface with the same name already exists on
the destination host, causing migration to fail.

This patch defines a new capability for setting the network device
prefix that will be used in the driver. Valid prefixes are
VIR_NET_GENERATED_PREFIX or the one announced by the driver.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
2016-02-04 11:15:51 +00:00
Roman Bogorodskiy
c94f6d4dff storage: zfs: flexible use of 'volmode' option
There are slight differences in various ZFS implementations.
Specifically, ZFS on FreeBSD requires to set value of 'volmode'
option to 'dev' to expose volumes as raw disk device (that's what
we need) rather than geom provides, for example.

With ZFS on Linux, however, such option is not available and
volumes exposed like we need by default.

To make our implementation more flexible, only pass 'volmode'
when it's supported. Support is checked by parsing usage
information of the 'zfs get' command.
2016-02-04 03:16:50 +03:00
Erik Skultety
8cd1d546e6 util: Export remoteSerializeTypedParameters internally via util
Same as for deserializer, this method might get handy for admin one day.
The major reason for this patch is to stay consistent with idea, i.e.
when deserializer can be shared, why not serializer as well. The only
problem to be solved was that the daemon side serializer uses a code
snippet which handles sparse arrays returned by some APIs as well as
removes any string parameters that can't be returned to older clients.
This patch makes of the new virTypedParameterRemote datatype introduced
by one of the pvious patches.
2016-02-03 15:46:45 +01:00
Erik Skultety
9afc115f73 util: Export remoteFreeTypedParameters internally via util
Since the method is static to remote_driver, it can't even be used by our
daemon. Other than that, it would be useful to be able to use it with admin as
well. This patch uses the new virTypedParameterRemote datatype introduced in
one of previous patches.
2016-02-03 15:46:45 +01:00
Erik Skultety
0472cef685 util: Export remoteDeserializeTypedParameters internally via util
Currently, the deserializer is hardcoded into remote_driver which makes
it impossible for admin to use it. One way to achieve a shared implementation
(besides moving the code to another module) would be pass @ret_params_val as a
void pointer as opposed to the remote_typed_param pointer and add a new extra
argument specifying which of those two protocols is being used and typecast
the pointer at the function entry. An example from remote_protocol:

struct remote_typed_param_value {
        int type;
        union {
                int i;
                u_int ui;
                int64_t l;
                uint64_t ul;
                double d;
                int b;
                remote_nonnull_string s;
        } remote_typed_param_value_u;
};
typedef struct remote_typed_param_value remote_typed_param_value;

struct remote_typed_param {
        remote_nonnull_string field;
        remote_typed_param_value value;
};

That would leave us with a bunch of if-then-elses that needed to be used across
the method. This patch takes the other approach using the new datatype
introduced in one of earlier commits.
2016-02-03 15:46:45 +01:00
Erik Skultety
41a459947f util: Introduce virTypedParameterRemote datatype
Both admin and remote protocols define their own types
(remote_typed_param vs admin_typed_param). Because of the naming convention,
admin typed params wouldn't be able to reuse the serialization/deserialization
methods, which are tailored for use by remote protocol, even if those method
were exported properly. In that case, introduce a new internal data type
structurally copying both admin and remote protocols which, eventually, would
allow serializer and deserializer to be used in a more generic way.
2016-02-03 15:46:45 +01:00
Nikolay Shirokovskiy
1e93470df0 qemu: qemuDomainRename and virDomainObjListNumOfDomains ABBA deadlock fix
A pretty nasty deadlock occurs while trying to rename a VM in parallel
with virDomainObjListNumOfDomains.
The short description of the problem is as follows:

Thread #1:

qemuDomainRename:
    ------> aquires domain lock by qemuDomObjFromDomain
       ---------> waits for domain list lock in any of the listed functions:
          - virDomainObjListFindByName
          - virDomainObjListRenameAddNew
          - virDomainObjListRenameRemove

Thread #2:

virDomainObjListNumOfDomains:
    ------> aquires domain list lock
       ---------> waits for domain lock in virDomainObjListCount

Introduce generic virDomainObjListRename function for renaming domains.
It aquires list lock in right order to avoid deadlock. Callback is used
to make driver specific domain updates.

Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-02-03 15:20:11 +01:00
Martin Kletzander
92757d4d2d systemd: Add virSystemdGetMachineNameByPID
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-02-03 14:57:43 +01:00
Martin Kletzander
9ba2646291 Revert "systemd: Escape only needed characters for machined"
This reverts commit 0e0149ce91.

That commit was added to comply with systemd rules that were changed in
the meantime, so this patch is pointless.
2016-02-03 14:44:37 +01:00
Ján Tomko
e1d7273f24 Simplify virDomainParseMemory
Do not store the return value of virDomainParseScaledValue,
it was overwritten anyway.

Delete the cleanup label, there is nothing to clean up.
2016-02-03 13:22:43 +01:00
Peter Krempa
598927a5bc conf: Split out logic to determine whether cpupin was provided 2016-02-03 13:10:04 +01:00
Peter Krempa
451b955d62 qemu: domain: Prepare qemuDomainDetectVcpuPids for reuse
Free the old vcpupids array in case when this function is called again
during the run of the VM. It will be later reused in the vCPU hotplug
code. The function now returns the number of detected VCPUs.
2016-02-03 13:10:04 +01:00
Peter Krempa
e97d1d20b1 qemu: Move and rename qemuProcessDetectVcpuPIDs to qemuDomainDetectVcpuPids
Future patches will tweak and reuse the function in different places so
move it separately first.
2016-02-03 13:10:04 +01:00
Peter Krempa
a190744aa9 qemu: cpu hotplug: Set vcpu state directly in the new structure
Avoid using virDomainDefSetVcpus when we can set it directly in the
structure.
2016-02-03 13:10:04 +01:00
Peter Krempa
9bf284daa9 conf: Add helper to retrieve bitmap of active vcpus for a definition
In some cases it may be better to have a bitmap representing state of
individual vcpus rather than iterating the definition. The new helper
creates a bitmap representing the state from the domain definition.
2016-02-03 13:10:04 +01:00
Peter Krempa
58578f83bc cgroup: Clean up virCgroupGetPercpuStats
Use 'ret' for return variable name, clarify use of 'param_idx' and avoid
unnecessary 'success' label. No functional changes. Also document the
function.
2016-02-03 13:10:04 +01:00
Martin Kletzander
1794a0103a qemu: Don't crash when create fails early
Since commit 7140807917 we are generating
socket path later than before -- when starting a domain.  That makes one
particular inconsistent state of a chardev, which was not possible
before, currently valid.  However, SELinux security driver forgot to
guard the main restoring function by a check for NULL-paths.  So make it
no-op for NULL paths, as in the DAC driver.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1300532

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-02-03 11:01:42 +01:00
Dmitry Andreev
d2dabff3a0 qemuDomainResume: allow to resume domain with guest panicked
In case of guest panicked, preserved crashed domain has stopped CPUs.
It's not possible to use tools like WinDbg for the problem investigation
until we start CPUs back.
2016-02-03 10:33:48 +01:00
Nikolay Shirokovskiy
4a67b044fb qemu: return -1 on error paths in qemuDomainSaveImageStartVM
Error paths after sending the event that domain is started written as if ret = -1
which is set at the beginning of the function. It's common idioma to keep 'ret'
equal to -1 until the end of function where it is set to 0. But here we use ret
to keep result of restore operation too and thus breaks the idioma and its users :)

Let's use different variable to hold restore result.

Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
2016-02-03 10:27:35 +01:00
John Ferlan
6ec319b84f logical: Clean up allocation when building regex on the fly
Rather than a loop reallocating space to build the regex, just allocate
it once up front, then if there's more than 1 nextent, append a comma and
another regex_unit string.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-02-02 13:13:05 -05:00
John Ferlan
c6d526f33f logical: Use 'stripes' value for mirror/raid segtype
The 'stripes' value is described as the "Number of stripes or mirrors in
a logical volume". So add "mirror" and anything that starts with "raid"
to the list of segtypes that can have an 'nextents' value greater than one.
Use of raid segtypes (raid1, raid4, raid5*, raid6*, and raid10) is favored
over mirror in more recent lvm code.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-02-02 13:13:01 -05:00
John Ferlan
69267756d0 logical: Use VIR_APPEND_ELEMENT instead of VIR_REALLOC_N
Rather than preallocating a set number of elements, then walking through
the extents and adjusting the specific element in place, use the APPEND
macros to handle that chore.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-02-02 13:12:57 -05:00
Cole Robinson
92549b3b8a qemu: Mark some functions as static 2016-02-01 10:33:25 -05:00
Michal Privoznik
c779bf8f62 fdstream: Realign
Some lines in this file are misaligned which fires up my OCD.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-02-01 16:12:22 +01:00
Boris Fiuczynski
f73ad5d47e qemu: Align dump options for watchdog and on_crash events
Having on_crash set to either coredump-destroy or coredump-restart
creates core dumps with option memory-only in the directory specified
by auto_dump_path. When a watchdog is triggered with the action dump
the core dump is also placed into the directory specified by auto_dump_path
but is created without the option memory-only.

This patch sets the option memory-only also for core dumps created by the
watchdog event.

Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
2016-02-01 13:47:56 +01:00
John Ferlan
63e15ad5e0 logical: Create helper virStorageBackendLogicalParseVolExtents
Create a helper routine in order to parse any extents information
including the extent size, length, and the device string contained
within the generated 'lvs' output string.

A future patch would then be able to avoid the code more cleanly

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-01-29 14:13:14 -05:00
Wido den Hollander
84678267e4 rbd: Open in Read-Only mode when refreshing a volume
By opening a RBD volume in Read-Only we do not register a
watcher on the header object inside the Ceph cluster.

Refreshing a volume only calls rbd_stat() which is a operation
which does not write to a RBD image.

This allows us to use a cephx user which has no write
permissions if we would want to use the libvirt storage pool
for informational purposes only.

It also saves us a write into the Ceph cluster which should
speed up refreshing a RBD pool.

rbd_open_read_only() is available in all librbd versions which
also support rbd_open().

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2016-01-29 14:09:34 -05:00
Wido den Hollander
0b15f92032 rbd: Implement buildVolFrom using RBD cloning
RBD supports cloning by creating a snapshot, protecting it and create
a child image based on that snapshot afterwards.

The RBD storage driver will try to find a snapshot with zero deltas between
the current state of the original volume and the snapshot.

If such a snapshot is found a clone/child image will be created using
the rbd_clone2() function from librbd.

rbd_clone2() is available in librbd since Ceph version Dumpling (0.67) which
dates back to August 2013.

It will use the same features, strip size and stripe count as the parent image.

This implementation will only create a single snapshot on the parent image if
never changes. This reduces the amount of snapshots created for that RBD image
which benefits the performance of the Ceph cluster.

During build the decision will be made to use either rbd_diff_iterate() or
rbd_diff_iterate2().

The latter is faster, but only available on Ceph versions after 0.94 (Hammer).

Cloning is only supported if RBD format 2 is used. All images created by libvirt
are already format 2.

If a RBD format 1 image is used as the original volume the backend will report
a VIR_ERR_OPERATION_UNSUPPORTED error.

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2016-01-29 11:11:51 -05:00
Wido den Hollander
34872ca461 rbd: Add support for wiping RBD volumes using TRIM.
Using VIR_STORAGE_VOL_WIPE_ALG_TRIM a RBD volume can be trimmed down
to 0 bytes using rbd_discard()

Effectively all the data on the volume will be lost/gone, but the volume
remains available for use afterwards.

Starting at offset 0 the storage pool will call rbd_discard() in stripe
size * count increments which is usually 4MB. Stripe size being 4MB and
count 1.

rbd_discard() is available since Ceph version Dumpling (0.67) which dates
back to August 2013.

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2016-01-29 11:11:32 -05:00
Wido den Hollander
63cdc92f04 storage: Add TRIM algorithm to storage volume API
This new algorithm adds support for wiping volumes using TRIM.

It does not overwrite all the data in a volume, but it tells the
backing storage pool/driver that all bytes in a volume can be
discarded.

It depends on the backing storage pool how this is handled.

A SCSI backend might send UNMAP commands to remove all data present
on a LUN.

A Ceph backend might use rbd_discard() to instruct the Ceph cluster
that all data on that RBD volume can be discarded.

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2016-01-29 11:09:14 -05:00
Wido den Hollander
f226ecbfbb rbd: Add support for wiping RBD volumes
When wiping the RBD image will be filled with zeros started
at offset 0 and until the end of the volume.

This will result in the RBD volume growing to it's full allocation
on the Ceph cluster. All data on the volume will be overwritten
however, making it unavailable.

It does NOT take any RBD snapshots into account. The original data
might still be in a snapshot of that RBD volume.

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2016-01-29 10:42:36 -05:00
Wido den Hollander
69535c6124 storage: Adjust fix virStorageBackendVolWipeLocal switch
Use the cast of (virStorageVolWipeAlgorithm) adding the missing case:'s
(VIR_STORAGE_VOL_WIPE_ALG_ZERO and VIR_STORAGE_VOL_WIPE_ALG_LAST).

Additionally, the old code would also still run the SCRUB command on
default since it didn't go to cleanup when a invalid flag was supplied.
We now go to cleanup and exit if a invalid flag would be provided.

Signed-off-by: Wido den Hollander <wido@widodh.nl>
2016-01-29 10:24:20 -05:00
John Ferlan
680030c42b logical: Fix comment examples for virStorageBackendLogicalFindLVs
When commit id '82c1740a' made changes to the output format (changing from
using a ',' separator to '#'), the examples in the lvs output from the
comments weren't changed.

Additionally, the two new fields added ('segtype' and 'stripes') were
not included in the output, leaving it well confusing.

This patch fixes the sample output, adds a 'striped' example, and makes
other comment related adjustments for long line and spacing between followup
'NB' remarks (while I'm there).

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-01-28 16:50:46 -05:00
Andrea Bolognani
11ef5869fb pci: Use bool return type for some virPCIDeviceGet*() functions
The affected functions are:

  virPCIDeviceGetManaged()
  virPCIDeviceGetUnbindFromStub()
  virPCIDeviceGetRemoveSlot()
  virPCIDeviceGetReprobe()

Change their return type from unsigned int to bool: the corresponding
members in struct _virPCIDevice are defined as bool, and even the
corresponding virPCIDeviceSet*() functions take a bool value as input
so there's no point in these functions having unsigned int as return
type.

Suggested-by: John Ferlan <jferlan@redhat.com>
2016-01-28 17:27:58 +01:00
Michal Privoznik
3f3f7a824c gendispatch: Don't output spaces on empty line
In our generator for some code we put empty lines in the output
to separate blocks of code. However, in some cases we put couple
of spaces on the empty line too. It's not bug, it just isn't
nice.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-01-28 17:10:54 +01:00
Andrea Bolognani
171607296d pci: Add debug messages when unbinding from stub driver
Unbinding a PCI device from the stub driver can require several steps,
and it can be useful for debugging to be able to trace which of these
steps are performed and which are skipped for each device.
2016-01-28 12:20:53 +01:00
Andrea Bolognani
771eaeb2b3 pci: Phase out virPCIDeviceReattachInit()
The name is confusing, and there are just two uses: one is a test case,
and the other will be removed as part of an upcoming refactoring of
the hostdev code.
2016-01-28 11:31:28 +01:00
Peter Krempa
d773b57d22 qemu: don't iterate vcpus using priv->nvcpupids in qemuProcessSetSchedParams
This should be the last offender.
2016-01-28 09:58:24 +01:00
Peter Krempa
763941749e conf: disallow empty cpuset for emulatorpin
It's disallowed in the API.
2016-01-27 17:27:54 +01:00
Peter Krempa
31b782a147 conf: disallow empty cpusets for vcpu pinning when parsing XML
They are disallowed in the pinning API and as default cpuset.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1293241
2016-01-27 17:27:54 +01:00
Peter Krempa
414b7eeae9 qemu: Don't use priv->ncpus to iterate cgroup setting
Iterate over all cpus skipping inactive ones.
2016-01-27 17:27:54 +01:00
Andrea Bolognani
d87f0c0052 virnetdevopenvswitch: Don't call strlen() twice on the same string
Commit 871e10f fixed a memory corruption error, but called strlen()
twice on the same string to do so. Even though the compiler is
probably smart enough to optimize the second call away, having a
single invocation makes the code slightly cleaner.

Suggested-by: Michal Privoznik <mprivozn@redhat.com>
2016-01-27 13:01:24 +01:00
Michal Privoznik
720bc953f8 virnetdevmacvlan: Provide stubs for build without macvtap
In 370608b4c7 we have introduced two new internal APIs.
However, there are no stubs for build without macvtap. Therefore
build on systems lacking macvtap support (e.g. mingw or freebds)
fails when trying to link.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-01-27 10:07:46 +01:00
Jason J. Herne
871e10fc95 Fix libvirtd free() segfault when migrating guest with deleted open vswitch port
libvirtd crashes on free()ing portData for an open vswitch port if that port
was deleted.  To reproduce:

ovs-vsctl del-port vnet0
virsh migrate --live kvm1 qemu+ssh://dstHost/system

Error message:
libvirtd: *** Error in `/usr/sbin/libvirtd': free(): invalid pointer: 0x000003ff90001e20 ***

The problem is that virCommandRun can return an empty string in the event that
the port being queried does not exist. When this happens then we are
unconditionally overwriting a newline character at position strlen()-1. When
strlen is 0, we overwrite memory that does not belong to the string.

The fix: Only overwrite the newline if the string is not empty.

Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
2016-01-27 10:01:58 +01:00
Laine Stump
370608b4c7 util: keep/use a bitmap of in-use macvtap devices
This patch creates two bitmaps, one for macvlan device names and one
for macvtap. The bitmap position is used to indicate that libvirt is
currently using a device with the name macvtap%d/macvlan%d, where %d
is the position in the bitmap. When requested to create a new
macvtap/macvlan device, libvirt will now look for the first clear bit
in the appropriate bitmap and derive the device name from that rather
than just starting at 0 and counting up until one works.

When libvirtd is restarted, the qemu driver code that reattaches to
active domains calls the appropriate function to "re-reserve" the
device names as it is scanning the status of running domains.

Note that it may seem strange that the retry counter now starts at
8191 instead of 5. This is because we now don't do a "pre-check" for
the existence of a device once we've reserved it in the bitmap - we
move straight to creating it; although very unlikely, it's possible
that someone has a running system where they have a large number of
network devices *created outside libvirt* named "macvtap%d" or
"macvlan%d" - such a setup would still allow creating more devices
with the old code, while a low retry max in the new code would cause a
failure. Since the objective of the retry max is just to prevent an
infinite loop, and it's highly unlikely to do more than 1 iteration
anyway, having a high max is a reasonable concession in order to
prevent lots of new failures.
2016-01-26 12:20:04 -05:00
Leno Hou
8c70d04bab util: increase libnl buffer size
In the following cases nl_recv() was returning the error "No buffer
space available":

* When switching CPUs to offline/online in a system more than 128 cpus
* When using virsh to destroy domain in a system with many interfaces

This patch sets the buffer size for all netlink sockets created by
libnl to 128K and turns on message peeking for nl_recv(). This
eliminates the "No buffer space available" errors seen in the cases
above, and also preempts other future errors the smaller buffers could
have caused.

Signed-off-by: Leno Hou <houqy@linux.vnet.ibm.com>
Signed-off-by: Laine Stump <laine@laine.org>
2016-01-26 12:20:04 -05:00
Pavel Hrdina
36785c7e77 device: cleanup input device code
The current code was a little bit odd.  At first we've removed all
possible implicit input devices from domain definition to add them later
back if there was any graphics device defined while parsing XML
description.  That's not all, while formating domain definition to XML
description we at first ignore any input devices with bus different to
USB and VIRTIO and few lines later we add implicit input devices to XML.

This seems to me as a lot of code for nothing.  This patch may look
to be more complicated than original approach, but this is a preferred
way to modify/add driver specific stuff only in those drivers and not
deal with them in common parsing/formating functions.

The update is to add those implicit input devices into config XML to
follow the real HW configuration visible by guest OS.

There was also inconsistence between our behavior and QEMU's in the way,
that in QEMU there is no way how to disable those implicit input devices
for x86 architecture and they are available always, even without graphics
device.  This applies also to XEN hypervisor.  VZ driver already does its
part by putting correct implicit devices into live XML.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2016-01-26 17:53:33 +01:00
Michal Privoznik
c7f5e26b5f vircgroup: Finish renaming of virCgroupIsolateMount
In dc576025c3 we renamed virCgroupIsolateMount function to
virCgroupBindMount. However, we forgot about one occurrence in
section of the code which provides stubs for platforms without
support for CGroups like *BSD for instance.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-01-26 17:39:47 +01:00
Daniel P. Berrange
dc576025c3 lxc: don't try to hide parent cgroups inside container
On the host when we start a container, it will be
placed in a cgroup path of

   /machine.slice/machine-lxc\x2ddemo.scope

under /sys/fs/cgroup/*

Inside the containers' namespace we need to setup
/sys/fs/cgroup mounts, and currently will bind
mount /machine.slice/machine-lxc\x2ddemo.scope on
the host to appear as / in the container.

While this may sound nice, it confuses applications
dealing with cgroups, because /proc/$PID/cgroup
now does not match the directory in /sys/fs/cgroup

This particularly causes problems for systems and
will make it create repeated path components in
the cgroup for apps run in the container eg

  /machine.slice/machine-lxc\x2ddemo.scope/machine.slice/machine-lxc\x2ddemo.scope/user.slice/user-0.slice/session-61.scope

This also causes any systemd service that uses
sd-notify to fail to start, because when systemd
receives the notification it won't be able to
identify the corresponding unit it came from.
In particular this break rabbitmq-server startup

Future kernels will provide proper cgroup namespacing
which will handle this problem, but until that time
we should not try to play games with hiding parent
cgroups.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-01-26 16:11:32 +00:00
Daniel P. Berrange
511e7c5bba qemu: add reporting of vCPU wait time
The VIR_DOMAIN_STATS_VCPU flag to virDomainListGetStats
enables reporting of stats about vCPUs. Currently we
only report the cumulative CPU running time and the
execution state.

This adds reporting of the wait time - time the vCPU
wants to run, but the host scheduler has something else
running ahead of it.

The data is reported per-vCPU eg

$ virsh domstats --vcpu demo
 Domain: 'demo'
   vcpu.current=4
   vcpu.maximum=4
   vcpu.0.state=1
   vcpu.0.time=1420000000
   vcpu.0.wait=18403928
   vcpu.1.state=1
   vcpu.1.time=130000000
   vcpu.1.wait=10612111
   vcpu.2.state=1
   vcpu.2.time=110000000
   vcpu.2.wait=12759501
   vcpu.3.state=1
   vcpu.3.time=90000000
   vcpu.3.wait=21825087

In implementing this I notice our reporting of CPU execute
time has very poor granularity, since we are getting it
from /proc/$PID/stat. As a future enhancement we should
prefer to get CPU execute time from /proc/$PID/schedstat
or /proc/$PID/sched (if either exist on the running kernel)

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-01-26 14:34:23 +00:00
Peter Krempa
356e28b35e util: buffer: Sanitize comment for virBufferAddBuffer
Idioms are usually weird and obscure when translated literally.
2016-01-25 17:53:08 +01:00
Peter Krempa
7141fc7a27 test: Touch up error message when attempting to pin invalid vCPU
Report
error: invalid argument: requested vcpu '100' is not present in the domain
instead of
error: invalid argument: requested vcpu is higher than allocated vcpus
2016-01-25 17:53:08 +01:00
Peter Krempa
51f07d8f0f (qemu|lxc)DomainGetCPUStats: Clean up
Remove unnecessary condition and variable.
2016-01-25 17:45:09 +01:00
Peter Krempa
68ee703bfe vz: Fix invalid iteration of def->cputune.vcpupin
The array doesn't necessarily have the same cardinality as the count of
vCPUs for a domain. Iterating it can cause access beyond the end of the
array.
2016-01-25 17:45:09 +01:00
Peter Krempa
b3c91b8a50 qemu: process: Disallow VMs with 0 vcpus
Counterintuitively the user would end up with a VM with maximum number
of vCPUs available.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1290324
2016-01-25 17:45:09 +01:00
Peter Krempa
adca15cf15 qemu: process: refactor and rename qemuValidateCpuMax to qemuValidateCpuCount
Next patch will add minimum checking, so use a more generic name.
Refactor return values to the commonly used semantics.
2016-01-25 17:45:09 +01:00
Michal Privoznik
35c3aab44d vmx: Adapt to emptyBackingString for cdrom-image
https://bugzilla.redhat.com/show_bug.cgi?id=1266088

We are missing this value for cdrom-image device. It seems like
there's no added value to extend this to other types of disk
devices [1].

1: https://www.redhat.com/archives/libvir-list/2016-January/msg01038.html

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-01-25 08:34:23 +01:00
Peter Krempa
4ac14cde9a qemu: snapshot: Correctly report qemu error on 'savevm'
Since 'savevm' was not converted to QMP libvirt has to parse for error
strings in the text monitor output. One of the unhandled errors is
produced when qemu treats a device as unmigratable.

As current qemu actually does support AHCI migration this bug is
applicable only to older versions of qemu.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1293899
2016-01-25 07:21:25 +01:00
Peter Krempa
0c1b0d83bb qemu: monitor: Refactor error handling for 'savevm'
Unify few error conditions into a single error reporting case.
2016-01-25 07:21:25 +01:00
Roman Bogorodskiy
ef01addb38 bhyve: bhyveload: respect boot dev and boot order
Make bhyveload respect boot order as specified by os.boot section of the
domain XML or by "boot order" for specific devices. As bhyve does not
support a real boot order specification right now, it's just about
choosing a single device to boot from.
2016-01-25 04:19:33 +03:00
Roman Bogorodskiy
318ae9f3be conf: expose virDomainBootType(From|To)String
These functions are going to be used by the Bhyve driver.
2016-01-25 03:54:07 +03:00
Laine Stump
29cc45cb79 util: reset MAC address of macvtap passthrough physdev after disassociate
libvirt always resets the MAC address of the physdev used for macvtap
passthrough when the guest is finished with it. This was happening
prior to the 802.1Qb[gh] DISASSOCIATE command, and was quite often
failing, presumably because the driver wouldn't allow the MAC address
to be reset while the association was still active, with a log message
like this:

virNetDevSetMAC:168 : Cannot set interface MAC to 00:00:00:00:00:00 on 'eth13': Cannot assign requested address

This patch changes the order - we now do the 802.1Qb[gh] disassociate
and delete the macvtap interface first, then and reset the MAC
address.
2016-01-22 13:16:24 -05:00
Cole Robinson
81da8bc73b lxc: fuse: Stub out Slab bits in /proc/meminfo
'free' on fedora23 wants to use the Slab field for calculated used
memory. The equation is:

used = MemTotal - MemFree - (Cached + Slab) - Buffers

We already set Cached and Buffers to 0, do the same for Slab and its
related values

https://bugzilla.redhat.com/show_bug.cgi?id=1300781
2016-01-22 08:32:00 -05:00
Cole Robinson
c7be484d11 lxc: fuse: Fill in MemAvailable for /proc/meminfo
'free' on Fedora 23 will use MemAvailable to calculate its 'available'
field, but we are passing through the host's value. Set it to match
MemFree, which is what 'free' will do for older linux that don't have
MemAvailable

https://bugzilla.redhat.com/show_bug.cgi?id=1300781
2016-01-22 08:32:00 -05:00
Cole Robinson
8418245a7e lxc: fuse: Fix /proc/meminfo size calculation
We virtualize bits of /proc/meminfo by replacing host values with
values specific to the container.

However for calculating the final size of the returned data, we are
using the size of the original file and not the altered copy, which
could give garbelled output.
2016-01-22 08:32:00 -05:00
Cole Robinson
f65dcfcd14 lxc: fuse: Unindent meminfo logic
Reverse the conditional at the start so we aren't stuffing all the logic
in an 'if' block
2016-01-22 08:32:00 -05:00
Ian Campbell
daeace5c5d libxl: Support cmdline= in xl config files
... and consolidate the cmdline/extra/root parsing to facilitate doing
so.

The logic is the same as xl's parse_cmdline from the current xen.git master
branch (e6f0e099d2c17de47fd86e817b1998db903cab61).

On the formatting side switch to producing cmdline= instead of extra=.

Update a few tests and add serveral more.
  - test-cmdline is added to test the exclusive use of cmdline.
  - test-fullvirt-direct-kernel-boot.cfg is updated due to the switch
    on the formatting side and now tests the exclusive use of cmdline=.
  - Tests are added for both paravirt and fullvirt where the .cfg uses
    extra= and (paravirt only) root=. These are format (xl->xml) only
    since the inverse will generate cmdline= hence is not a round trip
    (which was already true if using root=, which used to generate
    extra= on the way back).
  - Tests are added for both paravirt and fullvirt where the .cfg
    declares cmdline= as well as bogus extra= and (paravirt only) root=
    entries which should be ignored. Again these are format only tests
    since the inverse won't include the bogus lines.

The last two bullets here required splitting the DO_TEST macro into
two halves, as is done in the xmconfigtest.c case.

In order to introduce a use of VIR_WARN for logging I had to add
virerror.h and VIR_LOG_INIT.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
2016-01-21 10:48:44 -07:00
Joao Martins
d18d6a85f9 libxl: dispose libxl_dominfo after libxl_domain_info()
As suggested in a previous thread [0] this patch adds some missing calls
to libxl_dominfo_{init,dispose} when doing some of the libxl_domain_info
operations which would otherwise lead to memory leaks.

[0]
https://www.redhat.com/archives/libvir-list/2015-September/msg00519.html

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
2016-01-21 09:49:57 -07:00
Jim Fehlig
8c3c32f16a Xen: add XENXL to virErrorDomain enum
Add "Xen XL Config" to the virErrorDomain enum and use it in
src/xenconfig/xen_xl.c.
2016-01-21 09:31:39 -07:00
Jim Fehlig
7d3698b47c Xen: VIR_FROM_THIS cleanup
The virErrorDomain enum has VIR_FROM_XEN, VIR_FROM_XEND,
VIR_FROM_XENSTORE, VIR_FROM_SEXPR, and VIR_FROM_XENXM. Use
these elements in the corresponding .c files. While at it,
remove the VIR_FROM_THIS define in src/xenconfig/xenxs_private.h.
2016-01-21 09:31:39 -07:00
Jiri Denemark
56635345ad qemu: Add support for migration iteration event
The corresponding event in QEMU is called MIGRATION_PASS.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-01-21 16:36:08 +01:00
Jiri Denemark
0b50f4a025 Introduce migration iteration event
The VIR_DOMAIN_EVENT_ID_MIGRATION_ITERATION event will be triggered
whenever VIR_DOMAIN_JOB_MEMORY_ITERATION changes its value, i.e.,
whenever a new iteration over guest memory pages is started during
migration.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-01-21 16:36:08 +01:00
Dmitry Andreev
e2b86f580c qemuDomainReboot: use fakeReboot=true only for acpi mode
When acpi is used to reboot/shutdown qemu domain, qemu emits
SHUTDOWN event. Libvirt uses fakeReboot variable in order to
differentiate reboot or shutdown. fakeReboot value is reseted
to false after domain restart/reset.

When mode=agent is used to reboot qemu domain, qemu doesn't emit
SHUTDOWN event and libvirt doesn't reset fakeReboot value to false.
In this case next 'shutdown -h now' performs reboot. That's why
we don't need to set fakeReboot=true for mode=agent.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-01-21 15:03:56 +01:00
Cole Robinson
a1edb05c60 build: predictably generate systemtap tapsets (bz 1173641)
The generated output is dependent on perl hashtable ordering, which
gives different results for i686 and x86_64. Fix this by sorting
the hash keys before iterating over them

https://bugzilla.redhat.com/show_bug.cgi?id=1173641
2016-01-20 10:26:02 -05:00
Ján Tomko
ce9085eba1 leaseshelper: reduce indentation level in virLeaseReadCustomLeaseFile
Instead of nested ifs, jump out early.

Mostly whitespace changes.
2016-01-20 10:01:52 +01:00
Ján Tomko
d7049a67b6 leaseshelper: remove useless comparison
We do not care if the mac was specified in the delete section,
we are going to delete the record anyway.
2016-01-20 09:35:50 +01:00
Ján Tomko
99569948d3 leaseshelper: move comment about adding IPv6 leases
The comment is relevant to the ADD action, not DEL.
2016-01-20 09:34:47 +01:00
Ján Tomko
21fb379549 leaseshelper: split out virLeasePrintLeases
Introduce a function for printing the leases on the 'init' operation.
2016-01-20 09:33:44 +01:00
Ján Tomko
7f9c425bfb leaseshelper: split out custom leases file read
Introduce virLeaseReadCustomLeaseFile which will populate
the new leases array with all the leases, except for expired
ones and the ones matching 'ip_to_delete'.

This removes five variables from main().
2016-01-20 09:33:44 +01:00
Ján Tomko
9e7e7662bf leaseshelper: store server_duid as an allocated string
We either use the value from the environment variable, or learn it from
the existing lease file.

In the second case, the pointer would be pointing into the JSON object
of the first lease with a DUID, owned by leases_array, then
leases_array_new.

Always allocate the string instead, making obvious who should free the
string.
2016-01-20 09:33:44 +01:00
Ján Tomko
df9fe124d6 leaseshelper: fix crash when no mac is specified
If dnsmasq specified DNSMASQ_IAID (so we're dealing with an IPv6
lease) but no DNSMASQ_MAC, we skip creation of the new lease object.

Also skip adding it to the leases array.

https://bugzilla.redhat.com/show_bug.cgi?id=1202350
2016-01-20 09:32:59 +01:00
John Ferlan
020135dc85 storage: Add new flag for libvirt_parthelper
https://bugzilla.redhat.com/show_bug.cgi?id=1265694

In order to be able to process disk storage pool's using a multipath
device to handle the partitions, libvirt_parthelper will need a way to
not automatically add a partition separator "p" to the generated device
name for each partition found. This is designed to mimic the multipath
features known as 'user_friendly_names' and custom 'alias' name.

If the part_separator attribute is set to "no", then generation of the
multipath partition name will not include the "p" partition separator
unless the source device path name ends with a number. The generated
partition names that get passed back to libvirt are processed in order
to find the device mapper multipath (dm-#) path device.

For example, device path "/dev/mapper/mpatha" would create partitions
"/dev/mapper/mpatha1", "/dev/mapper/mpatha2", etc. instead of
"/dev/mapper/mpathap1", "/dev/mapper/mpathap2", etc. If the device
path ends with a number "/dev/mapper/mpatha1", then the algorithm
to generate names "/dev/mapper/mpatha1p1", "/dev/mapper/mpatha1p2", etc.
would be utilized.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-01-19 13:02:59 -05:00
John Ferlan
4f84617078 conf: Add storage pool device attribute part_separator
Add a new storage pool source device attribute 'part_separator=[yes|no]'
in order to allow a 'disk' storage pool using a device mapper multipath
device to not add the "p" partition separator to the generated device
name when libvirt_parthelper is run.

This will allow libvirt to find device mapper multipath devices which were
configured in /etc/multipath.conf to use 'user_friendly_names' or custom
'alias' names for the LUN.
2016-01-19 13:02:59 -05:00
Michal Privoznik
c03fbecc7c virLogManagerDomainReadLogFile: Don't do dummy allocs
Since we pass dummy variables @fdout and @fdoutlen into
virNetClientProgramCall() we make it alloc @fdout array (even
though it's an array of 0 elements since vitlogd can hardly pass
us some FDs at this stage). Nevertheless, it's an allocation not
followed by free():

==29385== 0 bytes in 60 blocks are definitely lost in loss record 2 of 1,009
==29385==    at 0x4C2C070: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==29385==    by 0x54B99EF: virAllocN (viralloc.c:191)
==29385==    by 0x56821B1: virNetClientProgramCall (virnetclientprogram.c:359)
==29385==    by 0x563B304: virLogManagerDomainReadLogFile (log_manager.c:272)
==29385==    by 0x217CD613: qemuDomainLogContextRead (qemu_domain.c:2485)
==29385==    by 0x217EDC76: qemuProcessReadLog (qemu_process.c:1660)
==29385==    by 0x217EDE1D: qemuProcessReportLogError (qemu_process.c:1696)
==29385==    by 0x217EE8C1: qemuProcessWaitForMonitor (qemu_process.c:1957)
==29385==    by 0x217F6636: qemuProcessLaunch (qemu_process.c:4955)
==29385==    by 0x217F71A4: qemuProcessStart (qemu_process.c:5152)
==29385==    by 0x21846582: qemuDomainObjStart (qemu_driver.c:7396)
==29385==    by 0x218467DE: qemuDomainCreateWithFlags (qemu_driver.c:7450)

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-01-18 17:14:16 +01:00
Michal Privoznik
105b51f42e qemuProcessReadLog: Fix memmove arguments
So I can observe this crasher that with freshly started daemon
(and virtlogd enabled) I am trying to startup a domain that
immediately dies (because it's said to use huge pages but I
haven't allocated a single one in the pool). Hardly reproducible
with -O0 or under valgrind. But I just got lucky:

==20469== Invalid write of size 8
==20469==    at 0x4C2E99B: memcpy@GLIBC_2.2.5 (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==20469==    by 0x217EDD07: qemuProcessReadLog (qemu_process.c:1670)
==20469==    by 0x217EDE1D: qemuProcessReportLogError (qemu_process.c:1696)
==20469==    by 0x217EE8C1: qemuProcessWaitForMonitor (qemu_process.c:1957)
==20469==    by 0x217F6636: qemuProcessLaunch (qemu_process.c:4955)
==20469==    by 0x217F71A4: qemuProcessStart (qemu_process.c:5152)
==20469==    by 0x21846582: qemuDomainObjStart (qemu_driver.c:7396)
==20469==    by 0x218467DE: qemuDomainCreateWithFlags (qemu_driver.c:7450)
==20469==    by 0x21846845: qemuDomainCreate (qemu_driver.c:7468)
==20469==    by 0x5611CD0: virDomainCreate (libvirt-domain.c:6753)
==20469==    by 0x125D9A: remoteDispatchDomainCreate (remote_dispatch.h:3613)
==20469==    by 0x125CB7: remoteDispatchDomainCreateHelper (remote_dispatch.h:3589)
==20469==  Address 0x27a52ad0 is 0 bytes after a block of size 5,584 alloc'd
==20469==    at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==20469==    by 0x9B8D1DB: xdr_string (in /lib64/libc-2.21.so)
==20469==    by 0x563B39C: xdr_virLogManagerProtocolNonNullString (log_protocol.c:24)
==20469==    by 0x563B6B7: xdr_virLogManagerProtocolDomainReadLogFileRet (log_protocol.c:123)
==20469==    by 0x164B34: virNetMessageDecodePayload (virnetmessage.c:407)
==20469==    by 0x5682360: virNetClientProgramCall (virnetclientprogram.c:379)
==20469==    by 0x563B30E: virLogManagerDomainReadLogFile (log_manager.c:272)
==20469==    by 0x217CD613: qemuDomainLogContextRead (qemu_domain.c:2485)
==20469==    by 0x217EDC76: qemuProcessReadLog (qemu_process.c:1660)
==20469==    by 0x217EDE1D: qemuProcessReportLogError (qemu_process.c:1696)
==20469==    by 0x217EE8C1: qemuProcessWaitForMonitor (qemu_process.c:1957)
==20469==    by 0x217F6636: qemuProcessLaunch (qemu_process.c:4955)

This points to memmove() in qemuProcessReadLog(). Imagine we just
read the following string from qemu:

"abc\n2016-01-18T09:40:44.022744Z qemu-system-x86_64: Error\n"

After the first pass of the while() loop in the
qemuProcessReadLog() (in which we have taken the false branch in
the if) @buf still points to the beginning of the string,
@filter_next points to the beginning of the second line.  So we
start second iteration because there is yet another newline
character at the end. In this iteration @eol points to it
actually. Now, the control gets inside true branch of if(). Just
to remind you:

got = 58
filter_next = buf + 5,
eol = buf + 58.

Therefore skip = 54 which is correct. The message we want to skip
is 54 bytes long. However:

memmove(filter_next, eol + 1, (got - skip) +1);

which is

memmove(filter_next, eol + 1, 5)

is obviously wrong as there is only one byte we can access, not 5!

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-01-18 17:14:16 +01:00
Martin Kletzander
4b47f9b82c Fix make check with gcc version 5
When building with gcc-5 (particularly gcc-5.3.0 now) and having pdwtags
installed (package dwarves) make check fails with the following error:

  $ make lock_protocol-struct
  GEN      lock_protocol-struct
  --- lock_protocol-structs	2016-01-13 15:04:59.318809607 +0100
  +++ lock_protocol-struct-t3	2016-01-13 15:05:17.703501234 +0100
  @@ -26,10 +26,6 @@
           virLockSpaceProtocolNonNullString name;
           u_int                      flags;
   };
  -enum virLockSpaceProtocolAcquireResourceFlags {
  -        VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_SHARED = 1,
  -        VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_AUTOCREATE = 2,
  -};
   struct virLockSpaceProtocolAcquireResourceArgs {
           virLockSpaceProtocolNonNullString path;
           virLockSpaceProtocolNonNullString name;
  Makefile:10415: recipe for target 'lock_protocol-struct' failed
  make: *** [lock_protocol-struct] Error 1

That happens because without any specific options gcc doesn't keep enum
information in the resulting binary object.  I managed to isolate the
parameters of gcc that caused this issue to disappear, however I
remember that they influenced the resulting binaries quite a bit and
were definitely not something we would want to add as mandatory to the
build process.

So to deal with this cleanly, let's take that enum and separate it out
to its own header file.  Since it is only used in the lockd driver and
the protocol, lock_driver_lockd.h feels like a suitable name.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-01-18 15:19:21 +01:00
Wido den Hollander
a5a383adc1 rbd: Set r variable so it can be returned should an error occur
This was reported in bug #1298024 where r would be filled with the
return code of rbd_open().

Should rbd_snap_unprotect() fail for any reason the virReportSystemError
call would return 'Success' since rbd_open() succeeded.

https://bugzilla.redhat.com/show_bug.cgi?id=1298024
Signed-off-by: Wido den Hollander <wido@widodh.nl>
2016-01-18 14:06:24 +01:00
Jiri Denemark
8f0a15727f security: Do not restore labels on device tree binary
A device tree binary file specified by /domain/os/dtb element is a
read-only resource similar to kernel and initrd files. We shouldn't
restore its label when destroying a domain to avoid breaking other
domains configure with the same device tree.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-01-15 16:34:37 +01:00
Jiri Denemark
68acc701bd security: Do not restore kernel and initrd labels
Kernel/initrd files are essentially read-only shareable images and thus
should be handled in the same way. We already use the appropriate label
for kernel/initrd files when starting a domain, but when a domain gets
destroyed we would remove the labels which would make other running
domains using the same files very unhappy.

https://bugzilla.redhat.com/show_bug.cgi?id=921135

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2016-01-15 10:55:58 +01:00
Yaniv Kaul
c1e0df918b qemu: Print better warning in qemuAgentNotifyEvent
We have this function qemuAgentNotifyEvent() which is supposed to
be called from thread pool responsible for processing qemu
monitor events. The function then should wake up other thread
that is waiting for a guest to shutdown or reboot. However, if we
have received a different error a warning is printed out. This
warning lacks info on which event is expected.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-01-15 08:33:49 +01:00
John Ferlan
d6d7e2885b cgroup: Fix possible bug as a result of code motion for vcpu cgroup setup
Commit id '90b721e43' moved where the virCgroupAddTask was made until
after the check for the vcpupin checks. However, in doing so it missed
an option where if the cpumap didn't exist, then the code would continue
back to the top of the current vcpu loop. The results was that the
virCgroupAddTask wouldn't be called.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-01-14 11:02:53 -05:00
John Ferlan
344d480611 Revert "lxc_cgroup: Add check for NULL cgroup before AddTask call"
This reverts commit ae09988eb7.

Since commit id '71ce4759' has been reverted, this one is no
longer necessary.
2016-01-14 11:01:50 -05:00
John Ferlan
d41bd09596 Revert "util: cgroups do not implicitly add task to new machine cgroup"
This reverts commit 71ce475967.

Since commit id 'a41c00b47' has been reverted, this no longer is
necessary
2016-01-14 11:00:25 -05:00
John Ferlan
f8f6907284 Revert "qemu: do not put a task into machine cgroup"
This reverts commit a41c00b472.

After much testing and upstream discussion this has been deemed to be
the incorrect operation since it means we no longer have any guarantee
about which resource controllers the QEMU processes in general are in.
2016-01-14 10:56:53 -05:00
Cédric Bosdonnat
c726af2d5a virt-aa-helper: don't deny writes to readonly mounts
There is no need to deny writes on a readonly mount: write still
won't be accepted, even if the user remounts the folder as RW in
the guest as qemu sets the 9p mount as ro.

This deny rule was leading to problems for example with readonly /:
The qemu process had to write to a bunch of files in / like logs,
sockets, etc. This deny rule was also preventing auditing of these
denials, making it harder to debug.
2016-01-14 15:42:05 +01:00
John Ferlan
3e2d637458 conf: Initialize 'deflate' for balloon parse XML
Commit id '7bf3198df' neglected to initialize deflate leading to a
possibility if model allocation/checks fail, then the VIR_FREE(deflate)
would be erroneous. Noted by Jan Tomko.
2016-01-14 05:54:58 -05:00
Michal Privoznik
e988ba94aa qemuProcessCleanupChardevDevice: Don't unlink NULL paths
So, you try to start a domain, but before we even get to the part
where chardev part of qemu command line is generated (and
possibly missing path to unix sockets is made up) an error occurs
which results in calling qemuProcessStop. This will then try to
clean up the mess and possibly ends up calling unlink(NULL).

==8085== Thread 3:
==8085== Syscall param unlink(pathname) points to unaddressable byte(s)
==8085==    at 0xA85EA57: unlink (in /lib64/libc-2.21.so)
==8085==    by 0x213D3C24: qemuProcessCleanupChardevDevice (qemu_process.c:2866)
==8085==    by 0x558D6B1: virDomainChrDefForeach (domain_conf.c:22924)
==8085==    by 0x213DA9AE: qemuProcessStop (qemu_process.c:5326)
==8085==    by 0x213DA2F2: qemuProcessStart (qemu_process.c:5190)
==8085==    by 0x2142957F: qemuDomainObjStart (qemu_driver.c:7396)
==8085==    by 0x214297DB: qemuDomainCreateWithFlags (qemu_driver.c:7450)
==8085==    by 0x21429842: qemuDomainCreate (qemu_driver.c:7468)
==8085==    by 0x5611B95: virDomainCreate (libvirt-domain.c:6753)
==8085==    by 0x125D9A: remoteDispatchDomainCreate (remote_dispatch.h:3613)
==8085==    by 0x125CB7: remoteDispatchDomainCreateHelper (remote_dispatch.h:3589)
==8085==    by 0x568BF41: virNetServerProgramDispatchCall (virnetserverprogram.c:437)
==8085==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==8085==

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-01-13 11:30:38 +01:00
Jim Fehlig
71daae9671 xenconfig: check return value of regcomp
Commit ec63000a missed checking the return value of regcomp(),
which coverity promptly identified.
2016-01-12 14:22:54 -07:00
Jim Fehlig
6564de5e95 Xen: use correct domctl version in domaininfolist union
Commmit fd2e3c4c used the domctl version 8 structure for version 9
in the xen_getdomaininfolist union, resulting in insufficient buffer
size (and subsequent memory corruption) for the GETDOMAININFOLIST
ioctl.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2016-01-12 10:37:56 -07:00
Dmitry Andreev
981c01d419 qemu: add support of optional 'autodeflate' attribute
Autodeflate can be enabled/disabled for memballon device
of model 'virtio'.

xml:
<devices>
  <memballoon model='virtio' autodeflate='on'/>
</devices>

qemu:
qemu -device virtio-balloon-pci,...,deflate-on-oom=on

Autodeflate cannot be enabled/disabled for running domain.
2016-01-12 10:48:21 -05:00
Dmitry Andreev
3522a311ea qemu: add capability check for memballoon 'deflate-on-oom' feature
Add appropriate capability check and new virQEMUCaps flag for the new
virtio balloon feature. QEMU commit with the complete feature description:
http://git.qemu.org/?p=qemu.git;a=commit;h=e3816255bf4b6377bb405331e2ee0dc14d841b80
2016-01-12 10:48:21 -05:00
Dmitry Andreev
7bf3198df6 conf: introduce 'autodeflate' attribute for memballoon device
Excessive memory balloon inflation can cause invocation of OOM-killer,
when Linux is under severe memory pressure. QEMU memballoon device
has a feature to release some memory at the last moment before some
process will be get killed by OOM-killer.

Introduce a new optional balloon device attribute 'autodeflate' to
enable or disable this feature.
2016-01-12 10:48:21 -05:00
Cole Robinson
2eb7a97575 rpc: socket: Don't repeatedly attempt to launch daemon
On every socket connect(2) attempt we were re-launching session
libvirtd, up to 100 times in 5 seconds.

This understandably caused some weird load races and intermittent
qemu:///session startup failures

https://bugzilla.redhat.com/show_bug.cgi?id=1271183
2016-01-12 10:45:45 -05:00
Cole Robinson
8da02d5280 rpc: socket: Explicitly error if we exceed retry count
When we autolaunch libvirtd for session URIs, we spin in a retry
loop waiting for the daemon to start and the connect(2) to succeed.

However if we exceed the retry count, we don't explicitly raise an
error, which can yield a slew of different error messages elsewhere
in the code.

Explicitly raise the last connect(2) failure if we run out of retries.
2016-01-12 10:45:45 -05:00
Cole Robinson
f102c7146e rpc: socket: Minor cleanups
- Add some debugging
- Make the loop dependent only on retries
- Make it explicit that connect(2) success exits the loop
- Invert the error checking logic
2016-01-12 10:45:45 -05:00
Roman Bogorodskiy
bc451c4980 Add missing virxdrdefs.h include to log_protocol
Commit 2b6f6ad introduced the virxdrdefs.h header with
common definitions to be included in the protocol files,
but logging/log_protocol.x was missed, so add it there as well.

Hopefully this fixes build on OS X.
2016-01-12 18:15:09 +03:00
Ben Gray
133c511b52 rpc: Don't rewrite msg->fds on every read dispatch
When we are receiving data in smaller chunks it might happen that
virNetServerClientDispatchRead() will be called multiple times.  And as
that happens, if it is a message that also transfer headers, we decode
the number of them every single time and, unfortunately, also allocate
the memory for them.  That causes a leak, in the best scenario.

Best viewed with '-w'.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-01-12 08:56:50 +01:00
Laine Stump
21e63916dc util: eliminate bogus error log in virNetDevVPortProfileGetStatus
if instanceId is NULL

When virNetDevVPortProfileGetStatus() was called with instanceId =
NULL (which is the case for all DISASSOCIATE requests in 802.1Qbh) it
would log the following error:

   Could not find netlink response with expected parameters

even though the disassociate had been successfully completely. Then,
due to the fortunate coincidence of status having been initialized to
0 and then not changed when the "failure" was encountered, it would
still return a status of 0 (PORT_VDP_RESPONSE_SUCCESS), so the caller
would assume a successful operation.

This would result in a spurious log message though, and would fill in
LastErrorMessage, so that the API would return that error if it
happened during cleanup from some other error. That, in turn, would
lead to an incorrect supposition that the response to the port profile
disassociate was the cause of the failure.

During debugging, I noticed that the VF in question usually had *no
uuid* associated with it (big surprise)by the time the disassociate
completed, so the solution is *not* to send the previous instanceId
down.

This patch fixes virNetDevVPortProfileGetStatus() to only check the
VF's uuid in the status if it was given an instanceId to check against
when originally called. Otherwise it only checks that the particular
VF is present (it will be).

This does cause a slight difference in behavior - rather than
returning with status unchanged (and thus always 0) it will actually
get the IFLA_PORT_RESPONSE. This could lead to revelation of error
conditions we were previously ignoring. Or not. So far "not".
2016-01-11 17:09:28 -05:00
Laine Stump
47b830370a qemu: use enum when setting PCI "multi" value, not 0 or 1
Use the VIR_TRISTATE_SWITCH_* enums appropriately.

No functional change.
2016-01-11 15:13:54 -05:00
Laine Stump
bd04ad42e7 qemu: auto-add a USB2 controller set for Q35 machines
Use virDomainDefAddUSBController() to add an EHCI1+UHCI1+UHCI2+UHCI3
controller set to newly defined Q35 domains that don't have any USB
controllers defined.
2016-01-11 13:21:10 -05:00
Laine Stump
8ebca27bb7 qemu: define virDomainDevAddUSBController()
This new function will add a single controller of the given model,
except the case of ich9-usb-ehci1 (the master controller for a USB2
controller set) in which case a set of related controllers will be
added (EHCI1, UHCI1, UHCI2, UHCI3). These controllers will not be
given PCI addresses, but should be otherwise ready to use.

"-1" is allowed for controller model, and means "default for this
machinetype". This matches the existing practice in
qemuDomainDefPostParse(), which always adds the default controller
with model = -1, and relies on the commandline builder to set a model
(that is wrong, but will be fixed later).
2016-01-11 13:16:51 -05:00
Laine Stump
ed64d92bea conf: add virDomainDefAddController()
We need a virDomainDefAddController() that doesn't check for an
existing controller at the same index (since USB2 controllers must be
added in sets of 4 that are all at the same index), so rather than
duplicating the code in virDomainDefMaybeAddController(), split it
into two functions, in the process eliminating existing duplicated
code that loops through the controller list by calling
virDomainControllerFind(), which does the same thing).
2016-01-11 13:08:26 -05:00
Laine Stump
163338ec28 qemu: prefer 00:1D.x and 00:1A.x for USB2 controllers on Q35
The real Q35 machine puts the first USB controller set (EHCI+(UHCIx4))
on bus 0 slot 0x1D, and the 2nd USB controller set on bus 0 slot 0x1A,
so let's attempt to make the virtual machine match that for
controllers with auto-assigned addresses when possible.

Three test cases were added to assure that the proper addresses are
assigned - one with a single set of unaddressed USB controllers, one
with 3 (to grab both preferred slots plus one more), and one with the
order of the controller definitions reordered, to assure that the
auto-assignment isn't mixed up by order.
2016-01-11 13:04:17 -05:00
Laine Stump
7dbb5fce06 qemu: don't assume slot 0 is unused/reserved.
When qemuAssignDevicePCISlots() is looking for companion controllers
for a USB controller that has no PCI address specified, it initializes
a virDevicePCIAddress to 0000:00:00.0, fills it in with the
companion's address if one is found, then checks whether or not there
was a find based on slot == 0. On a system with a single PCI bus, that
is a valid way to check, because slot 0 is reserved, but on most other
PCI buses, slot 0 is not reserved, and is open for use by any
device. This patch adds a separate bool that is set when a companion
is found rather than relying on the faulty information provided with
"slot == 0".
2016-01-11 12:58:40 -05:00
Jasper Lievisse Adriaanse
2b6f6ad64b Unify int types handling in protocol files
Some of the protocol files already include handing of the missing int
types such as xdr_uint64_t, some don't. To fix it everywhere, move out
of the appropriate defines to the utils/virxdrdefs.h file and include
it where needed.

Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
2016-01-11 19:56:06 +03:00
Jasper Lievisse Adriaanse
91b423beb7 Use struct sockpeercred when available
OpenBSD uses 'struct sockpeercred' instead of 'struct ucred'. Add a
configure check that detects its presence and use if in the code that
could be compiled on OpenBSD.

Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
2016-01-11 19:56:06 +03:00
Jasper Lievisse Adriaanse
1b60f1b401 cgroup: don't include sys/mount.h if not needed
As cgroup implementation only works on Linux, it does not
make much sense to include sys/mount.h if other requirements are
not met, such as HAVE_MNTENT_H and HAVE_GETMNTENT_R.

Also, it fixes build on OpenBSD that requires to include sys/param.h
along with sys/mount.h.

Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
2016-01-11 19:56:06 +03:00
Michal Privoznik
0a84286d8f qemu: Introduce QEMU_CAPS_VSERPORT_CHANGE
This capability tells if qemu is capable of vserport_change
events.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-01-11 17:17:52 +01:00
Michal Privoznik
d5762cc034 qemu: change qemuFindAgentConfig return type
While this is no functional change, whole channel definition is
going to be needed very soon. Moreover, while touching this obey
const correctness rule in qemuAgentOpen() - so far it was passed
regular pointer to channel config even though the function is
expected to not change pointee at all. Pass const pointer
instead.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-01-11 17:17:52 +01:00
Michal Privoznik
2f50445537 qemu: Set virtio channel state sooner
In qemu driver we listen to virtio channel events like an agent
connected to or disconnected from the guest part of socket.
However, with a little exception - when we find out that the
socket in question is the guest agent one, we connect or
disconnect guest agent which is done prior setting new state in
internal structure. Due to a bug in our code it may happen that
we got the event but failed to set it in internal structure
representing the channel.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-01-11 17:16:29 +01:00
Martin Kletzander
6dc0e4f171 Fix LSB requirements in service script and sync them
Commit b22344f328 mistakenly reordered
Default-* lines.  Thanks to that I noticed that we are very inconsistent
with our init scripts, so I took the liberty of synchronizing them,
updating them and making them all look shiny and new.  So apart from
fixing the LSB requirements, I also fixed the ordering, specified
runlevels and fix the link to the reference specification.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-01-11 15:49:13 +01:00
Michal Privoznik
506e9d6c2d virDomainGetTime: Deny on RO connections
We have a policy that if API may end up talking to a guest agent
it should require RW connection. We don't obey the rule in
virDomainGetTime().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-01-11 13:36:19 +01:00
Michal Privoznik
95c370f0ee virDomainInterfaceAddresses: Allow API on RO connection too
This API does not change domain state. However, we have a policy
that an API talking to a guest agent requires RW access. But that
happens only if source == VIR_DOMAIN_INTERFACE_ADDRESSES_SRC_AGENT.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-01-11 13:36:19 +01:00
Martin Kletzander
8223bd22ed Don't clear libvirt-internal paths when parsing status XML
Earlier commit 7140807917 forgot to deal
properly with status XMLs where we want the libvirt-internal paths to be
kept in place and not cleared, otherwise we could end up copying a NULL
string and segfaulting th daemon.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-01-11 10:54:50 +01:00
Martin Kletzander
93103da84b Provide parse flags to PostParse functions
This way both Domain and Device PostParse functions can act based on the
flags.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-01-11 10:54:50 +01:00
Cole Robinson
fde937bda0 qemu: command: wire up usage of q35/ich9 disable s3/s4
If the q35 specific disable s3/s4 setting isn't supported, fallback to
specifying the PIIX setting, which is the previous behavior. It doesn't
have any effect, but qemu will just warn about it rather than error:

  qemu-system-x86_64: Warning: global PIIX4_PM.disable_s3=1 not used
  qemu-system-x86_64: Warning: global PIIX4_PM.disable_s4=1 not used

Since it doesn't error, I don't think we should either, since there
may be configs in the wild that already have q35 + disable_s3/4 (via
virt-manager)
2016-01-10 15:16:38 -05:00
Cole Robinson
c77fd89000 qemu: caps: check for q35/ICH9 disable S3/S4
Update test data to match
2016-01-10 14:59:53 -05:00
Cole Robinson
5900356efb qemu: caps: Rename CAPS_DISABLE_S[34] to CAPS_PIIX_DISABLE_S[34]
These settings are specific to PIIX, so clarify it
2016-01-10 14:59:53 -05:00
Cole Robinson
ab963449dc qemu: capabilities: s/Pixx/Piix/g
The chipset is called PIIX; the functions are misnamed
2016-01-10 14:59:53 -05:00
Michal Privoznik
b7fac9f77f virDomainMigrateUnmanagedParams: Don't blindly dereference @dconnuri
This function may be called with @dconnuri == NULL, e.g. from
virDomainMigrateToURI3() if the flags are missing
VIR_MIGRATE_PEER2PEER flag. Moreover, all later functions called
from here do wrap it into NULLSTR() so why not do the same here?

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2016-01-09 18:44:44 +01:00
Martin Kletzander
8156493d8d Fix USB model defaults for ppc64
The condition was checking for UHCI (and OHCI for ppc64) availability so
that it can specify the proper device instead of legacy usb.  However,
for ppc64, we don't need to check both OHCI and UHCI, but only OHCI as
that is the legacy default.  The condition is so big that it was just a
matter of time when someone will make a mistake there, so let's use more
lines so that it is visible what the condition checks for.

This fixes usage of -device instead of -usb for ppc64 that supports
pci-usb-ohci and does not support piix3-usb-uhci.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1297020

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-01-09 18:39:17 +01:00
Jim Fehlig
f988ecfb34 libxl: support vif outgoing bandwidth QoS
The libxl_device_nic structure supports specifying an outgoing rate
limit based on a time interval and bytes allowed per interval. In xl
config a rate limit is specified as "<RATE>/s@<INTERVAL>". INTERVAL
is optional and defaults to 50ms.

libvirt expresses outgoing limits by average (required), peak, burst,
and floor attributes in units of KB/s. This patch supports the outgoing
bandwidth limit by converting the average KB/s to bytes per interval
based on the same default interval (50ms) used by xl.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2016-01-08 18:56:00 -07:00