Commit Graph

15678 Commits

Author SHA1 Message Date
Jim Fehlig
5b74103b0b Xen: support maxvcpus in xm and xl config
From: Ian Campbell <ian.campbell@citrix.com>

xend prior to 4.0 understands vcpus as maxvcpus and vcpu_avail
as a bit map of which cpus are online (default is all).

xend from 4.0 onwards understands maxvcpus as maxvcpus and
vcpus as the number which are online (from 0..N-1). The
upstream commit (68a94cf528e6 "xm: Add maxvcpus support")
claims that if maxvcpus is omitted then the old behaviour
(i.e. obeying vcpu_avail) is retained, but AFAICT it was not,
in this case vcpu==maxcpus==online cpus. This is good for us
because handling anything else would be fiddly.

This patch changes parsing of the virDomainDef maxvcpus and vcpus
entries to use the corresponding 'maxvcpus' and 'vcpus' settings
from xm and xl config. It also drops use of the old Xen 3.x
'vcpu_avail' setting.

The change also removes the maxvcpus limit of MAX_VIRT_VCPUS (since
maxvcpus is simply a count, not a bit mask), which is particularly
crucial on ARM where MAX_VIRT_CPUS == 1 (since all guests are
expected to support vcpu placement, and therefore only the boot
vcpu's info lives in the shared info page).

Existing tests adjusted accordingly, and new tests added for the
'maxvcpus' setting.
2015-12-18 17:52:00 -07:00
John Ferlan
7d792b99b8 libvirt: Add virStorageVolDeleteFlags to virStorageVolDelete
Although they've been present for quite a while, they weren't added
to the API definition, so add them there to make it clearer.

Currently only the RBD backend even checks for any flags.
2015-12-18 10:51:08 -05:00
John Ferlan
be783825af storage: Add virCheckFlags to virStorageBackendRBDDeleteVol
The initial commit '74951eade' did not include the proper check for whether
any flags are supported by the driver.

Even though the driver doesn't support VIR_STORAGE_VOL_DELETE_ZEROED,
it still checks and allows the processing to continue

Also add the new VIR_STORAGE_VOL_DELETE_WITH_SNAPSHOTS since it is handled
as of commit id '3c7590e0a'.
2015-12-18 10:51:08 -05:00
John Ferlan
ae09988eb7 lxc_cgroup: Add check for NULL cgroup before AddTask call
Commit id '71ce4759' altered the cgroup processing with respect to the
call to virCgroupAddTask being moved out from lower layers into the calling
layers especially for qemu processing of emulator and vcpu threads. The
movement affected lxc insomuch as it is possible for a code path to
return a NULL cgroup *and* a 0 return status via virCgroupNewPartition
failure when virCgroupNewIgnoreError succeeded when virCgroupNewMachineManual
returns. Coverity pointed out that would cause virCgroupAddTask to core.

This patch will check for a NULL cgroup as well as the negative return
and just return the NULL cgroup to the caller (as it would have previously)
2015-12-18 08:59:34 -05:00
Jim Fehlig
be08842e47 Xen: remove xendConfigVersion from driver private struct
xendConfigVersion is no longer used, so remove it from the
xenUnifiedPrivate struct.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2015-12-17 21:28:48 -07:00
Jim Fehlig
60ac2aa821 Xen: xenconfig: remove xendConfigVersion from public sexpr functions
Remove use of xendConfigVersion in the s-expresion config formatter/parser
in src/xenconfig/. Adjust callers in the xen and libxl drivers accordingly.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2015-12-17 21:28:48 -07:00
Jim Fehlig
b6fa951897 Xen: xend: remove use of XEND_CONFIG_VERSION
Remove use of XEND_CONFIG_VERSION_* in xend_internal.c

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2015-12-17 21:28:47 -07:00
Jim Fehlig
64d14daa4b Xen: xen_driver: remove use of XEND_CONFIG_VERSION
Remove use of XEND_CONFIG_VERSION_* in the Xen unified driver.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2015-12-17 21:28:29 -07:00
Jim Fehlig
5ff2e370af Xen: xenconfig: remove use of XEND_CONFIG_VERSION in xen_sxpr
Remove use of XEND_CONFIG_VERSION_* in s-expression parser/formatter.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2015-12-17 21:22:34 -07:00
Jim Fehlig
bec993d60a Xen: xenconfig: remove disks from '(image)' sexpr
It has been quite some time since xend required specifying cdroms
and fds in '(image (hvm ...))'. Remove the code from the parsing
and formatting functions and fixup the associated tests.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2015-12-17 21:22:34 -07:00
Jim Fehlig
7d7c08be1c Xen: xenconfig: remove xendConfigVersion from public functions
Remove use of xendConfigVersion in the xm and xl config formatter/parsers
in src/xenconfig/. Adjust callers in the xen and libxl drivers accordingly.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2015-12-17 21:22:33 -07:00
Jim Fehlig
0f58db3092 Xen: xenconfig: remove use of XEND_CONFIG_VERSION in xen_xm
Remove use of XEND_CONFIG_VERSION_* in xm parser/formatter.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2015-12-17 21:22:33 -07:00
Jim Fehlig
4796d7b34b Xen: xenconfig: remove XEND_CONFIG_VERSION in common code
Remove use of XEND_CONFIG_VERSION_* from xenconfig/xen_common.c

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
2015-12-17 21:22:33 -07:00
John Ferlan
aeb1078ab5 storage: Add flags to allow building pool during create processing
https://bugzilla.redhat.com/show_bug.cgi?id=830056

Add flags handling to the virStoragePoolCreate and virStoragePoolCreateXML
API's which will allow the caller to provide the capability for the storage
pool create API's to also perform a pool build during creation rather than
requiring the additional buildPool step. This will allow transient pools
to be defined, built, and started.

The new flags are:

    * VIR_STORAGE_POOL_CREATE_WITH_BUILD
      Perform buildPool without any flags passed.

    * VIR_STORAGE_POOL_CREATE_WITH_BUILD_OVERWRITE
      Perform buildPool using VIR_STORAGE_POOL_BUILD_OVERWRITE flag.

    * VIR_STORAGE_POOL_CREATE_WITH_BUILD_NO_OVERWRITE
      Perform buildPool using VIR_STORAGE_POOL_BUILD_NO_OVERWRITE flag.

It is up to the backend to handle the processing of build flags. The
overwrite and no-overwrite flags are mutually exclusive.

NB:
This patch is loosely based upon code originally authored by Osier
Yang that were not reviewed and pushed, see:

https://www.redhat.com/archives/libvir-list/2012-July/msg01328.html
2015-12-17 11:56:18 -05:00
Ján Tomko
22f9754f1b mark virDomainVirtioSerialAddrSetAddController as static.
This function is no longer used outside domain_addr.c
2015-12-17 16:57:25 +01:00
Ján Tomko
36d7a36158 Remove dead code from qemuDomainAttachControllerDevice
We only support hotplugging SCSI controllers.
The USB and virtio-serial related code was never reachable because
this function was only called for VIR_DOMAIN_CONTROLLER_TYPE_SCSI
controllers.

This reverts commit ee0d97a and parts of commits 16db8d2
and d6d54cd1.
2015-12-17 16:57:25 +01:00
Ján Tomko
aaa42d905a qemu_hotplug: remove qemuDomainAttachDeviceControllerLive
This function calls qemuDomainAttachControllerDevice for SCSI
controllers and reports an error for all other controllers.

Move the error inside qemuDomainAttachControllerDevice and delete this
wrapper.
2015-12-17 16:57:25 +01:00
Cédric Bosdonnat
bec787ee9d Allow building lxc without virt-login-shell
Add a configure option to disable virt-login-shell build even if lxc is
enabled.
2015-12-17 15:49:06 +01:00
John Ferlan
8c865052b9 storage: Fix startup issue for logical pool
Commit id '71b803ac' assumed that the storage pool source device path
was required for a 'logical' pool. This resulted in a failure to start
a pool without any device path defined.

So, adjust the virStorageBackendLogicalMatchPoolSource logic to
return success if at least the pool name matches the vgs output
when no pool source device path is/are provided.
2015-12-17 08:20:22 -05:00
John Ferlan
5efcfb9695 qemu: Fix event generated for qemuDomainRevertToSnapshot (pause->run)
A closer review of the code shows that for the transition from paused to
running which was supposed to emit the VIR_DOMAIN_EVENT_RESUMED - no event
would be generated. Rather the event is generated when going from running
to running.

Following the 'was_running' boolean shows it is set when the domain obj
is active and the domain obj state is VIR_DOMAIN_RUNNING. So rather than
using was_running to generate the RESUMED event, use !was_running
2015-12-17 08:04:02 -05:00
John Ferlan
80ca86e54d storage: Attempt to refresh volume after successful wipe volume
https://bugzilla.redhat.com/show_bug.cgi?id=1270709

When a volume wipe is successful, perform a volume refresh afterwards to
update any volume data that may be used in future volume commands, such as
volume resize.  For a raw file volume, a wipe could truncate the file and
a followup volume resize the capacity may fail because the volume target
allocation isn't updated to reflect the wipe activity.
2015-12-17 07:30:03 -05:00
Ján Tomko
f61770a169 virStorageBackendWipeLocal: remove bytes_wiped argument
It is not used by the caller.
2015-12-17 12:44:35 +01:00
Ján Tomko
c3f7371c5e storage: drop 'Extent' from virStorageBackendWipeExtentLocal
The only caller always passes 0 for the extent start.
Drop the 'extent_start' parameter, as well as the mention of extents
from the function name.

Change off_t extent_length to unsigned long long wipe_len, as well as the
'remain' variable.
2015-12-17 12:44:35 +01:00
Ján Tomko
4bccdf0ceb storage: move buffer allocation inside virStorageBackendWipeExtentLocal
We do not need to pass a zero-filled buffer as an argument,
the function can allocate its own.
2015-12-17 12:44:35 +01:00
Ján Tomko
09cbfc0481 storage: fix return values of virStorageBackendWipeExtentLocal
Return -1:
* on all failures of fdatasync. Instead of propagating -errno
  all the way up to the virStorageVolWipe API, which is documented
  to return 0 or -1.
* after a partial wipe. If safewrite failed, we would re-use the
  non-negative return value of lseek (which should be 0 in this case,
  because that's the only offset we seek to).
2015-12-17 12:44:02 +01:00
Andrea Bolognani
242e3ea4e3 qemu: Replace Mlock with MemLock in function names
MemLock is already used in other modules and, while still an
abbreviation, is not ambiguous.
2015-12-17 10:12:47 +01:00
Andrea Bolognani
afbe1d4c56 qemu: Allow qemuDomainAdjustMaxMemLock() to restore previous value
When the function changes the memory lock limit for the first time,
it will retrieve the current value and store it inside the
virDomainObj for the domain.

When the function is called again, if memory locking is no longer
needed, it will be able to restore the memory locking limit to its
original value.
2015-12-17 10:12:47 +01:00
Andrea Bolognani
b583e80cb8 qemu: Reduce memlock limit after detaching PCI hostdev
We increase the limit before plugging in a PCI hostdev or a memory
module because some memory might need to be locked due to eg. VFIO.

Of course we should do the opposite after unplugging a device: this
was already the case for memory modules, but not for PCI hostdevs.
2015-12-17 10:12:47 +01:00
Andrea Bolognani
65909c7996 qemu: Use qemuDomainAdjustMaxMemLock()
Replace all uses of the qemuDomainRequiresMlock/virProcessSetMaxMemLock
combination with the equivalent qemuDomainAdjustMaxMemLock() call.
2015-12-17 10:12:47 +01:00
Andrea Bolognani
ac7e4df4f4 qemu: Add qemuDomainAdjustMaxMemLock()
This function detects whether a domain needs RLIMIT_MEMLOCK
to be set, and if so, uses an appropriate value.
2015-12-17 10:12:47 +01:00
Andrea Bolognani
bbefc9cc2e process: Add virProcessGetMaxMemLock()
This function can be used to retrieve the current locked memory
limit for a process, so that the setting can be later restored.

Add a configure check for getrlimit(), which we now use.
2015-12-17 10:12:47 +01:00
Andrea Bolognani
c2f797544f process: Allow virProcessPrLimit() to get current limit
The prlimit() function allows both getting and setting limits for
a process; expose the same functionality in our wrapper.

Add the const modifier for new_limit, in accordance with the
prototype for prlimit().
2015-12-17 10:12:47 +01:00
Martin Kletzander
68d4245d21 qemu: Search all nodes for shared memory access
In commit 686eb7a24f, the break was not considered part of the
condition, hence breaking after first node when searching.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2015-12-16 13:02:33 +01:00
Andrea Bolognani
7743454165 pci: Use virPCIDeviceAddress in virPCIDevice
Instead of replicating the information (domain, bus, slot, function)
inside the virPCIDevice structure, use the already-existing
virPCIDeviceAddress structure.

For users of the module, this means that the object returned by
virPCIDeviceGetAddress() can no longer be NULL and must no longer
be freed by the caller.
2015-12-16 09:07:25 +01:00
Joao Martins
b7b439196c libxl: implement virDomainGetJobStats
Introduces support for domainGetJobStats which has the same
info as domainGetJobInfo but in a slightly different format.
Another difference is that virDomainGetJobStats can also
retrieve info on the most recently completed job. Though so
far this is only used in the source node to know if the
migration has been completed. But because we don't support
completed jobs we will deliver an error.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
2015-12-15 15:21:38 -07:00
Joao Martins
ad71665104 libxl: implement virDomainGetJobInfo
Introduce support for domainGetJobInfo to get info about the
ongoing job. If the job is active it will update the
timeElapsed which is computed with the "started" field added to
struct libxlDomainJobObj.  For now we support just the very basic
info and all jobs have VIR_DOMAIN_JOB_UNBOUNDED (i.e. no completion
time estimation) plus timeElapsed computed.

Openstack Kilo uses the Job API to monitor live-migration
progress which is currently nonexistent in libxl driver and
therefore leads to a crash in the nova compute node. Right
now, migration doesn't use jobs in the source node and will
return VIR_DOMAIN_JOB_NONE. Though nova handles this case and
will migrate it properly instead of crashing.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
2015-12-15 15:21:37 -07:00
John Ferlan
71b803ac9a storage: Add helper to compare logical pool def against pvs output
https://bugzilla.redhat.com/show_bug.cgi?id=1025230

Add a new helper virStorageBackendLogicalMatchPoolSource to compare the
pool's source name against the output from a 'pvs' command to list all
volume group physical volume data on the host.  In addition, compare the
pool's source device list against the particular volume group's device
list to ensure the source device(s) listed for the pool match what the
was listed for the volume group.

Then for pool startup or check API's we need to call this new API in
order to ensure that the pool we're about to start or declare active
during checkPool has a valid definition vs. the running host.
2015-12-15 14:33:05 -05:00
John Ferlan
ae5519f7f8 storage: Create helper for virStorageBackendLogicalFindPoolSources
Rework virStorageBackendLogicalFindPoolSources a bit to create a
helper virStorageBackendLogicalGetPoolSources that will make the
pvs call in order to generate a list of associated pv_name and vg_name's.

A future patch will make use of this for start/check processing to
ensure the storage pool source definition matches expectations.
2015-12-15 14:33:04 -05:00
John Ferlan
dae7007d6e storage: Check FS pool source during virStorageBackendFileSystemIsMounted
https://bugzilla.redhat.com/show_bug.cgi?id=1025230

When determining whether a FS pool is mounted, rather than assuming that
the FS pool is mounted just because the target.path is in the mount list,
let's make sure that the FS pool source matches what is mounted
2015-12-15 14:33:04 -05:00
John Ferlan
61c29fe56f storage: Refactor virStorageBackendFileSystemGetPoolSource
Refactor code to use standard return functioning with respect to setting
a ret value and going to cleanup.
2015-12-15 14:33:04 -05:00
John Ferlan
1d1330f37e storage: Create helper to generate FS pool source value
Refactor the code that builds the pool source string during the FS
storage pool mount to be a separate helper.

A future patch will use the helper in order to validate the mounted
FS matches the pool's expectation during poolCheck processing
2015-12-15 14:33:00 -05:00
Laine Stump
a8e3247e65 qemu: add bootindex option to hostdev network interface commandline
when appropriate, of course. If the config for a domain specifies boot
order with <boot dev='blah'/> elements, e.g.:

     <os>
       ...
       <boot dev='hd'/>
       <boot dev='network'/>
     </os>

Then the first disk device in the config will have ",bootindex=1"
appended to its qemu commandline -device options, and the first (and
*only* the first) network interface device will get ",bootindex=2".

However, if the first network interface device is a "hostdev" device
(an SRIOV Virtual Function (VF) being assigned to the domain with
vfio), then the bootindex option will *not* be appended. This happens
because the bootindex=n option corresponding to the order of "<boot
dev='network'/>" is added to the -device for the first network device
when network device commandline args are constructed, but if it's a
hostdev network device, its commandline arg is instead constructed in
the loop for hostdevs.

This patch fixes that omission by noticing (in bootHostdevNet) if the
first network device was a hostdev, and if so passing on the proper
bootindex to the commandline generator for hostdev devices - the
result is that ",bootindex=2" will be properly appended to the first
"network" device in the config even if it is really a hostdev
(including if it is assigned from a libvirt network pool). (note that
this is only the case if there is no <bootmenu enabled='yes'/> element
in the config ("-boot menu-on" in qemu) , since the two are mutually
exclusive - when the bootmenu is enabled, the individual per-device
bootindex options can't be used by qemu, and we revert to using "-boot
order=xyz" instead).

If a greater level of control over boot order is desired (e.g., more
than one network device should be tried, or a network device other
than the first one encountered in the config), then <boot
dev='network'/> in the <os> element should not be used; instead, the
individual device elements in the config should be given a "<boot
order='n'/>

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1278421
2015-12-15 10:57:27 -05:00
Ján Tomko
077bdba5c2 security_stack: remove extra Security from function names
Many of the functions follow the pattern:
virSecurity.*Security.*Label

Remove the second 'Security' from the names, it should be
obvious that the virSecurity* functions deal with security
labels even without it.
2015-12-15 16:06:08 +01:00
Ján Tomko
ba9285b3a3 security_selinux: remove extra Security from function names
Many of the functions follow the pattern:
virSecurity.*Security.*Label

Remove the second 'Security' from the names, it should be obvious
that the virSecurity* functions deal with security labels even
without it.
2015-12-15 16:06:08 +01:00
Ján Tomko
be33e96533 security_dac: remove extra Security from function names
Many of the functions follow the pattern:
virSecurity.*Security.*Label

Remove the second 'Security' from the names, it should be obvious
that the virSecurity* functions deal with security labels even
without it.
2015-12-15 16:06:08 +01:00
Pavel Hrdina
cbd3d06541 qemuMonitorJSONEjectMedia: don't stringify the replay at all
Commit 256496e1 introduced a detection if "is locked" in error replay
from qemu monitor. Commit c4073657 fixed a memory leak, but it was
pointed out by Peter, that this could be done cleaner without
stringifing the replay.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2015-12-15 12:18:27 +01:00
Andrea Bolognani
90791fbf96 pci: Use 'addr' instead of 'dev' for virPCIDeviceAddressPtr
The name 'dev' is more appropriate for virPCIDevicePtr.
2015-12-15 11:19:17 +01:00
Michal Privoznik
c407365769 qemuMonitorJSONEjectMedia: Don't leak stringified reply
The return value of virJSONValueToString() should be freed when
no longer needed. This is not the case after 256496e1.

==26902== 138 bytes in 2 blocks are definitely lost in loss record 1,051 of 1,239
==26902==    at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==26902==    by 0xAA5F599: strdup (in /lib64/libc-2.21.so)
==26902==    by 0x552BAD9: virStrdup (virstring.c:726)
==26902==    by 0x54F60A7: virJSONValueToString (virjson.c:1790)
==26902==    by 0x1DF6EBB9: qemuMonitorJSONEjectMedia (qemu_monitor_json.c:2225)
==26902==    by 0x1DF57A4C: qemuMonitorEjectMedia (qemu_monitor.c:1985)
==26902==    by 0x1DF1EF2D: qemuDomainChangeEjectableMedia (qemu_hotplug.c:199)
==26902==    by 0x1DF90314: qemuDomainChangeDiskLive (qemu_driver.c:7985)
==26902==    by 0x1DF90476: qemuDomainUpdateDeviceLive (qemu_driver.c:8030)
==26902==    by 0x1DF91ED7: qemuDomainUpdateDeviceFlags (qemu_driver.c:8677)
==26902==    by 0x561785F: virDomainUpdateDeviceFlags (libvirt-domain.c:8559)
==26902==    by 0x134210: remoteDispatchDomainUpdateDeviceFlags (remote_dispatch.h:10966)

==26902== 106 bytes in 1 blocks are definitely lost in loss record 1,033 of 1,239
==26902==    at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==26902==    by 0xAA5F599: strdup (in /lib64/libc-2.21.so)
==26902==    by 0x552BAD9: virStrdup (virstring.c:726)
==26902==    by 0x54F60A7: virJSONValueToString (virjson.c:1790)
==26902==    by 0x1DF6EC0C: qemuMonitorJSONEjectMedia (qemu_monitor_json.c:2227)
==26902==    by 0x1DF57A4C: qemuMonitorEjectMedia (qemu_monitor.c:1985)
==26902==    by 0x1DF1EF2D: qemuDomainChangeEjectableMedia (qemu_hotplug.c:199)
==26902==    by 0x1DF90314: qemuDomainChangeDiskLive (qemu_driver.c:7985)
==26902==    by 0x1DF90476: qemuDomainUpdateDeviceLive (qemu_driver.c:8030)
==26902==    by 0x1DF91ED7: qemuDomainUpdateDeviceFlags (qemu_driver.c:8677)
==26902==    by 0x561785F: virDomainUpdateDeviceFlags (libvirt-domain.c:8559)
==26902==    by 0x134210: remoteDispatchDomainUpdateDeviceFlags (remote_dispatch.h:10966)

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-12-15 10:27:19 +01:00
Henning Schild
90b721e43e qemu cgroups: move new threads to new cgroup after cpuset is set up
Moving tasks to cgroups implied sched_setaffinity. Changing the cpus in
a set implies the same for all tasks in the group.
The old code put the the thread into the cpuset inherited from the
machine cgroup, which allowed it to run outside of vcpupin for a short
while.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
2015-12-14 15:58:05 -05:00
Henning Schild
a41c00b472 qemu: do not put a task into machine cgroup
The machine cgroup is a superset, a parent to the emulator and vcpuX
cgroups. The parent cgroup should never have any tasks directly in it.
In fact the parent cpuset might contain way more cpus than the sum of
emulatorpin and vcpupins. So putting tasks in the superset will allow
them to run outside of <cputune>.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
2015-12-14 15:48:05 -05:00
Henning Schild
71ce475967 util: cgroups do not implicitly add task to new machine cgroup
virCgroupNewMachine used to add the pidleader to the newly created
machine cgroup. Do not do this implicit anymore.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
2015-12-14 15:43:29 -05:00
Michal Privoznik
65e3451ea9 virNetDevMacVLanTapSetup: Drop @multiqueue argument
Firstly, there's a bug (or typo) in the only place where we call
this function: @multiqueue is set whenever @tapfdSize is greater
than zero, while in fact the condition should have been 'greater
than one'.
Then, secondly, since the condition depends on just one
variable, that we are even passing down to the function, we can
move the condition into the function and drop useless argument.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-12-14 15:58:18 +01:00
Martin Kletzander
686eb7a24f qemu: Warn when using vhost-user without shared memory
When user configures vhost-user interface and forgets to also configure
any shared memory, the search for the root cause of non-operational
interface might take unpleasantly long time.  Let's enhance user
experience by emitting a warning in the logs.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1266982

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2015-12-14 08:54:19 +01:00
Michal Privoznik
ec93cc25ec virNetDevMacVLanTapSetup: Work around older systems
Some older systems, e.g. RHEL-6 do not have IFF_MULTI_QUEUE flag
which we use to enable multiqueue feature. Therefore one gets the
following compile error there:

  CC     util/libvirt_util_la-virnetdevmacvlan.lo
util/virnetdevmacvlan.c: In function 'virNetDevMacVLanTapSetup':
util/virnetdevmacvlan.c:338: error: 'IFF_MULTI_QUEUE' undeclared (first use in this function)
util/virnetdevmacvlan.c:338: error: (Each undeclared identifier is reported only once
util/virnetdevmacvlan.c:338: error: for each function it appears in.)
make[3]: *** [util/libvirt_util_la-virnetdevmacvlan.lo] Error 1

So, whenever user wants us to enable the feature on such systems,
we will just throw a runtime error instead.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-12-13 08:35:46 +01:00
Eric Blake
034e47c338 CVE-2015-5313: storage: don't allow '/' in filesystem volume names
The libvirt file system storage driver determines what file to
act on by concatenating the pool location with the volume name.
If a user is able to pick names like "../../../etc/passwd", then
they can escape the bounds of the pool.  For that matter,
virStoragePoolListVolumes() doesn't descend into subdirectories,
so a user really shouldn't use a name with a slash.

Normally, only privileged users can coerce libvirt into creating
or opening existing files using the virStorageVol APIs; and such
users already have full privilege to create any domain XML (so it
is not an escalation of privilege).  But in the case of
fine-grained ACLs, it is feasible that a user can be granted
storage_vol:create but not domain:write, and it violates
assumptions if such a user can abuse libvirt to access files
outside of the storage pool.

Therefore, prevent all use of volume names that contain "/",
whether or not such a name is actually attempting to escape the
pool.

This changes things from:

$ virsh vol-create-as default ../../../../../../etc/haha --capacity 128
Vol ../../../../../../etc/haha created
$ rm /etc/haha

to:

$ virsh vol-create-as default ../../../../../../etc/haha --capacity 128
error: Failed to create vol ../../../../../../etc/haha
error: Requested operation is not valid: volume name '../../../../../../etc/haha' cannot contain '/'

Signed-off-by: Eric Blake <eblake@redhat.com>
2015-12-11 16:34:53 -07:00
John Ferlan
afe73ed468 util: Fixup virnetdevmacvlan.h ATTRIBUTE_NONNULL's
Commit id '56e2171c6' removed a variable from the argument list, but
neglected to update the ATTRIBUTE_NONNULL values, so when commit id
'08da97bfb' added a couple of arguments, the values were off.
2015-12-11 07:16:16 -05:00
Peter Krempa
ace1ee225f test: qemuxml2argv: Mock virMemoryMaxValue to remove 32/64 bit difference
Always return LLONG_MAX even on 32 bit systems. The limitation
originates from our use of "unsigned long" in several APIs. The internal
data type is unsigned long long. Make the test suite deterministic by
removing the architecture difference.

Flaw was introduced in 645881139b where
I've added a test that uses too large numbers.
2015-12-11 12:23:38 +01:00
Michal Privoznik
81a110edc7 qemu: Enable multiqueue for macvtaps
https://bugzilla.redhat.com/show_bug.cgi?id=1240439

Ta-da! Now that we know how to open a macvtap device multiple
times, we can finally enable the multiqueue feature. Everything
else is already prepared (e.g. command line generation) from the
previous iteration where the feature was implemented for
TUN/TAP devices.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-12-11 08:44:44 +01:00
Michal Privoznik
08da97bfb9 virNetDevMacVLanCreateWithVPortProfile: Rework to support multiple FDs
For the multiqueue on macvtaps we are going to need to open
the device multiple times. Currently, this is not supported.
Rework the function, so that upper layers can be reworked too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-12-11 08:44:43 +01:00
Michal Privoznik
1e90c744d5 virNetDevMacVLanTapSetup: Allow enabling of IFF_MULTI_QUEUE
Like we are doing for TUN/TAP devices, we should do the same for
macvtaps. Although, it's not as critical as in that case, we
should do it for the consistency.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-12-11 08:44:39 +01:00
Michal Privoznik
136fe2f7cc virNetDevMacVLanTapSetup: Rework to support multiple FDs
For the multiqueue on macvtaps we are going to need to open
the device multiple times. Currently, this is not supported.
Rework the function, so that upper layers can be reworked too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-12-11 08:42:50 +01:00
Michal Privoznik
d36897c765 virNetDevMacVLanTapOpen: Rework to support multiple FDs
For the multiqueue on macvtaps we are going to need to open
the device multiple times. Currently, this is not supported.
Rework the function, so that upper layers can be reworked too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-12-11 08:42:50 +01:00
Michal Privoznik
025a87065f virNetDevMacVLanTapOpen: Slightly rework
There are few outdated things. Firstly, we don't need to undergo
the torture of fopen, fscanf and fclose just to get the interface
index when we have nice wrapper over that: virNetDevGetIndex.
Secondly, we don't need to have statically allocated buffer for
the path we are opening.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-12-11 08:42:49 +01:00
Michal Privoznik
56e2171c6f virNetDevMacVLanCreateWithVPortProfile: Turn vnet_hdr into flag
So yet again one of integer arguments that we use as a boolean.
Since the argument count of the function is unbearably long
enough, lets turn those booleans into flags.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-12-11 08:42:49 +01:00
Daniel P. Berrange
1ce929603b log: include hostname in initial log message
On the very first log message we send to any output, we include
the libvirt version number and package string. In some bug reports
we have been given libvirtd.log files that came from a different
host than the corresponding /var/log/libvirt/qemu log files. So
extend the initial log message to include the hostname too.

eg on first log message we would now see:

 $ libvirtd
 2015-12-04 17:35:36.610+0000: 20917: info : libvirt version: 1.3.0
 2015-12-04 17:35:36.610+0000: 20917: info : hostname: dhcp-1-180.lcy.redhat.com
 2015-12-04 17:35:36.610+0000: 20917: error : qemuMonitorIO:687 : internal error: End of file from monitor

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-10 18:05:49 +00:00
John Ferlan
a523770c32 storage: Ignore block devices that fail format detection
https://bugzilla.redhat.com/show_bug.cgi?id=1276198

Prior to commit id '98322052' failure to saferead the block device would
cause an error to be logged and the device to be skipped while attempting
to discover/create a stable target path for a new LUN (NPIV).

This was because virStorageBackendSCSIFindLUs ignored errors from
processLU and virStorageBackendSCSINewLun.

Ignoring the failure allowed a multipath device with an "active" and
"ghost" to be present on the host with the "ghost" block device being
ignored. This patch will return a -2 to the caller indicating the desire
to ignore the block device since it cannot be used directly rather than
fail the pool startup.
2015-12-09 16:31:15 -05:00
John Ferlan
b3df72c4dd storage: Add debug message
I found this useful while processing a volume that wouldn't end up
showing up in the resulting list of block volumes. In this case, the
partition type wasn't found in the disk_types table.
2015-12-09 16:31:14 -05:00
John Ferlan
1bc84b0a08 storage: Handle readflags errors
Similar to the openflags VIR_STORAGE_VOL_OPEN_NOERROR processing, if some
read processing operation fails, check the readflags for the corresponding
error flag being set. If so, rather then causing an error - use VIR_WARN
to flag the error, but return -2 which some callers can use to perform
specific actions. Use a new VIR_STORAGE_VOL_READ_NOERROR flag in a new
VolReadErrorMode enum.
2015-12-09 16:31:14 -05:00
John Ferlan
1edfce9b18 storage: Set ret = -1 on failures in virStorageBackendUpdateVolTargetInfo
While processing the volume for lseek, virFileReadHeaderFD, and
virStorageFileGetMetadataFromBuf - failure would cause an error,
but ret would not be set. That would result in an error message being
sent, but successful status being returned.
2015-12-09 16:31:14 -05:00
John Ferlan
af4028dccd storage: Add comments for backend APIs
Just so it's clearer what to expect upon input and what types of return
values could be generated.  These were loosely copied from existing
virStorageBackendUpdateVolTargetInfoFD.
2015-12-09 16:31:14 -05:00
John Ferlan
22346003dc storage: Add readflags for backend error processing
Similar to the openflags which allow VIR_STORAGE_VOL_OPEN_NOERROR to be
passed to avoid open errors, add a 'readflags' variable so that in the
future read failures could also be ignored.
2015-12-09 16:31:14 -05:00
Peter Krempa
8715120e4d qemu: cgroup: Don't use priv->ncpupids to iterate domain vCPUs
Use the proper data structures for the iteration since ncpupids will be
made private later.
2015-12-09 14:57:12 +01:00
Peter Krempa
ce43cca0eb qemu: driver: Refactor qemuDomainHelperGetVcpus
Change some of the control structures and switch to using the new vcpu
structure.
2015-12-09 14:57:12 +01:00
Peter Krempa
e6b36736a8 qemu: Add helper to retrieve vCPU pid
Instead of directly accessing the array add a helper to do this.
2015-12-09 14:57:12 +01:00
Peter Krempa
220a2d51de qemu: Replace checking for vcpu<->pid mapping availability with a helper
Add qemuDomainHasVCpuPids to do the checking and replace in place checks
with it.

We no longer need checking whether the thread contains fake data
(vcpupids[0] == vm->pid) as in b07f3d821d
and 65686e5a81 this was removed.
2015-12-09 14:57:12 +01:00
Peter Krempa
e4bf9a3bcc qemu: Drop checking vcpu threads in emulator bandwidth getter/setter
The vCPU threads make sense in the counterparts that set the vCPU
bandwidth/quota, not in the emulator one. The emulator tunables are set
all the time anyways.

Drop the extra check and remove the now unneeded vm argument.
2015-12-09 14:57:12 +01:00
Peter Krempa
6ba02c21ac qemu: cgroup: Remove now unreachable check
Since commit 0c04906fa the check for priv->cgroup doesn't make sense as
the calls to virCgroupHasController return the same information. Remove
it and move it's comment partially to the new check.

The already spurious check was also later copied to the iothreads code.
2015-12-09 14:57:12 +01:00
Peter Krempa
233c3ac861 conf: Add helper to get pointer to a certain vCPU definition
Once more stuff will be moved into the vCPU data structure it will be
necessary to get a specific one in some ocasions. Add a helper that will
simplify this task.
2015-12-09 14:57:12 +01:00
Peter Krempa
24a7beea5a conf: ABI: Split up and improve vcpu info ABI checking
Extract the checking code into a separate function and prepare the
infrastructure for checking the new structure type.
2015-12-09 14:57:12 +01:00
Peter Krempa
4e86838d89 conf: turn def->vcpus into a structure
To allow collecting all relevant data at one place let's make def->vcpus
a structure and then we can start moving stuff into it.
2015-12-09 14:57:12 +01:00
Peter Krempa
9d5ac29eef qemu: refactor qemuDomainHotunplugVcpus
Refactor the code flow so that 'exit_monitor:' can be removed.

This patch moves the auditing functions into places where it's certain
that hotunplug was or was not successful and reports errors from
qemuMonitorGetCPUInfo properly.
2015-12-09 14:57:12 +01:00
Peter Krempa
de3db7d27f qemu: Refactor qemuDomainHotplugVcpus
Refactor the code flow so that 'exit_monitor:' can be removed.

This patch also moves the auditing and setting of the new vCPU count
right to the place where the hotplug happens, since it's possible that
the hotplug succeeds and adds a cpu while other stuff fails.

Lastly, failures of qemuMonitorGetCPUInfo are now reported rather than
ignored. The function retuns 0 if it "successfully" detected 0 threads.
2015-12-09 14:57:12 +01:00
Peter Krempa
3b3b98056d qemu: cpu hotplug: Move loops to qemuDomainSetVcpusFlags
qemuDomainHotplugVcpus/qemuDomainHotunplugVcpus are complex enough in
regards of adding one CPU. Additionally it will be desired to reuse
those functions later with specific vCPU hotplug.

Move the loops for adding vCPUs into qemuDomainSetVcpusFlags so that the
helpers can be made simpler and more straightforward.
2015-12-09 14:57:12 +01:00
Peter Krempa
7912d87920 qemu: monitor: Remove weird return values from qemuMonitorSetCPU
Let the function report errors internally and change it to return
standard return codes.
2015-12-09 14:57:12 +01:00
Peter Krempa
8cf65dabf2 qemu: cpu hotplug: Fix error handling logic
The cpu hotplug helper functions used negative error handling in a part
of them, although some code that was added later didn't properly set the
error codes in some cases. This would cause improper error messages in
cases where we couldn't modify the numa cpu mask and a few other cases.

Fix the logic by converting it to the regularly used pattern.
2015-12-09 14:57:12 +01:00
Peter Krempa
bb1d8d7a84 qemu: Split up vCPU hotplug and hotunplug
There's only very little common code among the two operations. Split the
functions so that the internals are easier to understand and refactor
later.
2015-12-09 14:57:12 +01:00
Peter Krempa
2642a36db5 qemu: qemuDomainSetVcpusAgent: re-check agent before calling it the again
With a very unfortunate timing, the agent might vanish before we do the
second call while the locks were down. Re-check that the agent is
available before attempting it again.
2015-12-09 14:57:12 +01:00
Peter Krempa
da6620ffac qemu: Extract vCPU onlining/offlining via agent into a separate function
Separate the code so that qemuDomainSetVcpusFlags contains only code
relevant to hardware hotplug/unplug.
2015-12-09 14:57:12 +01:00
Peter Krempa
31fea86564 qemu: domain: Add helper to access vm->privateData->agent
As in commit 88dc7e0c2f, the helper can be used in cases where the
function actually does not access anyting in the private data besides
the agent.
2015-12-09 14:57:12 +01:00
Peter Krempa
80a59b4aae conf: Turn def->maxvcpus into size_t
Later on this will also be used to track size of the vcpu data array.
Use size_t so that we can utilize the memory allocation helpers.
2015-12-09 14:57:12 +01:00
Peter Krempa
71c89ac9df conf: Replace read accesses to def->vcpus with accessor 2015-12-09 14:57:12 +01:00
Peter Krempa
80a8ac3c1e conf: Move vcpu count check into helper 2015-12-09 14:57:12 +01:00
Peter Krempa
957d597330 conf: Replace writes to def->vcpus with accessor 2015-12-09 14:57:12 +01:00
Peter Krempa
d1dda68777 conf: Replace read access to def->maxvcpus with accessor
Finalize the refactor by adding the 'virDomainDefGetVCpusMax' getter and
reusing it accross libvirt.
2015-12-09 14:57:12 +01:00
Peter Krempa
c970c4a5ea conf: Add helper to check whether domain has offline vCPUs
The new helper will simplify checking whether the domain config contains
inactive vCPUs.
2015-12-09 14:57:12 +01:00
Peter Krempa
4a194c55af conf: Extract update of vcpu count if maxvcpus is decreased
The code can be unified into the new accessor rather than being
scattered accross the drivers.
2015-12-09 14:57:12 +01:00
Peter Krempa
310992c344 conf: Use local copy of maxvcpus in virDomainVcpuParse
Use the local variable rather than getting it all the time from the
struct. This will simplify further refactors.
2015-12-09 14:57:12 +01:00
Peter Krempa
4e187169f0 conf: Replace writes to def->maxvcpus with accessor
To support further refactors replace all write access to def->maxvcpus
with a accessor function.
2015-12-09 14:57:12 +01:00
Pavel Hrdina
6aff1e9ca5 libxl: copy persistent domain definition while starting a guest
We should make a copy of current definition to preserve a persistent
definition, because we later update the definition with live changes.
The live definition is discarded on domain shutdown and replaced by the
copy we make before starting the domain.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2015-12-09 13:00:23 +01:00
Pavel Hrdina
5cc2d8f6d5 xen: fix timer bug found by updated test
Only 'tsc' timer allows set mode and track is valid only for 'rtc' and
'platform' timers.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2015-12-09 12:59:27 +01:00