In dc576025c3 we renamed virCgroupIsolateMount function to
virCgroupBindMount. However, we forgot about one occurrence in
section of the code which provides stubs for platforms without
support for CGroups like *BSD for instance.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
On the host when we start a container, it will be
placed in a cgroup path of
/machine.slice/machine-lxc\x2ddemo.scope
under /sys/fs/cgroup/*
Inside the containers' namespace we need to setup
/sys/fs/cgroup mounts, and currently will bind
mount /machine.slice/machine-lxc\x2ddemo.scope on
the host to appear as / in the container.
While this may sound nice, it confuses applications
dealing with cgroups, because /proc/$PID/cgroup
now does not match the directory in /sys/fs/cgroup
This particularly causes problems for systems and
will make it create repeated path components in
the cgroup for apps run in the container eg
/machine.slice/machine-lxc\x2ddemo.scope/machine.slice/machine-lxc\x2ddemo.scope/user.slice/user-0.slice/session-61.scope
This also causes any systemd service that uses
sd-notify to fail to start, because when systemd
receives the notification it won't be able to
identify the corresponding unit it came from.
In particular this break rabbitmq-server startup
Future kernels will provide proper cgroup namespacing
which will handle this problem, but until that time
we should not try to play games with hiding parent
cgroups.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The VIR_DOMAIN_STATS_VCPU flag to virDomainListGetStats
enables reporting of stats about vCPUs. Currently we
only report the cumulative CPU running time and the
execution state.
This adds reporting of the wait time - time the vCPU
wants to run, but the host scheduler has something else
running ahead of it.
The data is reported per-vCPU eg
$ virsh domstats --vcpu demo
Domain: 'demo'
vcpu.current=4
vcpu.maximum=4
vcpu.0.state=1
vcpu.0.time=1420000000
vcpu.0.wait=18403928
vcpu.1.state=1
vcpu.1.time=130000000
vcpu.1.wait=10612111
vcpu.2.state=1
vcpu.2.time=110000000
vcpu.2.wait=12759501
vcpu.3.state=1
vcpu.3.time=90000000
vcpu.3.wait=21825087
In implementing this I notice our reporting of CPU execute
time has very poor granularity, since we are getting it
from /proc/$PID/stat. As a future enhancement we should
prefer to get CPU execute time from /proc/$PID/schedstat
or /proc/$PID/sched (if either exist on the running kernel)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
After commit 57177f1, the cpu-stats command format change to:
CPU0:
cpu_time 14401.507878990 seconds
vcpu_time 14378732785511
vcpu_time is not user friendly. After this patch, it will
change back:
CPU0:
cpu_time 14401.507878990 seconds
vcpu_time 14378.732785511 seconds
https://bugzilla.redhat.com/show_bug.cgi?id=1301807
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Report
error: invalid argument: requested vcpu '100' is not present in the domain
instead of
error: invalid argument: requested vcpu is higher than allocated vcpus
A future patch will refactor the storage of the pinning information in a
way where the ordering will be lost. Order them numerically to avoid
changing the tests later.
virDomainGetCPUStats doesn't support flags so there's no need to carry
the 'flags' variable around. Additionally since the API is poorly
designed I doubt that it will be extended.
From the code it seems to me that we need user namespace if
configured in domain XML. Otherwise we don't use it at all.
However our tool is more strict about that. Fix this discrepancy.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Since the introduction of virt-host-validate tool the set of
cgroup controllers we use has changed so the tool is checking for
some cgroups that we don't need (e.g. net_cls, although I doubt
we have ever used that one) and is not checking for those we
actually use (e.g. cpuset).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1250331
It all works like this. The change-media command dumps domain
XML, finds the corresponding cdrom device we want to change media
in and returns it in the xmlNodePtr form. This way we don't have
to bother with keeping all the subelements or attributes that we
don't care about in the XML that is fed back to libvirt for the
update API.
Now, the problem is we try to be clever here and detect if disk
already has a source (indicated by <source/> subelement).
However, bare fact that the element is there does not mean disk
has source. Make our clever check better.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Since 'savevm' was not converted to QMP libvirt has to parse for error
strings in the text monitor output. One of the unhandled errors is
produced when qemu treats a device as unmigratable.
As current qemu actually does support AHCI migration this bug is
applicable only to older versions of qemu.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1293899
Make bhyveload respect boot order as specified by os.boot section of the
domain XML or by "boot order" for specific devices. As bhyve does not
support a real boot order specification right now, it's just about
choosing a single device to boot from.
libvirt always resets the MAC address of the physdev used for macvtap
passthrough when the guest is finished with it. This was happening
prior to the 802.1Qb[gh] DISASSOCIATE command, and was quite often
failing, presumably because the driver wouldn't allow the MAC address
to be reset while the association was still active, with a log message
like this:
virNetDevSetMAC:168 : Cannot set interface MAC to 00:00:00:00:00:00 on 'eth13': Cannot assign requested address
This patch changes the order - we now do the 802.1Qb[gh] disassociate
and delete the macvtap interface first, then and reset the MAC
address.
'free' on fedora23 wants to use the Slab field for calculated used
memory. The equation is:
used = MemTotal - MemFree - (Cached + Slab) - Buffers
We already set Cached and Buffers to 0, do the same for Slab and its
related values
https://bugzilla.redhat.com/show_bug.cgi?id=1300781
'free' on Fedora 23 will use MemAvailable to calculate its 'available'
field, but we are passing through the host's value. Set it to match
MemFree, which is what 'free' will do for older linux that don't have
MemAvailable
https://bugzilla.redhat.com/show_bug.cgi?id=1300781
We virtualize bits of /proc/meminfo by replacing host values with
values specific to the container.
However for calculating the final size of the returned data, we are
using the size of the original file and not the altered copy, which
could give garbelled output.
... and consolidate the cmdline/extra/root parsing to facilitate doing
so.
The logic is the same as xl's parse_cmdline from the current xen.git master
branch (e6f0e099d2c17de47fd86e817b1998db903cab61).
On the formatting side switch to producing cmdline= instead of extra=.
Update a few tests and add serveral more.
- test-cmdline is added to test the exclusive use of cmdline.
- test-fullvirt-direct-kernel-boot.cfg is updated due to the switch
on the formatting side and now tests the exclusive use of cmdline=.
- Tests are added for both paravirt and fullvirt where the .cfg uses
extra= and (paravirt only) root=. These are format (xl->xml) only
since the inverse will generate cmdline= hence is not a round trip
(which was already true if using root=, which used to generate
extra= on the way back).
- Tests are added for both paravirt and fullvirt where the .cfg
declares cmdline= as well as bogus extra= and (paravirt only) root=
entries which should be ignored. Again these are format only tests
since the inverse won't include the bogus lines.
The last two bullets here required splitting the DO_TEST macro into
two halves, as is done in the xmconfigtest.c case.
In order to introduce a use of VIR_WARN for logging I had to add
virerror.h and VIR_LOG_INIT.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
As suggested in a previous thread [0] this patch adds some missing calls
to libxl_dominfo_{init,dispose} when doing some of the libxl_domain_info
operations which would otherwise lead to memory leaks.
[0]
https://www.redhat.com/archives/libvir-list/2015-September/msg00519.html
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
The virErrorDomain enum has VIR_FROM_XEN, VIR_FROM_XEND,
VIR_FROM_XENSTORE, VIR_FROM_SEXPR, and VIR_FROM_XENXM. Use
these elements in the corresponding .c files. While at it,
remove the VIR_FROM_THIS define in src/xenconfig/xenxs_private.h.
The VIR_DOMAIN_EVENT_ID_MIGRATION_ITERATION event will be triggered
whenever VIR_DOMAIN_JOB_MEMORY_ITERATION changes its value, i.e.,
whenever a new iteration over guest memory pages is started during
migration.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When acpi is used to reboot/shutdown qemu domain, qemu emits
SHUTDOWN event. Libvirt uses fakeReboot variable in order to
differentiate reboot or shutdown. fakeReboot value is reseted
to false after domain restart/reset.
When mode=agent is used to reboot qemu domain, qemu doesn't emit
SHUTDOWN event and libvirt doesn't reset fakeReboot value to false.
In this case next 'shutdown -h now' performs reboot. That's why
we don't need to set fakeReboot=true for mode=agent.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
We are getting the list of domains and after that we iterate over
the list and try to get status for each domain hoping it will
skip over domains that disappeared meanwhile. However, this
solution to race is bogus - domain may disappear right after we
have checked its state and before we exec another API over it
(e.g. virDomainHasManagedSaveImage()). Also, when printing just
names or uuids (list --name / --uuid) we issue APIs to obtain the
values, however these require no RPC call as all requested info
is in virDomain object that client already has.
Therefore move the status obtaining only to the place that really
needs it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The generated output is dependent on perl hashtable ordering, which
gives different results for i686 and x86_64. Fix this by sorting
the hash keys before iterating over them
https://bugzilla.redhat.com/show_bug.cgi?id=1173641
When generating docs in a VPATH build we get a failure to
create a file due to the 'internals' subdir not existing:
Generating internals/locking.html.tmp
/bin/sh: line 3: internals/locking.html.tmp: No such file or directory
rm: cannot remove ‘internals/locking.html.tmp’: No such file or directory
Makefile:2229: recipe for target 'internals/locking.html.tmp' failed
make: *** [internals/locking.html.tmp] Error 1
For some reason, make has decided to run the target
%.html.tmp: %.html.in site.xsl page.xsl sitemap.html.in $(acl_generated)
instead of the target
internals/%.html.tmp: internals/%.html.in subsite.xsl page.xsl sitemap.html.in
Removing '$(acl_generated)' from the first target, inexplicably
causes make to now run the correct target for the internals/
files.
Rather than figure this out, lets just combine the two targets
into one.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Introduce virLeaseReadCustomLeaseFile which will populate
the new leases array with all the leases, except for expired
ones and the ones matching 'ip_to_delete'.
This removes five variables from main().
We either use the value from the environment variable, or learn it from
the existing lease file.
In the second case, the pointer would be pointing into the JSON object
of the first lease with a DUID, owned by leases_array, then
leases_array_new.
Always allocate the string instead, making obvious who should free the
string.
If dnsmasq specified DNSMASQ_IAID (so we're dealing with an IPv6
lease) but no DNSMASQ_MAC, we skip creation of the new lease object.
Also skip adding it to the leases array.
https://bugzilla.redhat.com/show_bug.cgi?id=1202350
https://bugzilla.redhat.com/show_bug.cgi?id=1265694
In order to be able to process disk storage pool's using a multipath
device to handle the partitions, libvirt_parthelper will need a way to
not automatically add a partition separator "p" to the generated device
name for each partition found. This is designed to mimic the multipath
features known as 'user_friendly_names' and custom 'alias' name.
If the part_separator attribute is set to "no", then generation of the
multipath partition name will not include the "p" partition separator
unless the source device path name ends with a number. The generated
partition names that get passed back to libvirt are processed in order
to find the device mapper multipath (dm-#) path device.
For example, device path "/dev/mapper/mpatha" would create partitions
"/dev/mapper/mpatha1", "/dev/mapper/mpatha2", etc. instead of
"/dev/mapper/mpathap1", "/dev/mapper/mpathap2", etc. If the device
path ends with a number "/dev/mapper/mpatha1", then the algorithm
to generate names "/dev/mapper/mpatha1p1", "/dev/mapper/mpatha1p2", etc.
would be utilized.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add a new storage pool source device attribute 'part_separator=[yes|no]'
in order to allow a 'disk' storage pool using a device mapper multipath
device to not add the "p" partition separator to the generated device
name when libvirt_parthelper is run.
This will allow libvirt to find device mapper multipath devices which were
configured in /etc/multipath.conf to use 'user_friendly_names' or custom
'alias' names for the LUN.
Since we pass dummy variables @fdout and @fdoutlen into
virNetClientProgramCall() we make it alloc @fdout array (even
though it's an array of 0 elements since vitlogd can hardly pass
us some FDs at this stage). Nevertheless, it's an allocation not
followed by free():
==29385== 0 bytes in 60 blocks are definitely lost in loss record 2 of 1,009
==29385== at 0x4C2C070: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==29385== by 0x54B99EF: virAllocN (viralloc.c:191)
==29385== by 0x56821B1: virNetClientProgramCall (virnetclientprogram.c:359)
==29385== by 0x563B304: virLogManagerDomainReadLogFile (log_manager.c:272)
==29385== by 0x217CD613: qemuDomainLogContextRead (qemu_domain.c:2485)
==29385== by 0x217EDC76: qemuProcessReadLog (qemu_process.c:1660)
==29385== by 0x217EDE1D: qemuProcessReportLogError (qemu_process.c:1696)
==29385== by 0x217EE8C1: qemuProcessWaitForMonitor (qemu_process.c:1957)
==29385== by 0x217F6636: qemuProcessLaunch (qemu_process.c:4955)
==29385== by 0x217F71A4: qemuProcessStart (qemu_process.c:5152)
==29385== by 0x21846582: qemuDomainObjStart (qemu_driver.c:7396)
==29385== by 0x218467DE: qemuDomainCreateWithFlags (qemu_driver.c:7450)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So I can observe this crasher that with freshly started daemon
(and virtlogd enabled) I am trying to startup a domain that
immediately dies (because it's said to use huge pages but I
haven't allocated a single one in the pool). Hardly reproducible
with -O0 or under valgrind. But I just got lucky:
==20469== Invalid write of size 8
==20469== at 0x4C2E99B: memcpy@GLIBC_2.2.5 (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==20469== by 0x217EDD07: qemuProcessReadLog (qemu_process.c:1670)
==20469== by 0x217EDE1D: qemuProcessReportLogError (qemu_process.c:1696)
==20469== by 0x217EE8C1: qemuProcessWaitForMonitor (qemu_process.c:1957)
==20469== by 0x217F6636: qemuProcessLaunch (qemu_process.c:4955)
==20469== by 0x217F71A4: qemuProcessStart (qemu_process.c:5152)
==20469== by 0x21846582: qemuDomainObjStart (qemu_driver.c:7396)
==20469== by 0x218467DE: qemuDomainCreateWithFlags (qemu_driver.c:7450)
==20469== by 0x21846845: qemuDomainCreate (qemu_driver.c:7468)
==20469== by 0x5611CD0: virDomainCreate (libvirt-domain.c:6753)
==20469== by 0x125D9A: remoteDispatchDomainCreate (remote_dispatch.h:3613)
==20469== by 0x125CB7: remoteDispatchDomainCreateHelper (remote_dispatch.h:3589)
==20469== Address 0x27a52ad0 is 0 bytes after a block of size 5,584 alloc'd
==20469== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==20469== by 0x9B8D1DB: xdr_string (in /lib64/libc-2.21.so)
==20469== by 0x563B39C: xdr_virLogManagerProtocolNonNullString (log_protocol.c:24)
==20469== by 0x563B6B7: xdr_virLogManagerProtocolDomainReadLogFileRet (log_protocol.c:123)
==20469== by 0x164B34: virNetMessageDecodePayload (virnetmessage.c:407)
==20469== by 0x5682360: virNetClientProgramCall (virnetclientprogram.c:379)
==20469== by 0x563B30E: virLogManagerDomainReadLogFile (log_manager.c:272)
==20469== by 0x217CD613: qemuDomainLogContextRead (qemu_domain.c:2485)
==20469== by 0x217EDC76: qemuProcessReadLog (qemu_process.c:1660)
==20469== by 0x217EDE1D: qemuProcessReportLogError (qemu_process.c:1696)
==20469== by 0x217EE8C1: qemuProcessWaitForMonitor (qemu_process.c:1957)
==20469== by 0x217F6636: qemuProcessLaunch (qemu_process.c:4955)
This points to memmove() in qemuProcessReadLog(). Imagine we just
read the following string from qemu:
"abc\n2016-01-18T09:40:44.022744Z qemu-system-x86_64: Error\n"
After the first pass of the while() loop in the
qemuProcessReadLog() (in which we have taken the false branch in
the if) @buf still points to the beginning of the string,
@filter_next points to the beginning of the second line. So we
start second iteration because there is yet another newline
character at the end. In this iteration @eol points to it
actually. Now, the control gets inside true branch of if(). Just
to remind you:
got = 58
filter_next = buf + 5,
eol = buf + 58.
Therefore skip = 54 which is correct. The message we want to skip
is 54 bytes long. However:
memmove(filter_next, eol + 1, (got - skip) +1);
which is
memmove(filter_next, eol + 1, 5)
is obviously wrong as there is only one byte we can access, not 5!
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When building with gcc-5 (particularly gcc-5.3.0 now) and having pdwtags
installed (package dwarves) make check fails with the following error:
$ make lock_protocol-struct
GEN lock_protocol-struct
--- lock_protocol-structs 2016-01-13 15:04:59.318809607 +0100
+++ lock_protocol-struct-t3 2016-01-13 15:05:17.703501234 +0100
@@ -26,10 +26,6 @@
virLockSpaceProtocolNonNullString name;
u_int flags;
};
-enum virLockSpaceProtocolAcquireResourceFlags {
- VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_SHARED = 1,
- VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_AUTOCREATE = 2,
-};
struct virLockSpaceProtocolAcquireResourceArgs {
virLockSpaceProtocolNonNullString path;
virLockSpaceProtocolNonNullString name;
Makefile:10415: recipe for target 'lock_protocol-struct' failed
make: *** [lock_protocol-struct] Error 1
That happens because without any specific options gcc doesn't keep enum
information in the resulting binary object. I managed to isolate the
parameters of gcc that caused this issue to disappear, however I
remember that they influenced the resulting binaries quite a bit and
were definitely not something we would want to add as mandatory to the
build process.
So to deal with this cleanly, let's take that enum and separate it out
to its own header file. Since it is only used in the lockd driver and
the protocol, lock_driver_lockd.h feels like a suitable name.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This was reported in bug #1298024 where r would be filled with the
return code of rbd_open().
Should rbd_snap_unprotect() fail for any reason the virReportSystemError
call would return 'Success' since rbd_open() succeeded.
https://bugzilla.redhat.com/show_bug.cgi?id=1298024
Signed-off-by: Wido den Hollander <wido@widodh.nl>