This patch creates two bitmaps, one for macvlan device names and one
for macvtap. The bitmap position is used to indicate that libvirt is
currently using a device with the name macvtap%d/macvlan%d, where %d
is the position in the bitmap. When requested to create a new
macvtap/macvlan device, libvirt will now look for the first clear bit
in the appropriate bitmap and derive the device name from that rather
than just starting at 0 and counting up until one works.
When libvirtd is restarted, the qemu driver code that reattaches to
active domains calls the appropriate function to "re-reserve" the
device names as it is scanning the status of running domains.
Note that it may seem strange that the retry counter now starts at
8191 instead of 5. This is because we now don't do a "pre-check" for
the existence of a device once we've reserved it in the bitmap - we
move straight to creating it; although very unlikely, it's possible
that someone has a running system where they have a large number of
network devices *created outside libvirt* named "macvtap%d" or
"macvlan%d" - such a setup would still allow creating more devices
with the old code, while a low retry max in the new code would cause a
failure. Since the objective of the retry max is just to prevent an
infinite loop, and it's highly unlikely to do more than 1 iteration
anyway, having a high max is a reasonable concession in order to
prevent lots of new failures.
The current code was a little bit odd. At first we've removed all
possible implicit input devices from domain definition to add them later
back if there was any graphics device defined while parsing XML
description. That's not all, while formating domain definition to XML
description we at first ignore any input devices with bus different to
USB and VIRTIO and few lines later we add implicit input devices to XML.
This seems to me as a lot of code for nothing. This patch may look
to be more complicated than original approach, but this is a preferred
way to modify/add driver specific stuff only in those drivers and not
deal with them in common parsing/formating functions.
The update is to add those implicit input devices into config XML to
follow the real HW configuration visible by guest OS.
There was also inconsistence between our behavior and QEMU's in the way,
that in QEMU there is no way how to disable those implicit input devices
for x86 architecture and they are available always, even without graphics
device. This applies also to XEN hypervisor. VZ driver already does its
part by putting correct implicit devices into live XML.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The VIR_DOMAIN_STATS_VCPU flag to virDomainListGetStats
enables reporting of stats about vCPUs. Currently we
only report the cumulative CPU running time and the
execution state.
This adds reporting of the wait time - time the vCPU
wants to run, but the host scheduler has something else
running ahead of it.
The data is reported per-vCPU eg
$ virsh domstats --vcpu demo
Domain: 'demo'
vcpu.current=4
vcpu.maximum=4
vcpu.0.state=1
vcpu.0.time=1420000000
vcpu.0.wait=18403928
vcpu.1.state=1
vcpu.1.time=130000000
vcpu.1.wait=10612111
vcpu.2.state=1
vcpu.2.time=110000000
vcpu.2.wait=12759501
vcpu.3.state=1
vcpu.3.time=90000000
vcpu.3.wait=21825087
In implementing this I notice our reporting of CPU execute
time has very poor granularity, since we are getting it
from /proc/$PID/stat. As a future enhancement we should
prefer to get CPU execute time from /proc/$PID/schedstat
or /proc/$PID/sched (if either exist on the running kernel)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Since 'savevm' was not converted to QMP libvirt has to parse for error
strings in the text monitor output. One of the unhandled errors is
produced when qemu treats a device as unmigratable.
As current qemu actually does support AHCI migration this bug is
applicable only to older versions of qemu.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1293899
When acpi is used to reboot/shutdown qemu domain, qemu emits
SHUTDOWN event. Libvirt uses fakeReboot variable in order to
differentiate reboot or shutdown. fakeReboot value is reseted
to false after domain restart/reset.
When mode=agent is used to reboot qemu domain, qemu doesn't emit
SHUTDOWN event and libvirt doesn't reset fakeReboot value to false.
In this case next 'shutdown -h now' performs reboot. That's why
we don't need to set fakeReboot=true for mode=agent.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
So I can observe this crasher that with freshly started daemon
(and virtlogd enabled) I am trying to startup a domain that
immediately dies (because it's said to use huge pages but I
haven't allocated a single one in the pool). Hardly reproducible
with -O0 or under valgrind. But I just got lucky:
==20469== Invalid write of size 8
==20469== at 0x4C2E99B: memcpy@GLIBC_2.2.5 (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==20469== by 0x217EDD07: qemuProcessReadLog (qemu_process.c:1670)
==20469== by 0x217EDE1D: qemuProcessReportLogError (qemu_process.c:1696)
==20469== by 0x217EE8C1: qemuProcessWaitForMonitor (qemu_process.c:1957)
==20469== by 0x217F6636: qemuProcessLaunch (qemu_process.c:4955)
==20469== by 0x217F71A4: qemuProcessStart (qemu_process.c:5152)
==20469== by 0x21846582: qemuDomainObjStart (qemu_driver.c:7396)
==20469== by 0x218467DE: qemuDomainCreateWithFlags (qemu_driver.c:7450)
==20469== by 0x21846845: qemuDomainCreate (qemu_driver.c:7468)
==20469== by 0x5611CD0: virDomainCreate (libvirt-domain.c:6753)
==20469== by 0x125D9A: remoteDispatchDomainCreate (remote_dispatch.h:3613)
==20469== by 0x125CB7: remoteDispatchDomainCreateHelper (remote_dispatch.h:3589)
==20469== Address 0x27a52ad0 is 0 bytes after a block of size 5,584 alloc'd
==20469== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==20469== by 0x9B8D1DB: xdr_string (in /lib64/libc-2.21.so)
==20469== by 0x563B39C: xdr_virLogManagerProtocolNonNullString (log_protocol.c:24)
==20469== by 0x563B6B7: xdr_virLogManagerProtocolDomainReadLogFileRet (log_protocol.c:123)
==20469== by 0x164B34: virNetMessageDecodePayload (virnetmessage.c:407)
==20469== by 0x5682360: virNetClientProgramCall (virnetclientprogram.c:379)
==20469== by 0x563B30E: virLogManagerDomainReadLogFile (log_manager.c:272)
==20469== by 0x217CD613: qemuDomainLogContextRead (qemu_domain.c:2485)
==20469== by 0x217EDC76: qemuProcessReadLog (qemu_process.c:1660)
==20469== by 0x217EDE1D: qemuProcessReportLogError (qemu_process.c:1696)
==20469== by 0x217EE8C1: qemuProcessWaitForMonitor (qemu_process.c:1957)
==20469== by 0x217F6636: qemuProcessLaunch (qemu_process.c:4955)
This points to memmove() in qemuProcessReadLog(). Imagine we just
read the following string from qemu:
"abc\n2016-01-18T09:40:44.022744Z qemu-system-x86_64: Error\n"
After the first pass of the while() loop in the
qemuProcessReadLog() (in which we have taken the false branch in
the if) @buf still points to the beginning of the string,
@filter_next points to the beginning of the second line. So we
start second iteration because there is yet another newline
character at the end. In this iteration @eol points to it
actually. Now, the control gets inside true branch of if(). Just
to remind you:
got = 58
filter_next = buf + 5,
eol = buf + 58.
Therefore skip = 54 which is correct. The message we want to skip
is 54 bytes long. However:
memmove(filter_next, eol + 1, (got - skip) +1);
which is
memmove(filter_next, eol + 1, 5)
is obviously wrong as there is only one byte we can access, not 5!
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We have this function qemuAgentNotifyEvent() which is supposed to
be called from thread pool responsible for processing qemu
monitor events. The function then should wake up other thread
that is waiting for a guest to shutdown or reboot. However, if we
have received a different error a warning is printed out. This
warning lacks info on which event is expected.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit id '90b721e43' moved where the virCgroupAddTask was made until
after the check for the vcpupin checks. However, in doing so it missed
an option where if the cpumap didn't exist, then the code would continue
back to the top of the current vcpu loop. The results was that the
virCgroupAddTask wouldn't be called.
Signed-off-by: John Ferlan <jferlan@redhat.com>
This reverts commit a41c00b472.
After much testing and upstream discussion this has been deemed to be
the incorrect operation since it means we no longer have any guarantee
about which resource controllers the QEMU processes in general are in.
So, you try to start a domain, but before we even get to the part
where chardev part of qemu command line is generated (and
possibly missing path to unix sockets is made up) an error occurs
which results in calling qemuProcessStop. This will then try to
clean up the mess and possibly ends up calling unlink(NULL).
==8085== Thread 3:
==8085== Syscall param unlink(pathname) points to unaddressable byte(s)
==8085== at 0xA85EA57: unlink (in /lib64/libc-2.21.so)
==8085== by 0x213D3C24: qemuProcessCleanupChardevDevice (qemu_process.c:2866)
==8085== by 0x558D6B1: virDomainChrDefForeach (domain_conf.c:22924)
==8085== by 0x213DA9AE: qemuProcessStop (qemu_process.c:5326)
==8085== by 0x213DA2F2: qemuProcessStart (qemu_process.c:5190)
==8085== by 0x2142957F: qemuDomainObjStart (qemu_driver.c:7396)
==8085== by 0x214297DB: qemuDomainCreateWithFlags (qemu_driver.c:7450)
==8085== by 0x21429842: qemuDomainCreate (qemu_driver.c:7468)
==8085== by 0x5611B95: virDomainCreate (libvirt-domain.c:6753)
==8085== by 0x125D9A: remoteDispatchDomainCreate (remote_dispatch.h:3613)
==8085== by 0x125CB7: remoteDispatchDomainCreateHelper (remote_dispatch.h:3589)
==8085== by 0x568BF41: virNetServerProgramDispatchCall (virnetserverprogram.c:437)
==8085== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==8085==
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Autodeflate can be enabled/disabled for memballon device
of model 'virtio'.
xml:
<devices>
<memballoon model='virtio' autodeflate='on'/>
</devices>
qemu:
qemu -device virtio-balloon-pci,...,deflate-on-oom=on
Autodeflate cannot be enabled/disabled for running domain.
Use virDomainDefAddUSBController() to add an EHCI1+UHCI1+UHCI2+UHCI3
controller set to newly defined Q35 domains that don't have any USB
controllers defined.
The real Q35 machine puts the first USB controller set (EHCI+(UHCIx4))
on bus 0 slot 0x1D, and the 2nd USB controller set on bus 0 slot 0x1A,
so let's attempt to make the virtual machine match that for
controllers with auto-assigned addresses when possible.
Three test cases were added to assure that the proper addresses are
assigned - one with a single set of unaddressed USB controllers, one
with 3 (to grab both preferred slots plus one more), and one with the
order of the controller definitions reordered, to assure that the
auto-assignment isn't mixed up by order.
When qemuAssignDevicePCISlots() is looking for companion controllers
for a USB controller that has no PCI address specified, it initializes
a virDevicePCIAddress to 0000:00:00.0, fills it in with the
companion's address if one is found, then checks whether or not there
was a find based on slot == 0. On a system with a single PCI bus, that
is a valid way to check, because slot 0 is reserved, but on most other
PCI buses, slot 0 is not reserved, and is open for use by any
device. This patch adds a separate bool that is set when a companion
is found rather than relying on the faulty information provided with
"slot == 0".
While this is no functional change, whole channel definition is
going to be needed very soon. Moreover, while touching this obey
const correctness rule in qemuAgentOpen() - so far it was passed
regular pointer to channel config even though the function is
expected to not change pointee at all. Pass const pointer
instead.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In qemu driver we listen to virtio channel events like an agent
connected to or disconnected from the guest part of socket.
However, with a little exception - when we find out that the
socket in question is the guest agent one, we connect or
disconnect guest agent which is done prior setting new state in
internal structure. Due to a bug in our code it may happen that
we got the event but failed to set it in internal structure
representing the channel.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Earlier commit 7140807917 forgot to deal
properly with status XMLs where we want the libvirt-internal paths to be
kept in place and not cleared, otherwise we could end up copying a NULL
string and segfaulting th daemon.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
If the q35 specific disable s3/s4 setting isn't supported, fallback to
specifying the PIIX setting, which is the previous behavior. It doesn't
have any effect, but qemu will just warn about it rather than error:
qemu-system-x86_64: Warning: global PIIX4_PM.disable_s3=1 not used
qemu-system-x86_64: Warning: global PIIX4_PM.disable_s4=1 not used
Since it doesn't error, I don't think we should either, since there
may be configs in the wild that already have q35 + disable_s3/4 (via
virt-manager)
This function may be called with @dconnuri == NULL, e.g. from
virDomainMigrateToURI3() if the flags are missing
VIR_MIGRATE_PEER2PEER flag. Moreover, all later functions called
from here do wrap it into NULLSTR() so why not do the same here?
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The condition was checking for UHCI (and OHCI for ppc64) availability so
that it can specify the proper device instead of legacy usb. However,
for ppc64, we don't need to check both OHCI and UHCI, but only OHCI as
that is the legacy default. The condition is so big that it was just a
matter of time when someone will make a mistake there, so let's use more
lines so that it is visible what the condition checks for.
This fixes usage of -device instead of -usb for ppc64 that supports
pci-usb-ohci and does not support piix3-usb-uhci.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1297020
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
memory_dirty_rate corresponds to dirty-pages-rate in QEMU and
memory_iteration is what QEMU reports in dirty-sync-count.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The structure actually contains migration statistics rather than just
the status as the name suggests. Renaming it as
qemuMonitorMigrationStats removes the confusion.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
A migration is in "setup" state after it was "inactive" and before it
becomes "active". Let's reflect this in our migration status enum.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
My commit 674afcb09e moved computing the
default listen address from qemuMigrationPrepareAny to
qemuMigrationPrepareIncoming. However, I didn't notice listenAddress was
later passed to qemuMigrationStartNBDServer. Thus, it would be called
with the original value of listenAddress (NULL).
Let's add the updated listen address to qemuProcessIncomingDef and use
it when starting NBD servers.
Reported-by: Michael Chapman <mike@very.puzzling.org>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
If user defines a virtio channel with UNIX socket backend and doesn't
care about the path for the socket (e.g. qemu-agent channel), we still
generate it into the persistent XML. Moreover when then user renames
the domain, due to its persistent socket path saved into the per-domain
directory, it will not start. So let's forget about old generated paths
and also stop putting them into the persistent definition.
https://bugzilla.redhat.com/show_bug.cgi?id=1278068
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Just recently, qemu forbade specifying format for sourceless
disks (qemu commit 39c4ae941ed992a3bb5). It kind of makes sense.
If there's no file to open, why specify its format. Anyway, I
have a domain like this:
<disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<target dev='hda' bus='ide'/>
<readonly/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>
and obviously I am unable to start it. Therefore, a fix on our
side is needed too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For completeness, use the VIR_TRISTATE_SWITCH_ABSENT for data.file.append
comparisons. Commit ids '70ffa02f' and '53a15aed' just went with the non
zero comparison.