This reverts commit 372a14c67365660cf03260bfcacd01e9ce600ebb.
We shoul not have cherry-picked 28ae4ff0 without also cherry-picking
2223ea98, but the latter is too complex for a stable branch.
https://bugzilla.redhat.com/show_bug.cgi?id=816662 pointed out
that attempting 'virsh blockpull' on an offline domain gave a
misleading error message about qemu lacking support for the
operation, even when qemu was specifically updated to support it.
The real problem is that we have several capabilities that are
only determined when starting a domain, and therefore are still
clear when first working with an inactive domain (namely, any
capability set by qemuMonitorJSONCheckCommands).
While this patch was able to hoist an existing check in one of the
three culprits, it had to add redundant checks in the other two
places (because you always have to check for an active domain after
obtaining a VM job lock, but the capability bits were being checked
prior to obtaining the job lock).
Someday it would be nice to patch libvirt to cache the set of
capabilities per qemu binary (as determined by inode and timestamp),
rather than re-probing the binary every time a domain is started,
and to teach the cache how to query the monitor during the one
time the probe is made rather than having to wait until a guest
is started; then, a capability probe would succeed even for offline
guests because it just refers to the cache, and the single check for
an active domain after grabbing the job lock would be sufficient.
But since that will involve a lot more coding, I'm happy to go
with this simpler solution for an immediate solution.
* src/qemu/qemu_driver.c (qemuDomainPMSuspendForDuration)
(qemuDomainSnapshotCreateXML, qemuDomainBlockJobImpl): Check for
offline state before checking an online-only cap.
Conflicts:
src/qemu/qemu_driver.c
This patch addresses the following coverity findings:
/libvirt/src/conf/nwfilter_params.c:390:
var_assigned: Assigning: "varValue" = null return value from "virHashLookup".
/libvirt/src/conf/nwfilter_params.c:392:
dereference: Dereferencing a pointer that might be null "varValue" when calling "virNWFilterVarValueGetNthValue".
/libvirt/src/conf/nwfilter_params.c:399:
dereference: Dereferencing a pointer that might be null "tmp" when calling "virNWFilterVarValueGetNthValue".
This patch addresses the following coverity findings:
/libvirt/src/conf/nwfilter_params.c:157:
deref_parm: Directly dereferencing parameter "val".
/libvirt/src/conf/nwfilter_params.c:473:
negative_returns: Using variable "iterIndex" as an index to array "res->iter".
/libvirt/src/nwfilter/nwfilter_ebiptables_driver.c:2891:
unchecked_value: No check of the return value of "virAsprintf(&protostr, "-d 01:80:c2:00:00:00 ")".
/libvirt/src/nwfilter/nwfilter_ebiptables_driver.c:2894:
unchecked_value: No check of the return value of "virAsprintf(&protostr, "-p 0x%04x ", l3_protocols[protoidx].attr)".
/libvirt/src/nwfilter/nwfilter_ebiptables_driver.c:3590:
var_deref_op: Dereferencing null variable "inst".
Some of the error messages in this function should have been
virReportSystemError (since they have an errno they want to log), but
were mistakenly written as netlinkError, which expects a libvirt error
code instead. The result was that when one of the errors was
encountered, "No error message provided" would be printed instead of
something meaningful (see
https://bugzilla.redhat.com/show_bug.cgi?id=816465 for an example).
Once qemu monitor reports migration has completed, we just closed our
end of the pipe and let migration tunnel die. This generated bogus error
in case we did so before the thread saw EOF on the pipe and migration
was aborted even though it was in fact successful.
With this patch we first wake up the tunnel thread and once it has read
all data from the pipe and finished the stream we close the
filedescriptor.
A small additional bonus of this patch is that real errors reported
inside qemuMigrationIOFunc are not overwritten by virStreamAbort any
more.
When QEMU reported failed or canceled migration, we correctly detected
it but didn't really consider it as an error condition and migration
protocol just went on. Luckily, some of the subsequent steps eventually
failed end we reported an (unrelated and mostly random) error back to
the caller.
Currently, non-blocking calls are either sent immediately or discarded
in case sending would block. This was implemented based on the
assumption that the non-blocking keepalive call is not needed as there
are other calls in the queue which would keep the connection alive.
However, if those calls are no-reply calls (such as those carrying
stream data), the remote party knows the connection is alive but since
we don't get any reply from it, we think the connection is dead.
This is most visible in tunnelled migration. If it happens to be longer
than keepalive timeout (30s by default), it may be unexpectedly aborted
because the connection is considered to be dead.
With this patch, we only discard non-blocking calls when the last call
with a thread is completed and thus there is no thread left to keep
sending the remaining non-blocking calls.
In some cases (spotted with broken connection during tunneled migration)
we were overwriting the original error with worse or even misleading
errors generated when we were cleaning up after failed migration.
This patch resolves https://bugzilla.redhat.com/show_bug.cgi?id=815270
The function virNetDevMacVLanVPortProfileRegisterCallback() takes an
arg "virtPortProfile", and was checking it for non-NULL before using
it. However, the prototype for
virNetDevMacVLanPortProfileRegisterCallback had marked that arg with
ATTRIBUTE_NONNULL(). Contrary to what one may think,
ATTRIBUTE_NONNULL() does not provide any guarantee that an arg marked
as such really is always non-null; the only effect to the code
generated by gcc, is that gcc *assumes* it is non-NULL; this results
in, for example, the check for a non-NULL value being optimized out.
(Unfortunately, this code removal only occurs when optimization is
enabled, and I am in the habit of doing local builds with optimization
off to ease debugging, so the bug didn't show up in my earlier local
testing).
In general, virPortProfile might always be NULL, so it shouldn't be
marked as ATTRIBUTE_NONNULL. One other function prototype made this
same error, so this patch fixes it as well.
vboxArray is not castable to a COM item type. vboxArray is a
wrapper around the XPCOM and MSCOM specific array handling.
In this case we can avoid passing NULL as an empty array to
IMachine::Delete by passing a dummy IMedium* array with a single
NULL item.
virThreadSelf tries to access the virThreadPtr stored in TLS for the
current thread via TlsGetValue. When virThreadSelf is called on a thread
that was not created via virThreadCreate (e.g. the main thread) then
TlsGetValue returns NULL as TlsAlloc initializes TLS slots to NULL.
virThreadSelf can be called on the main thread via this call chain from
virsh
vshDeinit
virEventAddTimeout
virEventPollAddTimeout
virEventPollInterruptLocked
virThreadIsSelf
triggering a segfault as virThreadSelf unconditionally dereferences the
return value of TlsGetValue.
Fix this by making virThreadSelf check the TLS slot value for NULL and
setting the given virThreadPtr accordingly.
Reported by Marcel Müller.
POSIX says that sa_sigaction is only safe to use if sa_flags
includes SA_SIGINFO; conversely, sa_handler is only safe to
use when flags excludes that bit. Gnulib doesn't guarantee
an implementation of SA_SIGINFO, but does guarantee that
if SA_SIGINFO is undefined, we can safely define it to 0 as
long as we don't dereference the 2nd or 3rd argument of
any handler otherwise registered via sa_sigaction.
Based on a report by Wen Congyang.
* src/rpc/virnetserver.c (SA_SIGINFO): Stub for mingw.
(virNetServerSignalHandler): Avoid bogus dereference.
(virNetServerFatalSignal, virNetServerNew): Set flags properly.
(virNetServerAddSignalHandler): Drop unneeded #ifdef.
https://bugzilla.redhat.com/show_bug.cgi?id=617711 reported that
even with my recent patched to allow <memory unit='G'>1</memory>,
people can still get away with trying <memory>1G</memory> and
silently get <memory unit='KiB'>1</memory> instead. While
virt-xml-validate catches the error, our C parser did not.
Not to mention that it's always fun to fix bugs while reducing
lines of code. :)
* src/conf/domain_conf.c (virDomainParseMemory): Check for parse error.
(virDomainDefParseXML): Avoid strtoll.
* src/conf/storage_conf.c (virStorageDefParsePerms): Likewise.
* src/util/xml.c (virXPathLongBase, virXPathULongBase)
(virXPathULongLong, virXPathLongLong): Likewise.
Commit 78345c68 makes at least gcc 4.1.2 on RHEL 5 complain:
cc1: warnings being treated as errors
In file included from vbox/vbox_V4_0.c:13:
vbox/vbox_tmpl.c: In function 'vboxDomainUndefineFlags':
vbox/vbox_tmpl.c:5298: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
* src/vbox/vbox_tmpl.c (vboxDomainUndefineFlags): Use union to
avoid compiler warning.
Currently upon a migration a callback is created when a 802.1qbg link
is set to PREASSOCIATE, this should not happen because this is a no-op
on most switches, and does not lead to an ASSOCIATE state. This patch
only creates callbacks when CREATE or RESTORE is requested. Migration
and libvirtd restart scenarios are already handled elsewhere.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
The below patch fixes the following memory leak.
==20624== 24 bytes in 2 blocks are definitely lost in loss record 532 of 1,867
==20624== at 0x4A05E46: malloc (vg_replace_malloc.c:195)
==20624== by 0x38EC27FC01: strdup (strdup.c:43)
==20624== by 0x4EB6BA3: virDomainChrSourceDefCopy (domain_conf.c:1122)
==20624== by 0x495D76: qemuProcessFindCharDevicePTYs (qemu_process.c:1497)
==20624== by 0x498321: qemuProcessWaitForMonitor (qemu_process.c:1258)
==20624== by 0x49B5F9: qemuProcessStart (qemu_process.c:3652)
==20624== by 0x468B5C: qemuDomainObjStart (qemu_driver.c:4753)
==20624== by 0x469171: qemuDomainStartWithFlags (qemu_driver.c:4810)
==20624== by 0x4F21735: virDomainCreate (libvirt.c:8153)
==20624== by 0x4302BF: remoteDispatchDomainCreateHelper (remote_dispatch.h:852)
==20624== by 0x4F72C14: virNetServerProgramDispatch (virnetserverprogram.c:416)
==20624== by 0x4F6D690: virNetServerHandleJob (virnetserver.c:164)
==20624== by 0x4E8F43D: virThreadPoolWorker (threadpool.c:144)
==20624== by 0x4E8EAB5: virThreadHelper (threads-pthread.c:161)
==20624== by 0x38EC606CCA: start_thread (pthread_create.c:301)
==20624== by 0x38EC2E0C2C: clone (clone.S:115)
So that a domain xml which doesn't have "placement" specified, but
"cpuset" is specified, could be parsed. And in this case, the
"placement" mode will be set as "static".
If console[0] is an alias for serial[0], do not enforce the former to
have a PTY source type. This breaks serial consoles on stdio and makes
no sense.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
When using the xm/xend stack to manage instances there is a bug
that causes the emulated interfaces to be unusable when the vif
config contains type=ioemu.
The current code already has a special quirk to not use this
keyword if no specific model is given for the emulated NIC
(defaulting to rtl8139).
Essentially it works because regardless of the type argument,i
the Xen stack always creates emulated and paravirt interfaces and
lets the guest decide which one to use. So neither xl nor xm stack
actually require the type keyword for emulated NICs.
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
lvcreate want's the parent pool's name, not the pool path
lvchange and lvremove want lv specified as $vgname/$lvname
This largely worked before because these commands strip off a
starting /dev. But https://bugzilla.redhat.com/show_bug.cgi?id=714986
is from a user using a 'nested VG' that was having problems.
I couldn't find any info on nested LVM and the reporter never responded,
but I reproduced with XML that specified a valid source name, and
set target path to a symlink.
As explained in previous patch, numad will balance the affinity
dynamically, so reflecting the cpuset from numad at the first
time doesn't make much case, and may just could cause confusion.
(cherry picked from commit 8fb2164cfff35ce0b87f1d513a0f3ca5111d7880)
Instead of returning a CPUs list, numad returns NUMA node
list instead, this patch is to convert the node list to
cpumap before affinity setting. Otherwise, the domain
processes will be pinned only to CPU[$numa_cell_num],
which will cause significiant performance losses.
Also because numad will balance the affinity dynamically,
reflecting the cpuset from numad back doesn't make much
sense then, and it may just could produce confusion for
the users. Thus the better way is not to reflect it back
to XML. And in this case, it's better to ignore the cpuset
when parsing XML.
The codes to update the cpuset is removed in this patch
incidentally, and there will be a follow up patch to ignore
the manually specified "cpuset" if "placement" is "auto",
and document will be updated too.
(cherry picked from commit ccf80e36301d538505c5c053cf369a61d4671831)
The linux-2.6.32 kernel header does not yet define IFLA_VF_MAX and others,
which breaks compiling a new libvirt on old systems like Debian Squeeze.
(I also have to add --without-macvtap --disable-werror --without-virtualport to
./configure to get it to compile.)
Signed-off-by: Philipp Hahn <hahn@univention.de>
(cherry picked from commit d7451bddc55eabbe2c7a9070312ab3920cb93200)
Although it should be harmless to do:
disk = disk = def->disks[i]
some not-so-wise compilers may fool around.
Besides, such assignment is useless here.
(cherry picked from commit e14d6571c1185133d15161bb4b18ec9b11192358)
If placement mode is AUTO, on some return paths char *cpumap or
char *nodeset are leaked.
(cherry picked from commit 354e6d4ed0b71cc90084faea32c360ce86a85b42)
On newer xend (v3.x and after) there is no state and domid reported
for inactive domains. When initially creating connections this is
handled in various places by assigning domain->id = -1.
But once an instance has been running, the id is set to the current
domain id. And it does not change when the instance is shut down.
So when querying the domain info, the hypervisor driver, which gets
asked first will indicate it cannot find information, then the
xend driver is asked and will set the status to NOSTATE because it
checks for the -1 domain id.
Checking domain/status for 0 seems to be more reliable for that.
One note: I am not sure whether the domain->id also should get set
back to -1 whenever any sub-driver thinks the instance is no longer
running.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=746007
BugLink: http://bugs.launchpad.net/bugs/929626
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
(cherry picked from commit 26e9ef476239e8cb79f819092c5aac4afdd47d0d)
This patch adds a netlink callback when migrating a VEPA enabled
virtual machine. It fixes a Bug where a VM would not request a port
association when it was cleared by lldpad.
This patch requires the latest git version of lldpad to work.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
(cherry picked from commit 997366ca7d174524725d6f6dfa8b14d6d2838ef0)
If dynamic_ownership is off and we are creating a file on NFS
we force chown. This will fail as chown/chmod are not supported
on NFS. However, with no dynamic_ownership we are not required
to do any chown.
(cherry picked from commit b1256816ff34ca94675ef62eccfa66f2a7daa0fc)
The new safe console handling introduced a possibility to deadlock the
qemu driver when a new console connection forcibly disconnects a
previous console stream that belongs to an already closed connection.
The virStreamFree function calls subsequently a the virReleaseConnect
function that tries to lock the driver while discarding the connection,
but the driver was already locked in qemuDomainOpenConsole.
Backtrace of the deadlocked thread:
0 0x00007f66e5aa7f14 in __lll_lock_wait () from /lib64/libpthread.so.0
1 0x00007f66e5aa3411 in _L_lock_500 () from /lib64/libpthread.so.0
2 0x00007f66e5aa322a in pthread_mutex_lock () from/lib64/libpthread.so.0
3 0x0000000000462bbd in qemudClose ()
4 0x00007f66e6e178eb in virReleaseConnect () from/usr/lib64/libvirt.so.0
5 0x00007f66e6e19c8c in virUnrefStream () from /usr/lib64/libvirt.so.0
6 0x00007f66e6e3d1de in virStreamFree () from /usr/lib64/libvirt.so.0
7 0x00007f66e6e09a5d in virConsoleHashEntryFree () from/usr/lib64/libvirt.so.0
8 0x00007f66e6db7282 in virHashRemoveEntry () from/usr/lib64/libvirt.so.0
9 0x00007f66e6e09c4e in virConsoleOpen () from /usr/lib64/libvirt.so.0
10 0x00000000004526e9 in qemuDomainOpenConsole ()
11 0x00007f66e6e421f1 in virDomainOpenConsole () from/usr/lib64/libvirt.so.0
12 0x00000000004361e4 in remoteDispatchDomainOpenConsoleHelper ()
13 0x00007f66e6e80375 in virNetServerProgramDispatch () from/usr/lib64/libvirt.so.0
14 0x00007f66e6e7ae11 in virNetServerHandleJob () from/usr/lib64/libvirt.so.0
15 0x00007f66e6da897d in virThreadPoolWorker () from/usr/lib64/libvirt.so.0
16 0x00007f66e6da7ff6 in virThreadHelper () from/usr/lib64/libvirt.so.0
17 0x00007f66e5aa0c5c in start_thread () from /lib64/libpthread.so.0
18 0x00007f66e57e7fcd in clone () from /lib64/libc.so.6
* src/qemu/qemu_driver.c: qemuDomainOpenConsole()
-- unlock the qemu driver right after acquiring the domain
object
(cherry picked from commit 3d3de46a6772baabb6c203a7961a790af6e8d08c)
I noticed these compiler warnings when building for the s390 architecture.
* src/node_device/node_device_udev.c (udevDeviceMonitorStartup):
Mark unused variable.
* src/nodeinfo.c (linuxNodeInfoCPUPopulate): Avoid unused variable.
(cherry picked from commit 9011a494acf0abb8da8fb45dbfb2ac8d7f2a994d)
Leak introduced in commit 0436d32. If we allocate an actions array,
but fail early enough to never consume it with the qemu monitor
transaction call, we leaked memory.
But our semantics of making the transaction command free the caller's
memory is awkward; avoiding the memory leak requires making every
intermediate function in the call chain check for error. It is much
easier to fix things so that the function that allocates also frees,
while the call chain leaves the caller's data intact. To do that,
I had to hack our JSON data structure to make it easy to protect a
portion of an arbitrary JSON tree from being freed.
* src/util/json.h (virJSONType): Name the enum.
(_virJSONValue): New field.
* src/util/json.c (virJSONValueFree): Use it to protect a portion
of an array.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONTransaction): Avoid
freeing caller's data.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive):
Free actions array on failure.
(cherry picked from commit 1413560966eb07732b56dc2b83215462533656e5)
We can tell qemuDomainSnapshotFSThaw if we want it to report errors or
not. However, if we don't want to and an error has been already set by
previous qemuReportError() we must keep copy of that error not just a
pointer to it. Otherwise, it get overwritten if FSThaw reports an error.
(cherry picked from commit 650da0e99c56f4142273f4e369ad02bd0c008765)
This causes an implicit vkbd device to be added which takes
6min to finally fail being initialized in the guest.
http://lists.xen.org/archives/html/xen-devel/2012-04/msg00409.html
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
(cherry picked from commit fb98da005071f7f9a5d222b3829901682732900c)
Detected by valgrind. Leaks are introduced in commit b22eaa7.
* src/conf/domain_conf.c (virDomainDiskDefParseXML): fix memory leaks.
How to reproduce?
% make && make -C tests check TESTS=qemuxml2argvtest
% cd tests && valgrind -v --leak-check=full ./qemuxml2argvtest
actual result:
==2143== 12 bytes in 2 blocks are definitely lost in loss record 74 of 179
==2143== at 0x4A05FDE: malloc (vg_replace_malloc.c:236)
==2143== by 0x39D90A67DD: xmlStrndup (xmlstring.c:45)
==2143== by 0x4F5EC0: virDomainDiskDefParseXML (domain_conf.c:3438)
==2143== by 0x502F00: virDomainDefParseXML (domain_conf.c:8304)
==2143== by 0x505FE3: virDomainDefParseNode (domain_conf.c:9080)
==2143== by 0x5069AE: virDomainDefParse (domain_conf.c:9030)
==2143== by 0x41CBF4: testCompareXMLToArgvHelper (qemuxml2argvtest.c:105)
==2143== by 0x41E5DD: virtTestRun (testutils.c:145)
==2143== by 0x416FA3: mymain (qemuxml2argvtest.c:399)
==2143== by 0x41DCB7: virtTestMain (testutils.c:700)
==2143== by 0x39CF01ECDC: (below main) (libc-start.c:226)
Signed-off-by: Alex Jia <ajia@redhat.com>
(cherry picked from commit 80d476a92ff0b5c2db00fdf37842757f45e30560)
https://bugzilla.redhat.com/show_bug.cgi?id=809895
Basically, openvz dropped strict version numbering (3.1 vs 3.1.0),
which caused parsing to fail.
(cherry picked from commit 37075dfe6ccdb5a257cdd64194b5fdd51458fa8b)
This symbol is used in the test suites
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 06180ca433de3db9e544213afc3d87270aaa8901)
If the daemon is restarted it will lose list of active
USB devices assigned to active domains. Therefore we need
to rebuild this list on qemuProcessReconnect().
(cherry picked from commit ea3bc548aca7b4c448b48863120ad35a7337c127)
To prevent assigning one USB device to two domains,
we keep a list of assigned USB devices. On domain
startup - qemuProcessStart() - we insert devices
used by domain into the list but remove them only
on detach-device. Devices are, however, released
on qemuProcessStop() as well.
(cherry picked from commit e2f5dd6134ebeb6846450c7d7782273d3d274859)
Originally, qemuDomainCheckEjectableMedia was entering monitor with qemu
driver lock. Commit 2067e31bf97667eab9f111b496f5e9a44e827c5b, which I
made to fix that, revealed another issue we had (but didn't notice it
since the driver was locked): we didn't set nested job when
qemuDomainCheckEjectableMedia is called during migration. Thus the
original fix I made was wrong.
XenD-3.1 introduced managed domains. HV-domains have rtc_timeoffset
(hgd24f37b31030 from 2007-04-03), which tracks the offset between the
hypervisors clock and the domains RTC, and is persisted by XenD.
In combination with localtime=1 this had a bug until XenD-3.4
(hg5d701be7c37b from 2009-04-01) (I'm not 100% sure how that bug
manifests, but at least for me in TZ=Europe/Berlin I see the previous
offset relative to utc being applied to localtime again, which manifests
in an extra hour being added)
XenD implements the following variants for clock/@offset:
- PV domains don't have a RTC → 'localtime' | 'utc'
- <3.1: no managed domains → 'localtime' | 'utc'
- ≥3.1: the offset is tracked for HV → 'variable'
due to the localtime=1 bug → 'localtime' | 'utc'
- ≥3.4: the offset is tracked for HV → 'variable'
Current libvirtd still thinks XenD only implements <clock offset='utc'/>
and <clock offset='localtime'/>, which is wrong, since the semantic of
'utc' and 'localtime' specifies, that the offset will be reset on
domain-restart, while with 'variable' the offset is kept. (keeping the
offset over "virsh edit" is important, since otherwise the clock might
jump, which confuses certain guest OSs)
xendConfigVersion was last incremented to 4 by the xen-folks for
xen-3.1.0. I know of no way to reliably detect the version of XenD
(user space tools), which may be different from the version of the
hypervisor (kernel) version! Because of this only the change from
'utc'/'localtime' to 'variable' in XenD-3.1 is handled, not the buggy
behaviour of XenD-3.1 until XenD-3.4.
For backward compatibility with previous versions of libvirt Xen-HV
still accepts 'utc' and 'localtime', but they are returned as 'variable'
on the next read-back from Xend to libvirt, since this is what XenD
implements: The RTC is NOT reset back to the specified time on next
restart, but the previous offset is kept.
This behaviour can be turned off by adding the additional attribute
adjustment='reset', in which case libvirt will report an error instead
of doing the conversion. The attribute can also be used as a shortcut to
offset='variable' with basis='...'.
With these changes, it is also necessary to adjust the xen tests:
"localtime = 0" is always inserted, because otherwise on updates the
value is not changed within XenD.
adjustment='reset' is inserted for all cases, since they're all <
XEND_CONFIG_VERSION_3_1_0, only 3.1 introduced persistent
rtc_timeoffset.
Some statements change their order because code was moved around.
Signed-off-by: Philipp Hahn <hahn@univention.de>