Spawn the compressor ourselves, instead of requiring the shell.
* src/qemu/qemu_migration.c (qemuMigrationToFile): Spawn
compression helper process when needed.
SELinux labeling and cgroup ACLs aren't required if we hand a
pre-opened fd to qemu. All the more reason to love fd: migration.
* src/qemu/qemu_migration.c (qemuMigrationToFile): Skip steps
that are irrelevant in fd migration.
This points out that core dumps (still) don't work for root-squash
NFS, since the fd is not opened correctly. This patch should not
introduce any functionality change, it is just a refactoring to
avoid duplicated code.
* src/qemu/qemu_migration.h (qemuMigrationToFile): New prototype.
* src/qemu/qemu_migration.c (qemuMigrationToFile): New function.
* src/qemu/qemu_driver.c (qemudDomainSaveFlag, doCoreDump): Use
it.
Direct access to an open file is so much simpler than passing
everything through a pipe!
* src/qemu/qemu_driver.c (qemudOpenAsUID)
(qemudDomainSaveImageClose): Delete.
(qemudDomainSaveImageOpen): Rename...
(qemuDomainSaveImageOpen): ...and drop read_pid argument. Use
virFileOpenAs instead of qemudOpenAsUID.
(qemudDomainSaveImageStartVM, qemudDomainRestore)
(qemudDomainObjRestore): Rename...
(qemuDomainSaveImageStartVM, qemuDomainRestore)
(qemDomainObjRestore): ...and simplify accordingly.
(qemudDomainObjStart, qemuDriver): Update callers.
This patch intentionally doesn't change indentation, in order to
make it easier to review the real changes.
* src/util/util.h (VIR_FILE_OP_RETURN_FD, virFileOperationHook):
Delete.
(virFileOperation): Rename...
(virFileOpenAs): ...and reduce parameters.
* src/util/util.c (virFileOperationNoFork, virFileOperation):
Rename and simplify.
* src/qemu/qemu_driver.c (qemudDomainSaveFlag): Adjust caller.
* src/storage/storage_backend.c (virStorageBackendCreateRaw):
Likewise.
* src/libvirt_private.syms: Reflect rename.
Currently, the hook function in virFileOperation is extremely limited:
it must be async-signal-safe, and cannot modify any memory in the
parent process. It is much handier to return a valid fd and operate
on it in the parent than to deal with hook restrictions.
* src/util/util.h (VIR_FILE_OP_RETURN_FD): New flag.
* src/util/util.c (virFileOperationNoFork, virFileOperation):
Honor new flag.
This allows direct saves (no compression, no root-squash NFS) to use
the more efficient fd: migration, which in turn avoids a race where
qemu exec: migration can sometimes fail because qemu does a generic
waitpid() that conflicts with the pclose() used by exec:. Further
patches will solve compression and root-squash NFS.
* src/qemu/qemu_driver.c (qemudDomainSaveFlag): Use new function
when there is no compression.
Latent bug introduced in commit 2d6a581960 (Aug 2009), but not exposed
until commit 1859939a (Jan 2011). Basically, when virExec creates a
pipe, it always marks libvirt's side as cloexec. If libvirt then
wants to hand that pipe to another child process, things work great if
the fd is dup2()'d onto stdin or stdout (as with stdin: or exec:
migration), but if the pipe is instead used as-is (such as with fd:
migration) then qemu sees EBADF because the fd was closed at exec().
This is a minimal fix for the problem at hand; it is slightly racy,
but no more racy than the rest of libvirt fd handling, including the
case of uncompressed save images. A more invasive fix, but ultimately
safer at avoiding leaking unintended fds, would be to _always and
atomically_ open all fds as cloexec in libvirt (thanks to primitives
like open(O_CLOEXEC), pipe2(), accept4(), ...), then teach virExec to
clear that bit for all fds explicitly marked to be handed to the child
only after forking.
* src/qemu/qemu_command.c (qemuBuildCommandLine): Clear cloexec
flag.
* tests/qemuxml2argvtest.c (testCompareXMLToArgvFiles): Tweak test.
* src/util/logging.c (virLogStartup, virLogSetBufferSize):
Over-allocate, so that a debugger can just print the circular
buffer. Suggested by Daniel Veillard.
* src/Makefile.am (remote_protocol-structs): Flatten tabs.
* src/remote_protocol-structs: Likewise. Also add a hint to emacs
to make it easier to keep spaces in the file.
Diego reported a bug where virsh tries to initialize a readline
history directory during 'make check' run as root, but fails
because /root was read-only.
It turns out that I could reproduce this as non-root, by using:
mv ~/.virsh{,.bak}
chmod a-w ~
make check -C tests TESTS=int-overflow
chmod u+w ~
mv ~/.virsh{.bak,}
* tests/int-overflow: Don't trigger interactive mode.
Reported by Diego Elio Pettenò.
Otherwise, if something like doStopVcpus fails after the first
restore, a second restore is attempted and throws a useless
warning.
* src/qemu/qemu_driver.c (qemudDomainSaveFlag): Avoid second
restore of state label.
The Open Nebula driver has been unmaintained since it was first
introduced. The only commits have been for tree-wide cleanups.
It also has a major design flaw, in that it only knows about guests
that it has created itself, which makes it of very limited use.
Discussions wrt evolution of the VMWare ESX driver, concluded that
it should limit itself to single-node ESX operation and not try to
manage the multi-node architecture of VirtualCenter. Open Nebula
is a cluster like Virtual Center, not a single node system, so
the same reasoning applies.
The DeltaCloud project includes an Open Nebula driver and is a much
better fit architecturally, since it is explicitly targetting the
distributed multihost cluster scenario.
Thus this patch deletes the libvirt Open Nebula driver with the
recommendation that people use DeltaCloud for managing it instead.
* configure.ac: Remove probe for xmlrpc & --with-one arg
* daemon/Makefile.am, daemon/libvirtd.c, src/Makefile.am: Remove
ONE driver build
* src/opennebula/one_client.c, src/opennebula/one_client.h,
src/opennebula/one_conf.c, src/opennebula/one_conf.h,
src/opennebula/one_driver.c, src/opennebula/one_driver.c: Delete
files
* autobuild.sh, libvirt.spec.in, mingw32-libvirt.spec.in: Remove
build rules for Open Nebula
* docs/drivers.html.in, docs/sitemap.html.in: Remove reference
to OpenNebula
* docs/drvone.html.in: Delete file
Some things to note in this patch:
- we do extend libvirt scope beyond purely managing domains, there is
already a number of blocks which are here as helpr functions to
manage the resources on the host.
- we are expanding in the direction of libvirt being sufficient to do
most of the management on the Host (but within the limits of the need
for virtualization, e.g. managing users on the host is out of scope)
- we don't require anymore APIs to be supported by multiple
hypervisors to get in, it's already the case in practice, but we
should still make sure the semantic of those APIs are clear. We
added quite a bit for QEmu, but for example I saw on IRC that VBox
could emulate a network unplug/replug on a domain interface, and
that would be a good addition even if a priori no other hypervisor
supports it.
- Make clear that all libvirt APIs are available remotely, which is
key to use libvirt for building management tools.
- link the goal page from the project main page
As for libvirt project directions, I think it just reflects the natural
evolution in the last couple of years. We are less hypervisor agnostic
and extending in the Host management. Clearly there is interest in
making sure libvirt is complete in term of features for the hypervisors
supported, especially the ones like KVM or LXC which don't really have
integrated management library.
* docs/goals.html.in: update the goals page
* docs/index.html.in: link it from the top page
Add missing open curly brace between function declaration of non-linux
variant of qemudDomainInterfaceStats() and its body.
Signed-off-by: Philipp Hahn <hahn@univention.de>
Commit f44bfb7 was supposed to make sure no additional libvirt API (esp.
*Free) is called before remoteDispatchConnError() is called on error.
However, the patch missed two instances.
Sometimes, an asynchronous helper is started (such as a compressor
or iohelper program), but a later error means that we want to
abort that child. Make this easier.
Note that since daemons and virCommandRunAsync can't mix, the only
time virCommandFree can reap a process is if someone did
virCommandRunAsync for a non-daemon and didn't stash the pid.
* src/util/command.h (virCommandAbort): New prototype.
* src/util/command.c (_virCommand): Add new field.
(virCommandRunAsync, virCommandWait): Track whether pid was used.
(virCommandFree): Reap child if caller did not request pid.
(virCommandAbort): New function.
* src/libvirt_private.syms (command.h): Export it.
* tests/commandtest.c (test19): New test.
It doesn't make sense to run a daemon without synchronously
waiting for the child process to reply whether the daemon has
been kicked off and pidfile written yet.
* src/util/command.c (VIR_EXEC_RUN_SYNC): New constant.
(virCommandRun): Set temporary flag.
(virCommandRunAsync): Use it to prevent async runs of intermediate
child when spawning asynchronous daemon grandchild.
Child processes don't always reach _exit(); if they die from a
signal, then any messages should still be accurate. Most users
either expect a 0 status (thankfully, if status==0, then
WIFEXITED(status) is true and WEXITSTATUS(status)==0 for all
known platforms) or were filtering on WIFEXITED before printing
a status, but a few were missing this check. Additionally,
nwfilter_ebiptables_driver was making an assumption that works
on Linux (where WEXITSTATUS shifts and WTERMSIG just masks)
but fails on other platforms (where WEXITSTATUS just masks and
WTERMSIG shifts).
* src/util/command.h (virCommandTranslateStatus): New helper.
* src/libvirt_private.syms (command.h): Export it.
* src/util/command.c (virCommandTranslateStatus): New function.
(virCommandWait): Use it to also diagnose status from signals.
* src/security/security_apparmor.c (load_profile): Likewise.
* src/storage/storage_backend.c
(virStorageBackendQEMUImgBackingFormat): Likewise.
* src/util/util.c (virExecDaemonize, virRunWithHook)
(virFileOperation, virDirCreate): Likewise.
* daemon/remote.c (remoteDispatchAuthPolkit): Likewise.
* src/nwfilter/nwfilter_ebiptables_driver.c (ebiptablesExecCLI):
Likewise.
Hotpluging host usb device by text mode will fail, because the monitor
command 'device_add' outputs 'husb: using...' if it succeeds, but we
think the command should not output anything.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Add the compiler attribute to ensure we don't introduce any more
ref bugs like were just patched in commit 9741f34, then explicitly
mark the remaining places in code that are safe.
* src/qemu/qemu_monitor.h (qemuMonitorUnref): Mark
ATTRIBUTE_RETURN_CHECK.
* src/conf/domain_conf.h (virDomainObjUnref): Likewise.
* src/conf/domain_conf.c (virDomainObjParseXML)
(virDomainLoadStatus): Fix offenders.
* src/openvz/openvz_conf.c (openvzLoadDomains): Likewise.
* src/vmware/vmware_conf.c (vmwareLoadDomains): Likewise.
* src/qemu/qemu_domain.c (qemuDomainObjBeginJob)
(qemuDomainObjBeginJobWithDriver)
(qemuDomainObjExitRemoteWithDriver): Likewise.
* src/qemu/qemu_monitor.c (QEMU_MONITOR_CALLBACK): Likewise.
Suggested by Daniel P. Berrange.
This simplifies several callers that were repeating checks already
guaranteed by util.c, and makes other callers more robust to now
reject directories. remote_driver.c was over-strict - access(,R_OK)
is only needed to execute a script file; a binary only needs
access(,X_OK) (besides, it's unusual to see a file with x but not
r permissions, whether script or binary).
* cfg.mk (sc_prohibit_access_xok): New syntax-check rule.
(exclude_file_name_regexp--sc_prohibit_access_xok): Exempt one use.
* src/network/bridge_driver.c (networkStartRadvd): Fix offenders.
* src/qemu/qemu_capabilities.c (qemuCapsProbeMachineTypes)
(qemuCapsInitGuest, qemuCapsInit, qemuCapsExtractVersionInfo):
Likewise.
* src/remote/remote_driver.c (remoteFindDaemonPath): Likewise.
* src/uml/uml_driver.c (umlStartVMDaemon): Likewise.
* src/util/hooks.c (virHookCheck): Likewise.
Bug https://bugzilla.redhat.com/show_bug.cgi?id=689374 reported libvirtd
crash during error dispatch.
The reason is that libvirtd uses remoteDispatchConnError() with non-NULL
conn parameter which means that virConnGetLastError() is used instead of
its thread safe replacement virGetLastError().
So when several libvirtd threads are reporting errors at the same time,
the errors can get mixed or corrupted or in case of bad luck libvirtd
itself crashes.
Since Daniel B. is going to rewrite this code from scratch on top of his
RPC infrastructure, I tried to come up with a minimal fix. Thus,
remoteDispatchConnError() now just ignores its conn argument and always
calls virGetLastError(). However, several callers had to be touched as
well, since no libvirt API is allowed to be called before dispatching
the error. Doing so would reset the error and we would have nothing to
dispatch. As a result of that, the code is not very nice but that
doesn't really make daemon/remote.c worse than it is now :-) And it will
all die soon, which is good.
The bug report also contains a reproducer in C which detects both mixed
up error messages and libvirtd crash. Before this patch, I was able to
crash libvirtd in about 20 seconds up to 3 minutes depending on number
of CPU cores (more is better) and luck.
Steps to reproduce this bug:
1. virsh attach-disk domain --source diskimage --target sdb --sourcetype file --driver qemu --subdriver qcow2
error: Failed to attach disk
error: operation failed: adding scsi-disk,bus=scsi0.0,scsi-id=1,drive=drive-scsi0-0-1,id=scsi0-0-1 device failed: Property 'scsi-disk.drive' can't find value 'drive-scsi0-0-1'
2. service libvirtd restart
Stopping libvirtd daemon: [ OK ]
Starting libvirtd daemon: [ OK ]
3. virsh attach-disk domain --source diskimage --target sdb --sourcetype file --driver qemu --subdriver raw
error: Failed to attach disk
error: operation failed: adding lsi,id=scsi0,bus=pci.0,addr=0x6 device failed: Duplicate ID 'scsi0' for device
The reason is that we create a new scsi controller but we do not update
/var/run/libvirt/qemu/domain.xml.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Not every day you see a patch that nukes 27 files!
* .gnulib: Update to latest, for maint.mk improvements
* bootstrap: Resync to gnulib.
* bootstrap.conf (ACLOCAL): Swap the secondary aclocal include
directory, now that bootstrap picks up gnulib/m4 instead of m4.
* Makefile.am (syntax_check_exceptions, EXTRA_DIST): No longer
worry about nuked files.
* cfg.mk (sc_x_sc_dist_check): Delete dead rule.
(VC_LIST_ALWAYS_EXCLUDE_REGEX): Add HACKING.
(exclude_file_name_regexp--sc_*): Inline and simplify contents...
* .x-sc_*: ...from here, then delete the files.
The ref count was assigned to 1 at creation, then never modified again
until it was decremented just before freeing the object.
* src/conf/domain_conf.h (_virDomainSnapshotObj): Delete unused
field.
(virDomainSnapshotObjUnref): Delete unused prototype.
* src/libvirt_private.syms: Likewise.
* src/conf/domain_conf.c (virDomainSnapshotObjNew)
(virDomainSnapshotObjListDataFree): Update users.
(virDomainSnapshotObjUnref): Delete.
Among others, the missing radvd dependency showed up as:
error: Failed to start network ipv6net
error: Cannot find radvd - Possibly the package isn't installed: No such file
or directory
even when radvd was installed, because the RADVD preprocessor
symbol was missing at configure time.
* libvirt.spec.in (with_network): Add BuildRequires for radvd,
iptables, and ip6tables.
(BuildRequires): Add libxslt and augeas for docs and test.
(with_libvirtd): Add module-init-tools for modprobe.
(with_nwfilter): Add BuildRequires for ebtables.
(with_esx): Fix esx build on RHEL 5, thanks to curl-devel rename.
Problem:
"parser.head" is not NULL even if it's free'ed by "virJSONValueFree",
returning "parser.head" when "virJSONValueFromString" fails will cause
unexpected errors (libvirtd will crash sometimes), e.g.
In function "qemuMonitorJSONArbitraryCommand":
if (!(cmd = virJSONValueFromString(cmd_str)))
goto cleanup;
if (qemuMonitorJSONCommand(mon, cmd, &reply) < 0)
goto cleanup;
......
cleanup:
virJSONValueFree(cmd);
It will continues to send command to monitor even if "virJSONValueFromString"
is failed, and more worse, it trys to free "cmd" again.
Crash example:
{"error":{"class":"QMPBadInputObject","desc":"Expected 'execute' in QMP input","data":{"expected":"execute"}}}
{"error":{"class":"QMPBadInputObject","desc":"Expected 'execute' in QMP input","data":{"expected":"execute"}}}
error: server closed connection:
error: unable to connect to '/var/run/libvirt/libvirt-sock', libvirtd may need to be started: Connection refused
error: failed to connect to the hypervisor
This fix is to:
1) return NULL for failure of "virJSONValueFromString",
2) and it seems "virJSONValueFree" uses incorrect loop index for type
of "VIR_JSON_TYPE_OBJECT", fix it together.
* src/util/json.c
Steps to reproduce this bug:
# cat usb.xml
<hostdev mode='subsystem' type='usb'>
<source>
<address bus='0x001' device='0x003'/>
</source>
</hostdev>
# virsh attach-device vm1 usb.xml
error: Failed to attach device from usb.xml
error: server closed connection:
The reason of this bug is that we set data.cgroup to NULL, and this will cause
libvirtd crashed.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
A future patch will change reference counting idioms; consolidating
this pattern now makes the next patch smaller (touch only the new
macro rather than every caller).
* src/qemu/qemu_monitor.c (QEMU_MONITOR_CALLBACK): New helper.
(qemuMonitorGetDiskSecret, qemuMonitorEmitShutdown)
(qemuMonitorEmitReset, qemuMonitorEmitPowerdown)
(qemuMonitorEmitStop, qemuMonitorEmitRTCChange)
(qemuMonitorEmitWatchdog, qemuMonitorEmitIOError)
(qemuMonitorEmitGraphics): Use it to reduce duplication.
This patch introduces PREASSOCIATE-RR during incoming VM migration on the
destination host. This is similar to the usage of PREASSOCIATE during
migration in 8021qbg libvirt code today. PREASSOCIATE-RR is a VDP operation.
With the latest at IEEE, 8021qbh will need to support VDP operations.
A corresponding enic driver patch to support PREASSOCIATE-RR for 8021qbh
will be posted for net-next-2.6 inclusion soon.
THe veth setup in LXC had a couple of flaws, first brInit did
not report any error when it failed. Second vethCreate() did
not correctly initialize the variable containing the return
code, so could report failure even when it succeeded.
* src/lxc/lxc_driver.c: Report error when brInit fails
* src/lxc/veth.c: Fix uninitialized variable