Commit Graph

7441 Commits

Author SHA1 Message Date
Wen Congyang
d96431f910 avoid vm to be deleted if qemuConnectMonitor failed
Steps to reproduce this bug:
1. service libvirtd start
2. virsh start <domain>
3. kill -STOP $(cat /var/run/libvirt/qemu/<domain>.pid)
4. service libvirtd restart
5. kill -9 $(cat /var/run/libvirt/qemu/<domain>.pid)

Then libvirtd will core dump or be in deadlock state.

Make sure that json is built into libvirt and the version
of qemu is newer than 0.13.0.

The reason of libvirtd cores dump is that:
We add vm->refs when we alloc the memory, and decrease it
in the function qemuHandleMonitorEOF() in other thread.

We add vm->refs in the function qemuConnectMonitor() and
decrease it when the vm is inactive.

The libvirtd will block in the function qemuMonitorSetCapabilities()
because the vm is stopped by signal SIGSTOP. Now the vm->refs is 2.

Then we kill the vm by signal SIGKILL. The function
qemuMonitorSetCapabilities() failed, and then we will decrease vm->refs
in the function qemuMonitorClose().
In another thread, mon->fd is broken and the function
qemuHandleMonitorEOF() is called.

If qemuHandleMonitorEOF() decreases vm->refs before qemuConnectMonitor()
returns, vm->refs will be decrease to 0 and the memory is freed.

We will call qemudShutdownVMDaemon() as qemuConnectMonitor() failed.
The memory has been freed, so qemudShutdownVMDaemon() is too dangerous.

We will reference NULL pointer in the function virDomainConfVMNWFilterTeardown():
=============
void
virDomainConfVMNWFilterTeardown(virDomainObjPtr vm) {
    int i;

    if (nwfilterDriver != NULL) {
        for (i = 0; i < vm->def->nnets; i++)
            virDomainConfNWFilterTeardown(vm->def->nets[i]);
    }
}
============
vm->def->nnets is not 0 but vm->def->nets is NULL(We don't set vm->def->nnets
to 0 when we free vm).

We should add an extra reference of vm to avoid vm to be deleted if
qemuConnectMonitor() failed.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
2011-01-27 13:38:29 -07:00
Jiri Denemark
cedf97e75a tests: Fix virtio channel tests
As noticed by Eric, commit 8e28c5d402,
which fixed generation of virtio-serial port numbers, forgot to adjust
test files which resulted in make check failure.
2011-01-27 19:00:36 +01:00
Justin Clift
4282efcc76 docs: expand the man page text for virsh setmaxmem
Addresses BZ # 622534:

  https://bugzilla.redhat.com/show_bug.cgi?id=622534
2011-01-28 03:32:23 +11:00
Eric Blake
a7483a5631 event: fix event-handling allocation crash
Regression introduced in commit e6b68d7 (Nov 2010).

Prior to that point, handlesAlloc was always a multiple of
EVENT_ALLOC_EXTENT (10), and was an int (so even if the subtraction
had been able to wrap, a negative value would be less than the count
not try to free the handles array).  But after that point,
VIR_RESIZE_N made handlesAlloc grow geometrically (with a pattern of
10, 20, 30, 45 for the handles array) but still freed in multiples of
EVENT_ALLOC_EXTENT; and the count changed to size_t.  Which means that
after 31 handles have been created, then 30 handles destroyed,
handlesAlloc is 5 while handlesCount is 1, and since (size_t)(1 - 5)
is indeed greater than 1, this then tried to free 10 elements, which
had the awful effect of nuking the handles array while there were
still live handles.

Nuking live handles puts libvirtd in an inconsistent state, and was
easily reproducible by starting and then stopping 60 faqemu guests.

* daemon/event.c (virEventCleanupTimeouts, virEventCleanupHandles):
Avoid integer wrap-around causing us to delete the entire array
while entries are still active.
* tests/eventtest.c (mymain): Expose the bug.
2011-01-27 09:12:36 -07:00
Justin Clift
6014485cdb docs: fix incorrect XML element mentioned by setmem text 2011-01-27 22:57:18 +11:00
Osier Yang
31242565ae remote: Add extra parameter pkipath for URI
This new parameter allows user specifies where the client
cerficate, client key, CA certificate of x509 is, instead of
hardcoding it. If 'pkipath' is not specified, and the user
is not root, try to find files in $HOME/.pki/libvirt, as long
as one of client cerficate, client key, CA certificate can
not be found, use default global location (LIBVIRT_CACERT,
LIBVIRT_CLIENTCERT, LIBVIRT_CLIENTKEY, see
src/remote/remote_driver.h)

Example of use:

[root@Osier client]# virsh -c qemu+tls://10.66.93.111/system?pkipath=/tmp/pki/client
error: Cannot access CA certificate '/tmp/pki/client/cacert.pem': No such file
or directory
error: failed to connect to the hypervisor
[root@Osier client]# ls -l
total 24
-rwxrwxr-x. 1 root root 6424 Jan 24 21:35 a.out
-rw-r--r--. 1 root root 1245 Jan 23 19:04 clientcert.pem
-rw-r--r--. 1 root root  132 Jan 23 19:04 client.info
-rw-r--r--. 1 root root 1679 Jan 23 19:04 clientkey.pem

[root@Osier client]# cp /tmp/cacert.pem .
[root@Osier client]# virsh -c qemu+tls://10.66.93.111/system?pkipath=/tmp/pki/client
Welcome to virsh, the virtualization interactive terminal.

Type:  'help' for help with commands
'quit' to quit

virsh #

* src/remote/remote_driver.c: adds support for the new pkipath URI parameter
2011-01-27 16:34:54 +08:00
Osier Yang
6002e0406c storage: Round up capacity for LVM volume creation
If vol->capacity is odd, the capacity will be rounded down
by devision, this patch is to round it up instead of rounding
down, to be safer in case of one writes to the volume with the
size he used to create.

- src/storage/storage_backend_logical.c: make sure size is not rounded down
2011-01-27 16:28:19 +08:00
Daniel Veillard
28eae66a3a Update localization files from Fedora i10n 2011-01-27 13:33:45 +08:00
David Allan
8e28c5d402 Do not use virtio-serial port 0 for generic ports
Per the discussion in:

https://bugzilla.redhat.com/show_bug.cgi?id=670394

The port numbering should start from 1, not 0.  We assign maxport + 1,
so start maxport at 0.
2011-01-26 23:02:40 -05:00
Laine Stump
c9c794b52b Manually kill gzip if restore fails before starting qemu
If a guest image is saved in compressed format, and the restore fails
in some way after the intermediate process used to uncompress the
image has been started, but before qemu has been started to hook up to
the uncompressor, libvirt will endlessly wait for the uncompressor to
finish, but it never will because it's still waiting to have something
hooked up to drain its output.

The solution is to close the pipes on both sides of the uncompressor,
then send a SIGTERM before calling waitpid on it (only if the restore
has failed, of course).
2011-01-26 10:13:43 -05:00
Daniel P. Berrange
3493f1bcec Fix setup of lib directory with autogen.sh --system
On x86_64 hosts, /usr/lib64 must be used instead of /usr/lib
Rather than attempt to whitelist architectures, just check
for existance of /usr/lib64

* autogen.sh: Fix to use /usr/lib64 if it exists
2011-01-26 14:54:23 +00:00
Daniel P. Berrange
e0e4e4de7a Add check for poll error events in monitor
Handle poll errors in the same way as hangup event

* src/qemu/qemu_monitor.c: Handle error events
2011-01-26 14:54:23 +00:00
Daniel P. Berrange
b8786c0641 Filter out certain expected error messages from libvirtd
Add a hook to the error reporting APIs to allow specific
error messages to be filtered out. Wire up libvirtd to
remove VIR_ERR_NO_DOMAIN & similar error codes from the
logs. They are still logged at DEBUG level.

* daemon/libvirtd.c: Filter VIR_ERR_NO_DOMAIN and friends
* src/libvirt_private.syms, src/util/virterror.c,
  src/util/virterror_internal.h: Hook for changing error
  reporting level
2011-01-26 14:54:23 +00:00
Daniel P. Berrange
dbfca3ff70 Revert all previous error log priority hacks
This reverts the additions in commit

  abff683f78

taking us back to state where all errors are fully logged
in both libvirtd and normal clients.

THe intent was to stop VIR_ERR_NO_DOMAIN (No such domain
with UUID XXXX) messages from client apps polluting syslog
The change affected all error codes, but more seriously,
it also impacted errors from internal libvirtd infrastructure
For example guest autostart no longer logged errors. The
libvirtd network code no longer logged some errors. This
makes debugging incredibly hard

* daemon/libvirtd.c: Remove error log priority filter
* src/util/virterror.c, src/util/virterror_internal.h: Remove
  callback for overriding log priority
2011-01-26 14:54:23 +00:00
Daniel P. Berrange
2b7ac8838d Cleanup code style in logging APIs
Remove use of brackets around following return statement.
Fix indentation of two switch statements
2011-01-26 14:54:23 +00:00
Laine Stump
34a19dda1c Set SELinux context label of pipes used for qemu migration
This patch is a partial resolution to the following bug:

   https://bugzilla.redhat.com/show_bug.cgi?id=667756

(to complete the fix, an updated selinux-policy package is required,
to add the policy that allows libvirt to set the context of a fifo,
which was previously not allowed).

Explanation : When an incoming migration is over a pipe (for example,
if the image was compressed and is being fed through gzip, or was on a
root-squash nfs server, so needed to be opened by a child process
running as a different uid), qemu cannot read it unless the selinux
context label for the pipe has been set properly.

The solution is to check the fd used as the source of the migration
just before passing it to qemu; if it's a fifo (implying that it's a
pipe), we call the newly added virSecurityManagerSetFDLabel() function
to set the context properly.
2011-01-26 09:03:21 -05:00
Laine Stump
d89608f994 Add a function to the security driver API that sets the label of an open fd.
A need was found to set the SELinux context label on an open fd (a
pipe, as a matter of fact). This patch adds a function to the security
driver API that will set the label on an open fd to secdef.label. For
all drivers other than the SELinux driver, it's a NOP. For the SElinux
driver, it calls fsetfilecon().

If the return is a failure, it only returns error up to the caller if
1) the desired label is different from the existing label, 2) the
destination fd is of a type that supports setting the selinux context,
and 3) selinux is in enforcing mode. Otherwise it will return
success. This follows the pattern of the existing function
SELinuxSetFilecon().
2011-01-26 09:03:11 -05:00
Justin Clift
413c88e773 docs: add a link to the bindings page under the downloads menu item
So people looking to download the language bindings, but don't know
they're under the "Docs" area.
2011-01-26 16:00:33 +11:00
Michal Privoznik
cee47aace1 virsh: require --mac to avoid detach-interface ambiguity
bugfix for https://bugzilla.redhat.com/show_bug.cgi?id=671050

virsh simply refutes to detach-interface in case when multiple
interfaces are attached and --mac is not specified.
2011-01-25 10:47:28 -07:00
Wen Congyang
75da8b8505 dispatch error before return
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
2011-01-25 10:05:03 -07:00
Osier Yang
dbd63c4d63 qemu: Error prompt when managed save a shutoff domain
The problem was introduced by commit 4303c91, which removed the checking
of domain state, this patch is to fix it.

Otherwise, improper error will be thrown, e.g.

error: Failed to save domain rhel6 state
error: cannot resolve symlink /var/lib/libvirt/qemu/save/rhel6.save: No such
file or directory
2011-01-25 09:51:26 -07:00
Eric Blake
6cbab7c159 build: avoid corrupted gnulib/tests/Makefile
Running 'make check' can sometimes fail in the gnulib/tests
subdirectory, when doing an incremental build, because
./bootstrap generates a Makefile.am that tries to refer to
../../.. instead of ../.., and gets lost.

This may be an upstream gnulib bug, where a more elegant
solution will present itself in the future:
http://thread.gmane.org/gmane.comp.lib.gnulib.bugs/24898

But in the meantime, I was able to reproduce both the issue,
and this solution to work around it.

* bootstrap.conf (bootstrap_epilogue): Ensure that no stray
../../.. components remain in gnulib/tests/Makefile.in.
Reported by Serge Hallyn.
2011-01-24 17:19:25 -07:00
Cole Robinson
6cabc0b0d0 qemu: sound: Support intel 'ich6' model
In QEMU, the card itself is a PCI device, but it requires a codec
(either -device hda-output or -device hda-duplex) to actually output
sound. Specifying <sound model='ich6'/> gives us -device intel-hda
-device hda-duplex I think it's important that a simple <sound model='ich6'/>
sets up a useful codec, to have consistent behavior with all other sound cards.

This is basically Dan's proposal of

    <sound model='ich6'>
        <codec type='output' slot='0'/>
        <codec type='duplex' slot='3'/>
    </sound>

without the codec bits implemented.

The important thing is to keep a consistent API here, we don't want some
<sound> devs require tweaking codecs but not others. Steps I see to
accomplishing this:

    - every <sound> device has a <codec type='default'/> (unless codecs are
        manually specified)
    - <codec type='none'/> is required to specify 'no codecs'
    - new audio settings like mic=on|off could then be exposed in
        <sound> or <codec> in a consistent manner for all sound models

v2:
    Use model='ich6'

v3:
    Use feature detection, from eblake
    Set codec id, bus, and cad values

v4:
    intel-hda isn't supported if -device isn't available

v5:
    Comment spelling fixes
2011-01-24 13:11:52 -05:00
Matthias Bolte
4a267912bf vmx: Use VIR_ERR_CONFIG_UNSUPPORTED when appropriated 2011-01-22 00:26:52 +01:00
Eric Blake
a11bd2e6cc event: fix event-handling data race
This bug has been present since before the time that commit
f8a519 (Dec 2008) tried to make the dispatch loop re-entrant.

Dereferencing eventLoop.handles outside the lock risks crashing, since
any other thread could have reallocated the array in the meantime.
It's a narrow race window, however, and one that would have most
likely resulted in passing bogus data to the callback rather than
actually causing a segv, which is probably why it has gone undetected
this long.

* daemon/event.c (virEventDispatchHandles): Cache data while
inside the lock, as the array might be reallocated once outside.
2011-01-21 15:54:54 -07:00
Eric Blake
ae0cdd4710 build: fix 'make check' with older git
* .gnulib: Update to latest, for maintainer-makefile fix.
Reported by Matthias Bolte.
2011-01-21 15:40:46 -07:00
Cole Robinson
1e1f7a8950 Push unapplied fixups for previous patch
- Add augeas tests
- Clarify vnc_auto_unix_socket precedence in qemu.conf
2011-01-21 16:18:54 -05:00
Cole Robinson
a942ea0692 qemu: Add conf option to auto setup VNC unix sockets
If vnc_auto_unix_socket is enabled, any VNC devices without a hardcoded
listen or socket value will be setup to serve over a unix socket in
/var/lib/libvirt/qemu/$vmname.vnc.

We store the generated socket path in the transient VM definition at
CLI build time.
2011-01-21 16:03:05 -05:00
Cole Robinson
1d9c0a08d9 qemu: Allow serving VNC over a unix domain socket
QEMU supports serving VNC over a unix domain socket rather than traditional
TCP host/port. This is specified with:

<graphics type='vnc' socket='/foo/bar/baz'/>

This provides better security access control than VNC listening on
127.0.0.1, but will cause issues with tools that rely on the lax security
(virt-manager in fedora runs as regular user by default, and wouldn't be
able to access a socket owned by 'qemu' or 'root').

Also not currently supported by any clients, though I have patches for
virt-manager, and virt-viewer should be simple to update.

v2:
    schema: Make listen vs. socket a <choice>
2011-01-21 16:03:04 -05:00
Cole Robinson
cb4c2694f1 qemu: Set domain def transient at beginning of startup process
This will allow us to record transient runtime state in vm->def, like
default VNC parameters. Accomplish this by adding an extra 'live' parameter
to SetDefTransient, with similar semantics to the 'live' flag for
AssignDef.
2011-01-21 16:03:03 -05:00
Eric Blake
125978fe3b maint: support --no-git option during autogen.sh
https://bugzilla.redhat.com/show_bug.cgi?id=562743

Also, fixes gnulib bug in dealing with strerror_r from glibc 2.13.

* .gnulib: Update to latest, for improved bootstrap.
* bootstrap: Resync from gnulib.
* autogen.sh (bootstrap): Add --bootstrap-sync, to make it easier
to keep bootstrap up-to-date.  Pass optional --no-git through.
Reported by Aleksey Avdeev.
2011-01-21 09:45:37 -07:00
Jim Fehlig
4301b95af7 [v2] qemu: Retry JSON monitor cont cmd on MigrationExpected error
When restoring a saved qemu instance via JSON monitor, the vm is
left in a paused state.  Turns out the 'cont' cmd was failing with
"MigrationExpected" error class and "An incoming migration is
expected before this command can be executed" error description
due to migration (restore) not yet complete.

Detect if 'cont' cmd fails with "MigrationExpecte" error class and
retry 'cont' cmd.

V2: Fix potential double-free noted by Laine Stump
2011-01-21 09:35:57 -07:00
Osier Yang
af268f2a36 qemu: report more proper error for unsupported graphics
Report VIR_ERR_CONFIG_UNSUPPORTED instead of VIR_ERR_INTERNAL_ERROR,
as it's valid in our domain schema, just unsupported by hypervisor
here.

* src/qemu/qemu_command.c
2011-01-21 09:27:15 -07:00
Daniel P. Berrange
87a183f698 Fix startup with VNC password expiry on old QEMU
The code which set VNC passwords correctly had fallback for
the set_password command, but was lacking it for the
expire_password command. This made it impossible to start
a guest. It also failed to check whether QEMU was still
running after the initial 'set_password' command completed

* src/qemu/qemu_hotplug.c: Fix error handling when
  password expiry fails
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_text.c: Fix
  return code for missing expire_password command
2011-01-21 16:24:13 +00:00
Daniel P. Berrange
f0bbf96047 Fix error reporting when machine type probe fails
Avoid overwriting the real error message with a generic
OOM failure message, when machine type probe fails

* src/qemu/qemu_driver.c: Don't overwrite error
2011-01-21 16:08:28 +00:00
Daniel P. Berrange
31c698d76d Avoid crash in security driver if model is NULL
If the XML security model is NULL, it is assumed that the current
model will be used with dynamic labelling. The verify step is
meaningless and potentially crashes if dereferencing NULL

* src/security/security_manager.c: Skip NULL model on verify
2011-01-21 16:07:04 +00:00
Wen Congyang
bda57661b8 qemu: Fix a possible deadlock in p2p migration
The function virUnrefConnect() may call virReleaseConnect() to release
the dest connection, and the function virReleaseConnect() will call
conn->driver->close().

So the function virUnrefConnect() should be surrounded by
qemuDomainObjEnterRemoteWithDriver() and
qemuDomainObjExitRemoteWithDriver() to prevent possible deadlock between
two communicating libvirt daemons.

See commit f0c8e1cb37 for further details.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
2011-01-21 08:21:12 -07:00
Eric Blake
3703c3fed4 docs: document <controller> element
* docs/formatdomain.html.in: Talk about <controller> and <address>
throughout.
2011-01-20 08:58:40 -07:00
Justin Clift
e23842856c docs: move the apps page to the top level as its good promo 2011-01-20 17:17:41 +11:00
Justin Clift
ab3a43200c docs: added new entries to apps page, plus adjusted a few existing
Added new entries for Hudson, LCFG, Tivoli Provisioning Manager,
virt-what, and Zenoss.  Adjusted the existing entries for BuildBot
and vmware2libvirt.
2011-01-20 12:57:10 +11:00
Jiri Denemark
15e7865893 qemu: Avoid sending STOPPED event twice
In some circumstances, libvirtd would issue two STOPPED events after it
stopped a domain. This was because an EOF event can arrive after a qemu
process is killed but before qemuMonitorClose() is called.

qemuHandleMonitorEOF() should ignore EOF when the domain is not running.

I wasn't able to reproduce this bug directly, only after adding an
artificial sleep() into qemudShutdownVMDaemon().
2011-01-19 15:01:52 +01:00
Jiri Denemark
45c02ee06f qemu: Fail if per-device boot is used but deviceboot is not supported 2011-01-19 15:01:52 +01:00
Jiri Denemark
b9c1a9cbff spec: Start libvirt-guests only if it's on in current runlevel 2011-01-19 15:01:52 +01:00
Daniel P. Berrange
f10d209585 Remove redundant brackets around return values
A large number of return values used 'return (0)' instead
of simply 'return 0'. Remove all these redundant brackets
so the style is consistent throughout the file

* src/libvirt.c: Remove redundant brackets
2011-01-19 12:42:23 +00:00
Daniel P. Berrange
921b3812e2 Increase size of driver table to make UML work again
The driver table only has 10 slots, but there are potentially
11 drivers that need activating. Improve the error message
when driver registration fails

* src/libvirt.c: Increase driver table size & improve errors
2011-01-19 12:42:23 +00:00
Daniel P. Berrange
19d931d290 Turn libvirt.c error reporting functions into macros
The virLibConnError() function (and related ones) do not correctly
report line number info. Turn them all into macros so line numbers
are reported correctly. Drop the connection object in all of them
since it is no longer used.

Also from the virLibConnWarning() equivalents completely. Now
that the Xen driver is running 100% inside libvirtd, those
codepaths for secondary drivers cannot be reached.

* src/libvirt.c: Replace error functions with macros
2011-01-19 12:42:18 +00:00
Eric Blake
3c99896388 docs: document <sysinfo> and <smbios> elements
* docs/formatdomain.html.in: Talk about <sysinfo> throughout.
2011-01-18 15:35:41 -07:00
Eric Blake
c5b11b3cc4 build: use more gnulib modules for simpler code
* .gnulib: Update to latest, for sigpipe and sigaction modules.
* bootstrap.conf (gnulib_modules): Add siaction, sigpipe, strerror_r.
* tools/virsh.c (vshSetupSignals) [!SIGPIPE]: Delete, now that
gnulib guarantees it.
(SA_SIGINFO): Define for mingw fallback.
* src/util/virterror.c (virStrerror): Simplify, now that gnulib
guarantees the POSIX interface.
* configure.ac (AC_CHECK_FUNCS_ONCE): Drop redundant check.
(AM_PROG_CC_STDC): Move earlier, to keep autoconf happy.
2011-01-18 15:35:41 -07:00
Matthias Bolte
915bc7421e Remove two unused PATH_MAX-sized char arrays from the stack 2011-01-18 23:14:37 +01:00
Matthias Bolte
e065e1ea04 Use VIR_ERR_OPERATION_INVALID when appropriated
VIR_ERR_OPERATION_INVALID means that the operation is not valid
for the current state of the involved object.
2011-01-18 23:14:37 +01:00