Commit Graph

11144 Commits

Author SHA1 Message Date
Cole Robinson
f8085a5471 qemu: Don't report error on successful media eject
If we are just ejecting media, ret == -1 even after the retry loop
determines that the tray is open, as requested. This means media
disconnect always report's error.

Fix it, and fix some other mini issues:

- Don't overwrite the 'eject' error message if the retry loop fails
- Move the retries decrement inside the loop, otherwise the final loop
  might succeed, yet retries == 0 and we will raise error
- Setting ret = -1 in the disk->src check is unneeded
- Fix comment typos

cc: mprivozn@redhat.com
(cherry picked from commit 406d8a9809)
2013-06-12 16:30:31 -04:00
Michal Privoznik
a93f2757e2 qemuDomainChangeEjectableMedia: Unlock domain while waiting for event
In 84c59ffa I've tried to fix changing ejectable media process. The
process should go like this:

1) we need to call 'eject' on the monitor
2) we should wait for 'DEVICE_TRAY_MOVED' event
3) now we can issue 'change' command

However, while waiting in step 2) the domain monitor was locked. So
even if qemu reported the desired event, the proper callback was not
called immediately. The monitor handling code needs to lock the
monitor in order to read the event. So that's the first lock we must
not hold while waiting. The second one is the domain lock. When
monitor handling code reads an event, the appropriate callback is
called then. The first thing that each callback does is locking the
corresponding domain as a domain or its device is about to change
state. So we need to unlock both monitor and VM lock. Well, holding
any lock while sleep()-ing is not the best thing to do anyway.
(cherry picked from commit 543af79a14)

Conflicts:
	src/qemu/qemu_hotplug.c
2013-06-12 16:29:12 -04:00
Michal Privoznik
7c4b6833bb qemu_hotplug: Rework media changing process
https://bugzilla.redhat.com/show_bug.cgi?id=892289

It seems like with new udev within guest OS, the tray is locked,
so we need to:
- 'eject'
- wait for tray to open
- 'change'

Moreover, even when doing bare 'eject', we should check for
'tray_open' as guest may have locked the tray. However, the
waiting phase shouldn't be unbounded, so I've chosen 10 retries
maximum, each per 500ms. This should give enough time for guest
to eject a media and open the tray.
(cherry picked from commit 84c59ffaec)
2013-06-12 15:46:36 -04:00
Stefan Berger
0305c271fc nwfilter: grab driver lock earlier during init (bz96649)
This patch is in relation to Bug 966449:

https://bugzilla.redhat.com/show_bug.cgi?id=966449

This is a patch addressing the coredump.

Thread 1 must be calling  nwfilterDriverRemoveDBusMatches(). It does so with
nwfilterDriverLock held. In the patch below I am now moving the
nwfilterDriverLock(driverState) further up so that the initialization, which
seems to either take a long time or is entirely stuck, occurs with the lock
held and the shutdown cannot occur at the same time.

Remove the lock in virNWFilterDriverIsWatchingFirewallD to avoid
double-locking.

(cherry picked from commit 0ec376c20a)
2013-06-12 15:22:43 -04:00
Christophe Fergeau
364bbdc4cc storage: Ensure 'qemu-img resize' size arg is a 512 multiple
qemu-img resize will fail with "The new size must be a multiple of 512"
if libvirt doesn't round it first.
This fixes rhbz#951495

Signed-off-by: Christophe Fergeau <cfergeau@redhat.com>
(cherry picked from commit 9a8f39d097)
2013-06-12 15:11:45 -04:00
Daniel P. Berrange
69838772cc Tweak EOF handling of streams
Typically when you get EOF on a stream, poll will return
POLLIN|POLLHUP at the same time. Thus when we deal with
stream reads, if we see EOF during the read, we can then
clear the VIR_STREAM_EVENT_HANGUP & VIR_STREAM_EVENT_ERROR
event bits.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit e7b0382945)
2013-06-11 18:28:23 -04:00
Eric Blake
742dd7b75f smartcard: spell ccid-card-emulated qemu property correctly
Reported by Anthony Messina in
https://bugzilla.redhat.com/show_bug.cgi?id=904692
Present since introduction of smartcard support in commit f5fd9baa

* src/qemu/qemu_command.c (qemuBuildCommandLine): Match qemu spelling.
* tests/qemuxml2argvdata/qemuxml2argv-smartcard-host-certificates.args:
Fix broken test.
(cherry picked from commit 6f7e4ea359)
2013-06-11 16:51:18 -04:00
Eric Blake
05bfbdba00 cgroup: be robust against cgroup movement races, part 2
The previous commit was an incomplete backport of commit 83e4c775,
and as a result made any attempt to start a domain when cgroups
are enabled go into an infinite loop.  This fixes the botched
backport.

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-05-25 06:00:56 -06:00
Eric Blake
39dcf00e72 cgroup: be robust against cgroup movement races
https://bugzilla.redhat.com/show_bug.cgi?id=965169 documents a
problem starting domains when cgroups are enabled; I was able
to reliably reproduce the race about 5% of the time when I added
hooks to domain startup by 3 seconds (as that seemed to be about
the length of time that qemu created and then closed a temporary
thread, probably related to aio handling of initially opening
a disk image).  The problem has existed since we introduced
virCgroupMoveTask in commit 9102829 (v0.10.0).

There are some inherent TOCTTOU races when moving tasks between
kernel cgroups, precisely because threads can be created or
completed in the window between when we read a thread id from the
source and when we write to the destination.  As the goal of
virCgroupMoveTask is merely to move ALL tasks into the new
cgroup, it is sufficient to iterate until no more threads are
being created in the old group, and ignoring any threads that
die before we can move them.

It would be nicer to start the threads in the right cgroup to
begin with, but by default, all child threads are created in
the same cgroup as their parent, and we don't want vcpu child
threads in the emulator cgroup, so I don't see any good way
of avoiding the move.  It would also be nice if the kernel were
to implement something like rename() as a way to atomically move
a group of threads from one cgroup to another, instead of forcing
a window where we have to read and parse the source, then format
and write back into the destination.

* src/util/vircgroup.c (virCgroupAddTaskStrController): Ignore
ESRCH, because a thread ended between read and write attempts.
(virCgroupMoveTask): Loop until all threads have moved.

Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 83e4c77547)

Conflicts:
	src/util/cgroup.c - refactoring in commit 56f27b3bb is too big
to take in entirety; but I did inline its changes to the cleanup label
2013-05-21 14:03:23 -06:00
Daniel P. Berrange
69ef9b78c8 Avoid spamming logs with cgroups warnings
The code for putting the emulator threads in a separate cgroup
would spam the logs with warnings

2013-02-27 16:08:26.731+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 3
2013-02-27 16:08:26.731+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 4
2013-02-27 16:08:26.732+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 6

This is because it has only created child cgroups for 3 of the
controllers, but was trying to move the processes from all the
controllers. The fix is to only try to move threads in the
controllers we actually created. Also remove the warning and
make it return a hard error to avoid such lazy callers in the
future.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 279336c5d8)
2013-05-21 14:03:13 -06:00
Daniel P. Berrange
bf95285541 Don't try to add non-existant devices to ACL
The QEMU driver has a list of devices nodes that are whitelisted
for all guests. The kernel has recently started returning an
error if you try to whitelist a device which does not exist.
This causes a warning in libvirt logs and an audit error for
any missing devices. eg

2013-02-27 16:08:26.515+0000: 29625: warning : virDomainAuditCgroup:451 : success=no virt=kvm resrc=cgroup reason=allow vm="vm031714" uuid=9d8f1de0-44f4-a0b1-7d50-e41ee6cd897b cgroup="/sys/fs/cgroup/devices/libvirt/qemu/vm031714/" class=path path=/dev/kqemu rdev=? acl=rw

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 7f544a4c8f)
2013-05-21 14:02:58 -06:00
Cole Robinson
4f9e72c38f Prep for release 0.10.2.5 2013-05-19 18:18:26 -04:00
Daniel P. Berrange
3b7180f1d4 Fix TLS tests with gnutls 3
When given a CA cert with basic constraints to set non-critical,
and key usage of 'key signing', this should be rejected. Version
of GNUTLS < 3 do not rejecte it though, so we never noticed the
test case was broken

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 0204d6d7a0)
2013-05-19 17:47:58 -04:00
Ján Tomko
0f2eda0da9 daemon: fix leak after listing all volumes
CVE-2013-1962

remoteDispatchStoragePoolListAllVolumes wasn't freeing the pool.
The pool also held a reference to the connection, preventing it from
getting freed and closing the netcf interface driver, which held two
sockets open.
(cherry picked from commit ca697e90d5)
2013-05-16 16:00:14 +02:00
Eric Blake
fd00ec8f92 spec: proper soft static allocation of qemu uid
https://bugzilla.redhat.com/show_bug.cgi?id=924501 tracks a
problem that occurs if uid 107 is already in use at the time
libvirt is first installed.  In response that problem, Fedora
packaging guidelines were recently updated.  This fixes the
spec file to comply with the new guidelines:
https://fedoraproject.org/wiki/Packaging:UsersAndGroups

* libvirt.spec.in (daemon): Follow updated Fedora guidelines.

Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit a2584d58f6)

Conflicts:
	libvirt.spec.in - no backport of c8f79c9b %if reindents
2013-05-06 14:57:38 -06:00
Jiri Denemark
11f56fc7a8 spec: Fix minor changelog issues
When a changelog entry references an RPM macro, % needs to be escaped so
that it does not appear expanded in package changelog.

Fri Mar  4 2009 is incorrect since Mar 4 was Wednesday. Since
libvirt-0.6.1 was released on Mar 4 2009, we should change Fri to Wed.
(cherry picked from commit 53657a0abe)
2013-05-06 14:55:31 -06:00
Jiri Denemark
8ac4f9ce74 spec: Avoid using makeinstall relic
The macro was made to help installing broken packages that did not use
DESTDIR correctly by overriding individual path variables (prefix,
sysconfdir, ...). Newer rpm provides fixed make_install macro that calls
make install with just the correct DESTDIR, however it is not available
everywhere (e.g., RHEL 5 does not have it). On the other hand the
make_install macro is simple and straightforward enough for us to use
its expansion directly.
(cherry picked from commit d45066a55f)
2013-05-06 14:55:17 -06:00
Eric Blake
0b0ecdfc6d audit: properly encode device path in cgroup audit
https://bugzilla.redhat.com/show_bug.cgi?id=922186

Commit d04916fa introduced a regression in audit quality - even
though the code was computing the proper escaped name for a
path, it wasn't feeding that escaped name on to the audit message.
As a result, /var/log/audit/audit.log would mention a pair of
fields class=path path=/dev/hpet instead of the intended
class=path path="/dev/hpet", which in turn caused ausearch to
format the audit log with path=(null).

* src/conf/domain_audit.c (virDomainAuditCgroupPath): Use
constructed encoding.

Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 31c6bf35b9)
2013-04-22 14:51:03 -06:00
Atsushi Kumagai
610aadd635 storage: Fix lvcreate parameter for backingStore.
When virStorageBackendLogicalCreateVol() creates a snapshot for a
logical volume with backingStore element, it fails with the message
below:

  2013-01-17 03:10:18.869+0000: 1967: error : virCommandWait:2345 :
  internal error Child process (/sbin/lvcreate --name lvm-snapshot -L 51200K
  -s=/dev/lvm-pool/lvm-volume) unexpected exit status 3: /sbin/lvcreate:
  invalid option -- '='  Error during parsing of command line.

This is because virCommandAddArgPair() uses '=' to connect the two
parameters, it's unsuitable for -s option of the lvcreate.

Signed-off-by: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
(cherry picked from commit ffee627a4a)
2013-04-22 19:54:59 +02:00
Cole Robinson
8e72362ec8 Prep for release 0.10.2.4 2013-04-01 15:35:31 -04:00
Matthias Bolte
f50a9b3be1 esx: Fix and improve esxListAllDomains function
Avoid requesting information such as identity or power state when it
is not necessary.

Lookup virtual machine list with the required fields (configStatus,
name, and config.uuid) to make esxVI_GetVirtualMachineIdentity work.

No need to call esxVI_GetNumberOfSnapshotTrees. rootSnapshotTreeList
can be tested for emptiness by checking it for NULL.

esxVI_LookupRootSnapshotTreeList already does the error reporting,
don't overwrite it.

Check if autostart is enabled at all before looking up the individual
autostart setting of a virtual machine.

Reorder VIR_EXPAND_N(doms, ndoms, 1) to avoid leaking the result of
the call to virGetDomain if VIR_EXPAND_N fails.

Replace VIR_EXPAND_N by VIR_RESIZE_N to avoid quadratic scaling, as in
the Hyper-V version of the function.

If virGetDomain fails it already reports an error, don't overwrite it
with an OOM error.

All items in doms up to the count-th one are valid, no need to double
check before freeing them.

Finally, don't leak autoStartDefaults and powerInfoList.
(cherry picked from commit 5fc663d8be)
2013-04-01 10:41:06 -04:00
Daniel P. Berrange
6f290666dc Fix parsing of SELinux ranges without a category
Normally libvirtd should run with a SELinux label

  system_u:system_r:virtd_t:s0-s0:c0.c1023

If a user manually runs libvirtd though, it is sometimes
possible to get into a situation where it is running

  system_u:system_r:init_t:s0

The SELinux security driver isn't expecting this and can't
parse the security label since it lacks the ':c0.c1023' part
causing it to complain

  internal error Cannot parse sensitivity level in s0

This updates the parser to cope with this, so if no category
is present, libvirtd will hardcode the equivalent of c0.c1023.

Now this won't work if SELinux is in Enforcing mode, but that's
not an issue, because the user can only get into this problem
if in Permissive mode. This means they can now start VMs in
Permissive mode without hitting that parsing error

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 1732c1c629)

Conflicts:
	src/security/security_selinux.c
2013-04-01 10:41:04 -04:00
Daniel P. Berrange
afb32d4a2b Separate MCS range parsing from MCS range checking
Pull the code which parses the current process MCS range
out of virSecuritySELinuxMCSFind and into a new method
virSecuritySELinuxMCSGetProcessRange.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 4a92fe4413)

Conflicts:
	src/security/security_selinux.c
2013-04-01 10:40:21 -04:00
Daniel P. Berrange
d4e0e86c49 Fix memory leak on OOM in virSecuritySELinuxMCSFind
The body of the loop in virSecuritySELinuxMCSFind would
directly 'return NULL' on OOM, instead of jumping to the
cleanup label. This caused a leak of several local vars.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit f2d8190cfb)
2013-04-01 09:48:04 -04:00
Michal Privoznik
2bcf1522ff qemu: Set migration FD blocking
Since we switched from direct host migration scheme to the one,
where we connect to the destination and then just pass a FD to a
qemu, we have uncovered a qemu bug. Qemu expects migration FD to
block. However, we are passing a nonblocking one which results in
cryptic error messages like:

  qemu: warning: error while loading state section id 2
  load of migration failed

The bug is already known to Qemu folks, but we should workaround
already released Qemus. Patch has been originally proposed by Stefan
Hajnoczi <stefanha@gmail.com>
(cherry picked from commit ceb31795af)
2013-03-26 18:36:40 -04:00
Eric Blake
2c7638fdb4 build: further fixes for broken if_bridge.h
Commit c308a9ae was incomplete; it resolved the configure failure,
but not a later build failure.

* src/util/virnetdevbridge.c: Include pre-req header.
* configure.ac (AC_CHECK_HEADERS): Prefer standard in.h over
non-standard ip6.h.
(cherry picked from commit 1bf661caf4)
2013-03-26 18:36:34 -04:00
Cole Robinson
879f28a9f3 build: work around broken kernel header
I got this scary warning during ./configure on rawhide:

checking linux/if_bridge.h usability... no
checking linux/if_bridge.h presence... yes
configure: WARNING: linux/if_bridge.h: present but cannot be compiled
configure: WARNING: linux/if_bridge.h:     check for missing prerequisite headers?
configure: WARNING: linux/if_bridge.h: see the Autoconf documentation
configure: WARNING: linux/if_bridge.h:     section "Present But Cannot Be Compiled"
configure: WARNING: linux/if_bridge.h: proceeding with the compiler's result
configure: WARNING:     ## ------------------------------------- ##
configure: WARNING:     ## Report this to libvir-list@redhat.com ##
configure: WARNING:     ## ------------------------------------- ##
checking for linux/if_bridge.h... no

* configure.ac (AC_CHECK_HEADERS): Provide struct in6_addr, since
linux/if_bridge.h uses it without declaring it.
(cherry picked from commit c308a9ae15)
(cherry picked from commit 7ae53f1593)
2013-03-26 18:36:14 -04:00
Daniel P. Berrange
119c079c5b Fix SELinux security label test
If securityselinuxtest was run on a system with newer SELinux
policy it would fail, due to using svirt_tcg_t instead of
svirt_t. Fixing the domain type to be KVM avoids this issue.
(cherry picked from commit 32df483f1d)
2013-02-22 09:25:41 -07:00
Jim Fehlig
e91f69ed89 libxl: Fix setting of disk backend
The libxl driver was setting the backend field of libxl_device_disk
structure to LIBXL_DISK_BACKEND_TAP when the driver element of disk
configuration was not specified.  This needlessly forces the use of
blktap driver, which may not be loaded in dom0

https://bugzilla.redhat.com/show_bug.cgi?id=912488

Ian Campbell suggested that LIBXL_DISK_BACKEND_UNKNOWN is a better
default in this case

https://www.redhat.com/archives/libvir-list/2013-February/msg01126.html
(cherry picked from commit 567779e51a)
2013-02-22 09:00:57 -07:00
Jiri Denemark
96a49bd8bb util: Fix mask for 172.16.0.0 private address range
https://bugzilla.redhat.com/show_bug.cgi?id=905708

Only the first 12 bits should be set in the mask for this range. All
addresses between 172.16.0.0 and 172.31.255.255 are private.
(cherry picked from commit 6405713f2a)
2013-02-02 20:45:45 -05:00
Laine Stump
50a1a57e4c conf: don't fail to parse <boot> when parsing a single device
This resolves:

  https://bugzilla.redhat.com/show_bug.cgi?id=895294

The symptom was that attempts to modify a network device using
virDomainUpdateDeviceFlags() would fail if the original device had a
<boot> element (e.g. "<boot order='1'/>"), even if the updated device
had the same <boot> element. Instead, the following error would be logged:

  cannot modify network device boot index setting

It's true that it's not possible to change boot order (internally
known as bootIndex) of a live device; qemuDomainChangeNet checks for
that, but the problem was that the information it was checking was
incorrect.

Explanation:

When a complete domain is parsed, a global (to the domain) "bootMap"
is passed down to the parse for each device; the bootMap is used to
make sure that devices don't have conflicting settings for their boot
orders.

When a single device is parsed by itself (as in the case of
virDomainUpdateDeviceFlags), there is no global bootMap that would be
appropriate to send, so NULL is sent instead. However, although the
lowest level function that parses just the boot order *does* simply
skip the sanity check in that case, the next higher level
"virDomainDeviceInfoParseXML" function refuses to call down to the
lower "virDomainDeviceBootParseXML" if bootMap is NULL. So, the boot
order is never set in the "new" device object, and when it is compared
to the original (which does have a boot order), they don't match.

The fix is to patch virDomainDeviceInfoParseXML to not care about
bootMap, and just always call virDomainDeviceInfoBootParseXML whenever
there is a <boot> element. When we are only parsing a single device,
we don't care whether or not any specified boot order is consistent
with the rest of the domain; we will always do this check later (in
the current case, we do it by verifying that the net bootIndex exactly
matches the old bootIndex).
2013-01-31 11:18:40 -05:00
Daniel P. Berrange
95ea6a38bd Support custom 'svirt_tcg_t' context for TCG based guests
The current SELinux policy only works for KVM guests, since
TCG requires the 'execmem' privilege. There is a 'virt_use_execmem'
boolean to turn this on globally, but that is unpleasant for users.
This changes libvirt to automatically use a new 'svirt_tcg_t'
context for TCG based guests. This obsoletes the previous
boolean tunable and makes things 'just work(tm)'

Since we can't assume we run with new enough policy, I also
make us log a warning message (once only) if we find the policy
lacks support. In this case we fallback to the normal label and
expect users to set the boolean tunable

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 77d3a80974)
2013-01-28 14:46:32 -05:00
Cole Robinson
8a57d799ca uml: Report error if inotify fails on driver startup
(cherry picked from commit 7b97030ad4)
2013-01-28 14:45:37 -05:00
Cole Robinson
741f004378 daemon: Preface polkit error output with 'polkit:'
There's been a few bugs about an expected error from polkit:

https://bugzilla.redhat.com/show_bug.cgi?id=873799
https://bugzilla.redhat.com/show_bug.cgi?id=872166

The error is:

Authorization requires authentication but no agent is available.

The error means that polkit needs a password, but there is no polkit
agent registered in your session. Polkit agents are the bit of UI that
pop up and actually ask for your password.

Preface the error with the string 'polkit:' so folks can hopefully
make more sense of it.
(cherry picked from commit 96a108c993)
2013-01-28 14:45:26 -05:00
Cole Robinson
66fa059e1a spec: Fix script warning when uninstalling libvirt-client
https://bugzilla.redhat.com/show_bug.cgi?id=888071
(cherry picked from commit d60c7f75c2)
2013-01-28 14:45:03 -05:00
Cole Robinson
b03cf6d6bc Prep for release 0.10.2.3 2013-01-28 14:17:18 -05:00
Richard W.M. Jones
f63b9694dc selinux: Only create the selabel_handle once.
According to Eric Paris this is slightly more efficient because it
only loads the regular expressions in libselinux once.
(cherry picked from commit 6159710ca1)

Conflicts:
	src/security/security_selinux.c
2013-01-28 14:13:12 -05:00
Daniel P. Berrange
460e481647 Skip bulk relabelling of resources in SELinux driver when used with LXC
The virSecurityManager{Set,Restore}AllLabel methods are invoked
at domain startup/shutdown to relabel resources associated with
a domain. This works fine with QEMU, but with LXC they are in
fact both currently no-ops since LXC does not support disks,
hostdevs, or kernel/initrd files. Worse, when LXC gains support
for disks/hostdevs, they will do the wrong thing, since they
run in host context, not container context. Thus this patch
turns then into a formal no-op when used with LXC. The LXC
controller will call out to specific security manager labelling
APIs as required during startup.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 89c5a9d0e8)
2013-01-28 14:13:12 -05:00
John Ferlan
8cdeb0f85e selinux: Resolve resource leak using the default disk label
Commit id a994ef2d1 changed the mechanism to store/update the default
security label from using disk->seclabels[0] to allocating one on the
fly. That change allocated the label, but never saved it.  This patch
will save the label. The new virDomainDiskDefAddSecurityLabelDef() is
a copy of the virDomainDefAddSecurityLabelDef().
(cherry picked from commit 05cc035189)

Conflicts:
	src/conf/domain_conf.h
2013-01-28 14:13:12 -05:00
Peter Krempa
f104a2a6b3 rpc: Fix crash on error paths of message dispatching
This patch resolves CVE-2013-0170:
https://bugzilla.redhat.com/show_bug.cgi?id=893450

When reading and dispatching of a message failed the message was freed
but wasn't removed from the message queue.

After that when the connection was about to be closed the pointer for
the message was still present in the queue and it was passed to
virNetMessageFree which tried to call the callback function from an
uninitialized pointer.

This patch removes the message from the queue before it's freed.

* rpc/virnetserverclient.c: virNetServerClientDispatchRead:
    - avoid use after free of RPC messages
(cherry picked from commit 46532e3e8e)
2013-01-28 20:04:18 +01:00
John Ferlan
c52667e241 nwfilter: Remove unprivileged code path to set base
https://bugzilla.redhat.com/show_bug.cgi?id=903184

Commit id f8ab364c removed ability to run this driver unprivileged. Coverity
detected the check and flagged it.
(cherry picked from commit aafe41971c)

Conflicts:
	src/nwfilter/nwfilter_driver.c - whitespace changes in 1c04f99 not present
2013-01-23 09:12:10 -07:00
Daniel P. Berrange
ba4e7b6344 Fix nwfilter driver reload/shutdown handling when unprivileged
https://bugzilla.redhat.com/show_bug.cgi?id=903184

Although the nwfilter driver skips startup when running in a
session libvirtd, it did not skip reload or shutdown. This
caused errors to be reported when sending SIGHUP to libvirtd,
and caused an abort() in libdbus on shutdown due to trying
to remove a dbus filter that was never added
(cherry picked from commit abbec81bd0)

Conflicts:
	src/nwfilter/nwfilter_driver.c - earlier changes f4ea67f and
79b8a56 related to using bool and auto-shutdown of drivers are not backported
2013-01-23 08:38:50 -07:00
Hu Tao
5e10d0ef1e call virstateCleanup to do the cleanup before libvirtd exits
https://bugzilla.redhat.com/show_bug.cgi?id=903184

(cherry picked from commit 47e1767725)
2013-01-23 08:38:15 -07:00
Daniel P. Berrange
2d6eaba201 Fix race condition when destroying guests
When running virDomainDestroy, we need to make sure that no other
background thread cleans up the domain while we're doing our work.
This can happen if we release the domain object while in the
middle of work, because the monitor might detect EOF in this window.
For this reason we have a 'beingDestroyed' flag to stop the monitor
from doing its normal cleanup. Unfortunately this flag was only
being used to protect qemuDomainBeginJob, and not qemuProcessKill

This left open a race condition where either libvirtd could crash,
or alternatively report bogus error messages about the domain already
having been destroyed to the caller

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 81621f3e6e)

Conflicts:

  src/qemu/qemu_driver.c - virReportError had been removed from
      upstream in cases where qemuProcessKill failed, creating
      different context.
2013-01-18 13:23:23 -05:00
Yufang Zhang
a054aa94e8 build: move file deleting action from %files list to %install
When building libvirt rpms on rhel5, I got the following error:

    File must begin with "/": rm
    File must begin with "/": -f
    File must begin with "/": $RPM_BUILD_ROOT/etc/sysctl.d/libvirtd
    Installed (but unpackaged) file(s) found:
   /etc/sysctl.d/libvirtd

It is triggerd by the %files list of libvirt daemon:

    %if 0%{?fedora} >= 14 || 0%{?rhel} >= 6
    %config(noreplace) %{_prefix}/lib/sysctl.d/libvirtd.conf
    %else
    rm -f $RPM_BUILD_ROOT%{_prefix}/lib/sysctl.d/libvirtd.conf
    %endif

After checking document of rpm spec file, I think it would be better
to move the file deleting line from %files list to %install script.

Bug introduced in commit a1fd56c.
(cherry picked from commit daef7c9e9c)
2013-01-09 17:18:36 -07:00
Viktor Mihajlovski
5c31525082 build: libvirt-guests files misplaced in specfile
In a non-systemd environment the post and preun scripts of libvirt-client
fail, since the required files are in libvirt-daemon. Moved them to client.
Doing that I noticed %{_unitdir}/libvirt-guests.service was contained in
both libvirt-client and libvirt-daemon, which I don't think was intended.
Removed the extra copy from daemon.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
(cherry picked from commit b7159dca8b)

Conflicts:
	libvirt.spec.in - no virtlockd service
2013-01-08 11:02:09 -07:00
Michal Privoznik
48baba6a02 qemu: Relax hard RSS limit
Currently, if there's no hard memory limit defined for a domain,
libvirt tries to calculate one, based on domain definition and magic
equation and set it upon the domain startup. The rationale behind was,
if there's a memory leak or exploit in qemu, we should prevent the
host system trashing. However, the equation was too tightening, as it
didn't reflect what the kernel counts into the memory used by a
process. Since many hosts do have a swap, nobody hasn't noticed
anything, because if hard memory limit is reached, process can
continue allocating memory on a swap. However, if there is no swap on
the host, the process gets killed by OOM killer. In our case, the qemu
process it is.

To prevent this, we need to relax the hard RSS limit. Moreover, we
should reflect more precisely the kernel way of accounting the memory
for process. That is, even the kernel caches are counted within the
memory used by a process (within cgroups at least). Hence the magic
equation has to be changed:

  limit = 1.5 * (domain memory + total video memory) + (32MB for cache
          per each disk) + 200MB
(cherry picked from commit 3c83df679e)
2013-01-08 18:34:56 +01:00
Laine Stump
61511ae63a util: fix botched check for new netlink request filters
This is an adjustment to the fix for

  https://bugzilla.redhat.com/show_bug.cgi?id=889319

to account for two bonehead mistakes I made.

commit ac2797cf2a attempted to fix a
problem with netlink in newer kernels requiring an extra attribute
with a filter flag set in order to receive an IFLA_VFINFO_LIST from
netlink. Unfortunately, the #ifdef that protected against compiling it
in on systems without the new flag went a bit too far, assuring that
the new code would *never* be compiled, and even if it had, the code
was incorrect.

The first problem was that, while some IFLA_* enum values are also
their existence at compile time, IFLA_EXT_MASK *isn't* #defined, so
checking to see if it's #defined is not a valid method of determining
whether or not to add the attribute. Fortunately, the flag that is
being set (RTEXT_FILTER_VF) *is* #defined, and it is never present if
IFLA_EXT_MASK isn't, so it's sufficient to just check for that flag.

And to top it off, due to the code not actually compiling when I
thought it did, I didn't realize that I'd been given the wrong arglist
to nla_put() - you can't just send a const value to nla_put, you have
to send it a pointer to memory containing what you want to add to the
message, along with the length of that memory.

This time I've actually sent the patch over to the other machine
that's experiencing the problem, applied it to the branch being used
(0.10.2) and verified that it works properly, i.e. it does fix the
problem it's supposed to fix. :-/
(cherry picked from commit 7c36650699)
2013-01-08 12:22:16 -05:00
Laine Stump
6b789ea3e0 util: add missing error log messages when failing to get netlink VFINFO
This patch fixes the lack of error messages when libvirt fails to find
VFINFO in a returned netlinke response message.

https://bugzilla.redhat.com/show_bug.cgi?id=827519#c10 is an example
of the error message that was previously logged when the
IFLA_VFINFO_LIST object was missing from the netlink response. The
reason for this failure is detailed in

   https://bugzilla.redhat.com/show_bug.cgi?id=889319

Even though that root problem has been fixed, the experience of
finding the root cause shows us how important it is to properly log an
error message in these cases. This patch *seems* to replace the entire
function, but really most of the changes are due to moving code that
was previously inside an if() statement out to the top level of the
function (the original if() was reversed and made to log an error and
return).
(cherry picked from commit 846770e5ff)
2013-01-08 12:22:16 -05:00
Laine Stump
52fca883d2 util: fix functions that retrieve SRIOV VF info
This patch resolves:

  https://bugzilla.redhat.com/show_bug.cgi?id=889319

When assigning an SRIOV virtual function to a guest using "intelligent
PCI passthrough" (<interface type='hostdev'>, which sets the MAC
address and vlan tag of the VF before passing its info to qemu),
libvirt first learns the current MAC address and vlan tag by sending
an NLM_F_REQUEST message for the VF's PF (physical function) to the
kernel via a NETLINK_ROUTE socket (see virNetDevLinkDump()); the
response message's IFLA_VFINFO_LIST section is examined to extract the
info for the particular VF being assigned.

This worked fine with kernels up until kernel commit
115c9b81928360d769a76c632bae62d15206a94a (first appearing in upstream
kernel 3.3) which changed the ABI to not return IFLA_VFINFO_LIST in
the response until a newly introduced IFLA_EXT_MASK field was included
in the request, with the (newly introduced, of course) RTEXT_FILTER_VF
flag set.

The justification for this ABI change was that new fields had been
added to the VFINFO, causing NLM_F_REQUEST messages to fail on systems
with large numbers of VFs if the requesting application didn't have a
large enough buffer for all the info. The idea is that most
applications doing an NLM_F_REQUEST don't care about VFINFO anyway, so
eliminating it from the response would lower the requirements on
buffer size. Apparently, the people who pushed this patch made the
mistaken assumption that iproute2 (the "ip" command) was the only
package that used IFLA_VFINFO_LIST, so it wouldn't break anything else
(and they made sure that iproute2 was fixed.

The logic of this "fix" is debatable at best (one could claim that the
proper fix would be for the applications in question to be fixed so
that they properly sized the buffer, which is what libvirt does
(purely by virtue of using libnl), but it is what it is and we have to
deal with it.

In order for <interface type='hostdev'> to work properly on systems
with a kernel 3.3 or later, libvirt needs to add the afore-mentioned
IFLA_EXT_MASK field with RTEXT_FILTER_VF set.

Of course we also need to continue working on systems with older
kernels, so that one bit of code is compiled conditionally. The one
time this could cause problems is if the libvirt binary was built on a
system without IFLA_EXT_MASK which was subsequently updated to a
kernel that *did* have it. That could be solved by manually providing
the values of IFLA_EXT_MASK and RTEXT_FILTER_VF and adding it to the
message anyway, but I'm uncertain what that might actually do on a
system that didn't support the message, so for the time being we'll
just fail in that case (which will very likely never happen anyway).
(cherry picked from commit ac2797cf2a)
2013-01-08 12:22:16 -05:00