Commit Graph

18851 Commits

Author SHA1 Message Date
Cole Robinson
d7addd6871 build: predictably generate systemtap tapsets (bz 1173641)
The generated output is dependent on perl hashtable ordering, which
gives different results for i686 and x86_64. Fix this by sorting
the hash keys before iterating over them

https://bugzilla.redhat.com/show_bug.cgi?id=1173641
(cherry picked from commit a1edb05c60)
2016-01-20 19:09:03 -05:00
Ján Tomko
931888e63a leaseshelper: fix crash when no mac is specified
If dnsmasq specified DNSMASQ_IAID (so we're dealing with an IPv6
lease) but no DNSMASQ_MAC, we skip creation of the new lease object.

Also skip adding it to the leases array.

https://bugzilla.redhat.com/show_bug.cgi?id=1202350
(cherry picked from commit df9fe124d6)
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2016-01-20 10:04:48 +01:00
Ján Tomko
4be842d5f5 schema: interleave domain name and uuid with other elements
Allow <name> and <uuid> anywhere under <domain>, not just at the top:

error:XML document failed to validate against schema: Unable to validate
doc against /usr/share/libvirt/schemas/domain.rng
Expecting an element name, got nothing
Invalid sequence in interleave
Element domain failed to validate content

Introduced with the first RelaxNG schema in commit c642103.

https://bugzilla.redhat.com/show_bug.cgi?id=1292131
(cherry picked from commit b4e0549feb)
Signed-off-by: Ján Tomko <jtomko@redhat.com>
2016-01-05 13:53:47 +01:00
Cole Robinson
b1d7d94de6 Prep for release 1.2.13.2 2015-12-23 18:34:35 -05:00
Cole Robinson
5617105ff9 spec: Delete .git after applying patches
I'm hitting this little annoyance in fedora's package repo:

$ fedpkg prep
Downloading libvirt-1.2.20.tar.gz
...
+ /usr/bin/gzip -dc /home/crobinso/src/fedora/libvirt/libvirt-1.2.20.tar.gz
$ git clean -xdf
Removing libvirt-1.2.20.tar.gz
Skipping repository libvirt-1.2.20/

We git-ify the libvirt directory as part of applying patches in the spec
file, but 'git clean' will ignore subfolders that appear to be standalone
git repos.

Let's just delete the .git directory after we're done with it.

(cherry picked from commit 62ff210e5d)
2015-12-23 18:19:29 -05:00
Peter Krempa
b0fc553334 qemu: block-commit: Mark disk in block jobs only on successful command
Patch 51f9f03a4c introduces a regression
where if a blockCommit operation fails the disk is still marked as being
part of a block job but can't be unmarked later.

(cherry picked from commit ee744b5b38)
2015-12-23 18:06:59 -05:00
Peter Krempa
c7e01d5e1e qemu: Disallow concurrent block jobs on a single disk
While qemu may be prepared to do this libvirt is not. Forbid the block
ops until we fix our code.

(cherry picked from commit 51f9f03a4c)
2015-12-23 18:06:59 -05:00
Peter Krempa
6eb78379c2 qemu: event: Don't fiddle with disk backing trees without a job
Surprisingly we did not grab a VM job when a block job finished and we'd
happily rewrite the backing chain data. This made it possible to crash
libvirt when queueing two backing chains tightly and other badness.

To fix it, add yet another handler to the helper thread that handles
monitor events that require a job.

(cherry picked from commit 1a92c71910)
2015-12-23 18:06:59 -05:00
Peter Krempa
ce431e52e5 qemu: process: Export qemuProcessFindDomainDiskByAlias
(cherry picked from commit 5c634730b9)
2015-12-23 18:06:59 -05:00
Cole Robinson
448958859e spec: Fix polkit dep on F23
As of fedora polkit-0.113-2, polkit-devel only pulls in polkit-libs, not
full polkit, but we need the latter for pkcheck otherwise our configure
test fails.

(cherry picked from commit 6600f4f3d8)
2015-12-23 18:03:34 -05:00
Jiri Denemark
390d1b3f9a domain: Fix migratable XML with graphics/@listen
As of commit 6992994, we set graphics/@listen attribute according to the
first listen child element even if that element is of type='network'.
This was done for backward compatibility with applications which only
support the original listen attribute. However, by doing so we broke
migration to older libvirt which tried to check that the listen
attribute matches one of the listen child elements but which did not
take type='network' elements into account.

We are not concerned about compatibility with old applications when
formatting domain XML for migration for two reasons. The XML is consumed
only by libvirtd and the IP address associated with type='network'
listen address on the source host is just useless on the destination
host. Thus, we can safely avoid propagating the type='network' IP
address to graphics/@listen attribute when creating migratable XML.

https://bugzilla.redhat.com/show_bug.cgi?id=1265111

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
(cherry picked from commit c0806dc30b)
2015-12-23 18:01:59 -05:00
Peter Krempa
70d819ae27 qemu: hotplug: Properly clean up drive backend if frontend hotplug fails
Commit 8125113c added code that should remove the disk backend if the
fronted hotplug failed for any reason. The code had a bug though as it
used the disk string for unplug rather than the backend alias. Fix the
code by pre-creating an alias string and using it instead of the disk
string. In cases where qemu does not support QEMU_CAPS_DEVICE, we ignore
the unplug of the backend since we can't really create an alias in that
case.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1262399
(cherry picked from commit 64c6695f1a)
2015-12-23 18:01:10 -05:00
Stefan Berger
98e4d87799 tpm: adapt sysfs cancel path for new TPM driver
This patch addresses BZ 1244895.

Adapt the sysfs TPM command cancel path for the TPM driver that
does not use a miscdevice anymore since Linux 4.0. Support old
and new paths and check their availability.

Add a mockup for the test cases to avoid the testing for
availability of the cancel path.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
(cherry picked from commit 5ed7afa9de)
2015-12-23 18:00:04 -05:00
Guido Günther
01d7e0a721 libvirt-guests: Disable shutdown timeout
Since we can't know at service start how many VMs will be running we
can't calculate an apropriate shutdown timeout. So instead of killing
off the service just let it use it's own internal timeout mechanism.

References:
    http://bugs.debian.org/803714
    https://bugzilla.redhat.com/show_bug.cgi?id=1195544

(cherry picked from commit ba08d16d6c)
2015-12-23 17:59:46 -05:00
Martin Kletzander
87b6ec6c86 systemd: Escape only needed characters for machined
Machine name escaping follows the same rules as serice name escape,
except that '.' and '-' must not be escaped in machine names, due
to a bug in systemd-machined.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1282846

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
(cherry picked from commit 0e0149ce91)
2015-12-23 17:58:58 -05:00
Martin Kletzander
b92cd48c57 systemd: Escape machine name for machined
According to the documentation, CreateMachine accepts only 7bit ASCII
characters in the machinename parameter, so let's make sure we can start
machines with unicode names with systemd.  We already have a function
for that, we just forgot to use it.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1062943
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1282846

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
(cherry picked from commit e24eda48cf)
2015-12-23 17:58:53 -05:00
Peter Krempa
17a8713577 cgroup: Drop resource partition from virSystemdMakeScopeName
The scope name, even according to our docs is
"machine-$DRIVER\x2d$VMNAME.scope" virSystemdMakeScopeName would use the
resource partition name instead of "machine-" if it was specified thus
creating invalid scope paths.

This makes libvirt drop cgroups for a VM that uses custom resource
partition upon reconnecting since the detected scope name would not
match the expected name generated by virSystemdMakeScopeName.

The error is exposed by the following log entry:

debug : virCgroupValidateMachineGroup:302 : Name 'machine-qemu\x2dtestvm.scope' for controller 'cpu' does not match 'testvm', 'testvm.libvirt-qemu' or 'machine-test-qemu\x2dtestvm.scope'

for a "/machine/test" resource and "testvm" vm.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1238570
(cherry picked from commit 88f6c007c3)
2015-12-23 17:58:44 -05:00
Eric Blake
b553ec764f CVE-2015-5313: storage: don't allow '/' in filesystem volume names
The libvirt file system storage driver determines what file to
act on by concatenating the pool location with the volume name.
If a user is able to pick names like "../../../etc/passwd", then
they can escape the bounds of the pool.  For that matter,
virStoragePoolListVolumes() doesn't descend into subdirectories,
so a user really shouldn't use a name with a slash.

Normally, only privileged users can coerce libvirt into creating
or opening existing files using the virStorageVol APIs; and such
users already have full privilege to create any domain XML (so it
is not an escalation of privilege).  But in the case of
fine-grained ACLs, it is feasible that a user can be granted
storage_vol:create but not domain:write, and it violates
assumptions if such a user can abuse libvirt to access files
outside of the storage pool.

Therefore, prevent all use of volume names that contain "/",
whether or not such a name is actually attempting to escape the
pool.

This changes things from:

$ virsh vol-create-as default ../../../../../../etc/haha --capacity 128
Vol ../../../../../../etc/haha created
$ rm /etc/haha

to:

$ virsh vol-create-as default ../../../../../../etc/haha --capacity 128
error: Failed to create vol ../../../../../../etc/haha
error: Requested operation is not valid: volume name '../../../../../../etc/haha' cannot contain '/'

Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 034e47c338)
2015-12-12 22:19:01 -07:00
Michal Privoznik
175f5022dc remoteClientCloseFunc: Don't mangle connection object refcount
Well, in 8ad126e6 we tried to fix a memory corruption problem.
However, the fix was not as good as it could be. I mean, the
commit has one line more than it should. I've noticed this output
just recently:

  # ./run valgrind --leak-check=full --show-reachable=yes ./tools/virsh domblklist gentoo
  ==17019== Memcheck, a memory error detector
  ==17019== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
  ==17019== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
  ==17019== Command: /home/zippy/work/libvirt/libvirt.git/tools/.libs/virsh domblklist gentoo
  ==17019==
  Target     Source
  ------------------------------------------------
  fda        /var/lib/libvirt/images/fd.img
  vda        /var/lib/libvirt/images/gentoo.qcow2
  hdc        /home/zippy/tmp/install-amd64-minimal-20150402.iso

  ==17019== Thread 2:
  ==17019== Invalid read of size 4
  ==17019==    at 0x4EFF5B4: virObjectUnref (virobject.c:258)
  ==17019==    by 0x5038CFF: remoteClientCloseFunc (remote_driver.c:552)
  ==17019==    by 0x5069D57: virNetClientCloseLocked (virnetclient.c:685)
  ==17019==    by 0x506C848: virNetClientIncomingEvent (virnetclient.c:1852)
  ==17019==    by 0x5082136: virNetSocketEventHandle (virnetsocket.c:1913)
  ==17019==    by 0x4ECD64E: virEventPollDispatchHandles (vireventpoll.c:509)
  ==17019==    by 0x4ECDE02: virEventPollRunOnce (vireventpoll.c:658)
  ==17019==    by 0x4ECBF00: virEventRunDefaultImpl (virevent.c:308)
  ==17019==    by 0x130386: vshEventLoop (vsh.c:1864)
  ==17019==    by 0x4F1EB07: virThreadHelper (virthread.c:206)
  ==17019==    by 0xA8462D3: start_thread (in /lib64/libpthread-2.20.so)
  ==17019==    by 0xAB441FC: clone (in /lib64/libc-2.20.so)
  ==17019==  Address 0x139023f4 is 4 bytes inside a block of size 240 free'd
  ==17019==    at 0x4C2B1F0: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
  ==17019==    by 0x4EA8949: virFree (viralloc.c:582)
  ==17019==    by 0x4EFF6D0: virObjectUnref (virobject.c:273)
  ==17019==    by 0x4FE74D6: virConnectClose (libvirt.c:1390)
  ==17019==    by 0x13342A: virshDeinit (virsh.c:406)
  ==17019==    by 0x134A37: main (virsh.c:950)

The problem is, when registering remoteClientCloseFunc(), it's
conn->closeCallback which is ref'd. But in the function itself
it's conn->closeCallback->conn what is unref'd. This is causing
imbalance in reference counting. Moreover, there's no need for
the remote driver to increase/decrease conn refcount since it's
not used anywhere. It's just merely passed to client registered
callback. And for that purpose it's correctly ref'd in
virConnectRegisterCloseCallback() and then unref'd in
virConnectUnregisterCloseCallback().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(cherry picked from commit e689300770)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-09-03 17:46:07 +02:00
Jim Fehlig
d2a115b16f Revert "LXC: show used memory as 0 when domain is not active"
This reverts commit 1ce7c1d20c,
which introduced a significant semantic change to the
virDomainGetInfo() API. Additionally, the change was only
made to 2 of the 15 virt drivers.

Conflicts:
	src/qemu/qemu_driver.c

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
(cherry picked from commit 60acb38abb)
2015-08-28 10:38:00 -06:00
Michal Privoznik
1f15a5d4b2 lxc: Don't pass a local variable address randomly
So, recently I was testing the LXC driver. You know, startup some
domains. But to my surprise, I was not able to start a single one:

  virsh # start --console test
  error: Reconnected to the hypervisor
  error: Failed to start domain test
  error: internal error: guest failed to start: unexpected exit status 125

So I've start digging. It turns out, that in virExec(), when I printed
out the @cmd, I got strange values: *(cmd->outfdptr) was certainly not
valid FD number: it has random value of several millions. This
obviously made prepareStdFd(childout, STDOUT_FILENO) fail (line 611).
But outfdptr is set in virCommandSetOutputFD(). The only place within
LXC driver where the function is called is in
virLXCProcessBuildControllerCmd(). If you take a closer look at the
function it looks like this:

static virCommandPtr
virLXCProcessBuildControllerCmd(virLXCDriverPtr driver,
                                ..
                                int logfd,
                                const char *pidfile)
{
    ...
    virCommandSetOutputFD(cmd, &logfd);
    virCommandSetErrorFD(cmd, &logfd);
    ...
}

Yes, you guessed it. @logfd is passed into the function by value.
However, in the function we try to get its address (an address of a
local variable) which is no longer valid once function is finished and
stack is cleaned. Therefore when cmd->outfdptr is evaluated at any
point after this function, we may get a random number, depending on
what's currently on the stack. Of course, this may work sometimes too
- it depends on the compiler how it arranges the code, when the stack
is wiped out.

In order to fix this, lets pass a pointer to @logfd instead of
figuring out (wrong) its value in a function.

The bug was introduced in e1de5521.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(cherry picked from commit 302146b16d)
2015-07-02 08:40:02 +02:00
Eric W. Biederman
030e966499 lxc: set nosuid+nodev+noexec flags on /proc/sys mount
Future kernels will mandate the use of nosuid+nodev+noexec
flags when mounting the /proc/sys filesystem. Unconditionally
add them now since they don't harm things regardless and could
mitigate future security attacks.

(cherry picked from commit 24710414d4)
2015-06-16 17:12:17 +01:00
Lubomir Rintel
d7395473d1 virnetdev: fix moving of 802.11 phys
There was a couple of problems with the style fixes applied to the original
patch:

1.) virFileReadAllQuiet comparison was incorrectly parenthesized when moved
into a condition, causing the len to be set to the result of comparison. This,
together with the removed underflow check would underflow the phy buffer.

2.) The logic was broken. Failure to call "ip" would abort the function, thus
the "iw" branch would never be reached.

This aims to fix the issues and work around possible style complains :)

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
(cherry picked from commit 81b19ce46a)
2015-06-05 19:34:40 -04:00
Lubomir Rintel
cc2add9681 interface: don't error out if a bond has no interfaces
It's not a problem at all and causes virt-manager to break down.

Note: netcf 0.2.8 and earlier generates invalid XML for a bond with no
interfaces anyway, so in that case this error in libvirt is never
reached since we fail earlier.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
(cherry picked from commit efc68de5cd)
2015-06-05 19:34:31 -04:00
Lubomir Rintel
361f5244ca lxc: don't up the veth interfaces unless explicitly asked to
Upping an interface for no reason and not configuring it is a cardinal sin.

With the default addrgenmode if eui64 it sticks a link-local address to the
interface. That is not good, as NetworkManager would see an address configured,
assume the interface is already configured and won't touch it iself and the
interface might stay unconfigured until the end of the days.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1124721

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(cherry picked from commit c3cf3c43a0)
2015-05-27 14:32:03 -04:00
Michal Privoznik
a7ee1d068e tests: Add virnetdevtestdata to EXTRA_DIST
In one of my previous commits (49ed6cff9) I've introduced a test
among with some files stored under virnetdevtestdata folder.
While this works perfectly within a git tree, the folder was not
getting into .tar.gz and therefore the dist-check would fail.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(cherry picked from commit 598f3fddbc)
2015-05-27 14:29:05 -04:00
Lubomir Rintel
e5982092d1 lxc: move wireless PHYs to a network namespace
The 802.11 interfaces can not be moved by themselves, their Phy has to move too.

If there are other interfaces, they have to move too -- hopefully it's not too
confusing. This is a less-invasive alternative to defining a new hostdev type
for PHYs.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(cherry picked from commit 3a495948b9)
2015-05-27 14:28:47 -04:00
Michal Privoznik
5040f00b49 Cleanup "/sys/class/net" usage
Throughout the code, we have several places need to construct a path
somewhere in /sys/class/net/... They are not consistent and nearly
each code piece invents its own way how to do it. So unify this by:

1) use virNetDevSysfsFile() wherever possible

2) At least use common macro SYSFS_NET_DIR declared in virnetdev.h at
   the rest of places which can't go with 1)

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(cherry picked from commit 96a21e975f)
2015-05-27 14:28:41 -04:00
Michal Privoznik
f519ea6742 Introduce virnetdevtest
This is yet another test for check of basic functionality of our
NIC state handling code.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(cherry picked from commit 49ed6cff99)
2015-05-27 14:28:35 -04:00
Eric Blake
fc7c9544cb build: provide virNetDevSysfsFile on non-Linux
Commit 49ed6cff is broken on mingw and other non-linux platforms:

  CCLD     libvirt.la
  Cannot export virNetDevSysfsFile: symbol not defined
  collect2: error: ld returned 1 exit status

* src/util/virnetdev.c: Provide virNetDevSysfsFile fallback.

Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 58dfc53414)
2015-05-27 14:28:23 -04:00
Cole Robinson
0911d1c6a6 Prep for release 1.2.13.1 2015-04-28 11:47:33 -04:00
Jiri Denemark
40d774a4af Fix memory leak in virNetSocketNewConnectUNIX
==26726==    by 0x673CD67: __vasprintf_chk (vasprintf_chk.c:80)
==26726==    by 0x5673605: UnknownInlinedFun (stdio2.h:210)
==26726==    by 0x5673605: virVasprintfInternal (virstring.c:476)
==26726==    by 0x56736EE: virAsprintfInternal (virstring.c:497)
==26726==    by 0x5680C37: virGetUserRuntimeDirectory (virutil.c:866)
==26726==    by 0x5783A89: virNetSocketNewConnectUNIX (virnetsocket.c:572)
==26726==    by 0x57751AF: virNetClientNewUNIX (virnetclient.c:344)
==26726==    by 0x57689B3: doRemoteOpen (remote_driver.c:895)
==26726==    by 0x5769F8E: remoteConnectOpen (remote_driver.c:1195)
==26726==    by 0x57092DF: do_open (libvirt.c:1189)
==26726==    by 0x570A7BF: virConnectOpenAuth (libvirt.c:1341)

https://bugzilla.redhat.com/show_bug.cgi?id=1215042
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
(cherry picked from commit da4d7c3069)
2015-04-28 11:28:37 -04:00
Daniel P. Berrange
9117783bc0 rng: fix port number range validation
The PortNumber data type is declared to derive from 'short'.
Unfortunately this is an signed type, so validates the range
[-32,768, 32,767] which excludes valid port numbers between
32767 and 65535.

We can't use 'unsignedShort', since we need -1 to be a valid
port number too.

This change is to use 'int' and set an explicit max boundary
instead of relying on the data types' built-in max.

One of the existing tests is changed to use a high port number
to validate the schema.

https://bugzilla.redhat.com/show_bug.cgi?id=1214664

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 615bdfda07)
2015-04-28 11:27:24 -04:00
zhang bo
f708713efe qemu: Don't fail to reboot domains with unresponsive agent
just as what b8e25c35d7 did, we
fall back to the ACPI method when the guest agent is unresponsive
in qemuDomainReboot().

Signed-off-by: YueWenyuan <yuewenyuan@huawei.com>
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(cherry picked from commit eadf41fe31)
2015-04-28 11:27:10 -04:00
Roman Bogorodskiy
b5a210dce7 vircommand: fix polling in virCommandProcessIO
When running on FreeBSD, there's a bug in virCommandProcessIO
polling that is triggered by the commandtest.

A test that triggers EPIPE in commandtest (named "test20") hungs
forever on FreeBSD.

Apparently, this happens because FreeBSD sets POLLHUP flag on revents
when stdin in closed. And as the current implementation only checks for
POLLOUT and POLLERR, it ends up looping forever inside
virCommandProcessIO and not trying to do one more write() that would
trigger EPIPE.

To fix that check for the POLLHUP flag along with POLLOUT and POLLERR.

(cherry picked from commit e34cccf783)
2015-04-28 11:26:47 -04:00
Peter Krempa
e84e5aa3df util: storage: Fix possible crash when source path is NULL
Some storage protocols allow to have the @path field in struct
virStorageSource set to NULL. Add NULLSTR() wrappers to handle this
possibility until I finish the storage source error formatter.

(cherry picked from commit 62a61d583c)
2015-04-28 11:26:08 -04:00
Laine Stump
24e0cecb7c qemu: set macvtap physdevs online when macvtap is set online
A further fix for:

  https://bugzilla.redhat.com/show_bug.cgi?id=1113474

Since there is no possibility that any type of macvtap will work if
the parent physdev it's attached to is offline, we should bring the
physdev online at the same time as the macvtap. When taking the
macvtap offline, it's also necessary to take the physdev offline for
macvtap passthrough mode (because the physdev has the same MAC address
as the macvtap device, so could potentially cause problems with
misdirected packets during migration, as outlined in commits 829770
and 879c13). We can't set the physdev offline for other macvtap modes
1) because there may be other macvtap devices attached to the same
physdev (and/or the host itself may be using the device) in the other
modes whereas passthrough mode is exclusive to one macvtap at a time,
and 2) there's no practical reason to do so anyway.

(cherry picked from commit 38172ed894)
2015-04-28 11:26:08 -04:00
Laine Stump
d336362cea util: set MAC address for VF via netlink message to PF+VF# when possible
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1113474

When we set the MAC address of a network device as a part of setting
up macvtap "passthrough" mode (where the domain has an emulated netdev
connected to a host macvtap device that has exclusive use of the
physical device, and sets the device MAC address to match its own,
i.e. "<interface type='direct'> <source mode='passthrough' .../>"), we
use ioctl(SIOCSIFHWADDR) giving it the name of that device. This is
true even if it is an SRIOV Virtual Function (VF).

But, when we are setting the MAC address / vlan ID of a VF in
preparation for "hostdev network" passthrough (this is where we set
the MAC address and vlan id of the VF after detaching the host net
driver and before assigning the device to the domain with PCI
passthrough, i.e. "<interface type='hostdev'>", we do the setting via
a netlink RTM_SETLINK message for that VF's Physical Function (PF),
telling it the VF# we want to change. This sets an "administratively
changed MAC" flag for that VF in the PF's driver, and from that point
on (until the PF driver is reloaded, *not* merely the VF driver) that
VF's MAC address can't be changed using ioctl(SIOCSIFHWADDR) - the
only way to change it is via the PF with RTM_SETLINK.

This means that if a VF is used for hostdev passthrough, it will have
the admin flag set, and future attempts to use that VF for macvtap
passthrough will fail.

The solution to this problem is to check if the device being used for
macvtap passthrough is actually a VF; if so, we use the netlink
RTM_SETLINK message to the PF to set the VF's mac address instead of
ioctl(SIOCSIFHWADDR) directly to the VF; if not, behavior does not
change from previously.

There are three pieces to making this work:

1) virNetDevMacVLan(Create|Delete)WithVPortProfile() now call
   virNetDev(Replace|Restore)NetConfig() rather than
   virNetDev(Replace|Restore)MacAddress() (simply passing -1 for VF#
   and vlanid).

2) virNetDev(Replace|Restore)NetConfig() check to see if the device is
   a VF. If so, they find the PF's name and VF#, allowing them to call
   virNetDev(Replace|Restore)VfConfig().

3) To prevent mixups when detaching a macvtap passthrough device that
   had been attached while running an older version of libvirt,
   virNetDevRestoreVfConfig() is potentially given the preserved name
   of the VF, and if the proper statefile for a VF can't be found in
   the stateDir (${stateDir}/${pfname}_vf${vfid}),
   virNetDevRestoreMacAddress() is called instead (which will look in
   the file named ${stateDir}/${vfname}).

This problem has existed in every version of libvirt that has both
macvtap passthrough and interface type='hostdev'. Fortunately people
seem to use one or the other though, so it hasn't caused any real
world problem reports.

(cherry picked from commit cb3fe38c74)
2015-04-28 11:26:08 -04:00
Richard W.M. Jones
d1ef7b625e xend: Remove a couple of unused function prototypes.
Commit 70f446631f (from 2008) introduced
some functions for testing whether xend was returning correct sound
models.  Those functions have long gone, but the function prototypes
remain.  This commit removes the unused prototypes.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
(cherry picked from commit 093eea9589)
2015-04-28 11:26:08 -04:00
zhang bo
0636291d8b qemuDomainShutdownFlags: Set fakeReboot more frequently
When a qemu domain is to be rebooted, from outside, at libvirt
level it looks like regular shutdown. To really restart the
domain, libvirt needs to issue reset command on the monitor once
SHUTDOWN event appeared. So, in order to differentiate bare
shutdown and reboot libvirt uses a variable within domain private
data. It's called fakeReboot. When the reboot API is called, the
variable is set, but when the shutdown API is called it must be
cleared out. But it was not for every possible case. So if user
called virDomainReboot(), and there was no ACPI daemon running
inside the guest (so guest didn't initiated shutdown sequence)
and then virDomainShutdown(mode=agent) was called bad thing
happened. We remembered the fakeReboot and instead of shutting
the domain down, we just rebooted it.

Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
Signed-off-by: Wang Yufei <james.wangyufei@huawei.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(cherry picked from commit 8be502fd90)
2015-04-28 11:26:07 -04:00
Michal Privoznik
10736cb41c nwfilter: Partly initialize driver even for non-privileged users
https://bugzilla.redhat.com/show_bug.cgi?id=1211436

This reverts commit b7829f959b.

The previous fix was not correct. Like everywhere else, a driver is a
global variable allocated in stateInitialize function (or something
similar for stateless drivers). Later, when a driver API is called,
it's possible that the global variable is accessed and dereferenced.
Now, some drivers require root privileges because they undertake some
actions reserved only for the system admin (e.g. manipulating host
firewall). And here's the trouble, the NWFilter state initializer
exited too early when finding out it's running unprivileged, leaving
the global NWFilter driver variable uninitialized. Any subsequent
API call that tried to lock the driver resulted in dereferencing the
driver and thus crash.

On the other hand, in order to not resurrect the bug the original
commit was fixing, Let's forbid the nwfilter define in session mode.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>

Conflicts:
	src/nwfilter/nwfilter_driver.c: Context. Code changed a bit
        since 2013.

(cherry picked from commit 77d92e2e77)
2015-04-28 11:26:07 -04:00
Michal Privoznik
5328780b5f virNetSocketNewConnectUNIX: Don't unlink(NULL)
There is a possibility that we jump onto error label with @lockpath
still initialized to NULL. Here, the @lockpath should be unlink()-ed,
but passing there a NULL is not a good idea. Don't do that. In fact,
we should call unlink() only if we created the lock file successfully.

Reported-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(cherry picked from commit 1fdac3d99a)
2015-04-28 11:26:07 -04:00
Jiri Denemark
c6e7a9ea64 sanlock: Use VIR_ERR_RESOURCE_BUSY if sanlock_acquire fails
When acquiring resource via sanlock fails, we would report it as
VIR_ERR_INTERNAL_ERROR, which is not very friendly to applications using
libvirt. Moreover, the lockd driver would report the same failure as
VIR_ERR_RESOURCE_BUSY, which looks better.

Unfortunately, in sanlock driver we don't really know if acquiring the
resource failed because it was already locked or there was another
reason behind. But the end result is the same and I think using
VIR_ERR_RESOURCE_BUSY reason for all acquire failures is still better
than what we have now.

https://bugzilla.redhat.com/show_bug.cgi?id=1165119
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
(cherry picked from commit 4864e377c9)
2015-04-28 11:18:22 -04:00
Michal Privoznik
9d0c2053c2 qemuMigrationPrecreateStorage: Fix debug message
When pre-creating storage for domains, we need to find corresponding
disk in the XML on the destination (domain XML may differ there, e.g.
disk is accessible under different path). For better debugging, I'm
printing all info I received on a disk. But there was a typo when
printing the disk capacity: "%lluu" instead of "%llu".

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(cherry picked from commit 65a88572ad)
2015-04-28 11:14:29 -04:00
Xing Lin
8ab12f1a5b qemu_migration.c: sleep first before checking for migration status.
The problem with the previous implementation is,
even when qemuMigrationUpdateJobStatus() detects a migration job
has completed, it will do a sleep for 50 ms (which is unnecessary
and only adds up to the VM pause time).

Signed-off-by: Xing Lin <xinglin@cs.utah.edu>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(cherry picked from commit 522e81cbb5)
2015-04-28 11:14:10 -04:00
Michael Chapman
d17b2d9d5d qemu_driver: check caps after starting block job
Currently we check qemuCaps before starting the block job. But qemuCaps
isn't available on a stopped domain, which means we get a misleading
error message in this case:

  # virsh domstate example
  shut off

  # virsh blockjob example vda
  error: unsupported configuration: block jobs not supported with this QEMU binary

Move the qemuCaps check into the block job so that we are guaranteed the
domain is running.

Signed-off-by: Michael Chapman <mike@very.puzzling.org>
(cherry picked from commit cfcdf5ff01)
2015-04-28 11:10:53 -04:00
Michael Chapman
0ff86a4792 qemu_migrate: use nested job when adding NBD to cookie
qemuMigrationCookieAddNBD is usually called from within an async
MIGRATION_OUT or MIGRATION_IN job, so it needs to start a nested job.

(The one exception is during the Begin phase when change protection
isn't enabled, but qemuDomainObjEnterMonitorAsync will behave the same
as qemuDomainObjEnterMonitor in this case.)

This bug was encountered with a libvirt client that repeatedly queries
the disk mirroring block job info during a migration. If one of these
queries occurs just as the Perform migration cookie is baked, libvirt
crashes.

Relevant logs are as follows:

    6701: warning : qemuDomainObjEnterMonitorInternal:1544 : This thread seems to be the async job owner; entering monitor without asking for a nested job is dangerous
[1] 6701: info : qemuMonitorSend:972 : QEMU_MONITOR_SEND_MSG: mon=0x7fefdc004700 msg={"execute":"query-block","id":"libvirt-629"}
[2] 6699: info : qemuMonitorIOWrite:503 : QEMU_MONITOR_IO_WRITE: mon=0x7fefdc004700 buf={"execute":"query-block","id":"libvirt-629"}
[3] 6704: info : qemuMonitorSend:972 : QEMU_MONITOR_SEND_MSG: mon=0x7fefdc004700 msg={"execute":"query-block-jobs","id":"libvirt-630"}
[4] 6699: info : qemuMonitorJSONIOProcessLine:203 : QEMU_MONITOR_RECV_REPLY: mon=0x7fefdc004700 reply={"return": [...], "id": "libvirt-629"}
    6699: error : qemuMonitorJSONIOProcessLine:211 : internal error: Unexpected JSON reply '{"return": [...], "id": "libvirt-629"}'

At [1] qemuMonitorBlockStatsUpdateCapacity sends its request, then waits
on mon->notify. At [2] the request is written out to the monitor socket.
At [3] qemuMonitorBlockJobInfo sends its request, and also waits on
mon->notify. The reply from the first request is received at [4].
However, qemuMonitorJSONIOProcessLine is not expecting this reply since
the second request hadn't completed sending. The reply is dropped and an
error is returned.

qemuMonitorIO signals mon->notify twice during its error handling,
waking up both of the threads waiting on it. One of them clears mon->msg
as it exits qemuMonitorSend; the other crashes:

  qemuMonitorSend (mon=0x7fefdc004700, msg=<value optimized out>) at qemu/qemu_monitor.c:975
  975         while (!mon->msg->finished) {
  (gdb) print mon->msg
  $1 = (qemuMonitorMessagePtr) 0x0

Signed-off-by: Michael Chapman <mike@very.puzzling.org>
(cherry picked from commit 72df8314f0)
2015-04-28 11:10:35 -04:00
Michael Chapman
29b3402b8b util: fix removal of callbacks in virCloseCallbacksRun
The close callbacks hash are keyed by a UUID-string, but
virCloseCallbacksRun was attempting to remove them by raw UUID. This
patch ensures the callback entries are removed by UUID-string as well.

This bug caused problems when guest migrations were abnormally aborted:

  # timeout --signal KILL 1 \
      virsh migrate example qemu+tls://remote/system \
        --verbose --compressed --live --auto-converge \
        --abort-on-error --unsafe --persistent \
        --undefinesource --copy-storage-all --xml example.xml
  Killed

  # virsh migrate example qemu+tls://remote/system \
      --verbose --compressed --live --auto-converge \
      --abort-on-error --unsafe --persistent \
      --undefinesource --copy-storage-all --xml example.xml
  error: Requested operation is not valid: domain 'example' is not being migrated

Signed-off-by: Michael Chapman <mike@very.puzzling.org>
(cherry picked from commit fa2607d577)
2015-04-28 11:10:17 -04:00
Michael Chapman
9a0e0d3f17 qemu: fix race between disk mirror fail and cancel
If a VM migration is aborted, a disk mirror may be failed by QEMU before
libvirt has a chance to cancel it. The disk->mirrorState remains at
_ABORT in this case, and this breaks subsequent mirrorings of that disk.

We should instead check the mirrorState directly and transition to _NONE
if it is already aborted. Do the check *after* aborting the block job in
QEMU to avoid a race.

Signed-off-by: Michael Chapman <mike@very.puzzling.org>
(cherry picked from commit e5d729ba42)
2015-04-28 11:10:08 -04:00
Michael Chapman
188e536739 qemu: fix error propagation in qemuMigrationBegin
If virCloseCallbacksSet fails, qemuMigrationBegin must return NULL to
indicate an error occurred.

Signed-off-by: Michael Chapman <mike@very.puzzling.org>
(cherry picked from commit 77ddd0bba2)
2015-04-28 11:09:59 -04:00