qemu-img resize will fail with "The new size must be a multiple of 512"
if libvirt doesn't round it first.
This fixes rhbz#951495
Signed-off-by: Christophe Fergeau <cfergeau@redhat.com>
(cherry picked from commit 9a8f39d097448b2b43c4a05d0edc213eacfc9ea6)
Typically when you get EOF on a stream, poll will return
POLLIN|POLLHUP at the same time. Thus when we deal with
stream reads, if we see EOF during the read, we can then
clear the VIR_STREAM_EVENT_HANGUP & VIR_STREAM_EVENT_ERROR
event bits.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit e7b0382945c9bf043f06ee4dd760c2051d63d0f0)
Reported by Anthony Messina in
https://bugzilla.redhat.com/show_bug.cgi?id=904692
Present since introduction of smartcard support in commit f5fd9baa
* src/qemu/qemu_command.c (qemuBuildCommandLine): Match qemu spelling.
* tests/qemuxml2argvdata/qemuxml2argv-smartcard-host-certificates.args:
Fix broken test.
(cherry picked from commit 6f7e4ea359323f9bc413dfb738a5c544d4f9c4f8)
The previous commit was an incomplete backport of commit 83e4c775,
and as a result made any attempt to start a domain when cgroups
are enabled go into an infinite loop. This fixes the botched
backport.
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=965169 documents a
problem starting domains when cgroups are enabled; I was able
to reliably reproduce the race about 5% of the time when I added
hooks to domain startup by 3 seconds (as that seemed to be about
the length of time that qemu created and then closed a temporary
thread, probably related to aio handling of initially opening
a disk image). The problem has existed since we introduced
virCgroupMoveTask in commit 9102829 (v0.10.0).
There are some inherent TOCTTOU races when moving tasks between
kernel cgroups, precisely because threads can be created or
completed in the window between when we read a thread id from the
source and when we write to the destination. As the goal of
virCgroupMoveTask is merely to move ALL tasks into the new
cgroup, it is sufficient to iterate until no more threads are
being created in the old group, and ignoring any threads that
die before we can move them.
It would be nicer to start the threads in the right cgroup to
begin with, but by default, all child threads are created in
the same cgroup as their parent, and we don't want vcpu child
threads in the emulator cgroup, so I don't see any good way
of avoiding the move. It would also be nice if the kernel were
to implement something like rename() as a way to atomically move
a group of threads from one cgroup to another, instead of forcing
a window where we have to read and parse the source, then format
and write back into the destination.
* src/util/vircgroup.c (virCgroupAddTaskStrController): Ignore
ESRCH, because a thread ended between read and write attempts.
(virCgroupMoveTask): Loop until all threads have moved.
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 83e4c77547f5b721afad19a452f41c31daeee8c5)
Conflicts:
src/util/cgroup.c - refactoring in commit 56f27b3bb is too big
to take in entirety; but I did inline its changes to the cleanup label
The code for putting the emulator threads in a separate cgroup
would spam the logs with warnings
2013-02-27 16:08:26.731+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 3
2013-02-27 16:08:26.731+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 4
2013-02-27 16:08:26.732+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 6
This is because it has only created child cgroups for 3 of the
controllers, but was trying to move the processes from all the
controllers. The fix is to only try to move threads in the
controllers we actually created. Also remove the warning and
make it return a hard error to avoid such lazy callers in the
future.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 279336c5d8c44f28956b3427ab2bf207d06878e2)
The QEMU driver has a list of devices nodes that are whitelisted
for all guests. The kernel has recently started returning an
error if you try to whitelist a device which does not exist.
This causes a warning in libvirt logs and an audit error for
any missing devices. eg
2013-02-27 16:08:26.515+0000: 29625: warning : virDomainAuditCgroup:451 : success=no virt=kvm resrc=cgroup reason=allow vm="vm031714" uuid=9d8f1de0-44f4-a0b1-7d50-e41ee6cd897b cgroup="/sys/fs/cgroup/devices/libvirt/qemu/vm031714/" class=path path=/dev/kqemu rdev=? acl=rw
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 7f544a4c8f0353e4ff9ca08aafbb86ff8f60da0a)
When given a CA cert with basic constraints to set non-critical,
and key usage of 'key signing', this should be rejected. Version
of GNUTLS < 3 do not rejecte it though, so we never noticed the
test case was broken
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 0204d6d7a0519377b2e6bc296b00328cd748f55d)
CVE-2013-1962
remoteDispatchStoragePoolListAllVolumes wasn't freeing the pool.
The pool also held a reference to the connection, preventing it from
getting freed and closing the netcf interface driver, which held two
sockets open.
(cherry picked from commit ca697e90d5bd6a6dfb94bfb6d4438bdf9a44b739)
https://bugzilla.redhat.com/show_bug.cgi?id=924501 tracks a
problem that occurs if uid 107 is already in use at the time
libvirt is first installed. In response that problem, Fedora
packaging guidelines were recently updated. This fixes the
spec file to comply with the new guidelines:
https://fedoraproject.org/wiki/Packaging:UsersAndGroups
* libvirt.spec.in (daemon): Follow updated Fedora guidelines.
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit a2584d58f6f7d941b960f996c8e26df8294b79b9)
Conflicts:
libvirt.spec.in - no backport of c8f79c9b %if reindents
When a changelog entry references an RPM macro, % needs to be escaped so
that it does not appear expanded in package changelog.
Fri Mar 4 2009 is incorrect since Mar 4 was Wednesday. Since
libvirt-0.6.1 was released on Mar 4 2009, we should change Fri to Wed.
(cherry picked from commit 53657a0abe273000a8aa2bfc0417a92dcd9dd5b7)
The macro was made to help installing broken packages that did not use
DESTDIR correctly by overriding individual path variables (prefix,
sysconfdir, ...). Newer rpm provides fixed make_install macro that calls
make install with just the correct DESTDIR, however it is not available
everywhere (e.g., RHEL 5 does not have it). On the other hand the
make_install macro is simple and straightforward enough for us to use
its expansion directly.
(cherry picked from commit d45066a55f866a793f346bde1ac6d0f552aa9e52)
https://bugzilla.redhat.com/show_bug.cgi?id=922186
Commit d04916fa introduced a regression in audit quality - even
though the code was computing the proper escaped name for a
path, it wasn't feeding that escaped name on to the audit message.
As a result, /var/log/audit/audit.log would mention a pair of
fields class=path path=/dev/hpet instead of the intended
class=path path="/dev/hpet", which in turn caused ausearch to
format the audit log with path=(null).
* src/conf/domain_audit.c (virDomainAuditCgroupPath): Use
constructed encoding.
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 31c6bf35b9d9de04158318658f4fbf6a9e54ff28)
When virStorageBackendLogicalCreateVol() creates a snapshot for a
logical volume with backingStore element, it fails with the message
below:
2013-01-17 03:10:18.869+0000: 1967: error : virCommandWait:2345 :
internal error Child process (/sbin/lvcreate --name lvm-snapshot -L 51200K
-s=/dev/lvm-pool/lvm-volume) unexpected exit status 3: /sbin/lvcreate:
invalid option -- '=' Error during parsing of command line.
This is because virCommandAddArgPair() uses '=' to connect the two
parameters, it's unsuitable for -s option of the lvcreate.
Signed-off-by: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
(cherry picked from commit ffee627a4a52bf13d787cec883054fe2e7d9465c)
Avoid requesting information such as identity or power state when it
is not necessary.
Lookup virtual machine list with the required fields (configStatus,
name, and config.uuid) to make esxVI_GetVirtualMachineIdentity work.
No need to call esxVI_GetNumberOfSnapshotTrees. rootSnapshotTreeList
can be tested for emptiness by checking it for NULL.
esxVI_LookupRootSnapshotTreeList already does the error reporting,
don't overwrite it.
Check if autostart is enabled at all before looking up the individual
autostart setting of a virtual machine.
Reorder VIR_EXPAND_N(doms, ndoms, 1) to avoid leaking the result of
the call to virGetDomain if VIR_EXPAND_N fails.
Replace VIR_EXPAND_N by VIR_RESIZE_N to avoid quadratic scaling, as in
the Hyper-V version of the function.
If virGetDomain fails it already reports an error, don't overwrite it
with an OOM error.
All items in doms up to the count-th one are valid, no need to double
check before freeing them.
Finally, don't leak autoStartDefaults and powerInfoList.
(cherry picked from commit 5fc663d8bedc082585941e1453229cdcf5fe2880)
Normally libvirtd should run with a SELinux label
system_u:system_r:virtd_t:s0-s0:c0.c1023
If a user manually runs libvirtd though, it is sometimes
possible to get into a situation where it is running
system_u:system_r:init_t:s0
The SELinux security driver isn't expecting this and can't
parse the security label since it lacks the ':c0.c1023' part
causing it to complain
internal error Cannot parse sensitivity level in s0
This updates the parser to cope with this, so if no category
is present, libvirtd will hardcode the equivalent of c0.c1023.
Now this won't work if SELinux is in Enforcing mode, but that's
not an issue, because the user can only get into this problem
if in Permissive mode. This means they can now start VMs in
Permissive mode without hitting that parsing error
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 1732c1c62997b9f5ce39e5eb4d1ef2f842af73e1)
Conflicts:
src/security/security_selinux.c
Pull the code which parses the current process MCS range
out of virSecuritySELinuxMCSFind and into a new method
virSecuritySELinuxMCSGetProcessRange.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 4a92fe4413d2a43cf1215e26be7551b653d9a992)
Conflicts:
src/security/security_selinux.c
The body of the loop in virSecuritySELinuxMCSFind would
directly 'return NULL' on OOM, instead of jumping to the
cleanup label. This caused a leak of several local vars.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit f2d8190cfb8e52b006a5cfd080b42d2a1755fd28)
Since we switched from direct host migration scheme to the one,
where we connect to the destination and then just pass a FD to a
qemu, we have uncovered a qemu bug. Qemu expects migration FD to
block. However, we are passing a nonblocking one which results in
cryptic error messages like:
qemu: warning: error while loading state section id 2
load of migration failed
The bug is already known to Qemu folks, but we should workaround
already released Qemus. Patch has been originally proposed by Stefan
Hajnoczi <stefanha@gmail.com>
(cherry picked from commit ceb31795af40f6127a541076b905935ff83e5b11)
Commit c308a9ae was incomplete; it resolved the configure failure,
but not a later build failure.
* src/util/virnetdevbridge.c: Include pre-req header.
* configure.ac (AC_CHECK_HEADERS): Prefer standard in.h over
non-standard ip6.h.
(cherry picked from commit 1bf661caf4e926efcad6e85151a587cea5fd29f4)
I got this scary warning during ./configure on rawhide:
checking linux/if_bridge.h usability... no
checking linux/if_bridge.h presence... yes
configure: WARNING: linux/if_bridge.h: present but cannot be compiled
configure: WARNING: linux/if_bridge.h: check for missing prerequisite headers?
configure: WARNING: linux/if_bridge.h: see the Autoconf documentation
configure: WARNING: linux/if_bridge.h: section "Present But Cannot Be Compiled"
configure: WARNING: linux/if_bridge.h: proceeding with the compiler's result
configure: WARNING: ## ------------------------------------- ##
configure: WARNING: ## Report this to libvir-list@redhat.com ##
configure: WARNING: ## ------------------------------------- ##
checking for linux/if_bridge.h... no
* configure.ac (AC_CHECK_HEADERS): Provide struct in6_addr, since
linux/if_bridge.h uses it without declaring it.
(cherry picked from commit c308a9ae153db619fc0366bad9fd8f6c49cfac58)
(cherry picked from commit 7ae53f15936277dfc777539ce13970293fdb03d0)
If securityselinuxtest was run on a system with newer SELinux
policy it would fail, due to using svirt_tcg_t instead of
svirt_t. Fixing the domain type to be KVM avoids this issue.
(cherry picked from commit 32df483f1d5916f00e3ab15158f099234909e9c2)
The libxl driver was setting the backend field of libxl_device_disk
structure to LIBXL_DISK_BACKEND_TAP when the driver element of disk
configuration was not specified. This needlessly forces the use of
blktap driver, which may not be loaded in dom0
https://bugzilla.redhat.com/show_bug.cgi?id=912488
Ian Campbell suggested that LIBXL_DISK_BACKEND_UNKNOWN is a better
default in this case
https://www.redhat.com/archives/libvir-list/2013-February/msg01126.html
(cherry picked from commit 567779e51a7727b021dee095c9d75cf0cde0bd43)
https://bugzilla.redhat.com/show_bug.cgi?id=905708
Only the first 12 bits should be set in the mask for this range. All
addresses between 172.16.0.0 and 172.31.255.255 are private.
(cherry picked from commit 6405713f2ab9243db7d856914aaefbd4f9747daa)
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=895294
The symptom was that attempts to modify a network device using
virDomainUpdateDeviceFlags() would fail if the original device had a
<boot> element (e.g. "<boot order='1'/>"), even if the updated device
had the same <boot> element. Instead, the following error would be logged:
cannot modify network device boot index setting
It's true that it's not possible to change boot order (internally
known as bootIndex) of a live device; qemuDomainChangeNet checks for
that, but the problem was that the information it was checking was
incorrect.
Explanation:
When a complete domain is parsed, a global (to the domain) "bootMap"
is passed down to the parse for each device; the bootMap is used to
make sure that devices don't have conflicting settings for their boot
orders.
When a single device is parsed by itself (as in the case of
virDomainUpdateDeviceFlags), there is no global bootMap that would be
appropriate to send, so NULL is sent instead. However, although the
lowest level function that parses just the boot order *does* simply
skip the sanity check in that case, the next higher level
"virDomainDeviceInfoParseXML" function refuses to call down to the
lower "virDomainDeviceBootParseXML" if bootMap is NULL. So, the boot
order is never set in the "new" device object, and when it is compared
to the original (which does have a boot order), they don't match.
The fix is to patch virDomainDeviceInfoParseXML to not care about
bootMap, and just always call virDomainDeviceInfoBootParseXML whenever
there is a <boot> element. When we are only parsing a single device,
we don't care whether or not any specified boot order is consistent
with the rest of the domain; we will always do this check later (in
the current case, we do it by verifying that the net bootIndex exactly
matches the old bootIndex).
The current SELinux policy only works for KVM guests, since
TCG requires the 'execmem' privilege. There is a 'virt_use_execmem'
boolean to turn this on globally, but that is unpleasant for users.
This changes libvirt to automatically use a new 'svirt_tcg_t'
context for TCG based guests. This obsoletes the previous
boolean tunable and makes things 'just work(tm)'
Since we can't assume we run with new enough policy, I also
make us log a warning message (once only) if we find the policy
lacks support. In this case we fallback to the normal label and
expect users to set the boolean tunable
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 77d3a8097480e388f1ce3129fe530f235b05f93b)
There's been a few bugs about an expected error from polkit:
https://bugzilla.redhat.com/show_bug.cgi?id=873799https://bugzilla.redhat.com/show_bug.cgi?id=872166
The error is:
Authorization requires authentication but no agent is available.
The error means that polkit needs a password, but there is no polkit
agent registered in your session. Polkit agents are the bit of UI that
pop up and actually ask for your password.
Preface the error with the string 'polkit:' so folks can hopefully
make more sense of it.
(cherry picked from commit 96a108c99398f56970a29c8bfb7da9df90d206ed)
According to Eric Paris this is slightly more efficient because it
only loads the regular expressions in libselinux once.
(cherry picked from commit 6159710ca1eecefa7c81335612c8141c88fc35a9)
Conflicts:
src/security/security_selinux.c
The virSecurityManager{Set,Restore}AllLabel methods are invoked
at domain startup/shutdown to relabel resources associated with
a domain. This works fine with QEMU, but with LXC they are in
fact both currently no-ops since LXC does not support disks,
hostdevs, or kernel/initrd files. Worse, when LXC gains support
for disks/hostdevs, they will do the wrong thing, since they
run in host context, not container context. Thus this patch
turns then into a formal no-op when used with LXC. The LXC
controller will call out to specific security manager labelling
APIs as required during startup.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 89c5a9d0e83306eef0d73af5cfb32cb49d533afc)
Commit id a994ef2d1 changed the mechanism to store/update the default
security label from using disk->seclabels[0] to allocating one on the
fly. That change allocated the label, but never saved it. This patch
will save the label. The new virDomainDiskDefAddSecurityLabelDef() is
a copy of the virDomainDefAddSecurityLabelDef().
(cherry picked from commit 05cc03518987fa0f8399930d14c1d635591ca49b)
Conflicts:
src/conf/domain_conf.h
This patch resolves CVE-2013-0170:
https://bugzilla.redhat.com/show_bug.cgi?id=893450
When reading and dispatching of a message failed the message was freed
but wasn't removed from the message queue.
After that when the connection was about to be closed the pointer for
the message was still present in the queue and it was passed to
virNetMessageFree which tried to call the callback function from an
uninitialized pointer.
This patch removes the message from the queue before it's freed.
* rpc/virnetserverclient.c: virNetServerClientDispatchRead:
- avoid use after free of RPC messages
(cherry picked from commit 46532e3e8ed5f5a736a02f67d6c805492f9ca720)
https://bugzilla.redhat.com/show_bug.cgi?id=903184
Commit id f8ab364c removed ability to run this driver unprivileged. Coverity
detected the check and flagged it.
(cherry picked from commit aafe41971cc3f4a189edf5b322f399aabd869d74)
Conflicts:
src/nwfilter/nwfilter_driver.c - whitespace changes in 1c04f99 not present
https://bugzilla.redhat.com/show_bug.cgi?id=903184
Although the nwfilter driver skips startup when running in a
session libvirtd, it did not skip reload or shutdown. This
caused errors to be reported when sending SIGHUP to libvirtd,
and caused an abort() in libdbus on shutdown due to trying
to remove a dbus filter that was never added
(cherry picked from commit abbec81bd0c9bf917f2c63045222734d7e4411fb)
Conflicts:
src/nwfilter/nwfilter_driver.c - earlier changes f4ea67f and
79b8a56 related to using bool and auto-shutdown of drivers are not backported
When running virDomainDestroy, we need to make sure that no other
background thread cleans up the domain while we're doing our work.
This can happen if we release the domain object while in the
middle of work, because the monitor might detect EOF in this window.
For this reason we have a 'beingDestroyed' flag to stop the monitor
from doing its normal cleanup. Unfortunately this flag was only
being used to protect qemuDomainBeginJob, and not qemuProcessKill
This left open a race condition where either libvirtd could crash,
or alternatively report bogus error messages about the domain already
having been destroyed to the caller
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 81621f3e6e45e8681cc18ae49404736a0e772a11)
Conflicts:
src/qemu/qemu_driver.c - virReportError had been removed from
upstream in cases where qemuProcessKill failed, creating
different context.
When building libvirt rpms on rhel5, I got the following error:
File must begin with "/": rm
File must begin with "/": -f
File must begin with "/": $RPM_BUILD_ROOT/etc/sysctl.d/libvirtd
Installed (but unpackaged) file(s) found:
/etc/sysctl.d/libvirtd
It is triggerd by the %files list of libvirt daemon:
%if 0%{?fedora} >= 14 || 0%{?rhel} >= 6
%config(noreplace) %{_prefix}/lib/sysctl.d/libvirtd.conf
%else
rm -f $RPM_BUILD_ROOT%{_prefix}/lib/sysctl.d/libvirtd.conf
%endif
After checking document of rpm spec file, I think it would be better
to move the file deleting line from %files list to %install script.
Bug introduced in commit a1fd56c.
(cherry picked from commit daef7c9e9c5abef65e77116a1cabad37c0c0a897)
In a non-systemd environment the post and preun scripts of libvirt-client
fail, since the required files are in libvirt-daemon. Moved them to client.
Doing that I noticed %{_unitdir}/libvirt-guests.service was contained in
both libvirt-client and libvirt-daemon, which I don't think was intended.
Removed the extra copy from daemon.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
(cherry picked from commit b7159dca8b1b653e342584fb33225190bb772fd8)
Conflicts:
libvirt.spec.in - no virtlockd service
Currently, if there's no hard memory limit defined for a domain,
libvirt tries to calculate one, based on domain definition and magic
equation and set it upon the domain startup. The rationale behind was,
if there's a memory leak or exploit in qemu, we should prevent the
host system trashing. However, the equation was too tightening, as it
didn't reflect what the kernel counts into the memory used by a
process. Since many hosts do have a swap, nobody hasn't noticed
anything, because if hard memory limit is reached, process can
continue allocating memory on a swap. However, if there is no swap on
the host, the process gets killed by OOM killer. In our case, the qemu
process it is.
To prevent this, we need to relax the hard RSS limit. Moreover, we
should reflect more precisely the kernel way of accounting the memory
for process. That is, even the kernel caches are counted within the
memory used by a process (within cgroups at least). Hence the magic
equation has to be changed:
limit = 1.5 * (domain memory + total video memory) + (32MB for cache
per each disk) + 200MB
(cherry picked from commit 3c83df679e8feab939c08b1f97c48f9290a5b8cd)
This is an adjustment to the fix for
https://bugzilla.redhat.com/show_bug.cgi?id=889319
to account for two bonehead mistakes I made.
commit ac2797cf2af2fd0e64c58a48409a8175d24d6f86 attempted to fix a
problem with netlink in newer kernels requiring an extra attribute
with a filter flag set in order to receive an IFLA_VFINFO_LIST from
netlink. Unfortunately, the #ifdef that protected against compiling it
in on systems without the new flag went a bit too far, assuring that
the new code would *never* be compiled, and even if it had, the code
was incorrect.
The first problem was that, while some IFLA_* enum values are also
their existence at compile time, IFLA_EXT_MASK *isn't* #defined, so
checking to see if it's #defined is not a valid method of determining
whether or not to add the attribute. Fortunately, the flag that is
being set (RTEXT_FILTER_VF) *is* #defined, and it is never present if
IFLA_EXT_MASK isn't, so it's sufficient to just check for that flag.
And to top it off, due to the code not actually compiling when I
thought it did, I didn't realize that I'd been given the wrong arglist
to nla_put() - you can't just send a const value to nla_put, you have
to send it a pointer to memory containing what you want to add to the
message, along with the length of that memory.
This time I've actually sent the patch over to the other machine
that's experiencing the problem, applied it to the branch being used
(0.10.2) and verified that it works properly, i.e. it does fix the
problem it's supposed to fix. :-/
(cherry picked from commit 7c36650699f33e54361720f824efdf164bc6e65d)
This patch fixes the lack of error messages when libvirt fails to find
VFINFO in a returned netlinke response message.
https://bugzilla.redhat.com/show_bug.cgi?id=827519#c10 is an example
of the error message that was previously logged when the
IFLA_VFINFO_LIST object was missing from the netlink response. The
reason for this failure is detailed in
https://bugzilla.redhat.com/show_bug.cgi?id=889319
Even though that root problem has been fixed, the experience of
finding the root cause shows us how important it is to properly log an
error message in these cases. This patch *seems* to replace the entire
function, but really most of the changes are due to moving code that
was previously inside an if() statement out to the top level of the
function (the original if() was reversed and made to log an error and
return).
(cherry picked from commit 846770e5ff959f7819e2c32857598cb88e2e2f0e)
This patch resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=889319
When assigning an SRIOV virtual function to a guest using "intelligent
PCI passthrough" (<interface type='hostdev'>, which sets the MAC
address and vlan tag of the VF before passing its info to qemu),
libvirt first learns the current MAC address and vlan tag by sending
an NLM_F_REQUEST message for the VF's PF (physical function) to the
kernel via a NETLINK_ROUTE socket (see virNetDevLinkDump()); the
response message's IFLA_VFINFO_LIST section is examined to extract the
info for the particular VF being assigned.
This worked fine with kernels up until kernel commit
115c9b81928360d769a76c632bae62d15206a94a (first appearing in upstream
kernel 3.3) which changed the ABI to not return IFLA_VFINFO_LIST in
the response until a newly introduced IFLA_EXT_MASK field was included
in the request, with the (newly introduced, of course) RTEXT_FILTER_VF
flag set.
The justification for this ABI change was that new fields had been
added to the VFINFO, causing NLM_F_REQUEST messages to fail on systems
with large numbers of VFs if the requesting application didn't have a
large enough buffer for all the info. The idea is that most
applications doing an NLM_F_REQUEST don't care about VFINFO anyway, so
eliminating it from the response would lower the requirements on
buffer size. Apparently, the people who pushed this patch made the
mistaken assumption that iproute2 (the "ip" command) was the only
package that used IFLA_VFINFO_LIST, so it wouldn't break anything else
(and they made sure that iproute2 was fixed.
The logic of this "fix" is debatable at best (one could claim that the
proper fix would be for the applications in question to be fixed so
that they properly sized the buffer, which is what libvirt does
(purely by virtue of using libnl), but it is what it is and we have to
deal with it.
In order for <interface type='hostdev'> to work properly on systems
with a kernel 3.3 or later, libvirt needs to add the afore-mentioned
IFLA_EXT_MASK field with RTEXT_FILTER_VF set.
Of course we also need to continue working on systems with older
kernels, so that one bit of code is compiled conditionally. The one
time this could cause problems is if the libvirt binary was built on a
system without IFLA_EXT_MASK which was subsequently updated to a
kernel that *did* have it. That could be solved by manually providing
the values of IFLA_EXT_MASK and RTEXT_FILTER_VF and adding it to the
message anyway, but I'm uncertain what that might actually do on a
system that didn't support the message, so for the time being we'll
just fail in that case (which will very likely never happen anyway).
(cherry picked from commit ac2797cf2af2fd0e64c58a48409a8175d24d6f86)
The first two hunks fix "Unterminated I<...> sequence" error and the
last one fixes "’=item’ outside of any ’=over’" error.
(cherry picked from commit 61299a1c983a64c7e0337b94232fdd2d42c1f4f2)
https://bugzilla.redhat.com/show_bug.cgi?id=887017 reports that
even though libvirt attempts to set fs.aio-max-nr via sysctl,
the file was installed with the wrong name and gets ignored by
sysctl. Furthermore, 'man systcl.d' recommends that packages
install into hard-coded /usr/lib/sysctl.d (even when libdir is
/usr/lib64), so that sysadmins can use /etc/sysctl.d for overrides.
* daemon/Makefile.am (install-sysctl, uninstall-sysctl): Use
correct location.
* libvirt.spec.in (network_files): Reflect this.
(cherry picked from commit a1fd56cb3057c45cffbf5d41eaf70a26d2116b20)
See also commit 66ff2dd, where we avoided installing these files
as executables.
* daemon/Makefile.am (libvirtd.service): Drop chmod.
* tools/Makefile.am (libvirt-guests.service): Likewise.
* src/Makefile.am (virtlockd.service, virtlockd.socket):
Likewise.
(cherry picked from commit 5ec4b22b777b4505d159c6e8d1631d4d774a7be7)
Conflicts:
src/Makefile.am - virtlockd.service not present in 0.10.2
We had several different styles of .in conversion in our Makefiles:
ALLCAPS, @ALLCAPS@, @lower@, ::lower::
Canonicalize on one form, to make it easier to copy and paste
between .in files.
Also, we were using some non-portable sed constructs: \@ is an
undefined escape sequence (it happens to be @ itself in GNU sed,
but POSIX allows it to mean something else), as well as risky
behavior (failure to consistently quote things means a space
in $(sysconfdir) could throw things off; also, Autoconf recommends
using | rather than , or ! in the s||| operator, because | has to
be quoted in shell and is therefore less likely to appear in file
names than , or !).
Fix all of these uses to follow the same syntax.
* daemon/libvirtd.8.in: Switch to @var@.
* tools/virt-xml-validate.in: Likewise.
* tools/virt-pki-validate.in: Likewise.
* src/locking/virtlockd.init.in: Likewise.
* daemon/Makefile.am: Prefer | over ! in sed.
(libvirtd.8): Prefer consistent substitution.
(libvirtd.init, libvirtd.service): Avoid non-portable sed.
* tools/Makefile.am (libvirt-guests.sh, libvirt-guests.init)
(libvirt-guests.service): Likewise.
(virt-xml-validate, virt-pki-validate, virt-sanlock-cleanup):
Prefer consistent capitalization.
* src/Makefile.am (virtlockd.init, virtlockd.service)
(virtlockd.socket): Prefer consistent substitution.
(cherry picked from commit 462a69621e232c83990dbe6a711326b671262d47)
Conflicts:
daemon/Makefile.am - drop files not present in 0.10.2
src/Makefile.am - likewise
src/locking/virtlockd.init.in - likewise