https://bugzilla.redhat.com/show_bug.cgi?id=951637
Newer gnutls uses nettle, rather than gcrypt, which is a lot nicer
regarding initialization. Yet we were unconditionally initializing
gcrypt even when gnutls wouldn't be using it, and having two crypto
libraries linked into libvirt.so is pointless, but mostly harmless
(it doesn't crash, but does interfere with certification efforts).
There are three distinct version ranges to worry about when
determining which crypto lib gnutls uses, per these gnutls mails:
2.12: http://lists.gnu.org/archive/html/gnutls-devel/2011-03/msg00034.html
3.0: http://lists.gnu.org/archive/html/gnutls-devel/2011-07/msg00035.html
If pkg-config can prove version numbers and/or list the crypto
library used for static linking, we have our proof; if not, it
is safer (even if pointless) to continue to use gcrypt ourselves.
* configure.ac (WITH_GNUTLS): Probe whether to add -lgcrypt, and
define a witness WITH_GNUTLS_GCRYPT.
* src/libvirt.c (virTLSMutexInit, virTLSMutexDestroy)
(virTLSMutexLock, virTLSMutexUnlock, virTLSThreadImpl)
(virGlobalInit): Honor the witness.
* libvirt.spec.in (BuildRequires): Make gcrypt usage conditional,
no longer needed in Fedora 19.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit d72ef888 introduced a bug in the libxl driver that will
segfault libvirtd if libxl reports an error message, e.g. when
attempting to initialize the driver on a non-Xen system. I
assumed it was valid to pass a NULL logger to libxl_ctx_alloc(),
but that is not the case since any errors associated with the ctx
that are emitted by libxl will dereference the logger and crash
libvirtd.
Errors associated with the libxl driver-wide ctx could be useful
for debugging anyway, so create a 'libxl-driver.log' to capture
these errors.
Recentish (2011) kernels introduced a new device called /dev/loop-control,
which causes libvirt's detection of loop devices to get confused
since it only checks for a prefix of 'loop'. Also check that the
next character is a digit
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This adds two new pages to the website, acl.html describing
the general access control framework and permissions models,
and aclpolkit.html describing the use of polkit as an
access control driver.
page.xsl is modified to support a new syntax
<div id="include" filename="somefile.htmlinc"/>
which will cause the XSL transform to replace that <div>
with the contents of 'somefile.htmlinc'. We use this in
the acl.html.in file, to pull the table of permissions
for each libvirt object. This table is autogenerated
from the enums in src/access/viraccessperms.h by the
genaclperms.pl script.
newapi.xsl is modified so that the list of permissions
checks shown against each API will link to the description
of the permissions in acl.html
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The gendispatch.pl script puts comments at the top of files
it creates, saying that it auto-generated them. Also include
the name of the source data file which it reads when doing
the auto-generation.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
introduced by cs 4b9eec50fe ("libxl: implement per
NUMA node free memory reporting"). What was wrong was that
libxl_get_numainfo() put in nr_nodes the actual number of
host NUMA nodes, not the highest node ID (like libnuma's
numa_max_node() does instead).
While at it, turn the failure of libxl_get_numainfo() from
a simple warning to a proper error, as requested during the
review of another patch of the original series.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Daniel P. Berrange <berrange@redhat.com>
This is a second attempt at fixing the problem first attempted
in commit 2df8d99; basically undoing the fact that it was
reverted in commit 43cee32f, plus fixing two more issues: the
code in configure.ac has to EXACTLY match virnetdevbridge.c
with regards to declaring in6 types before using if_bridge.h,
and the fact that RHEL 5 has even more conflicts:
In file included from util/virnetdevbridge.c:49:
/usr/include/linux/in6.h:47: error: conflicting types for 'in6addr_any'
/usr/include/netinet/in.h:206: error: previous declaration of 'in6addr_any' was here
/usr/include/linux/in6.h:49: error: conflicting types for 'in6addr_loopback'
/usr/include/netinet/in.h:207: error: previous declaration of 'in6addr_loopback' was here
The rest of this commit message borrows from the original try
of 2df8d99:
A fresh checkout on a RHEL 6 machine with these packages:
kernel-headers-2.6.32-405.el6.x86_64
glibc-2.12-1.128.el6.x86_64
failed to configure with this message:
checking for linux/if_bridge.h... no
configure: error: You must install kernel-headers in order to compile libvirt with QEMU or LXC support
Digging in config.log, we see that the problem is identical to
what we fixed earlier in commit d12c2811:
configure:98831: checking for linux/if_bridge.h
configure:98853: gcc -std=gnu99 -c -g -O2 conftest.c >&5
In file included from /usr/include/linux/if_bridge.h:17,
from conftest.c:559:
/usr/include/linux/in6.h:31: error: redefinition of 'struct in6_addr'
/usr/include/linux/in6.h:48: error: redefinition of 'struct sockaddr_in6'
/usr/include/linux/in6.h:56: error: redefinition of 'struct ipv6_mreq'
configure:98860: $? = 1
I had not hit it earlier because I was using incremental builds,
where config.cache had shielded me from the kernel-headers breakage.
* configure.ac (if_bridge.h): Avoid conflicting type definitions.
* src/util/virnetdevbridge.c (includes): Also sanitize for RHEL 5.
Signed-off-by: Eric Blake <eblake@redhat.com>
Currently, only one log file is created by the libxl driver, with
all output from libxl for all domains going to this one file.
Create a per-domain log file based on domain name, making sifting
through the logs a bit easier. This required deferring libxl_ctx
allocation until starting the domain, which is fine since the
ctx is not used when the domain is inactive.
Tested-by: Dario Faggioli <dario.faggioli@citrix.com>
The virtlockd daemon supports an /etc/libvirt/virtlockd.conf
config file, but we never installed a default config, nor
created any augeas scripts. This change addresses that omission.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Coverity complained about the usage of the uninitialized cacerts in the
event(s) that "access(certFile, R_OK)" and/or "access(cacertFile, R_OK)"
fail the for loop used to fill in the certs will have indeterminate data
as well as the possibility that both failures would result in the
gnutls_x509_crt_deinit() call having a similar fate.
Initializing cacerts only would resolve the issue; however, it still
would leave the indeterminate action, so rather add a parameter to
the virNetTLSContextLoadCACertListFromFile() to pass the max size rather
then overloading the returned count parameter. If the the call is never
made, then we won't go through the for loops referencing the empty
cacerts
Valgrind defects memory error:
==16759== 1 errors in context 1 of 8:
==16759== Invalid free() / delete / delete[] / realloc()
==16759== at 0x4A074C4: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==16759== by 0x83CD329: xdr_string (in /usr/lib64/libc-2.17.so)
==16759== by 0x4D93E4D: xdr_remote_nonnull_string (remote_protocol.c:31)
==16759== by 0x4D94350: xdr_remote_nonnull_domain (remote_protocol.c:58)
==16759== by 0x4D976C8: xdr_remote_domain_create_with_flags_ret (remote_protocol.c:1762)
==16759== by 0x83CC734: xdr_free (in /usr/lib64/libc-2.17.so)
==16759== by 0x4D7F1E0: remoteDomainCreateWithFlags (remote_driver.c:2441)
==16759== by 0x4D4BF17: virDomainCreateWithFlags (libvirt.c:9499)
==16759== by 0x13127A: cmdStart (virsh-domain.c:3376)
==16759== by 0x12BF83: vshCommandRun (virsh.c:1751)
==16759== by 0x126FFB: main (virsh.c:3205)
==16759== Address 0xe1394a0 is not stack'd, malloc'd or (recently) free'd
==16759== 1 errors in context 2 of 8:
==16759== Conditional jump or move depends on uninitialised value(s)
==16759== at 0x4A07477: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==16759== by 0x83CD329: xdr_string (in /usr/lib64/libc-2.17.so)
==16759== by 0x4D93E4D: xdr_remote_nonnull_string (remote_protocol.c:31)
==16759== by 0x4D94350: xdr_remote_nonnull_domain (remote_protocol.c:58)
==16759== by 0x4D976C8: xdr_remote_domain_create_with_flags_ret (remote_protocol.c:1762)
==16759== by 0x83CC734: xdr_free (in /usr/lib64/libc-2.17.so)
==16759== by 0x4D7F1E0: remoteDomainCreateWithFlags (remote_driver.c:2441)
==16759== by 0x4D4BF17: virDomainCreateWithFlags (libvirt.c:9499)
==16759== by 0x13127A: cmdStart (virsh-domain.c:3376)
==16759== by 0x12BF83: vshCommandRun (virsh.c:1751)
==16759== by 0x126FFB: main (virsh.c:3205)
==16759== Uninitialised value was created by a stack allocation
==16759== at 0x4D7F120: remoteDomainCreateWithFlags (remote_driver.c:2423)
How to reproduce?
# virsh start <domain> --paused
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=994855
Signed-off-by: Alex Jia <ajia@redhat.com>
If securityfs is available on the host, we should ensure to
mount it read-only in the container. This will avoid systemd
trying to mount it during startup causing SELinux AVCs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Hotplugging a single SCSI device works, but adding additional ones
result in an error from QEMU:
[root@gpok197 ~]# virsh attach-device guest01 blah.xml
Device attached successfully
[root@gpok197 ~]# virsh attach-device guest01 blah2.xml
error: Failed to attach device from blah2.xml
error: internal error unable to execute QEMU command 'device_add': Duplicate ID 'hostdev0' for device
The hostdev ID that is created is always set to zero, regardless
of the contents of the XML. Changing the index in the hotplug case
to a negative one so the next available index is used.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Reviewed-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
So that app developers / admins know what access control checks
are performed for each API, this patch extends the API docs
generator to include details of the ACLs for each.
The gendispatch.pl script is extended so that it generates
a simple XML describing ACL rules, eg.
<aclinfo>
...
<api name='virConnectNumOfDomains'>
<check object='connect' perm='search_domains'/>
<filter object='domain' perm='getattr'/>
</api>
<api name='virDomainAttachDeviceFlags'>
<check object='domain' perm='write'/>
<check object='domain' perm='save' flags='!VIR_DOMAIN_AFFECT_CONFIG|VIR_DOMAIN_AFFECT_LIVE'/>
<check object='domain' perm='save' flags='VIR_DOMAIN_AFFECT_CONFIG'/>
</api>
...
</aclinfo>
The newapi.xsl template loads the XML files containing the ACL
rules and generates a short block of HTML for each API describing
the parameter checks and return value filters (if any).
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The code added to validate CA certificates did not take into
account the possibility that the cacert.pem file can contain
multiple (concatenated) cert data blocks. Extend the code for
loading CA certs to use the gnutls APIs for loading cert lists.
Add test cases to check that multi-level trees of certs will
validate correctly.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit 3d0e3c1 reintroduced a problem previously squelched in
commit 7e5aa78. Add a syntax check this time around.
util/virutil.c: In function 'virGetGroupList':
util/virutil.c:1015: error: 'for' loop initial declaration used outside C99 mode
* cfg.mk (sc_prohibit_loop_var_decl): New rule.
* src/util/virutil.c (virGetGroupList): Fix offender.
Signed-off-by: Eric Blake <eblake@redhat.com>
Before, missing attributes were only OK when adding entries;
modification and deletion required all of them.
Now, only deletion works with missing attributes, as long as
the host is uniquely identified.
Go through disks of guest, if one disk doesn't exist or its backing
chain is broken, with 'optional' startupPolicy, for CDROM and Floppy
we only discard its source path definition in xml, for disks we drop
it from disk list and free it.
Since iptables version 1.4.16 '-m state --state NEW' is converted to
'-m conntrack --ctstate NEW'. Therefore, when encountering this or later
versions of iptables use '-m conntrack --ctstate'.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
The change from initgroups to virGetGroupList/setgroups in
cab36cfe71ba83b71e536ba5c98e596f02b697b0 dropped the primary group from
processes group list iff the passed in group to virGetGroupList differs
from the user's primary group.
So always include the primary group to bring back the old behaviour.
Debian has the kvm group as primary group but uses
libvirt-qemu:libvirt-qemu as user:group to run the kvm process so
without this change the /dev/kvm is inaccessible.
Since commit 95e18efd most public interfaces (xenUnified...) obtain
a virDomainDefPtr via xenGetDomainDefFor...() which take the unified
lock.
This is already taken before calling xenDomainUsedCpus(), so we get
a deadlock for active guests. Avoid this by splitting up
xenUnifiedDomainGetVcpusFlags() and xenUnifiedDomainGetVcpus() into
public and private function calls (which get the virDomainDefPtr passed)
and use those in xenDomainUsedCpus().
xenDomainUsedCpus
...
nb_vcpu = xenUnifiedDomainGetMaxVcpus(dom);
return xenUnifiedDomainGetVcpusFlags(...)
...
if (!(def = xenGetDomainDefForDom(dom)))
return xenGetDomainDefForUUID(dom->conn, dom->uuid);
...
ret = xenHypervisorLookupDomainByUUID(conn, uuid);
...
xenUnifiedLock(priv);
name = xenStoreDomainGetName(conn, id);
xenUnifiedUnlock(priv);
...
if ((ncpus = xenUnifiedDomainGetVcpus(dom, cpuinfo, nb_vcpu,
...
if (!(def = xenGetDomainDefForDom(dom)))
[again like above]
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
This patch addresses two concerns with the error reporting when an
incompatible PCI address is specified for a device:
1) It wasn't always apparent which device had the problem. With this
patch applied, any error about an incompatible address will always
contain the full address as given in the config, so it will be easier
to determine which device's config aused the problem.
2) In some cases when the problem came from bad config, the error
message was erroneously classified as VIR_ERR_INTERNAL_ERROR. With
this patch applied, the same error message will be changed to indicate
either "internal" or "xml" error depending on whether the address came
from the config, or was automatically generated by libvirt.
Note that in the case of "internal" (due to bad auto-generation)
errors, the PCI address won't be of much use in finding the location
in config to change (because it was automatically generated). Of
course that makes perfect sense, but still the address could provide a
clue about a bug in libvirt attempting to use a type of pci bus that
doesn't have its flags set correctly (or something similar). In other
words, it's not perfect, but it is definitely better.
q35 machines have an implicit ahci (sata) controller at 00:1F.2 which
has no "id" associated with it. For this reason, we can't refer to it
as "ahci0". Instead, we don't give an id on the commandline, which
qemu interprets as "use the first ahci controller". We then need to
specify the unit with "unit=%d" rather than adding it onto the bus
arg.
https://bugzilla.redhat.com/show_bug.cgi?id=979477
Since 1.0.3 we are using the new way to copy non shared storage during
migration (the NBD way). However, whether the new or old way is used is
not controllable by user but unconditionally turned on if both sides of
migration support it. Moreover, the implementation is not complete: the
combination for VIR_MIGRATE_TUNNELLED flag is missing (as we need to
open new port on the destination) in which case we just error out. This
is a deadly combination: not letting users choose their destiny and
erroring out. We should not do that but VIR_WARN and turn the NBD off
instead.
We had been setting the device alias in the devinceinfo for pci
controllers to "pci%u", but then hardcoding "pci.%u" when creating the
device address for other devices using that pci bus. This all worked
just fine until we encountered the built-in "pcie.0" bus (the PCIe
root complex) in Q35 machines.
In order to create the correct commandline for this one case, this
patch:
1) sets the alias for PCI controllers correctly, to "pci.%u" (or
"pcie.%u" for the pcie-root controller)
2) eliminates the hardcoded "pci.%u" for pci controllers when
generatuing device address strings, and instead uses the controller's
alias.
3) plumbs a pointer to the virDomainDef all the way down to
qemuBuildDeviceAddressStr. This was necessary in order to make the
aliase of the controller *used by a device* available (previously
qemuBuildDeviceAddressStr only had the deviceinfo of the device
itself, *not* of the controller it was connecting to). This made for a
larger than desired diff, but at least in the future we won't have to
do it again, since all the information we could possibly ever need for
future enhancements is in the virDomainDef. (right?)
This should be done for *all* controllers, but for now we just do it
in the case of PCI controllers, to reduce the likelyhood of
regression.
This patch adds in special handling for a few devices that need to be
treated differently for q35 domains:
usb - there is no implicit/default usb controller for the q35
machinetype. This is done because normally the default usb controller
is added to a domain by just adding "-usb" to the qemu commandline,
and it's assumed that this will add a single piix3 usb1 controller at
slot 1 function 2. That's not what happens when the machinetype is
q35, though. Instead, adding -usb to the commandline adds 3 usb
(version 2) controllers to the domain at slot 0x1D.{1,2,7}. Rather
than having
<controller type='usb' index='0'/>
translate into 3 separate devices on the PCI bus, it's cleaner to not
automatically add a default usb device; one can always be added
explicitly if desired. Or we may decide that on q35 machines, 3 usb
controllers will be automatically added when none is given. But for
this initial commit, at least we aren't locking ourselves into
something we later won't want.
video - qemu always initializes the primary video device immediately
after any integrated devices for the machinetype. Unless instructed
otherwise (by using "-device vga..." instead of "-vga" which libvirt
uses in many cases to work around deficiencies and bugs in various
qemu versions) qemu will always pick the first unused slot. In the
case of the "pc" machinetype and its derivatives, this is always slot
2, but on q35 machinetypes, the first free slot is slot 1 (since the
q35's integrated peripheral devices are placed in other slots,
e.g. slot 0x1f). In order to make the PCI address of the video device
predictable, that slot (1 or 2, depending on machinetype) is reserved
even when no video device has been specified.
sata - a q35 machine always has a sata controller implicitly added at
slot 0x1F, function 2. There is no way to avoid this controller, so we
always add it. Note that the xml2xml tests for the pcie-root and q35
cases were changed to use DO_TEST_DIFFERENT() so that we can check for
the sata controller being automatically added. This is especially
important because we can't check for it in the xml2argv output (it has
no effect on that output since it's an implicit device).
ide - q35 has no ide controllers.
isa and smbus controllers - these two are always present in a q35 (at
slot 0x1F functions 0 and 3) but we have no way of modelling them in
our config. We do need to reserve those functions so that the user
doesn't attempt to put anything else there though. (note that the "pc"
machine type also has an ISA controller, which we also ignore).
This PCI controller, named "dmi-to-pci-bridge" in the libvirt config,
and implemented with qemu's "i82801b11-bridge" device, connects to a
PCI Express slot (e.g. one of the slots provided by the pcie-root
controller, aka "pcie.0" on the qemu commandline), and provides 31
*non-hot-pluggable* PCI (*not* PCIe) slots, numbered 1-31.
Any time a machine is defined which has a pcie-root controller
(i.e. any q35-based machinetype), libvirt will automatically add a
dmi-to-pci-bridge controller if one doesn't exist, and also add a
pci-bridge controller. The reasoning here is that any useful domain
will have either an immediate (startup time) or eventual (subsequent
hot-plug) need for a standard PCI slot; since the pcie-root controller
only provides PCIe slots, we need to connect a dmi-to-pci-bridge
controller to it in order to get a non-hot-plug PCI slot that we can
then use to connect a pci-bridge - the slots provided by the
pci-bridge will be both standard PCI and hot-pluggable.
Since pci-bridge devices themselves can not be hot-plugged into a
running system (although you can hot-plug other devices into a
pci-bridge's slots), any new pci-bridge controller that is added can
(and will) be plugged into the dmi-to-pci-bridge as long as it has
empty slots available.
This patch is also changing the qemuxml2xml-pcie test from a "DO_TEST"
to a "DO_DIFFERENT_TEST". This is so that the "before" xml can omit
the automatically added dmi-to-pci-bridge and pci-bridge devices, and
the "after" xml can include it - this way we are testing if libvirt is
properly adding these devices.
This controller is implicit on q35 machinetypes. It provides 31 PCIe
(*not* PCI) slots as controller 0.
Currently there are no devices that can connect to pcie-root, and no
implicit pci controller on a q35 machine, so q35 is still
unusable. For a usable q35 system, we need to add a
"dmi-to-pci-bridge" pci controller, which can connect to pcie-root,
and provides standard pci slots that can be used to connect other
devices.
Previous refactoring of the guest PCI address reservation/allocation
code allowed for slot types other than basic PCI (e.g. PCI express,
non-hotpluggable slots, etc) but would not auto-allocate a slot for a
device that required any type other than a basic hot-pluggable
PCI slot.
This patch refactors the code to be aware of different slot types
during auto-allocation of addresses as well - as long as there is an
empty slot of the required type, it will be found and used.
The piece that *wasn't* added is that we don't auto-create a new PCI
bus when needed for anything except basic PCI devices. This is because
there are multiple different types of controllers that can provide,
for example, a PCI express slot (in addition to the pcie-root
controller, these can also be found on a "root-port" or on a
"downstream-switch-port"). Since we currently don't support any PCIe
devices (except pending support for dmi-to-pci-bridge), we can defer
any decision on what to do about this.
Commit 632180d1 introduced memory corruption in xenDaemonListDefinedDomains
by starting to populate the names array at index -1, causing all sorts
of havoc in libvirtd such as aborts like the following
*** Error in `/usr/sbin/libvirtd': double free or corruption (out): 0x00007fffe00ccf20 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7abf6)[0x7ffff3fa0bf6]
/lib64/libc.so.6(+0x7b973)[0x7ffff3fa1973]
/lib64/libc.so.6(xdr_array+0xde)[0x7ffff403cbae]
/usr/sbin/libvirtd(+0x50251)[0x5555555a4251]
/lib64/libc.so.6(xdr_free+0x15)[0x7ffff403ccd5]
/usr/lib64/libvirt.so.0(+0x1fad34)[0x7ffff76b1d34]
/usr/lib64/libvirt.so.0(virNetServerProgramDispatch+0x1fc)[0x7ffff76b16f1]
/usr/lib64/libvirt.so.0(+0x1f214a)[0x7ffff76a914a]
/usr/lib64/libvirt.so.0(+0x1f222d)[0x7ffff76a922d]
/usr/lib64/libvirt.so.0(+0xbcc4f)[0x7ffff7573c4f]
/usr/lib64/libvirt.so.0(+0xbc5e5)[0x7ffff75735e5]
/lib64/libpthread.so.0(+0x7e0f)[0x7ffff48f7e0f]
/lib64/libc.so.6(clone+0x6d)[0x7ffff400e7dd]
Fix by initializing ret to 0 and only setting to error on failure path.
This configuration knob lets user to set the length of queue of
connection requests waiting to be accept()-ed by the daemon. IOW, it
just controls the @backlog passed to listen:
int listen(int sockfd, int backlog);
Currently, even if max_client limit is hit, we accept() incoming
connection request, but close it immediately. This has disadvantage of
not using listen() queue. We should accept() only those clients we
know we can serve and let all other wait in the (limited) queue.
* The functions qemuDomainPCIAddressReserveAddr and
qemuDomainPCIAddressReserveSlot were very similar (and should have
been more similar) and were about to get more code added to them which
would create even more duplicated code, so this patch gives
qemuDomainPCIAddressReserveAddr a "reserveEntireSlot" arg, then
replaces the body of qemuDomainPCIAddressReserveSlot with a call to
qemuDomainPCIAddressReserveAddr.
You will notice that addrs->lastaddr was previously set in
qemuDomainPCIAddressReserveAddr (but *not* set in
qemuDomainPCIAddressReserveSlot). For consistency and cleanliness of
code, that bit was removed and put into the one caller of
qemuDomainPCIAddressReserveAddr (there is a similar place where the
caller of qemuDomainPCIAddressReserveSlot sets lastaddr). This does
guarantee identical functionality to pre-patch code, but in practice
isn't really critical, because lastaddr is just keeping track of where
to start when looking for a free slot - if it isn't updated, we will
just start looking on a slot that's already occupied, then skip up to
one that isn't.
* qemuCollectPCIAddress was essentially doing the same thing as
qemuDomainPCIAddressReserveAddr, but with some extra special case
checking at the beginning. The duplicate code has been replaced with
a call to qemuDomainPCIAddressReserveAddr. This required adding a
"fromConfig" boolean, which is only used to change the log error
code from VIR_ERR_INTERNAL_ERROR (when the address was
auto-generated by libvirt) to VIR_ERR_XML_ERROR (when the address is
coming from the config); without this differentiation, it would be
difficult to tell if an error was caused by something wrong in
libvirt's auto-allocate code or just bad config.
* the bit of code in qemuDomainPCIAddressValidate that checks the
connect type flags is going to be used in a couple more places where
we don't need to also check the slot limits (because we're generating
the slot number ourselves), so that has been pulled out into a
separate qemuDomainPCIAddressFlagsCompatible function.
* qemuDomainPCIAddressSetNextAddr
The name of this function was confusing because 1) other functions in
the file that end in "Addr" are only operating on a single function of
one PCI slot, not the entire slot, while functions that do something
with the entire slot end in "Slot", and 2) it didn't contain a verb
describing what it is doing (the "Set" refers to the set that contains
all PCI buses in the system, used to keep track of which slots in
which buses are already reserved for use).
It is now renamed to qemuDomainPCIAddressReserveNextSlot, which more
clearly describes what it is doing. Arguably, it could have been
changed to qemuDomainPCIAddressSetReserveNextSlot, but 1) the word
"set" is confusing in this context because it could be intended as a
verb or as a noun, and 2) most other functions that operate on a
single slot or address within this set are also named
qemuDomainPCIAddress... rather than qemuDomainPCIAddressSet... Only
the Create, Free, and Grow functions for an address set (which modify the
entire set, not just one element) use "Set" in their name.
* qemuPCIAddressAsString, qemuPCIAddressValidate
All the other functions in this set are named
qemuDomainPCIAddressxxxxx, so I renamed these to be consistent.
The parser shouldn't be doing arch-specific things like adding in
implicit controllers to the config. This should instead be done in the
hypervisor's post-parse callback.
This patch removes the auto-add of a usb controller from the domain
parser, and puts it into the qemu driver's post-parse callback (just
as is already done with the auto-add of the pci-root controller). In
the future, any machine/arch that shouldn't have a default usb
controller added should just set addDefaultUSB = false in this
function.
We've recently seen that q35 and ARMV7L domains shouldn't get a default USB
controller, so I've set addDefaultUSB to false for both of those.
If upgrading from a libvirt that is older than 1.0.5, we can
not assume that vm->def->resource is non-NULL. This bogus
assumption caused libvirtd to crash
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The journald code would crash if a NULL was passed for the
filename / funcname in the logging code. This shouldn't
happen in general, but it is better to be safe, since there
have been bugs triggering this.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virLibConnError macros in libvirt-lxc.c and
libvirt-qemu.c were passing NULL for the filename.
This causes a crash if the logging code is configured
to use journald.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
* Move platform specific things (e.g. firewalling and route
collision checks) into bridge_driver_platform
* Create two platform specific implementations:
- bridge_driver_linux: Linux implementation using iptables,
it's actually the code moved from bridge_driver.c
- bridge_driver_nop: dumb implementation that does nothing
Signed-off-by: Eric Blake <eblake@redhat.com>
*src/util/virstoragefile.c: Add a helper function to get
the first name of missing backing files, if the name is NULL,
it means the diskchain is not broken.
*src/qemu/qemu_domain.c: qemuDiskChainCheckBroken(disk) to
check if its chain is broken
Refactor this function to make it focus on disk presence checking,
including diskchain checking, and not only for CDROM and Floppy.
This change is good for the following patches.
The virDomainDef is allocated by the caller and also used after
calling to xenDaemonCreateXML. So it must not get freed by the
callee.
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Make the virCgroupNewMachine method try to use systemd-machined
first. If that fails, then fallback to using the traditional
cgroup setup code path.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When systemd is involved in managing processes, it may start
killing off & tearing down croups associated with the process
while we're still doing virCgroupKillPainfully. We must
explicitly check for ENOENT and treat it as if we had finished
killing processes
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Systemd uses a named cgroup mount for tracking processes. Add
it as another type of controller, albeit one which we have to
special case in a number of places. In particular we must
never create/delete directories there, nor add tasks. Essentially
the systemd mount is to be considered read-only for libvirt.
With this change both the virCgroupDetectPlacement and
virCgroupCopyPlacement methods must be invoked. The copy
placement method will copy setup for resource controllers
only. The detect placement method will probe for any
named controllers, or resource controllers not already
setup.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There are some interesting escaping rules to consider when dealing
with systemd slice/scope names. Thus it is helpful to have APIs
for formatting names
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Although this isn't apparently needed for the guest agent itself, the
test I will be adding later depends on the newline as a separator of
messages to process.
this patch introduce the console api in libxl driver for both pv and
hvm guest. and import and update the libxlMakeChrdevStr function
which was deleted in commit dfa1e1dd.
Signed-off-by: Bamvor Jian Zhang <bjzhang@suse.com>
This function is needed for virt-login-shell. Also modify virGirUserDirectory
to use the new function, to simplify the code.
Signed-off-by: Eric Blake <eblake@redhat.com>
The VIR_DOMAIN_PAUSED_GUEST_PANICKED constant is badly named,
leaking the QEMU event name. Elsewhere in the API we use
'CRASHED' rather than 'PANICKED', and the addition of 'GUEST'
is redundant since all events are guest related.
Thus rename it to VIR_DOMAIN_PAUSED_CRASHED, which matches
with VIR_DOMAIN_RUNNING_CRASHED and VIR_DOMAIN_EVENT_CRASHED.
It was added in commit 14e7e0ae8d
which post-dates v1.1.0, so is safe to rename before 1.1.1
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The VIR_DOMAIN_SHUTDOWN_CRASHED state constant does not appear
to be used in the QEMU code anyway. It also doesn't make much
(any) sense, since the 'shutdown' state is a transient state
between 'running' and 'shutoff' and when a guest crashes, it
does not end up in a 'shutdown' state, only 'shutoff'.
It was added in commit 14e7e0ae8d
which post-dates v1.1.0, so is safe to remove before 1.1.1
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The way we were casting small (<32bit) integers was broken
on big endian hosts, causing stack smashing. This was detected
in the test suite either by test failures due to incorrect
results, or by libc/gcc abort'ing with its stack canary
triggered.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Depending on the set of mingw packages installed, it is possible
that other .c files hit the mingw header pollution from the
virdbus.h file.
In file included from ../../src/rpc/virnetserver.c:39:0:
../../src/util/virdbus.h:41:35: error: expected ';', ',' or ')' before 'struct'
const char *interface,
^
* src/util/virdbus.h (virDBusCallMethod): Match .c file change.
Signed-off-by: Eric Blake <eblake@redhat.com>
On platforms without decent group support, the build failed:
Cannot export virGetGroupList: symbol not defined
./.libs/libvirt_security_manager.a(libvirt_security_manager_la-security_dac.o): In function `virSecurityDACPreFork':
/home/eblake/libvirt-tmp/build/src/../../src/security/security_dac.c:248: undefined reference to `virGetGroupList'
collect2: error: ld returned 1 exit status
* src/util/virutil.c (virGetGroupList): Provide dummy implementation.
Signed-off-by: Eric Blake <eblake@redhat.com>
Our recent conversion to make VIR_ALLOC report oom wasn't
tested on mingw:
In file included from ../../src/util/virthread.c:29:0:
../../src/util/virthreadwin32.c: In function 'virCondWait':
../../src/util/virthreadwin32.c:166:81: error: 'VIR_FROM_THIS' undeclared (first use in this function)
if (VIR_REALLOC_N(c->waiters, c->nwaiters + 1) < 0) {
^
* src/util/virthreadwin32.c (VIR_FROM_THIS): Define.
Signed-off-by: Eric Blake <eblake@redhat.com>
The previous patch was incomplete.
CC libvirt_util_la-vircgroup.lo
../../src/util/vircgroup.c:70:12: error: 'virCgroupPartitionEscape' declared 'static' but never defined [-Werror=unused-function]
static int virCgroupPartitionEscape(char **path);
^
* src/util/vircgroup.c (virCgroupPartitionEscape): Move forward
declaration inside conditional.
Signed-off-by: Eric Blake <eblake@redhat.com>
The virCgroupValidateMachineGroup method calls some functions
which are only conditionally compiled, thus it too must be
made conditional. This fixes the build on non-Linux hosts.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A VPATH build 'make check' was failing with:
GEN check-driverimpls
Can't open ../../src/../../src/lxc/lxc_monitor_protocol.h: No such file or directory at ../../src/check-driverimpls.pl line 29, <> line 27153.
Can't open ../../src/../../src/lxc/lxc_monitor_protocol.c: No such file or directory at ../../src/check-driverimpls.pl line 29, <> line 27153.
...
GEN check-aclrules
cannot read ../../src/../../src/remote/remote_protocol.x at ../../src/check-aclrules.pl line 128.
because $(srcdir) was being prepended to file names that already
included it.
* src/Makefile.am (check-driverimpls): Don't add srcdir twice.
Signed-off-by: Eric Blake <eblake@redhat.com>
When the legacy Xen driver probes with a NULL URI, and
finds itself running on Xen, it will set conn->uri. A
little bit later though it checks to see if libxl support
exists, and if so declines the driver. This leaves the
conn->uri set to 'xen:///', so if libxl also declines
it, it prevents probing of the QEMU driver.
Once a driver has set the conn->uri, it must *never*
decline an open request. So we must move the libxl
check earlier
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=981094
The commit 0ad9025ef introduce qemu flag QEMU_CAPS_DEVICE_VIDEO_PRIMARY
for using -device VGA, -device cirrus-vga, -device vmware-svga and
-device qxl-vga. In use, for -device qxl-vga, mouse doesn't display
in guest window like the desciption in above bug.
This patch try to use -device for primary video when qemu >=1.6 which
contains the bug fix patch
Otherwise, with new enough gcc compiling at -O2, the build fails with:
../../src/conf/domain_conf.c: In function ‘virDomainDeviceDefPostParse’:
../../src/conf/domain_conf.c:2821:29: error: ‘cnt’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
for (i = 0; i < *cnt; i++) {
^
../../src/conf/domain_conf.c:2795:20: note: ‘cnt’ was declared here
size_t i, *cnt;
^
../../src/conf/domain_conf.c:2794:30: error: ‘arrPtr’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
virDomainChrDefPtr **arrPtr;
^
* src/conf/domain_conf.c (virDomainChrGetDomainPtrs): Always
assign into output parameters.
Signed-off-by: Eric Blake <eblake@redhat.com>
By setting the default partition in libvirt_lxc it is not
visible when querying the live XML. Move setting of the
default partition into libvirtd virLXCProcessStart
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Decrementing it when it was already 0 causes an invalid free
in virNetworkDefUpdateDNSHost if virNetworkDNSHostDefParseXML
fails and virNetworkDNSHostDefClear gets called twice.
virNetworkForwardDefClear left the number untouched even if it
freed all the elements.
If the app has provided a whitelist of controllers to be used,
we skip detecting its mount point. We still, however, fill in
the placement info which later confuses the machine name
validation code. Skip detecting placement if the controller
mount point is not set
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When a VM has an 'emulator' child cgroup present, we must
strip off that suffix when detecting the cgroup for a
machine
Rename the virCgroupIsValidMachineGroup method to
virCgroupValidateMachineGroup to make a bit clearer
that this isn't simply a boolean check, it will make
changes to the object.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virCgroupIsValidMachine does not need to be called from
outside the cgroups file now, so make it static.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of requiring drivers to use a combination of calls
to virCgroupNewDetect and virCgroupIsValidMachine, combine
the two into virCgroupNewDetectMachine
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Both virStoragePoolFree and virStorageVolFree reset the last error,
which might lead to the cryptic message:
An error occurred, but the cause is unknown
When the volume wasn't found, virStorageVolFree was called with NULL,
leading to an error:
invalid storage volume pointer in virStorageVolFree
This patch changes it to:
Storage volume not found: no storage vol with matching name 'tomato'
Add protection such that the virCgroupRemove and
virCgroupKill* do not do anything to the root cgroup.
Killing all PIDs in the root cgroup does not end well.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of requiring one API call to create a cgroup and
another to add a task to it, introduce a new API
virCgroupNewMachine which does both jobs at once. This
will facilitate the later code to talk to systemd to
achieve this job which is also atomic.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There's a race in lxc driver causing a deadlock. If a domain is
destroyed immediately after started, the deadlock can occur. When domain
is started, the even loop tries to connect to the monitor. If the
connecting succeeds, virLXCProcessMonitorInitNotify() is called with
@mon->client locked. The first thing that callee does, is
virObjectLock(vm). So the order of locking is: 1) @mon->client, 2) @vm.
However, if there's another thread executing virDomainDestroy on the
very same domain, the first thing done here is locking the @vm. Then,
the corresponding libvirt_lxc process is killed and monitor is closed
via calling virLXCMonitorClose(). This callee tries to lock @mon->client
too. So the order is reversed to the first case. This situation results
in deadlock and unresponsive libvirtd (since the eventloop is involved).
The proper solution is to unlock the @vm in virLXCMonitorClose prior
entering virNetClientClose(). See the backtrace as follows:
Thread 25 (Thread 0x7f1b7c9b8700 (LWP 16312)):
0 0x00007f1b80539714 in __lll_lock_wait () from /lib64/libpthread.so.0
1 0x00007f1b8053516c in _L_lock_516 () from /lib64/libpthread.so.0
2 0x00007f1b80534fbb in pthread_mutex_lock () from /lib64/libpthread.so.0
3 0x00007f1b82a637cf in virMutexLock (m=0x7f1b3c0038d0) at util/virthreadpthread.c:85
4 0x00007f1b82a4ccf2 in virObjectLock (anyobj=0x7f1b3c0038c0) at util/virobject.c:320
5 0x00007f1b82b861f6 in virNetClientCloseInternal (client=0x7f1b3c0038c0, reason=3) at rpc/virnetclient.c:696
6 0x00007f1b82b862f5 in virNetClientClose (client=0x7f1b3c0038c0) at rpc/virnetclient.c:721
7 0x00007f1b6ee12500 in virLXCMonitorClose (mon=0x7f1b3c007210) at lxc/lxc_monitor.c:216
8 0x00007f1b6ee129f0 in virLXCProcessCleanup (driver=0x7f1b68100240, vm=0x7f1b680ceb70, reason=VIR_DOMAIN_SHUTOFF_DESTROYED) at lxc/lxc_process.c:174
9 0x00007f1b6ee14106 in virLXCProcessStop (driver=0x7f1b68100240, vm=0x7f1b680ceb70, reason=VIR_DOMAIN_SHUTOFF_DESTROYED) at lxc/lxc_process.c:710
10 0x00007f1b6ee1aa36 in lxcDomainDestroyFlags (dom=0x7f1b5c002560, flags=0) at lxc/lxc_driver.c:1291
11 0x00007f1b6ee1ab1a in lxcDomainDestroy (dom=0x7f1b5c002560) at lxc/lxc_driver.c:1321
12 0x00007f1b82b05be5 in virDomainDestroy (domain=0x7f1b5c002560) at libvirt.c:2303
13 0x00007f1b835a7e85 in remoteDispatchDomainDestroy (server=0x7f1b857419d0, client=0x7f1b8574ae40, msg=0x7f1b8574acf0, rerr=0x7f1b7c9b7c30, args=0x7f1b5c004a50) at remote_dispatch.h:3143
14 0x00007f1b835a7d78 in remoteDispatchDomainDestroyHelper (server=0x7f1b857419d0, client=0x7f1b8574ae40, msg=0x7f1b8574acf0, rerr=0x7f1b7c9b7c30, args=0x7f1b5c004a50, ret=0x7f1b5c0029e0) at remote_dispatch.h:3121
15 0x00007f1b82b93704 in virNetServerProgramDispatchCall (prog=0x7f1b8573af90, server=0x7f1b857419d0, client=0x7f1b8574ae40, msg=0x7f1b8574acf0) at rpc/virnetserverprogram.c:435
16 0x00007f1b82b93263 in virNetServerProgramDispatch (prog=0x7f1b8573af90, server=0x7f1b857419d0, client=0x7f1b8574ae40, msg=0x7f1b8574acf0) at rpc/virnetserverprogram.c:305
17 0x00007f1b82b8c0f6 in virNetServerProcessMsg (srv=0x7f1b857419d0, client=0x7f1b8574ae40, prog=0x7f1b8573af90, msg=0x7f1b8574acf0) at rpc/virnetserver.c:163
18 0x00007f1b82b8c1da in virNetServerHandleJob (jobOpaque=0x7f1b8574dca0, opaque=0x7f1b857419d0) at rpc/virnetserver.c:184
19 0x00007f1b82a64158 in virThreadPoolWorker (opaque=0x7f1b8573cb10) at util/virthreadpool.c:144
20 0x00007f1b82a63ae5 in virThreadHelper (data=0x7f1b8574b9f0) at util/virthreadpthread.c:161
21 0x00007f1b80532f4a in start_thread () from /lib64/libpthread.so.0
22 0x00007f1b7fc4f20d in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7f1b83546740 (LWP 16297)):
0 0x00007f1b80539714 in __lll_lock_wait () from /lib64/libpthread.so.0
1 0x00007f1b8053516c in _L_lock_516 () from /lib64/libpthread.so.0
2 0x00007f1b80534fbb in pthread_mutex_lock () from /lib64/libpthread.so.0
3 0x00007f1b82a637cf in virMutexLock (m=0x7f1b680ceb80) at util/virthreadpthread.c:85
4 0x00007f1b82a4ccf2 in virObjectLock (anyobj=0x7f1b680ceb70) at util/virobject.c:320
5 0x00007f1b6ee13bd7 in virLXCProcessMonitorInitNotify (mon=0x7f1b3c007210, initpid=4832, vm=0x7f1b680ceb70) at lxc/lxc_process.c:601
6 0x00007f1b6ee11fd3 in virLXCMonitorHandleEventInit (prog=0x7f1b3c001f10, client=0x7f1b3c0038c0, evdata=0x7f1b8574a7d0, opaque=0x7f1b3c007210) at lxc/lxc_monitor.c:109
7 0x00007f1b82b8a196 in virNetClientProgramDispatch (prog=0x7f1b3c001f10, client=0x7f1b3c0038c0, msg=0x7f1b3c003928) at rpc/virnetclientprogram.c:259
8 0x00007f1b82b87030 in virNetClientCallDispatchMessage (client=0x7f1b3c0038c0) at rpc/virnetclient.c:1019
9 0x00007f1b82b876bb in virNetClientCallDispatch (client=0x7f1b3c0038c0) at rpc/virnetclient.c:1140
10 0x00007f1b82b87d41 in virNetClientIOHandleInput (client=0x7f1b3c0038c0) at rpc/virnetclient.c:1312
11 0x00007f1b82b88f51 in virNetClientIncomingEvent (sock=0x7f1b3c0044e0, events=1, opaque=0x7f1b3c0038c0) at rpc/virnetclient.c:1832
12 0x00007f1b82b9e1c8 in virNetSocketEventHandle (watch=3321, fd=54, events=1, opaque=0x7f1b3c0044e0) at rpc/virnetsocket.c:1695
13 0x00007f1b82a272cf in virEventPollDispatchHandles (nfds=21, fds=0x7f1b8574ded0) at util/vireventpoll.c:498
14 0x00007f1b82a27af2 in virEventPollRunOnce () at util/vireventpoll.c:645
15 0x00007f1b82a25a61 in virEventRunDefaultImpl () at util/virevent.c:273
16 0x00007f1b82b8e97e in virNetServerRun (srv=0x7f1b857419d0) at rpc/virnetserver.c:1097
17 0x00007f1b8359db6b in main (argc=2, argv=0x7ffff98dbaa8) at libvirtd.c:1512
The avail_vcpu bitmap has to be allocated before it can be used (using
the maximum allowed value for that). Then for each available VCPU the
bit in the mask has to be set (libxl_bitmap_set takes a bit position
as an argument, not the number of bits to set).
Without this, I would always only get one VCPU for guests created
through libvirt/libxl.
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
virCgroupAvailable() implementation calls getmntent_r
without checking if HAVE_GETMNTENT_R is defined, so it fails
to build on platforms without getmntent_r support.
Make virCgroupAvailable() just return false without
HAVE_GETMNTENT_R.
Function qemuOpenFile() haven't had any idea about seclabels applied
to VMs only, so in case the seclabel differed from the "user:group"
from configuration, there might have been issues with opening files.
Make qemuOpenFile() VM-aware, but only optionally, passing NULL
argument means skipping VM seclabel info completely.
However, all current qemuOpenFile() calls look like they should use VM
seclabel info in case there is any, so convert these calls as well.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=869053
Parsing 'user:group' is useful even outside the DAC security driver,
so expose the most abstract function which has no DAC security driver
bits in itself.
Since PCI bridges, PCIe bridges, PCIe switches, and PCIe root ports
all share the same namespace, they are all defined as controllers of
type='pci' in libvirt (but with a differing model attribute). Each of
these controllers has a certain connection type upstream, allows
certain connection types downstream, and each can either allow a
single downstream connection at slot 0, or connections from slot 1 -
31.
Right now, we only support the pci-root and pci-bridge devices, both
of which only allow PCI devices to connect, and both which have usable
slots 1 - 31. In preparation for adding other types of controllers
that have different capabilities, this patch 1) adds info to the
qemuDomainPCIAddressBus object to indicate the capabilities, 2) sets
those capabilities appropriately for pci-root and pci-bridge devices,
and 3) validates that the controller being connected to is the proper
type when allocating slots or validating that a user-selected slot is
appropriate for a device..
Having this infrastructure in place will make it much easier to add
support for the other PCI controller types.
While it would be possible to do all the necessary checking by just
storing the controller model in the qemyuDomainPCIAddressBus, it
greatly simplifies all the validation code to also keep a "flags",
"minSlot" and "maxSlot" for each - that way we can just check those
attributes rather than requiring a nearly identical switch statement
everywhere we need to validate compatibility.
You may notice many places where the flags are seemingly hard-coded to
QEMU_PCI_CONNECT_HOTPLUGGABLE | QEMU_PCI_CONNECT_TYPE_PCI
This is currently the correct value for all PCI devices, and in the
future will be the default, with small bits of code added to change to
the flags for the few devices which are the exceptions to this rule.
Finally, there are a few places with "FIXME" comments. Note that these
aren't indicating places that are broken according to the currently
supported devices, they are places that will need fixing when support
for new PCI controller models is added.
To assure that there was no regression in the auto-allocation of PCI
addresses or auto-creation of integrated pci-root, ide, and usb
controllers, a new test case (pci-bridge-many-disks) has been added to
both the qemuxml2argv and qemuxml2xml tests. This new test defines a
domain with several dozen virtio disks but no pci-root or
pci-bridges. The .args file of the new test case was created using
libvirt sources from before this patch, and the test still passes
after this patch has been applied.
Although these two enums are named ..._LAST, they really had the value
of ..._SIZE. This patch changes their values so that, e.g.,
QEMU_PCI_ADDRESS_SLOT_LAST really is the slot number of the last slot
on a PCI bus.
The implicit IDE, USB, and video controllers provided by the PIIX3
chipset in the pc-* machinetypes are not present on other
machinetypes, so we shouldn't be doing the special checking for
them. This patch places those validation checks into a separate
function that is only called for machine types that have a PIIX3 chip
(which happens to be the i440fx-based pc-* machine types).
One qemuxml2argv test data file had to be changed - the
pseries-usb-multi test had included a piix3-usb-uhci device, which was
being placed at a specific address, and also had slot 2 auto reserved
for a video device, but the pseries virtual machine doesn't actually
have a PIIX3 chip, so even if there was a piix3-usb-uhci driver for
it, the device wouldn't need to reside at slot 1 function 2. I just
changed the .argv file to have the generic slot info for the two
devices that results when the special PIIX3 code isn't executed.
qemuDomainPCIAddressBus was an array of QEMU_PCI_ADDRESS_SLOT_LAST
uint8_t's, which worked fine as long as every PCI bus was
identical. In the future, some PCI busses will allow connecting PCI
devices, and some will allow PCIe devices; also some will only allow
connection of a single device, while others will allow connecting 31
devices.
In order to keep track of that information for each bus, we need to
turn qemuDomainPCIAddressBus into a struct, for now with just one
member:
uint8_t slots[QEMU_PCI_ADDRESS_SLOT_LAST];
Additional members will come in later patches.
The item in qemuDomainPCIAddresSet that contains the array of
qemuDomainPCIAddressBus is now called "buses" to be more consistent
with the already existing "nbuses" (and with the new "slots" array).
Thanks to a lack of coordination between kernel and glibc folks,
it has been impossible to mix code using <linux/in.h> and
<net/in.h> for some time now (see for example commit c308a9a).
On at least RHEL 6, <linux/if_bridge.h> tries to use the kernel
side, and fails due to our desire to use the glibc side elsewhere:
In file included from /usr/include/linux/if_bridge.h:17,
from util/virnetdevbridge.c:42:
/usr/include/linux/in6.h:31: error: redefinition of ‘struct in6_addr’
/usr/include/linux/in6.h:48: error: redefinition of ‘struct sockaddr_in6’
/usr/include/linux/in6.h:56: error: redefinition of ‘struct ipv6_mreq’
Thankfully, the kernel layout of these structs is ABI-compatible,
they only differ in the type system presented to the C compiler.
While there are other versions of kernel headers that avoid the
problem, it is easier to just work around the issue than to expect
all developers to upgrade to working kernel headers.
* src/util/virnetdevbridge.c (includes): Coerce the kernel version
of in.h to not collide with the normal version.
Signed-off-by: Eric Blake <eblake@redhat.com>
dbus 1.2.24 (on RHEL 6) lacks DBUS_TYPE_UNIX_FD; but as we aren't
trying to pass one of those anyways, we can just drop support for
it in our wrapper. Solves this build error introduced in commit
834c9c94:
CC libvirt_util_la-virdbus.lo
util/virdbus.c:242: error: 'DBUS_TYPE_UNIX_FD' undeclared here (not in a function)
* src/util/virdbus.c (virDBusBasicTypes): Drop support for unix fds.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit id '4421e257' strdup'd devAlias, but didn't free
Running qemuhotplugtest under valgrind resulted in the following:
==7375== 9 bytes in 1 blocks are definitely lost in loss record 11 of 70
==7375== at 0x4A0887C: malloc (vg_replace_malloc.c:270)
==7375== by 0x37C1085D71: strdup (strdup.c:42)
==7375== by 0x4CBBD5F: virStrdup (virstring.c:554)
==7375== by 0x4CFF9CB: virDomainEventDeviceRemovedNew (domain_event.c:1174)
==7375== by 0x427791: qemuDomainRemoveChrDevice (qemu_hotplug.c:2508)
==7375== by 0x42C65D: qemuDomainDetachChrDevice (qemu_hotplug.c:3357)
==7375== by 0x41C94F: testQemuHotplug (qemuhotplugtest.c:115)
==7375== by 0x41D817: virtTestRun (testutils.c:168)
==7375== by 0x41C400: mymain (qemuhotplugtest.c:322)
==7375== by 0x41DF3A: virtTestMain (testutils.c:764)
==7375== by 0x37C1021A04: (below main) (libc-start.c:225)
Commit 'c8695053' resulted in the following:
Coverity error seen in the output:
ERROR: REVERSE_INULL
FUNCTION: lxcProcessAutoDestroy
Due to the 'dom' being checked before 'dom->persistent' since 'dom'
is already dereferenced prior to that.
Currently the LXC driver creates the VM's cgroup prior to
forking, and then libvirt_lxc moves the child process
into the cgroup. This won't work with systemd whose APIs
do the creation of cgroups + attachment of processes atomically.
Fortunately we simply move the entire cgroups setup into
the libvirt_lxc child process. We make it take place before
fork'ing into the background, so by the time virCommandRun
returns in the LXC driver, the cgroup is guaranteed to be
present.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the QEMU driver creates the VM's cgroup prior to
forking, and then uses a virCommand hook to move the child
into the cgroup. This won't work with systemd whose APIs
do the creation of cgroups + attachment of processes atomically.
Fortunately we have a handshake taking place between the
QEMU driver and the child process prior to QEMU being exec()d,
which was introduced to allow setup of disk locking. By good
fortune this synchronization point can be used to enable the
QEMU driver to do atomic setup of cgroups removing the use
of the hook script.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virCgroupNewDomainDriver and virCgroupNewDriver methods
are obsolete now that we can auto-detect existing cgroup
placement. Delete them to reduce code bloat.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Use the new virCgroupNewDetect function to determine cgroup
placement of existing running VMs. This will allow the legacy
cgroups creation APIs to be removed entirely
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add virCgroupIsValidMachine API to check whether an auto
detected cgroup is valid for a machine. This lets us
check if a VM has just been placed into some generic
shared cgroup, or worse, the root cgroup
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a virCgroupNewDetect API which is used to initialize a
cgroup object with the placement of an arbitrary process.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If systemd machine does not exist, return -2 instead of -1,
so that applications don't need to repeat the tedious error
checking code
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Current code for handling dbus errors only works for errors
received from the remote application itself. We must also
handle errors emitted by the bus itself, for example, when
it fails to spawn the target service.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a privileged field to storageDriverState
Use the privileged value in order to generate a connection which could
be passed to the various storage backend drivers.
In particular, the iSCSI driver will need a connect in order to perform
pool authentication using the 'chap' secrets and the RBD driver utilizes
the connection during pool refresh for pools using 'ceph' secrets.
For now that connection will be to be to qemu driver until a mechanism
is devised to get a connection to just the secret driver without qemu.
Update virStorageBackendRBDOpenRADOSConn() to use the internal API to the
secret driver in order to get the secret value instead of the external
virSecretGetValue() path. Without the flag VIR_SECRET_GET_VALUE_INTERNAL_CALL
there is no way to get the value of private secret.
This also requires ensuring there is a connection which wasn't true for
for the refreshPool() path calls from storageDriverAutostart() prior to
adding support for the connection to a qemu driver. It seems calls to
virSecretLookupByUUIDString() and virSecretLookupByUsage() from the
refreshPool() path would have failed with no way to find the secret - that is
theoretically speaking since the 'conn' was NULL the failure would have been
"failed to find the secret".
Although the XML for CHAP authentication with plain "password"
was introduced long ago, the function was never implemented. This
patch replaces the login/password mechanism by following the
'ceph' (or RBD) model of using a 'username' with a 'secret' which
has the authentication information.
This patch performs the authentication during startPool() processing
of pools with an authType of VIR_STORAGE_POOL_AUTH_CHAP specified
for iSCSI pools.
There are two types of CHAP configurations supported for iSCSI
authentication:
* Initiator Authentication
Forward, one-way; The initiator is authenticated by the target.
* Target Authentication
Reverse, Bi-directional, mutual, two-way; The target is authenticated
by the initiator; This method also requires Initiator Authentication
This only supports the "Initiator Authentication". (I don't have any
enterprise iSCSI env for testing, only have a iSCSI target setup with
tgtd, which doesn't support "Target Authentication").
"Discovery authentication" is not supported by tgt yet too. So this only
setup the session authentication by executing 3 iscsiadm commands, E.g:
% iscsiadm -m node --target "iqn.2013-05.test:iscsi.foo" --name \
"node.session.auth.authmethod" -v "CHAP" --op update
% iscsiadm -m node --target "iqn.2013-05.test:iscsi.foo" --name \
"node.session.auth.username" -v "Jim" --op update
% iscsiadm -m node --target "iqn.2013-05.test:iscsi.foo" --name \
"node.session.auth.password" -v "Jimsecret" --op update
During qemuTranslateDiskSourcePool() execution, if the srcpool has been
defined with authentication information, then for iSCSI pools copy the
authentication and host information to virDomainDiskDef.
Due to a goto statement missed when refactoring in 2771f8b74c
when acquiring of a domain job failed the error path was not taken. This
resulted into a crash afterwards as an extra reference was removed from a
domain object leading to it being freed. An attempt to list the domains
leaded to a crash of the daemon afterwards.
https://bugzilla.redhat.com/show_bug.cgi?id=928672
Commit 834c9c94 introduced virDBusMessageEncode and
virDBusMessageDecode functions, however corresponding stubs
were not added to !WITH_DBUS section, therefore 'make check'
started to fail when compiled w/out dbus support like that:
Expected symbol virDBusMessageDecode is not in ELF library
The translation must be done before both of cgroup and security
setting, otherwise since the disk source is not translated yet,
it might be skipped on cgroup and security setting.
virDomainDiskDefForeachPath is not only used by the security
setting helpers, also used by cgroup setting helpers, so this
is to ignore the volume type disk with mode="direct" for cgroup
setting.
The difference with already supported pool types (dir, fs, block)
is: there are two modes for iscsi pool (or network pools in future),
one can specify it either to use the volume target path (the path
showed up on host) with mode='host', or to use the remote URI qemu
supports (e.g. file=iscsi://example.org:6000/iqn.1992-01.com.example/1)
with mode='direct'.
For 'host' mode, it copies the volume target path into disk->src. For
'direct' mode, the corresponding info in the *one* pool source host def
is copied to disk->hosts[0].
There are two ways to use a iSCSI LUN as disk source for qemu.
* The LUN's path as it shows up on host, e.g.
/dev/disk/by-path/ip-$ip:3260-iscsi-$iqn-fc18:iscsi.iscsi0-lun-1
* The libiscsi URI from the storage pool source element host attribute, e.g.
iscsi://demo.org:6000/iqn.1992-01.com.example/1
For a "volume" type disk, if the specified "pool" is of iscsi
type, we should support to use the LUN in either of above 2 ways.
That's why to introduce a new XML tag "mode" for the disk source
(libvirt should support iscsi pool with libiscsi, but it's another
new feature, which should be done later).
The "mode" can be either of "host" or "direct". Use "host" to indicate
use of the LUN with the path as it shows up on host. Use "direct" to
indicate to use it with the source pool host URI (future patches may support
to use network type libvirt storage too, e.g. Ceph)
This is another cleanup before extracting platform-specific
parts from bridge_driver.
Rename struct network_driver to _virNetworkDriverState and
add appropriate typedefs: virNetworkDriverState and
virNetworkDriverStatePtr.
This will help us to avoid potential problems when moving
this struct to the .h file.
Convert the remaining methods in vircgroup.c to report errors
instead of returning errno values.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add virErrorSetErrnoFromLastError and virLastErrorIsSystemErrno
to simplify code which wants to handle system errors in a more
graceful fashion.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To register virtual machines and containers with systemd-machined,
and thus have cgroups auto-created, we need to talk over DBus.
This is somewhat tedious code, so introduce a dedicated function
to isolate the DBus call in one place.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Doing DBus method calls using libdbus.so is tedious in the
extreme. systemd developers came up with a nice high level
API for DBus method calls (sd_bus_call_method). While
systemd doesn't use libdbus.so, their API design can easily
be ported to libdbus.so.
This patch thus introduces methods virDBusCallMethod &
virDBusMessageRead, which are based on the code used for
sd_bus_call_method and sd_bus_message_read. This code in
systemd is under the LGPLv2+, so we're license compatible.
This code is probably pretty unintelligible unless you are
familiar with the DBus type system. So I added some API
docs trying to explain how to use them, as well as test
cases to validate that I didn't screw up the adaptation
from the original systemd code.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Until now CPU features inherited from a specified CPU model could only
be overridden with 'disable' policy. With this patch, any explicitly
specified feature always overrides the same feature inherited from a CPU
model regardless on the specified policy.
The CPU in x86-exact-force-Haswell.xml would previously be incompatible
with x86-host-SandyBridge.xml CPU even though x86-host-SandyBridge.xml
provides all features required by x86-exact-force-Haswell.xml.
If no explicit driver is set for an image backed filesystem,
set it to use the loop driver (if raw) or nbd driver (if
non-raw)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A couple of places in LXC setup for filesystems did not do
a "goto cleanup" after reporting errors. While fixing this,
also add in many more debug statements to aid troubleshooting
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The alias for hostdevs of type SCSI can be too long for QEMU if
larger LUNs are encountered. Here's a real life example:
<hostdev mode='subsystem' type='scsi' managed='no'>
<source>
<adapter name='scsi_host0'/>
<address bus='0' target='19' unit='1088634913'/>
</source>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</hostdev>
this results in a too long drive id, resulting in QEMU yelling
Property 'scsi-generic.drive' can't find value 'drive-hostdev-scsi_host0-0-19-1088634913'
This commit changes the alias back to the default hostdev$(index)
scheme.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
In case libvirtd is asked to unplug a device but the device is actually
unplugged later when libvirtd is not running, we need to detect that and
remove such device when libvirtd starts again and reconnects to running
domains.
Attempts to start a domain with both SELinux and DAC security
modules loaded will deadlock; latent problem introduced in commit
fdb3bde and exposed in commit 29fe5d7. Basically, when recursing
into the security manager for other driver's prefork, we have to
undo the asymmetric lock taken at the manager level.
Reported by Jiri Denemark, with diagnosis help from Dan Berrange.
* src/security/security_stack.c (virSecurityStackPreFork): Undo
extra lock grabbed during recursion.
Signed-off-by: Eric Blake <eblake@redhat.com>
Makefiles are another easy file to enforce line limits.
Mostly straightforward; interesting tricks worth noting:
src/Makefile.am: $(confdir) was already defined, use it in more places
tests/Makefile.am: path_add and VG required some interesting compression
* cfg.mk (sc_prohibit_long_lines): Add another test.
* Makefile.am: Fix offenders.
* daemon/Makefile.am: Likewise.
* docs/Makefile.am: Likewise.
* python/Makefile.am: Likewise.
* src/Makefile.am: Likewise.
* tests/Makefile.am: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit 75c1256 states that virGetGroupList must not be called
between fork and exec, then commit ee777e99 promptly violated
that for lxc's use of virSecurityManagerSetProcessLabel. Hoist
the supplemental group detection to the time that the security
manager needs to fork. Qemu is safe, as it uses
virSecurityManagerSetChildProcessLabel which in turn uses
virCommand to determine supplemental groups.
This does not fix the fact that virSecurityManagerSetProcessLabel
calls virSecurityDACParseIds calls parseIds which eventually
calls getpwnam_r, which also violates fork/exec async-signal-safe
safety rules, but so far no one has complained of hitting
deadlock in that case.
* src/security/security_dac.c (_virSecurityDACData): Track groups
in private data.
(virSecurityDACPreFork): New function, to set them.
(virSecurityDACClose): Clean up new fields.
(virSecurityDACGetIds): Alter signature.
(virSecurityDACSetSecurityHostdevLabelHelper)
(virSecurityDACSetChardevLabel, virSecurityDACSetProcessLabel)
(virSecurityDACSetChildProcessLabel): Update callers.
Signed-off-by: Eric Blake <eblake@redhat.com>
A future patch wants the DAC security manager to be able to safely
get the supplemental group list for a given uid, but at the time
of a fork rather than during initialization so as to pick up on
live changes to the system's group database. This patch adds the
framework, including the possibility of a pre-fork callback
failing.
For now, any driver that implements a prefork callback must be
robust against the possibility of being part of a security stack
where a later element in the chain fails prefork. This means
that drivers cannot do any action that requires a call to postfork
for proper cleanup (no grabbing a mutex, for example). If this
is too prohibitive in the future, we would have to switch to a
transactioning sequence, where each driver has (up to) 3 callbacks:
PreForkPrepare, PreForkCommit, and PreForkAbort, to either clean
up or commit changes made during prepare.
* src/security/security_driver.h (virSecurityDriverPreFork): New
callback.
* src/security/security_manager.h (virSecurityManagerPreFork):
Change signature.
* src/security/security_manager.c (virSecurityManagerPreFork):
Optionally call into driver, and allow returning failure.
* src/security/security_stack.c (virSecurityDriverStack):
Wrap the handler for the stack driver.
* src/qemu/qemu_process.c (qemuProcessStart): Adjust caller.
Signed-off-by: Eric Blake <eblake@redhat.com>
These helpers use the remembered host capabilities to retrieve the cpu
map rather than query the host again. The intended usage for this
helpers is to fix automatic NUMA placement with strict memory alloc. The
code doing the prepare needs to pin the emulator process only to cpus
belonging to a subset of NUMA nodes of the host.
When user does not specify any model for scsi controller, or worse, no
controller at all, but libvirt automatically adds scsi controller with
no model, we are not searching for virtio-scsi and thus this can fail
for example on qemu which doesn't support lsi logic adapter.
This means that when qemu on x86 doesn't support lsi53c895a and the
user adds the following to an XML without any scsi controller:
<disk ...>
...
<target dev='sda'>
</disk>
libvirt fails like this:
# virsh define asdf.xml
error: Failed to define domain from asdf.xml
error: internal error Unable to determine model for scsi controller
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=974943
With the majority of fields in the virLXCDriverPtr struct
now immutable or self-locking, there is no need for practically
any methods to be using the LXC driver lock. Only a handful
of helper APIs now need it.
The activeUsbHostdevs item in LXCDriver are lockable, but the lock has
to be called explicitly. Call the virObject(Un)Lock() in order to
achieve mutual exclusion once lxcDriverLock is removed.
The 'driver->caps' pointer can be changed on the fly. Accessing
it currently requires the global driver lock. Isolate this
access in a single helper, so a future patch can relax the
locking constraints.
Currently the virLXCDriverPtr struct contains an wide variety
of data with varying access needs. Move all the static config
data into a dedicated virLXCDriverConfigPtr object. The only
locking requirement is to hold the driver lock, while obtaining
an instance of virLXCDriverConfigPtr. Once a reference is held
on the config object, it can be used completely lockless since
it is immutable.
NB, not all APIs correctly hold the driver lock while getting
a reference to the config object in this patch. This is safe
for now since the config is never updated on the fly. Later
patches will address this fully.
When virAsprintf was changed from a function to a macro
reporting OOM error in dc6f2da, it was documented as returning
0 on success. This is incorrect, it returns the number of bytes
written as asprintf does.
Some of the functions were converted to use virAsprintf's return
value directly, changing the return value on success from 0 to >= 0.
For most of these, this is not a problem, but the change in
virPCIDriverDir breaks PCI passthrough.
The return value check in virhashtest pre-dates virAsprintf OOM
conversion.
vmwareMakePath seems to be unused.
Merge the virCommandPreserveFD / virCommandTransferFD methods
into a single virCommandPasFD method, and use a new
VIR_COMMAND_PASS_FD_CLOSE_PARENT to indicate their difference
in behaviour
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Wire up the new virDomainCreate{XML}WithFiles methods in the
LXC driver, so that FDs get passed down to the init process.
The lxc_container code needs to do a little dance in order
to renumber the file descriptors it receives into linear
order, starting from STDERR_FILENO + 1.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In the following commit:
commit 03d813bbcd
Author: Marek Marczykowski <marmarek@invisiblethingslab.com>
Date: Thu May 23 02:01:30 2013 +0200
remote: fix dom->id after virDomainCreateWithFlags
The virDomainCreateWithFlags remote client helper was made to
invoke REMOTE_PROC_DOMAIN_LOOKUP_BY_UUID to refresh the 'id'
of the domain, following the pattern used in the previous
virDomainCreate method impl.
The remote protocol for virDomainCreateWithFlags though did
actually fix the design flaw in virDomainCreate, by directly
returning the new domain info. For some reason, this data was
never used. So we can just use that data now instead.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Since they make use of file descriptor passing, the remote protocol
methods for virDomainCreate{XML}WithFiles must be written by hand.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
With container based virt, it is useful to be able to pass
pre-opened file descriptors to the container init process.
This allows for containers to be auto-activated from incoming
socket connections, passing the active socket into the container.
To do this, introduce a pair of new APIs, virDomainCreateXMLWithFiles
and virDomainCreateWithFiles, which accept an array of file
descriptors. For the LXC driver, UNIX file descriptor passing
will be used to send them to libvirtd, which will them pass
them down to libvirt_lxc, which will then pass them to the container
init process.
This will only be implemented for LXC right now, but the design
is generic enough it could work with other hypervisors, hence
I suggest adding this to libvirt.so, rather than libvirt-lxc.so
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add support for creating disk-only (no memory) snapshots in esx, and
for quiescing the VM before taking the snapshot. The VMware API
supports these operations directly, so adding support to libvirt is
just a matter of setting the flags correctly when calling
VMware. VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY and
VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE are now valid flags for esx.
Signed-off-by: Eric Blake <eblake@redhat.com>
Although, having it depending on Xen >= 4.3 (by using the proper
libxl feature flag).
Xen currently implements a NUMA placement policy which is basically
the same as the 'interleaved' policy of `numactl', although it can
be applied on a subset of the available nodes. We therefore hardcode
"interleave" as 'numa_mode', and we use the newly introduced libxl
interface to figure out what nodes a domain spans ('numa_nodeset').
With this change, it is now possible to query the NUMA node
affinity of a running domain:
[raistlin@Zhaman ~]$ sudo virsh --connect xen:/// list
Id Name State
----------------------------------------------------
23 F18_x64 running
[raistlin@Zhaman ~]$ sudo virsh --connect xen:/// numatune 23
numa_mode : interleave
numa_nodeset : 1
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
domainGetNumaParameters has a string typed parameter, hence it
is necessary for the libxl driver to support this.
This change implements the connectSupportsFeature hook for the
libxl driver, advertising that VIR_DRV_FEATURE_TYPED_PARAM_STRING
is supported.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Eric Blake <eblake@redhat.com>
Xen 4.3 changes sysctl version to 10 and domctl version to 9. Update
the hypervisor driver to work with those.
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Commit 75c1256 states that virGetGroupList must not be called
between fork and exec, then commit ee777e99 promptly violated
that for lxc.
Patch originally posted by Eric Blake <eblake@redhat.com>.
Reuse the buffer for getline and track buffer allocation
separately from the string length to prevent unlikely
out-of-bounds memory access.
This fixes the following leak that happened when zero bytes were read:
==404== 120 bytes in 1 blocks are definitely lost in loss record 1,344 of 1,671
==404== at 0x4C2C71B: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==404== by 0x906F862: getdelim (iogetdelim.c:68)
==404== by 0x52A48FB: virCgroupPartitionNeedsEscaping (vircgroup.c:1136)
==404== by 0x52A0FB4: virCgroupPartitionEscape (vircgroup.c:1171)
==404== by 0x52A0EA4: virCgroupNewDomainPartition (vircgroup.c:1450)
While generating seclabels, we check the seclabel stack if required
driver is in the stack. If not, an error is returned. However, it is
possible for a seclabel to not have any model set (happens with LXC
domains that have just <seclabel type='none'>). If that's the case,
we should just skip the iteration instead of calling STREQ(NULL, ...)
and SIGSEGV-ing subsequently.
Currently, if the primary security driver is 'none', we skip
initializing caps->host.secModels. This means, later, when LXC domain
XML is parsed and <seclabel type='none'/> is found (see
virSecurityLabelDefsParseXML), the model name is not copied to the
seclabel. This leads to subsequent crash in virSecurityManagerGenLabel
where we call STREQ() over the model (note, that we are expecting model
to be !NULL).
Introduced in commit 24b08219; compilation on RHEL 6.4 complained:
qemu/qemu_hotplug.c: In function 'qemuDomainAttachChrDevice':
qemu/qemu_hotplug.c:1257: error: declaration of 'remove' shadows a global declaration [-Wshadow]
/usr/include/stdio.h:177: error: shadowed declaration is here [-Wshadow]
* src/qemu/qemu_hotplug.c (qemuDomainAttachChrDevice): Avoid the
name 'remove'.
Signed-off-by: Eric Blake <eblake@redhat.com>
Otherwise the container will fail to start if we
enable user namespace, since there is no rights to
do mknod in uninit user namespace.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
lxc driver will use this function to change the owner
of hot added devices.
Move virLXCControllerChown to lxc_container.c and Rename
it to lxcContainerChown.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
A part of the returned monitor response was freed twice and caused
crashes of the daemon when using guest agent cpu count retrieval.
# virsh vcpucount dom --guest
Introduced in v1.0.6-48-gc6afcb0
Not all RBD (Ceph) storage pools have cephx authentication turned on,
so "secret" might not be initialized.
It could also be that the secret couldn't be located.
Only call virSecretFree() if "secret" is initialized earlier.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
Implement the new API that will handle setting the balloon driver statistics
collection period in order to enable or disable the collection dynamically.
Add new API in order to set the balloon memory driver statistics collection
period in order to allow dynamic period adjustment for the virsh dommemstats to
display balloon stats data
This patch will add the qemuMonitorJSONGetMemoryStats() to execute a
"guest-stats" on the balloonpath using "get-qom" replacing the former
mechanism which looked through the "query-ballon" returned data for
the fields. The "query-balloon" code only returns 'actual' memory.
Rather than duplicating the existing code, have the JSON API use the
GetBalloonInfo API.
A check in the qemuMonitorGetMemoryStats() will be made to ensure the
balloon driver path has been set. Since the underlying JSON code can
return data not associated with the balloon driver, we don't fail on
a failure to get the balloonpath. Of course since we've made the check,
we can then set the ballooninit flag. Getting the path here is primarily
due to the process reconnect path which doesn't attempt to set the
collection period.
At vm startup and attach attempt to set the balloon driver statistics
collection period based on the value found in the domain xml file. This
is not done at reconnect since it's possible that a collection period
was set on the live guest and making the set period call would reset to
whatever value is stored in the config file.
Setting the stats collection period has a side effect of searching through
the qom-list output for the virtio balloon driver and making sure that it
has the right properties in order to allow setting of a collection period
and eventually fetching of statistics.
The walk through the qom-list is expensive and thus the balloonpath will
be saved in the monitor private structure as well as a flag indicating
that the initialization has already been attempted (in the event that a
path is not found, no sense to keep checking).
This processing model conforms to the qom object model model which
requires setting object properties after device startup. That is, it's
not possible to pass the period along via the startup code as it won't
be recognized.
If users haven't configured guest agent then qemuAgentCommand() will
dereference a NULL 'mon' pointer, which causes crash of libvirtd when
using agent based cpu (un)plug.
With the patch, when the qemu-ga service isn't running in the guest,
a expected error "error: Guest agent is not responding: Guest agent
not available for now" will be raised, and the error "error: argument
unsupported: QEMU guest agent is not configured" is raised when the
guest hasn't configured guest agent.
GDB backtrace:
(gdb) bt
#0 virNetServerFatalSignal (sig=11, siginfo=<value optimized out>, context=<value optimized out>) at rpc/virnetserver.c:326
#1 <signal handler called>
#2 qemuAgentCommand (mon=0x0, cmd=0x7f39300017b0, reply=0x7f394b090910, seconds=-2) at qemu/qemu_agent.c:975
#3 0x00007f39429507f6 in qemuAgentGetVCPUs (mon=0x0, info=0x7f394b0909b8) at qemu/qemu_agent.c:1475
#4 0x00007f39429d9857 in qemuDomainGetVcpusFlags (dom=<value optimized out>, flags=9) at qemu/qemu_driver.c:4849
#5 0x00007f3957dffd8d in virDomainGetVcpusFlags (domain=0x7f39300009c0, flags=8) at libvirt.c:9843
How to reproduce?
# To start a guest without guest agent configuration
# then run the following cmdline
# virsh vcpucount foobar --guest
error: End of file while reading data: Input/output error
error: One or more references were leaked after disconnect from the hypervisor
error: Failed to reconnect to the hypervisor
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=984821
Signed-off-by: Alex Jia <ajia@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
When using logical pools, we had to trust the target->path provided.
This parameter, however, can be completely ommited and we can use
'/dev/<source.name>' safely and populate it to target.path.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=952973
There are two levels on which a device may be hotplugged: config
and live. The config level requires just an insert or remove from
internal domain definition structure, which is exactly what this
patch does. There is currently no implementation for a chardev
update action, as there's not much to be updated. But more
importantly, the only thing that can be updated is path or socket
address by which chardevs are distinguished. So the update action
is currently not supported.
Now that we have callbacks, we should auto fill in omitted pieces of
information. It's important for chardev hotplug to fill in the correct
/{serial,parallel,console,channel}/target/@port if no value has been
provided by user.
https://bugzilla.redhat.com/show_bug.cgi?id=799354
Until now, the "host-model" cpu mode couldn't be influenced. This patch
allows to use the <feature> elements to either enable or disable
specific CPU flags. This can be used to force flags that can be emulated
even if the host CPU doesn't support them.
This new function updates or adds a feature to a existing cpu model
definition. This function will be helpful to allow tuning of
"host-model" features in later patches.
Merge virStoragePoolDefParseAuthChap and virStoragePoolDefParseAuthCephx
into a common virStoragePoolDefParseAuthSecret. Change the output to be
common for both by putting 'type' first followed by 'username'.
The existing 'chap' XML logic was never used - just defined. Rather than
try to insert a square peg into a round hole, blow it up and rewrite the
logic to follow the 'ceph' format.
Remove the former "chap.login" and "chap.passwd" fields and replace
with "chap.username" and "chap.secret" in _virStoragePoolAuthChap.
Adjust the virStoragePoolDefParseAuthChap() to process.
Change the rng file to describe the new layout
Update the formatstorage.html to describe the usage of the secret element
to mention that the secret type "iscsi" and "ceph" can be used
to storage pool too.
Update the formatsecret.html to include a reference to the storage pool
Update tests to handle the changes from 'login' and 'passwd' to 'username'
and '<secret>' format
Add a new qemuMonitorJSONSetObjectProperty() method to support invocation
of the 'qom-set' JSON monitor command with a provided path, property, and
expected data type to set.
NOTE: The set API was added only for the purpose of the qemumonitorjsontest
The test code uses the same "/machine/i440fx" property as the get test and
attempts to set the "realized" property to "true" (which it should be set
at anyway).
Add a new qemuMonitorJSONGetObjectProperty() method to support invocation
of the 'qom-get' JSON monitor command with a provided path, property, and
expected data type return. The qemuMonitorJSONObjectProperty is similar to
virTypedParameter; however, a future patch will extend it a bit to include
a void pointer to balloon driver statistic data.
NOTE: The ObjectProperty structures and API are added only for the
purpose of the qemumonitorjsontest
The provided test will execute a qom-get on "/machine/i440fx" which will
return a property "realized".
Add a new qemuMonitorJSONGetObjectListPaths() method to support invocation
of the 'qom-list' JSON monitor command with a provided path.
NOTE: The ListPath structures and API's are added only for the
purpose of the qemumonitorjsontest
The returned list of paired data fields of "name" and "type" that can
be used to peruse QOM configuration data and eventually utilize for the
balloon statistics.
The test does a "{"execute":"qom-list", "arguments": { "path": "/"}}" which
returns "{"return": [{"name": "machine", "type": "child<container>"},
{"name": "type", "type": "string"}]}" resulting in a return of an array
of 2 elements with [0].name="machine", [0].type="child<container>". The [1]
entry appears to be a header that could be used some day via a command such
as "virsh qemuobject --list" to format output.
If an error occurs during qemuDomainAttachNetDevice after the macvtap
was created in qemuPhysIfaceConnect, the macvtap device gets left behind.
This patch adds code to the cleanup routine to delete the macvtap.
Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Reviewed-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
I recently patches the callers to virPCIDeviceReset() to not call it
if the current driver for a device was vfio-pci (since that driver
will always reset the device itself when appropriate. At the time, Dan
Berrange suggested that I could instead modify virPCIDeviceReset
to check the currently bound driver for the device, and decide
for itself whether or not to go ahead with the reset.
This patch removes the previously added checks, and replaces them with
a check down in virPCIDeviceReset(), as suggested.
The functional difference here is that previously we were deciding
based on either the hostdev configuration or the value of
stubDriverName in the virPCIDevice object, but now we are actually
comparing to the "driver" link in the device's sysfs entry
directly. In practice, both should be the same.
virPCIDeviceGetDriverPathAndName is a static function that will need
to be called by another function that occurs above it in the
file. This patch reorders the static functions so that a forward
declaration isn't needed.
When failing to start a container due to inaccessible root
filesystem path, we did not log any meaningful error. Add a
few debug statements to assist diagnosis
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The function being introduced is responsible for creating command
line argument for '-device' for given character device. Based on
the chardev type, it calls appropriate qemuBuild.*ChrDeviceStr(),
e.g. qemuBuildSerialChrDeviceStr() for serial chardev and so on.
The chardev alias assignment is going to be needed in a separate
places, so it should be moved into a separate function rather
than copying code randomly around.
The function being introduced is responsible for preparing and
executing 'chardev-add' qemu monitor command. Moreover, in case
of PTY chardev, the corresponding pty path is updated.
Currently, we are building InetSocketAddress qemu json type
within the qemuMonitorJSONNBDServerStart function. However, other
future functions may profit from the code as well. So it should
be moved into a static function.
For now, only these three helpers are needed:
virDomainChrFind - to find a duplicate chardev within VM def
virDomainChrInsert - wrapper for inserting a new chardev into VM def
virDomainChrRemove - wrapper for removing chardev from VM def
There is, however, one internal helper as well:
virDomainChrGetDomainPtrs which sets given pointers to one of
vmdef->{parallels,serials,consoles,channels} based on passed
chardev type.
This patch enables the password authentication in the libssh2 connection
driver. There are a few benefits to this step:
1) Hosts with challenge response authentication will now be supported
with the libssh2 connection driver.
2) Credential for hosts can now be stored in the authentication
credential config file
The password authentication method wasn't used as there wasn't a
pleasant way to pass the password. This patch adds the option to use
virAuth util functions to request the password either from a config file
or uses the conf callback to request it from the user.
Previously a connection object was required to retrieve the auth
credentials. This patch adds the option to call the retrieval functions
only using the connection URI or path to the configuration file. This
will allow to use this toolkit to request passwords for ssh
authentication in the libssh2 connection driver.
Changes:
*virAuthGetConfigFilePathURI(): use URI to retrieve the config file path
*virAuthGetCredential(): Remove the need to propagate conn object
virAuthGetPasswordPath():
*virAuthGetUsernamePath(): New functions, that use config file path
instead of conn object
nodeGetFreeMemory and nodeGetCellsFreeMemory assumed that the NUMA nodes
are contiguous and starting from 0. Unfortunately there are machines
that don't match this assumption:
available: 1 nodes (1)
node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
node 1 size: 16340 MB
node 1 free: 11065 MB
Before this patch:
error: internal error Failed to query NUMA free memory
error: internal error Failed to query NUMA free memory for node: 0
After this patch:
Total: 15772580 KiB
0: 0 KiB
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=964358
POSIX states that multi-threaded apps should not use functions
that are not async-signal-safe between fork and exec, yet we
were using getpwuid_r and initgroups. Although rare, it is
possible to hit deadlock in the child, when it tries to grab
a mutex that was already held by another thread in the parent.
I actually hit this deadlock when testing multiple domains
being started in parallel with a command hook, with the following
backtrace in the child:
Thread 1 (Thread 0x7fd56bbf2700 (LWP 3212)):
#0 __lll_lock_wait ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1 0x00007fd5761e7388 in _L_lock_854 () from /lib64/libpthread.so.0
#2 0x00007fd5761e7257 in __pthread_mutex_lock (mutex=0x7fd56be00360)
at pthread_mutex_lock.c:61
#3 0x00007fd56bbf9fc5 in _nss_files_getpwuid_r (uid=0, result=0x7fd56bbf0c70,
buffer=0x7fd55c2a65f0 "", buflen=1024, errnop=0x7fd56bbf25b8)
at nss_files/files-pwd.c:40
#4 0x00007fd575aeff1d in __getpwuid_r (uid=0, resbuf=0x7fd56bbf0c70,
buffer=0x7fd55c2a65f0 "", buflen=1024, result=0x7fd56bbf0cb0)
at ../nss/getXXbyYY_r.c:253
#5 0x00007fd578aebafc in virSetUIDGID (uid=0, gid=0) at util/virutil.c:1031
#6 0x00007fd578aebf43 in virSetUIDGIDWithCaps (uid=0, gid=0, capBits=0,
clearExistingCaps=true) at util/virutil.c:1388
#7 0x00007fd578a9a20b in virExec (cmd=0x7fd55c231f10) at util/vircommand.c:654
#8 0x00007fd578a9dfa2 in virCommandRunAsync (cmd=0x7fd55c231f10, pid=0x0)
at util/vircommand.c:2247
#9 0x00007fd578a9d74e in virCommandRun (cmd=0x7fd55c231f10, exitstatus=0x0)
at util/vircommand.c:2100
#10 0x00007fd56326fde5 in qemuProcessStart (conn=0x7fd53c000df0,
driver=0x7fd55c0dc4f0, vm=0x7fd54800b100, migrateFrom=0x0, stdin_fd=-1,
stdin_path=0x0, snapshot=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE,
flags=1) at qemu/qemu_process.c:3694
...
The solution is to split the work of getpwuid_r/initgroups into the
unsafe portions (getgrouplist, called pre-fork) and safe portions
(setgroups, called post-fork).
* src/util/virutil.h (virSetUIDGID, virSetUIDGIDWithCaps): Adjust
signature.
* src/util/virutil.c (virSetUIDGID): Add parameters.
(virSetUIDGIDWithCaps): Adjust clients.
* src/util/vircommand.c (virExec): Likewise.
* src/util/virfile.c (virFileAccessibleAs, virFileOpenForked)
(virDirCreate): Likewise.
* src/security/security_dac.c (virSecurityDACSetProcessLabel):
Likewise.
* src/lxc/lxc_container.c (lxcContainerSetID): Likewise.
* configure.ac (AC_CHECK_FUNCS_ONCE): Check for setgroups, not
initgroups.
Signed-off-by: Eric Blake <eblake@redhat.com>
Since neither getpwuid_r() nor initgroups() are safe to call in
between fork and exec (they obtain a mutex, but if some other
thread in the parent also held the mutex at the time of the fork,
the child will deadlock), we have to split out the functionality
that is unsafe. At least glibc's initgroups() uses getgrouplist
under the hood, so the ideal split is to expose getgrouplist for
use before a fork. Gnulib already gives us a nice wrapper via
mgetgroups; we wrap it once more to look up by uid instead of name.
* bootstrap.conf (gnulib_modules): Add mgetgroups.
* src/util/virutil.h (virGetGroupList): New declaration.
* src/util/virutil.c (virGetGroupList): New function.
* src/libvirt_private.syms (virutil.h): Export it.
Signed-off-by: Eric Blake <eblake@redhat.com>
A future patch needs to look up pw_gid; but it is wasteful
to crawl through getpwuid_r twice for two separate pieces
of information, and annoying to copy that much boilerplate
code for doing the crawl. The current internal-only
virGetUserEnt is also a rather awkward interface; it's easier
to just design it to let callers request multiple pieces of
data as needed from one traversal.
And while at it, I noticed that virGetXDGDirectory could deref
NULL if the getpwuid_r lookup fails.
* src/util/virutil.c (virGetUserEnt): Alter signature.
(virGetUserDirectory, virGetXDGDirectory, virGetUserName): Adjust
callers.
Signed-off-by: Eric Blake <eblake@redhat.com>
Recent changes uncovered a possibility that 'last_processed_hostdev_vf'
was set to -1 in 'qemuPrepareHostdevPCIDevices' and would cause problems
in for loop end condition in the 'resetvfnetconfig' label if the
variable was never set to 'i' due to 'qemuDomainHostdevNetConfigReplace'
failure.
The switch statement in 'virStorageBackendCreateQemuImgOpts' used the
for loop end condition 'VIR_STORAGE_FILE_FEATURE_LAST' as a possible value,
but since that cannot happen Coverity spits out a DEADCODE message. Adding
the Coverity tag just removes the Coverity message
Recent changes uncovered a NEGATIVE_RETURNS in the return from sysconf()
when processing a for loop in virtTestCaptureProgramExecChild() in
testutils.c
Code review uncovered 3 other code paths with the same condition that
weren't found by Covirity, so fixed those as well.
With current code, error reporting for unsupported devices for hot plug,
unplug and update is total mess. The VIR_ERR_CONFIG_UNSUPPORTED error
code is reported instead of VIR_ERR_OPERATION_UNSUPPORTED. Moreover, the
error messages are not helping to find the root cause (lack of
implementation).
When adding a new domain device, it is fairly easy to forget to add
corresponding piece into virDomainDeviceDefParse. However, if the
internal structure is changed to one bit switch() the compiler will warn
about not handled enum item.
Not all device types are currently parsed in virDomainDeviceDefParse,
Since all needed functions do exist, nothing holds us back to make the
implementation complete. Similarly, the virDomainDeviceDefFree needs to
be updated as well.
Don't reuse the return value of virStorageBackendFileSystemIsMounted.
If it's 0, we'd return it even if the mount command failed.
Also, don't report another error if it's -1, since one has already
been reported.
Introduced by 258e06c.
https://bugzilla.redhat.com/show_bug.cgi?id=981251
For low-memory domains (roughly under 400MB) our automatic memory limit
computation comes up with a limit that's too low. This is because the
0.5 multiplication does not add enough for such small values. Let's
increase the constant part of the computation to fix this.
I had made the change locally, so make check and make syntax-check
were successful, but forgot to add/commit. Unfortunately, git allows a
push when the local directory is dirty, so it didn't catch my mistake.
Eliminate memmove() by using VIR_*_ELEMENT API instead.
In both pci and usb cases, the count that held the size of the list
was unsigned int so it had to be changed to size_t.
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Actually, I'm turning this function into a macro as filename,
function name and line number needs to be passed. The new
function virAsprintfInternal is introduced with the extended set
of arguments.
Create parent directroy for hostdev automatically when we
start a lxc domain or attach a hostdev to a lxc domain.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
This helper function is used to create parent directory for
the hostdev which will be added to the container. If the
parent directory of this hostdev doesn't exist, the mknod of
the hostdev will fail. eg with /dev/net/tun
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Many applications use /dev/tty to read from stdin.
e.g. zypper on openSUSE.
Let's create this device node to unbreak those applications.
As /dev/tty is a synonym for the current controlling terminal
it cannot harm the host or any other containers.
Signed-off-by: Richard Weinberger <richard@nod.at>
The device bus value was used instead of the device target when
building the sysfs device path. Trivial.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
The imagelabel SELinux label was only generated when relabeling was
enabled. This prohibited labeling of files created by libvirt that need
to be labeled even if relabeling is turned off.
The only codepath this change has direct impact on is labeling of FDs
passed to qemu which is always safe in current state.
Whenever virPortAllocatorRelease is called with port == 0, it complains
that the port is not in an allowed range, which is expectable as the
port was never allocated. Let's make virPortAllocatorRelease ignore 0
ports in a similar way free() ignores NULL pointers.
https://bugzilla.redhat.com/show_bug.cgi?id=981139
If a domain is paused before migration starts, we need to tell that to
the destination libvirtd to prevent it from resuming the domain at the
end of migration. This regression was introduced by commit 5379bb0.
<hyperv>
<spinlocks state='off'/>
</hyperv>
results in:
error: XML error: missing HyperV spinlock retry count
Don't require retries when state is off and use virXPathUInt
instead of virXPathString to simplify parsing.
https://bugzilla.redhat.com/show_bug.cgi?id=784836#c19
Use virDomainObjListRemoveLocked instead of virDomainObjListRemove, as
driver->domains is already taken by virDomainObjListForEach.
Above deadlock can be triggered when libvirtd is started after some
domain have been started by hand (in which case driver will not find
libvirt-xml domain config).
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
While iterating with virDomainObjListForEach it is safe to remove
current element. But while iterating, 'doms' lock is already taken, so
can't use standard virDomainObjListRemove. So introduce
virDomainObjListRemoveLocked for this purpose.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
The 'check-aclrules' test case validates that there are ACL
checks in each method. This extends it so that it can also
validate that methods which return info about lists of objects,
will filter their returned info throw an ACL check.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ensure that all APIs which list interface objects filter
them against the access control system.
This makes the APIs for listing names and counting devices
slightly less efficient, since we can't use the direct
netcf APIs for these tasks. Instead we have to ask netcf
for the full list of objects & iterate over the list
filtering them out.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ensure that all APIs which list nwfilter objects filter
them against the access control system.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ensure that all APIs which list node device objects filter
them against the access control system.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Change the ACL filter functions to use a 'bool' return
type instead of a tri-state 'int' return type. The callers
of these functions don't want to distinguish 'auth failed'
from other errors.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Since commit 23e8b5d8, the code is refactored in a way that supports
domains with multiple graphics elements and commit 37b415200 allows
starting such domains. However none of those commits take migration
into account. Even though qemu doesn't support relocation for
anything else than SPICE and for no more than one graphics, there is no
reason to hardcode one graphics into this part of the code as well.
libivrt lxc can only set generic weight for container,
This patch allows user to setup per device blkio
weigh for container.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
When removing a TAP device, the associated bandwidth settings are
removed. Currently, the /sbin/tc is used for that. It is spawned
several times. Moreover, we use the same @cmd variable to
construct the command and its arguments. That means we need to
virCommandFree(cmd); prior to each virCommandNew(TC); which
wasn't done.
Add monitor callback API domainGuestPanic, that implements
'destroy', 'restart' and 'preserve' events of the 'on_crash'
in the XML when domain crashed.
On a mingw VPATH build (such as done by ./autobuild.sh), the tarball
created by 'make dist' was including generated files. The VPATH
rules were then seeing that the tarball files were up-to-date, and
not regenerating files locally, leading to this failure:
GEN libvirt.syms
cat: libvirt_access.syms: No such file or directory
cat: libvirt_access_qemu.syms: No such file or directory
cat: libvirt_access_lxc.syms: No such file or directory
make: *** [libvirt.syms] Error 1
We already have a category for generated sym files, which are
intentionally not part of the tarball; stick the access sym
files in that category. The rearrange the declarations a bit
to make it harder to repeat the problem, dropping things that
are now redundant (for example, BUILT_FILES already includes
GENERATED_SYM_FILES, so it does not also need to call out
ACCESS_DRIVER_SYM_FILES).
* src/Makefile.am (USED_SYM_FILES): Don't include generated files.
(GENERATED_SYM_FILES): Access syms files are generated.
(libvirt.syms): Include access syms files here.
(ACCESS_DRIVER_SYMFILES): Rename...
(ACCESS_DRIVER_SYM_FILES): ...for consistency.
Signed-off-by: Eric Blake <eblake@redhat.com>
User namespaces will deny the ability to mount the SELinux
filesystem. This is harmless for libvirt's LXC needs, so the
error can be ignored.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
On Fedora 18, when cross-compiling to mingw with the mingw*-dbus
packages installed, compilation fails with:
CC libvirt_net_rpc_server_la-virnetserver.lo
In file included from /usr/i686-w64-mingw32/sys-root/mingw/include/dbus-1.0/dbus/dbus-connection.h:32:0,
from /usr/i686-w64-mingw32/sys-root/mingw/include/dbus-1.0/dbus/dbus-bus.h:30,
from /usr/i686-w64-mingw32/sys-root/mingw/include/dbus-1.0/dbus/dbus.h:31,
from ../../src/util/virdbus.h:26,
from ../../src/rpc/virnetserver.c:39:
/usr/i686-w64-mingw32/sys-root/mingw/include/dbus-1.0/dbus/dbus-message.h:74:58: error: expected ';', ',' or ')' before 'struct'
I have reported this as a bug against two packages:
- mingw-headers, for polluting the namespace
https://bugzilla.redhat.com/show_bug.cgi?id=980270
- dbus, for not dealing with the pollution
https://bugzilla.redhat.com/show_bug.cgi?id=980278
At least dbus has agreed that a future version of dbus headers will
do s/interface/iface/, regardless of what happens in mingw. But it
is also easy to workaround in libvirt in the meantime, without having
to wait for either mingw or dbus to upgrade.
* src/util/virdbus.h (includes): Undo mingw's pollution so that
dbus doesn't fail.
Signed-off-by: Eric Blake <eblake@redhat.com>
After abf75aea24 the compiler screams:
qemu/qemu_driver.c: In function 'qemuNodeDeviceDetachFlags':
qemu/qemu_driver.c:10693:9: error: 'domain' may be used uninitialized in this function [-Werror=maybe-uninitialized]
pci = virPCIDeviceNew(domain, bus, slot, function);
^
qemu/qemu_driver.c:10693:9: error: 'bus' may be used uninitialized in this function [-Werror=maybe-uninitialized]
qemu/qemu_driver.c:10693:9: error: 'slot' may be used uninitialized in this function [-Werror=maybe-uninitialized]
qemu/qemu_driver.c:10693:9: error: 'function' may be used uninitialized in this function [-Werror=maybe-uninitialized]
Since the other functions qemuNodeDeviceReAttach and qemuNodeDeviceReset
looks exactly the same, I've initialized the variables there as well.
However, I am still wondering why those functions don't matter to gcc
while the first one does.
Since these devices are created for the container.
the owner should be the root user of the container.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
container will create /dev/pts directory in /dev.
the owner of /dev should be the root user of container.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Since these tty devices will be used by container,
the owner of them should be the root user of container.
This patch also adds a new function virLXCControllerChown,
we can use this general function to change the owner of
files.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
user namespace doesn't allow to create devices in
uninit userns. We should create devices on host side.
We first mount tmpfs on dev directroy under state dir
of container. then create devices under this dev dir.
Finally in container, mount the dev directroy created
on host to the /dev/ directroy of container.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
This patch introduces new helper function
virLXCControllerSetupUserns, in this function,
we set the files uid_map and gid_map of the init
task of container.
lxcContainerSetID is used for creating cred for
tasks running in container. Since after setuid/setgid,
we may be a new user. This patch calls lxcContainerSetUserns
at first to make sure the new created files belong to
right user.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Make sure the mapping line contains the root user of container
is the first element of idmap array. So we can get the real
user id on host for the container easily.
This patch also check the map information, User must map
the root user of container to any user of host.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
User namespace will be enabled only when the idmap exist
in configuration.
If you want disable user namespace,just remove these
elements from XML.
If kernel doesn't support user namespace and idmap exist
in configuration file, libvirt lxc will start failed and
return "Kernel doesn't support user namespace" message.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
This patch introduces new element <idmap> for
user namespace. for example
<idmap>
<uid start='0' target='1000' count='10'/>
<gid start='0' target='1000' count='10'/>
</idmap>
this new element is used for setting proc files
/proc/<pid>/{uid_map,gid_map}.
This patch also supports multiple uid/gid elements
setting in XML configuration.
We don't support the semi configuation, user has to
configure uid and gid both.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
By providing the implementation of nodeGetCellsFreeMemory for
the driver. This is all just a matter of properly formatting, in
a way that libvirt like, what Xen provides via libxl_get_numainfo().
[raistlin@Zhaman ~]$ sudo virsh --connect xen:/// freecell --all
0: 25004 KiB
1: 105848 KiB
--------------------
Total: 130852 KiB
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
On mingw, configure sets the name of the lxc symfile to
libvirt_lxc.defs rather than libvirt_lxc.syms. But tarballs
must be arch-independent, regardless of the configure options
used for the tree where we ran 'make dist'. This led to the
following failure in autobuild.sh:
CCLD libvirt-lxc.la
CCLD libvirt-qemu.la
/usr/lib64/gcc/i686-w64-mingw32/4.7.2/../../../../i686-w64-mingw32/bin/ld: cannot find libvirt_lxc.def: No such file or directory
collect2: error: ld returned 1 exit status
make[3]: *** [libvirt-lxc.la] Error 1
make[3]: *** Waiting for unfinished jobs....
We were already doing the right thing with libvirt_qemu.syms.
* src/Makefile.am (EXTRA_DIST): Don't ship a built file which
depends on configure for its final name.
Signed-off-by: Eric Blake <eblake@redhat.com>
Found while trying to cross-compile to mingw:
CC libvirt_driver_remote_la-remote_driver.lo
../../src/remote/remote_driver.c: In function 'doRemoteOpen':
../../src/remote/remote_driver.c:487:23: error: variable 'verify' set but not used [-Werror=unused-but-set-variable]
* src/remote/remote_driver.c (doRemoteOpen): Also ignore 'verify'.
Signed-off-by: Eric Blake <eblake@redhat.com>
iptablesContext holds only 4 pairs of iptables
(table, chain) and there's no need to pass
it around.
This is a first step towards separating bridge_driver.c
in platform-specific parts.
Implement check whether (maximum) vCPUs doesn't exceed machine
type's cpu-max settings.
On older versions of QEMU the check is disabled.
Signed-off-by: Michal Novotny <minovotn@redhat.com>
On Thu, Jun 27, 2013 at 03:56:42PM +0100, Daniel P. Berrange wrote:
> Hi Security Team,
>
> I've discovered a way for an unprivileged user with a readonly connection
> to libvirtd, to crash the daemon.
Ok, the final patch for this is issue will be the simpler variant that
Eric suggested
The embargo can be considered to be lifted on Monday July 1st, at
0900 UTC
The following is the GIT change that DV or myself will apply to libvirt
GIT master immediately before the 1.1.0 release:
>From 177b4165c531a4b3ba7f6ab6aa41dca9ceb0b8cf Mon Sep 17 00:00:00 2001
From: "Daniel P. Berrange" <berrange@redhat.com>
Date: Fri, 28 Jun 2013 10:48:37 +0100
Subject: [PATCH] CVE-2013-2218: Fix crash listing network interfaces with
filters
The virConnectListAllInterfaces method has a double-free of the
'struct netcf_if' object when any of the filtering flags cause
an interface to be skipped over. For example when running the
command 'virsh iface-list --inactive'
This is a regression introduced in release 1.0.6 by
commit 7ac2c4fe62
Author: Guannan Ren <gren@redhat.com>
Date: Tue May 21 21:29:38 2013 +0800
interface: list all interfaces with flags == 0
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This fixes https://bugzilla.redhat.com/show_bug.cgi?id=971325
The problem was that if virPCIGetVirtualFunctions was given the name
of a non-existent interface, it would return to its caller without
initializing the pointer to the array of virtual functions to NULL,
and the caller (virNetDevGetVirtualFunctions) would try to VIR_FREE()
the invalid pointer.
The final error message before the crash would be:
virPCIGetVirtualFunctions:2088 :
Failed to open dir '/sys/class/net/eth2/device':
No such file or directory
In this patch I move the initialization in virPCIGetVirtualFunctions()
to the begining of the function, and also do an explicit
initialization in virNetDevGetVirtualFunctions, just in case someone
in the future adds code into that function prior to the call to
virPCIGetVirtualFunctions.
This fixes:
https://bugzilla.redhat.com/show_bug.cgi?id=979290https://bugzilla.redhat.com/show_bug.cgi?id=979330
The node device driver was written with the assumption that udev would
use a "change" event to notify libvirt of any change to device status
(including the name of the driver it was bound to). It turns out this
is not the case (see Comment 4 of BZ 979290). That means that a
dumpxml for a device would always show whatever driver happened to be
bound at the time libvirt was started (when the node device cache was
built).
There was already code in the driver (for the benefit of the HAL
backend) that updated the driver name from sysfs each time a device's
info was retrieved from the cache. This patch just enables that manual
update for the udev backend as well.
There were two errors, one as a direct result of commit id '8807b285'
and the other from cut-n-paste
TEST: nodedevxml2xmltest
.............. 14 OK
==25735== 3 bytes in 1 blocks are definitely lost in loss record 1 of 24
==25735== at 0x4A0887C: malloc (vg_replace_malloc.c:270)
==25735== by 0x344D2AF275: xmlStrndup (in /usr/lib64/libxml2.so.2.9.1)
==25735== by 0x4D0C767: virNodeDeviceDefParseNode (node_device_conf.c:997)
==25735== by 0x4D0D3D2: virNodeDeviceDefParse (node_device_conf.c:1337)
==25735== by 0x401CA4: testCompareXMLToXMLHelper (nodedevxml2xmltest.c:28)
==25735== by 0x402B2F: virtTestRun (testutils.c:158)
==25735== by 0x401B27: mymain (nodedevxml2xmltest.c:81)
==25735== by 0x40316A: virtTestMain (testutils.c:722)
==25735== by 0x37C1021A04: (below main) (libc-start.c:225)
==25735==
==25735== 16 bytes in 1 blocks are definitely lost in loss record 10 of 24
==25735== at 0x4A08A6E: realloc (vg_replace_malloc.c:662)
==25735== by 0x4C7385E: virReallocN (viralloc.c:184)
==25735== by 0x4C73906: virExpandN (viralloc.c:214)
==25735== by 0x4C73B4A: virInsertElementsN (viralloc.c:324)
==25735== by 0x4D0C84C: virNodeDeviceDefParseNode (node_device_conf.c:1026)
==25735== by 0x4D0D3D2: virNodeDeviceDefParse (node_device_conf.c:1337)
==25735== by 0x401CA4: testCompareXMLToXMLHelper (nodedevxml2xmltest.c:28)
==25735== by 0x402B2F: virtTestRun (testutils.c:158)
==25735== by 0x401B27: mymain (nodedevxml2xmltest.c:81)
==25735== by 0x40316A: virtTestMain (testutils.c:722)
==25735== by 0x37C1021A04: (below main) (libc-start.c:225)
==25735==
PASS: nodedevxml2xmltest
The first error was resolved by adding a missing VIR_FREE(numberStr); in
the new function virNodeDevCapPciDevIommuGroupParseXML().
The second error was a bit more opaque as the error was a result of copying
the free methodolgy of the existing code in virNodeDevCapsDefFree(). The code
would free each of the entries in the array, but not the memory for the
array itself. Added the necessary VIR_FREE(data->pci_dev.iommuGroupDevices)
and while at it added the missing VIR_FREE(data->pci_dev.virtual_functions)
although there wasn't a test that tripped across it (thus it's been lurking
since commit id 'a010165d').
Commit id '53d5967c' introduced the following:
TEST: storagevolxml2argvtest
.............. 14 OK
==25636== 358 (264 direct, 94 indirect) bytes in 1 blocks are definitely lost in loss record 67 of 75
==25636== at 0x4A06B6F: calloc (vg_replace_malloc.c:593)
==25636== by 0x4C95791: virAlloc (viralloc.c:124)
==25636== by 0x4CA0BB4: virCommandNewArgs (vircommand.c:805)
==25636== by 0x4CA0C88: virCommandNew (vircommand.c:789)
==25636== by 0x408602: virStorageBackendCreateQemuImgCmd (storage_backend.c:849)
==25636== by 0x405427: testCompareXMLToArgvHelper (storagevolxml2argvtest.c:61)
==25636== by 0x4064DF: virtTestRun (testutils.c:158)
==25636== by 0x40516F: mymain (storagevolxml2argvtest.c:195)
==25636== by 0x406B1A: virtTestMain (testutils.c:722)
==25636== by 0x37C1021A04: (below main) (libc-start.c:225)
==25636==
PASS: storagevolxml2argvtest
Commit '861d4056' introduced the following:
TEST: networkxml2xmltest
.................. 18 OK
==25504== 7 bytes in 1 blocks are definitely lost in loss record 5 of 23
==25504== at 0x4A0887C: malloc (vg_replace_malloc.c:270)
==25504== by 0x37C1085D71: strdup (strdup.c:42)
==25504== by 0x4CB835F: virStrdup (virstring.c:546)
==25504== by 0x4CC5179: virXPathString (virxml.c:90)
==25504== by 0x4CC75C2: virNetDevVlanParse (netdev_vlan_conf.c:78)
==25504== by 0x4CF928A: virNetworkPortGroupParseXML (network_conf.c:1555)
==25504== by 0x4CFE385: virNetworkDefParseXML (network_conf.c:2049)
==25504== by 0x4D0113B: virNetworkDefParseNode (network_conf.c:2273)
==25504== by 0x4D01254: virNetworkDefParse (network_conf.c:2234)
==25504== by 0x401E80: testCompareXMLToXMLHelper (networkxml2xmltest.c:32)
==25504== by 0x402D4F: virtTestRun (testutils.c:158)
==25504== by 0x401CE9: mymain (networkxml2xmltest.c:110)
==25504==
PASS: networkxml2xmltest
Also changed the label from error to cleanup and adjusted code since it's
all one exit path
The IF_MAXUNIT macro is not present on all BSDs, so
make its use conditional, to avoid breaking OS-X.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The 'in_addr_t' typedef is not present in Mingw64 headers.
Instead we can use the more portable 'struct in_addr' and
then access its 's_addr' field.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The udev based interface backend did not allow querying data over a
read-only connection which is different than how the netcf backend
operates. This brings the behavior inline with the default, netcf
backend.
When creating a virtual FC HBA with virsh/libvirt API, an error message
will be returned: "error: Node device not found",
also the 'nodedev-dumpxml' shows wrong information of wwpn & wwnn
for the new created device.
Signed-off-by: xschen@tnsoft.com.cn
This reverts f90af69 which switched wwpn & wwwn in the wrong place.
https://www.kernel.org/doc/Documentation/scsi/scsi_fc_transport.txt
Building on FreeBSD had this linker error:
/work/a/ports/devel/libvirt/work/libvirt-1.1.0/src/.libs/libvirt.so:
undefined reference to `virPCIDeviceAddressParse'
This was caused by the new use of virPCIDeviceAddressParse in a
portion of virpci.c that wasn't linux-only (in commit 72c029d8). The
problem was that virPCIDeviceAddressParse had originally been defined
inside #ifdef _linux (because it was only used by another function
that was inside the same ifdef).
The solution is to move it out to the part of virpci.c that is
compiled on all platforms.
(Because the portion that was "moved" was 40-50 lines, but only moved
up by 15 lines, the diff for the patch is less than non-informative -
rather than showing that part that I moved, it shows the bit that was
previously before the moved part, and now sits *after* it.)
Implicit controllers may be dependent on device definitions altered
in a post-parse callback. Specifically, if a console device is
defined without the target type, the type will be set in QEMU's
callback. In the case of s390, this is virtio, which requires
an implicit virtio-serial controller.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
If networkUnplugBandwidth is called on a network which has
no bandwidth defined, print a warning instead of crashing.
This can happen when destroying a domain with bandwidth if
bandwidth was removed from the network after the domain was
started.
https://bugzilla.redhat.com/show_bug.cgi?id=975359
This includes adding it to the nodedev parser and formatter, docs, and
test.
An example of the new iommuGroup element that is a part of the output
from "virsh nodedev-dumpxml" (virNodeDeviceGetXMLDesc()):
<device>
<name>pci_0000_02_00_1</name>
<capability type='pci'>
...
<iommuGroup number='12'>
<address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
<address domain='0x0000' bus='0x02' slot='0x00' function='0x1'/>
</iommuGroup>
</capability>
</device>
Any device which belongs to an "IOMMU group" (used by vfio) will
have links to all devices of its group listed in
/sys/bus/pci/$device/iommu_group/devices;
/sys/bus/pci/$device/iommu_group is actually a link to
/sys/kernel/iommu_groups/$n, where $n is the group number (there
will be a corresponding device node at /dev/vfio/$n once the
devices are bound to the vfio-pci driver)
The following functions are added:
virPCIDeviceGetIOMMUGroupList
Gets a virPCIDeviceList with one virPCIDeviceList for each device
in the same IOMMU group as the provided virPCIDevice (a copy of the
original device object is included in the list.
virPCIDeviceAddressIOMMUGroupIterate
Calls the function @actor once for each device in the group that
contains the given virPCIDeviceAddress.
virPCIDeviceAddressGetIOMMUGroupAddresses
Fills in a virPCIDeviceAddressPtr * with an array of
virPCIDeviceAddress, one for each device in the iommu group of the
provided virPCIDeviceAddress (including a copy of the original).
virPCIDeviceAddressGetIOMMUGroupNum
Returns the group number as an int (a valid group number will always
be 0 or greater). If there is no iommu_group link in the device's
directory (usually indicating that vfio isn't loaded), -2 will be
returned. On any real error, -1 will be returned.
We only break out of the while loop if *content is an empty string.
However the buffer has been allocated to BUFSIZ + 1 (8193 in my case),
but it gets overwritten in the next for iteration.
Move VIR_FREE right before we overwrite it to avoid the leak.
==5777== 16,386 bytes in 2 blocks are definitely lost in loss record 1,022 of 1,027
==5777== by 0x5296E28: virReallocN (viralloc.c:184)
==5777== by 0x52B0C66: virFileReadLimFD (virfile.c:1137)
==5777== by 0x52B0E1A: virFileReadAll (virfile.c:1199)
==5777== by 0x529B092: virCgroupGetValueStr (vircgroup.c:534)
==5777== by 0x529AF64: virCgroupMoveTask (vircgroup.c:1079)
Introduced by 83e4c77.
https://bugzilla.redhat.com/show_bug.cgi?id=978352
Don't check for '\n' at the end of file if zero bytes were read.
Found by valgrind:
==404== Invalid read of size 1
==404== at 0x529B09F: virCgroupGetValueStr (vircgroup.c:540)
==404== by 0x529AF64: virCgroupMoveTask (vircgroup.c:1079)
==404== by 0x1EB475: qemuSetupCgroupForEmulator (qemu_cgroup.c:1061)
==404== by 0x1D9489: qemuProcessStart (qemu_process.c:3801)
==404== by 0x18557E: qemuDomainObjStart (qemu_driver.c:5787)
==404== by 0x190FA4: qemuDomainCreateWithFlags (qemu_driver.c:5839)
Introduced by 0d0b409.
https://bugzilla.redhat.com/show_bug.cgi?id=978356
Although SRIOV network cards support setting a vlan tag on their
virtual functions, and although setting this vlan tag via a <vlan>
element in a domain's <interface> works, setting a vlan tag for these
devices in a <network> definition, or in a network <portgroup>
definition is also supposed to work (and the comment that validates
<vlan> usage even says that!). However, the check to allow it only
checked for an openvswitch network, so attempts to add <vlan> to a
network of type='hostdev' would fail.
A loop in qemuPrepareHostdevPCIDevices() intended to cycle through all
the objects on the list pcidevs was doing "while (listcount > 0)", but
nothing in the body of the loop was reducing the size of the list - it
was instead removing items from a *different* list. It has now been
safely changed to a for() loop.
(This isn't as bad as it sounds - it's only a problem in case of an
OOM error.)
qemuGetActivePciHostDeviceList() had been creating a list that
contained pointers to objects that were also on the activePciHostdevs
list. In case of an OOM error, this newly created list would be
virObjectUnref'ed, which would cause everything on the list to be
freed. But all of those objects would still be on the
activePciHostdevs list, which could have very bad consequences if that
list was ever again accessed.
The solution used here is to populate the new list with *copies* of
the objects from the original list. It turns out that on return from
qemuGetActivePciHostDeviceList(), the caller would almost immediately
go through all the device objects and "steal" them (i.e. remove the
pointer from the list but not delete it) all from either one list or
the other; we now instead just *delete* (remove from the list and
free) each device from one list or the other, so in the end we have
the same state.
The "fix" I pushed a few commits ago would still leak a virPCIDevice
in case of an OOM error. Although it's inconsequential in practice,
this patch satisfies my OCD.
The same strings were being re-created multiple times just to save
declaring a new variable. In the meantime, the use of the generic
variable names led to confusion when trying to follow the code. This
patch creates strings for:
stubDriverName (was called "driver" in original args)
stubDriverPath ("/sys/bus/pci/drivers/${stubDriverName}")
driverLink ("${device}/driver")
oldDriverName (the final component of path linked to by
"${device}/driver")
oldDriverPath ("/sys/bus/pci/drivers/${oldDriverName}")
then re-uses them as necessary.
I realized after the fact that it's probably better in the long run to
give this function a name that matches the name of the link used in
sysfs to hold the group (iommu_group).
I'm changing it now because I'm about to add several more functions
that deal with iommu groups.
The driver arg to virPCIDeviceDetach is no longer used (the name of the stub driver is now set in the virPCIDevice object, and virPCIDeviceDetach retrieves it from there). Remove it.
Commit 861d40565 added code (my personal change to "clean up" the
submitter's code, *not* the fault of the submitter) that dereferenced
virtVlan without first checking for NULL. This patch fixes that and,
as part of the fix, cleans up some unnecessary obtuseness.
virNetDevBridgeSetSTPDelay accepts delay in milliseconds,
but BSD implementation was expecting seconds. Therefore,
it was working correctly only with delay == 0.
This patch adds functionality to allow libvirt to configure the
'native-tagged' and 'native-untagged' modes on openvswitch networks.
Signed-off-by: Laine Stump <laine@redhat.com>
I just learned that VFIO resets PCI devices when they are assigned to
guests / returned to the host, so it is redundant for libvirt to reset
the devices. This patch inhibits calling virPCIDeviceReset to devices
that will be/were assigned using VFIO.
This patch introduces two new APIs virDomainMigrate3 and
virDomainMigrateToURI3 that may be used in place of their older
variants. These new APIs take optional migration parameters (such as
bandwidth, domain XML, ...) in an array of virTypedParameters, which
makes adding new parameters easier as there's no need to introduce new
APIs whenever a new migration parameter needs to be added. Both APIs are
backward compatible and will automatically use older migration calls in
case the new calls are not supported as long as the typed parameters
array does not contain any parameter which was not supported by the
older calls.
All APIs that take typed parameters are only using params address in
their entry point debug messages. With the new VIR_TYPED_PARAMS_DEBUG
macro, all functions can easily log all individual typed parameters
passed to them.
When unsupported parameter is passed to virTypedParamsValidate,
VIR_ERR_ARGUMENT_UNSUPPORTED should be returned rather than
VIR_ERR_INVALID_ARG, which is more appropriate for supported parameters
used incorrectly.
virPCIDeviceDetach would previously sometimes consume the input device
object (to put it on the inactive list) and sometimes not. Avoiding
memory leaks required checking beforehand to see if the device was
already on the list, and freeing the device object in the caller only
if there wasn't already an identical object on the inactive list.
This patch makes it consistent - virPCIDeviceDetach will *never*
consume the input virPCIDevice object; if it needs to put one on the
inactive list, it will create a copy and put *that* on the list. This
way the caller knows that it is always their responsibility to free
the device object they created.
virPCIDeviceReattach was making the assumption that the dev object
given to it was one and the same with the dev object on the
inactiveDevs list. If that had been the case, it would not need to
free the dev object it removed from the inactive list, because the
caller of virPCIDeviceReattach always frees the dev object that it
passes in. Since the dev object passed in is *never* the same object
that's on the list (it is a different object with the same name and
attributes, created just for the purpose of searching for the actual
object), simply doing a "ListSteal" to remove the object from the list
results in one leaked object; we need to actually free the object
after removing it from the list.
* virPCIDeviceFindByIDs - find a device on a list w/o creating an object
This makes searching for an existing device on a list lighter weight.
* virPCIDeviceCopy - make a copy of an existing virPCIDevice object.
* virPCIDeviceGetDriverPathAndName - construct new strings containing
1) the name of the driver bound to this device.
2) the full path to the sysfs config for that driver.
(This code was lifted from virPCIDeviceUnbindFromStub, and replaced
there with a call to this new function).
Previously stubDriver was always set from a string literal, so it was
okay to use a const char * that wasn't freed when the virPCIDevice was
freed. This will not be the case in the near future, so it is now a
char* that is allocated in virPCIDeviceSetStubDriver() and freed
during virPCIDeviceFree().
libxl supports the LIBXL_DISK_BACKEND_QDISK disk backend, where qemu
is used to provide the disk backend. This patch simply maps the
existing <driver name='qemu'/> to LIBXL_DISK_BACKEND_QDISK.
Specifying an unsupported disk format with the tap driver resulted in
a less than helpful error message
error: Failed to start domain test-hvm
error: internal error libxenlight does not support disk driver qed
Change the message to state that the qed format is not supported by
the tap driver, e.g.
error: Failed to start domain test-hvm
error: internal error libxenlight does not support disk format qed
with disk driver tap
While at it, check for unsupported formats in the other driver
backends.
Add a script which parses the driver API code and validates
that every API registered in a virNNNDriverPtr table contains
an ACL check matching the API name.
NB this currently whitelists a few xen driver functions
which are temporarily lacking in access control checks.
The xen driver is considered insecure until these are
fixed.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>