- Introduce VIR_CGROUP_SUPPORTED conditional
- Convert virCgroupKill* to use it
- Convert virCgroupIsolateMount() to use it
- Convert virCgroupRemoveRecursively to VIR_CGROUP_SUPPORTED
Signed-off-by: Eric Blake <eblake@redhat.com>
Make future patches smaller by matching a sane header listing in
the first place. No semantic change.
* src/util/vircgroup.h: Move free next to new, and controller
functions next to each other.
* src/util/vircgroup.c (virCgroupFree, virCgroupHasController)
(virCgroupPathOfController, virCgroupRemoveRecursively)
(virCgroupRemove): Sort implementation to be closer to header.
Signed-off-by: Eric Blake <eblake@redhat.com>
Avoid a forward declaration of a static function.
* src/util/vircgroup.c (virCgroupPartitionNeedsEscaping)
(virCgroupParticionEscape): Move up.
Signed-off-by: Eric Blake <eblake@redhat.com>
Format all functions with two blank lines between, and return type
on separate line from function name. Also break some lines longer
than 80 columns. This makes the subsequent macro refactoring
less noisy.
* src/util/vircgroup.c: Match prevailing style.
Signed-off-by: Eric Blake <eblake@redhat.com>
otherwise having a strict --no-copy-dt-needed-entries fails in several
places like:
CCLD virdbustest
/usr/bin/ld: virdbustest-virdbustest.o: undefined reference to symbol 'dbus_message_unref'
/lib/x86_64-linux-gnu/libdbus-1.so.3: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
https://bugzilla.redhat.com/show_bug.cgi?id=951637
Newer gnutls uses nettle, rather than gcrypt, which is a lot nicer
regarding initialization. Yet we were unconditionally initializing
gcrypt even when gnutls wouldn't be using it, and having two crypto
libraries linked into libvirt.so is pointless, but mostly harmless
(it doesn't crash, but does interfere with certification efforts).
There are three distinct version ranges to worry about when
determining which crypto lib gnutls uses, per these gnutls mails:
2.12: http://lists.gnu.org/archive/html/gnutls-devel/2011-03/msg00034.html
3.0: http://lists.gnu.org/archive/html/gnutls-devel/2011-07/msg00035.html
If pkg-config can prove version numbers and/or list the crypto
library used for static linking, we have our proof; if not, it
is safer (even if pointless) to continue to use gcrypt ourselves.
* configure.ac (WITH_GNUTLS): Probe whether to add -lgcrypt, and
define a witness WITH_GNUTLS_GCRYPT.
* src/libvirt.c (virTLSMutexInit, virTLSMutexDestroy)
(virTLSMutexLock, virTLSMutexUnlock, virTLSThreadImpl)
(virGlobalInit): Honor the witness.
* libvirt.spec.in (BuildRequires): Make gcrypt usage conditional,
no longer needed in Fedora 19.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit d72ef888 introduced a bug in the libxl driver that will
segfault libvirtd if libxl reports an error message, e.g. when
attempting to initialize the driver on a non-Xen system. I
assumed it was valid to pass a NULL logger to libxl_ctx_alloc(),
but that is not the case since any errors associated with the ctx
that are emitted by libxl will dereference the logger and crash
libvirtd.
Errors associated with the libxl driver-wide ctx could be useful
for debugging anyway, so create a 'libxl-driver.log' to capture
these errors.
Recentish (2011) kernels introduced a new device called /dev/loop-control,
which causes libvirt's detection of loop devices to get confused
since it only checks for a prefix of 'loop'. Also check that the
next character is a digit
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This adds two new pages to the website, acl.html describing
the general access control framework and permissions models,
and aclpolkit.html describing the use of polkit as an
access control driver.
page.xsl is modified to support a new syntax
<div id="include" filename="somefile.htmlinc"/>
which will cause the XSL transform to replace that <div>
with the contents of 'somefile.htmlinc'. We use this in
the acl.html.in file, to pull the table of permissions
for each libvirt object. This table is autogenerated
from the enums in src/access/viraccessperms.h by the
genaclperms.pl script.
newapi.xsl is modified so that the list of permissions
checks shown against each API will link to the description
of the permissions in acl.html
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The gendispatch.pl script puts comments at the top of files
it creates, saying that it auto-generated them. Also include
the name of the source data file which it reads when doing
the auto-generation.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
introduced by cs 4b9eec50fe ("libxl: implement per
NUMA node free memory reporting"). What was wrong was that
libxl_get_numainfo() put in nr_nodes the actual number of
host NUMA nodes, not the highest node ID (like libnuma's
numa_max_node() does instead).
While at it, turn the failure of libxl_get_numainfo() from
a simple warning to a proper error, as requested during the
review of another patch of the original series.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Daniel P. Berrange <berrange@redhat.com>
This is a second attempt at fixing the problem first attempted
in commit 2df8d99; basically undoing the fact that it was
reverted in commit 43cee32f, plus fixing two more issues: the
code in configure.ac has to EXACTLY match virnetdevbridge.c
with regards to declaring in6 types before using if_bridge.h,
and the fact that RHEL 5 has even more conflicts:
In file included from util/virnetdevbridge.c:49:
/usr/include/linux/in6.h:47: error: conflicting types for 'in6addr_any'
/usr/include/netinet/in.h:206: error: previous declaration of 'in6addr_any' was here
/usr/include/linux/in6.h:49: error: conflicting types for 'in6addr_loopback'
/usr/include/netinet/in.h:207: error: previous declaration of 'in6addr_loopback' was here
The rest of this commit message borrows from the original try
of 2df8d99:
A fresh checkout on a RHEL 6 machine with these packages:
kernel-headers-2.6.32-405.el6.x86_64
glibc-2.12-1.128.el6.x86_64
failed to configure with this message:
checking for linux/if_bridge.h... no
configure: error: You must install kernel-headers in order to compile libvirt with QEMU or LXC support
Digging in config.log, we see that the problem is identical to
what we fixed earlier in commit d12c2811:
configure:98831: checking for linux/if_bridge.h
configure:98853: gcc -std=gnu99 -c -g -O2 conftest.c >&5
In file included from /usr/include/linux/if_bridge.h:17,
from conftest.c:559:
/usr/include/linux/in6.h:31: error: redefinition of 'struct in6_addr'
/usr/include/linux/in6.h:48: error: redefinition of 'struct sockaddr_in6'
/usr/include/linux/in6.h:56: error: redefinition of 'struct ipv6_mreq'
configure:98860: $? = 1
I had not hit it earlier because I was using incremental builds,
where config.cache had shielded me from the kernel-headers breakage.
* configure.ac (if_bridge.h): Avoid conflicting type definitions.
* src/util/virnetdevbridge.c (includes): Also sanitize for RHEL 5.
Signed-off-by: Eric Blake <eblake@redhat.com>
Currently, only one log file is created by the libxl driver, with
all output from libxl for all domains going to this one file.
Create a per-domain log file based on domain name, making sifting
through the logs a bit easier. This required deferring libxl_ctx
allocation until starting the domain, which is fine since the
ctx is not used when the domain is inactive.
Tested-by: Dario Faggioli <dario.faggioli@citrix.com>
The virtlockd daemon supports an /etc/libvirt/virtlockd.conf
config file, but we never installed a default config, nor
created any augeas scripts. This change addresses that omission.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Coverity complained about the usage of the uninitialized cacerts in the
event(s) that "access(certFile, R_OK)" and/or "access(cacertFile, R_OK)"
fail the for loop used to fill in the certs will have indeterminate data
as well as the possibility that both failures would result in the
gnutls_x509_crt_deinit() call having a similar fate.
Initializing cacerts only would resolve the issue; however, it still
would leave the indeterminate action, so rather add a parameter to
the virNetTLSContextLoadCACertListFromFile() to pass the max size rather
then overloading the returned count parameter. If the the call is never
made, then we won't go through the for loops referencing the empty
cacerts
Valgrind defects memory error:
==16759== 1 errors in context 1 of 8:
==16759== Invalid free() / delete / delete[] / realloc()
==16759== at 0x4A074C4: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==16759== by 0x83CD329: xdr_string (in /usr/lib64/libc-2.17.so)
==16759== by 0x4D93E4D: xdr_remote_nonnull_string (remote_protocol.c:31)
==16759== by 0x4D94350: xdr_remote_nonnull_domain (remote_protocol.c:58)
==16759== by 0x4D976C8: xdr_remote_domain_create_with_flags_ret (remote_protocol.c:1762)
==16759== by 0x83CC734: xdr_free (in /usr/lib64/libc-2.17.so)
==16759== by 0x4D7F1E0: remoteDomainCreateWithFlags (remote_driver.c:2441)
==16759== by 0x4D4BF17: virDomainCreateWithFlags (libvirt.c:9499)
==16759== by 0x13127A: cmdStart (virsh-domain.c:3376)
==16759== by 0x12BF83: vshCommandRun (virsh.c:1751)
==16759== by 0x126FFB: main (virsh.c:3205)
==16759== Address 0xe1394a0 is not stack'd, malloc'd or (recently) free'd
==16759== 1 errors in context 2 of 8:
==16759== Conditional jump or move depends on uninitialised value(s)
==16759== at 0x4A07477: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==16759== by 0x83CD329: xdr_string (in /usr/lib64/libc-2.17.so)
==16759== by 0x4D93E4D: xdr_remote_nonnull_string (remote_protocol.c:31)
==16759== by 0x4D94350: xdr_remote_nonnull_domain (remote_protocol.c:58)
==16759== by 0x4D976C8: xdr_remote_domain_create_with_flags_ret (remote_protocol.c:1762)
==16759== by 0x83CC734: xdr_free (in /usr/lib64/libc-2.17.so)
==16759== by 0x4D7F1E0: remoteDomainCreateWithFlags (remote_driver.c:2441)
==16759== by 0x4D4BF17: virDomainCreateWithFlags (libvirt.c:9499)
==16759== by 0x13127A: cmdStart (virsh-domain.c:3376)
==16759== by 0x12BF83: vshCommandRun (virsh.c:1751)
==16759== by 0x126FFB: main (virsh.c:3205)
==16759== Uninitialised value was created by a stack allocation
==16759== at 0x4D7F120: remoteDomainCreateWithFlags (remote_driver.c:2423)
How to reproduce?
# virsh start <domain> --paused
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=994855
Signed-off-by: Alex Jia <ajia@redhat.com>
If securityfs is available on the host, we should ensure to
mount it read-only in the container. This will avoid systemd
trying to mount it during startup causing SELinux AVCs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Hotplugging a single SCSI device works, but adding additional ones
result in an error from QEMU:
[root@gpok197 ~]# virsh attach-device guest01 blah.xml
Device attached successfully
[root@gpok197 ~]# virsh attach-device guest01 blah2.xml
error: Failed to attach device from blah2.xml
error: internal error unable to execute QEMU command 'device_add': Duplicate ID 'hostdev0' for device
The hostdev ID that is created is always set to zero, regardless
of the contents of the XML. Changing the index in the hotplug case
to a negative one so the next available index is used.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Reviewed-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
So that app developers / admins know what access control checks
are performed for each API, this patch extends the API docs
generator to include details of the ACLs for each.
The gendispatch.pl script is extended so that it generates
a simple XML describing ACL rules, eg.
<aclinfo>
...
<api name='virConnectNumOfDomains'>
<check object='connect' perm='search_domains'/>
<filter object='domain' perm='getattr'/>
</api>
<api name='virDomainAttachDeviceFlags'>
<check object='domain' perm='write'/>
<check object='domain' perm='save' flags='!VIR_DOMAIN_AFFECT_CONFIG|VIR_DOMAIN_AFFECT_LIVE'/>
<check object='domain' perm='save' flags='VIR_DOMAIN_AFFECT_CONFIG'/>
</api>
...
</aclinfo>
The newapi.xsl template loads the XML files containing the ACL
rules and generates a short block of HTML for each API describing
the parameter checks and return value filters (if any).
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The code added to validate CA certificates did not take into
account the possibility that the cacert.pem file can contain
multiple (concatenated) cert data blocks. Extend the code for
loading CA certs to use the gnutls APIs for loading cert lists.
Add test cases to check that multi-level trees of certs will
validate correctly.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit 3d0e3c1 reintroduced a problem previously squelched in
commit 7e5aa78. Add a syntax check this time around.
util/virutil.c: In function 'virGetGroupList':
util/virutil.c:1015: error: 'for' loop initial declaration used outside C99 mode
* cfg.mk (sc_prohibit_loop_var_decl): New rule.
* src/util/virutil.c (virGetGroupList): Fix offender.
Signed-off-by: Eric Blake <eblake@redhat.com>
Before, missing attributes were only OK when adding entries;
modification and deletion required all of them.
Now, only deletion works with missing attributes, as long as
the host is uniquely identified.
Go through disks of guest, if one disk doesn't exist or its backing
chain is broken, with 'optional' startupPolicy, for CDROM and Floppy
we only discard its source path definition in xml, for disks we drop
it from disk list and free it.
Since iptables version 1.4.16 '-m state --state NEW' is converted to
'-m conntrack --ctstate NEW'. Therefore, when encountering this or later
versions of iptables use '-m conntrack --ctstate'.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
The change from initgroups to virGetGroupList/setgroups in
cab36cfe71ba83b71e536ba5c98e596f02b697b0 dropped the primary group from
processes group list iff the passed in group to virGetGroupList differs
from the user's primary group.
So always include the primary group to bring back the old behaviour.
Debian has the kvm group as primary group but uses
libvirt-qemu:libvirt-qemu as user:group to run the kvm process so
without this change the /dev/kvm is inaccessible.
Since commit 95e18efd most public interfaces (xenUnified...) obtain
a virDomainDefPtr via xenGetDomainDefFor...() which take the unified
lock.
This is already taken before calling xenDomainUsedCpus(), so we get
a deadlock for active guests. Avoid this by splitting up
xenUnifiedDomainGetVcpusFlags() and xenUnifiedDomainGetVcpus() into
public and private function calls (which get the virDomainDefPtr passed)
and use those in xenDomainUsedCpus().
xenDomainUsedCpus
...
nb_vcpu = xenUnifiedDomainGetMaxVcpus(dom);
return xenUnifiedDomainGetVcpusFlags(...)
...
if (!(def = xenGetDomainDefForDom(dom)))
return xenGetDomainDefForUUID(dom->conn, dom->uuid);
...
ret = xenHypervisorLookupDomainByUUID(conn, uuid);
...
xenUnifiedLock(priv);
name = xenStoreDomainGetName(conn, id);
xenUnifiedUnlock(priv);
...
if ((ncpus = xenUnifiedDomainGetVcpus(dom, cpuinfo, nb_vcpu,
...
if (!(def = xenGetDomainDefForDom(dom)))
[again like above]
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
This patch addresses two concerns with the error reporting when an
incompatible PCI address is specified for a device:
1) It wasn't always apparent which device had the problem. With this
patch applied, any error about an incompatible address will always
contain the full address as given in the config, so it will be easier
to determine which device's config aused the problem.
2) In some cases when the problem came from bad config, the error
message was erroneously classified as VIR_ERR_INTERNAL_ERROR. With
this patch applied, the same error message will be changed to indicate
either "internal" or "xml" error depending on whether the address came
from the config, or was automatically generated by libvirt.
Note that in the case of "internal" (due to bad auto-generation)
errors, the PCI address won't be of much use in finding the location
in config to change (because it was automatically generated). Of
course that makes perfect sense, but still the address could provide a
clue about a bug in libvirt attempting to use a type of pci bus that
doesn't have its flags set correctly (or something similar). In other
words, it's not perfect, but it is definitely better.
q35 machines have an implicit ahci (sata) controller at 00:1F.2 which
has no "id" associated with it. For this reason, we can't refer to it
as "ahci0". Instead, we don't give an id on the commandline, which
qemu interprets as "use the first ahci controller". We then need to
specify the unit with "unit=%d" rather than adding it onto the bus
arg.
https://bugzilla.redhat.com/show_bug.cgi?id=979477
Since 1.0.3 we are using the new way to copy non shared storage during
migration (the NBD way). However, whether the new or old way is used is
not controllable by user but unconditionally turned on if both sides of
migration support it. Moreover, the implementation is not complete: the
combination for VIR_MIGRATE_TUNNELLED flag is missing (as we need to
open new port on the destination) in which case we just error out. This
is a deadly combination: not letting users choose their destiny and
erroring out. We should not do that but VIR_WARN and turn the NBD off
instead.
We had been setting the device alias in the devinceinfo for pci
controllers to "pci%u", but then hardcoding "pci.%u" when creating the
device address for other devices using that pci bus. This all worked
just fine until we encountered the built-in "pcie.0" bus (the PCIe
root complex) in Q35 machines.
In order to create the correct commandline for this one case, this
patch:
1) sets the alias for PCI controllers correctly, to "pci.%u" (or
"pcie.%u" for the pcie-root controller)
2) eliminates the hardcoded "pci.%u" for pci controllers when
generatuing device address strings, and instead uses the controller's
alias.
3) plumbs a pointer to the virDomainDef all the way down to
qemuBuildDeviceAddressStr. This was necessary in order to make the
aliase of the controller *used by a device* available (previously
qemuBuildDeviceAddressStr only had the deviceinfo of the device
itself, *not* of the controller it was connecting to). This made for a
larger than desired diff, but at least in the future we won't have to
do it again, since all the information we could possibly ever need for
future enhancements is in the virDomainDef. (right?)
This should be done for *all* controllers, but for now we just do it
in the case of PCI controllers, to reduce the likelyhood of
regression.
This patch adds in special handling for a few devices that need to be
treated differently for q35 domains:
usb - there is no implicit/default usb controller for the q35
machinetype. This is done because normally the default usb controller
is added to a domain by just adding "-usb" to the qemu commandline,
and it's assumed that this will add a single piix3 usb1 controller at
slot 1 function 2. That's not what happens when the machinetype is
q35, though. Instead, adding -usb to the commandline adds 3 usb
(version 2) controllers to the domain at slot 0x1D.{1,2,7}. Rather
than having
<controller type='usb' index='0'/>
translate into 3 separate devices on the PCI bus, it's cleaner to not
automatically add a default usb device; one can always be added
explicitly if desired. Or we may decide that on q35 machines, 3 usb
controllers will be automatically added when none is given. But for
this initial commit, at least we aren't locking ourselves into
something we later won't want.
video - qemu always initializes the primary video device immediately
after any integrated devices for the machinetype. Unless instructed
otherwise (by using "-device vga..." instead of "-vga" which libvirt
uses in many cases to work around deficiencies and bugs in various
qemu versions) qemu will always pick the first unused slot. In the
case of the "pc" machinetype and its derivatives, this is always slot
2, but on q35 machinetypes, the first free slot is slot 1 (since the
q35's integrated peripheral devices are placed in other slots,
e.g. slot 0x1f). In order to make the PCI address of the video device
predictable, that slot (1 or 2, depending on machinetype) is reserved
even when no video device has been specified.
sata - a q35 machine always has a sata controller implicitly added at
slot 0x1F, function 2. There is no way to avoid this controller, so we
always add it. Note that the xml2xml tests for the pcie-root and q35
cases were changed to use DO_TEST_DIFFERENT() so that we can check for
the sata controller being automatically added. This is especially
important because we can't check for it in the xml2argv output (it has
no effect on that output since it's an implicit device).
ide - q35 has no ide controllers.
isa and smbus controllers - these two are always present in a q35 (at
slot 0x1F functions 0 and 3) but we have no way of modelling them in
our config. We do need to reserve those functions so that the user
doesn't attempt to put anything else there though. (note that the "pc"
machine type also has an ISA controller, which we also ignore).
This PCI controller, named "dmi-to-pci-bridge" in the libvirt config,
and implemented with qemu's "i82801b11-bridge" device, connects to a
PCI Express slot (e.g. one of the slots provided by the pcie-root
controller, aka "pcie.0" on the qemu commandline), and provides 31
*non-hot-pluggable* PCI (*not* PCIe) slots, numbered 1-31.
Any time a machine is defined which has a pcie-root controller
(i.e. any q35-based machinetype), libvirt will automatically add a
dmi-to-pci-bridge controller if one doesn't exist, and also add a
pci-bridge controller. The reasoning here is that any useful domain
will have either an immediate (startup time) or eventual (subsequent
hot-plug) need for a standard PCI slot; since the pcie-root controller
only provides PCIe slots, we need to connect a dmi-to-pci-bridge
controller to it in order to get a non-hot-plug PCI slot that we can
then use to connect a pci-bridge - the slots provided by the
pci-bridge will be both standard PCI and hot-pluggable.
Since pci-bridge devices themselves can not be hot-plugged into a
running system (although you can hot-plug other devices into a
pci-bridge's slots), any new pci-bridge controller that is added can
(and will) be plugged into the dmi-to-pci-bridge as long as it has
empty slots available.
This patch is also changing the qemuxml2xml-pcie test from a "DO_TEST"
to a "DO_DIFFERENT_TEST". This is so that the "before" xml can omit
the automatically added dmi-to-pci-bridge and pci-bridge devices, and
the "after" xml can include it - this way we are testing if libvirt is
properly adding these devices.
This controller is implicit on q35 machinetypes. It provides 31 PCIe
(*not* PCI) slots as controller 0.
Currently there are no devices that can connect to pcie-root, and no
implicit pci controller on a q35 machine, so q35 is still
unusable. For a usable q35 system, we need to add a
"dmi-to-pci-bridge" pci controller, which can connect to pcie-root,
and provides standard pci slots that can be used to connect other
devices.
Previous refactoring of the guest PCI address reservation/allocation
code allowed for slot types other than basic PCI (e.g. PCI express,
non-hotpluggable slots, etc) but would not auto-allocate a slot for a
device that required any type other than a basic hot-pluggable
PCI slot.
This patch refactors the code to be aware of different slot types
during auto-allocation of addresses as well - as long as there is an
empty slot of the required type, it will be found and used.
The piece that *wasn't* added is that we don't auto-create a new PCI
bus when needed for anything except basic PCI devices. This is because
there are multiple different types of controllers that can provide,
for example, a PCI express slot (in addition to the pcie-root
controller, these can also be found on a "root-port" or on a
"downstream-switch-port"). Since we currently don't support any PCIe
devices (except pending support for dmi-to-pci-bridge), we can defer
any decision on what to do about this.
Commit 632180d1 introduced memory corruption in xenDaemonListDefinedDomains
by starting to populate the names array at index -1, causing all sorts
of havoc in libvirtd such as aborts like the following
*** Error in `/usr/sbin/libvirtd': double free or corruption (out): 0x00007fffe00ccf20 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7abf6)[0x7ffff3fa0bf6]
/lib64/libc.so.6(+0x7b973)[0x7ffff3fa1973]
/lib64/libc.so.6(xdr_array+0xde)[0x7ffff403cbae]
/usr/sbin/libvirtd(+0x50251)[0x5555555a4251]
/lib64/libc.so.6(xdr_free+0x15)[0x7ffff403ccd5]
/usr/lib64/libvirt.so.0(+0x1fad34)[0x7ffff76b1d34]
/usr/lib64/libvirt.so.0(virNetServerProgramDispatch+0x1fc)[0x7ffff76b16f1]
/usr/lib64/libvirt.so.0(+0x1f214a)[0x7ffff76a914a]
/usr/lib64/libvirt.so.0(+0x1f222d)[0x7ffff76a922d]
/usr/lib64/libvirt.so.0(+0xbcc4f)[0x7ffff7573c4f]
/usr/lib64/libvirt.so.0(+0xbc5e5)[0x7ffff75735e5]
/lib64/libpthread.so.0(+0x7e0f)[0x7ffff48f7e0f]
/lib64/libc.so.6(clone+0x6d)[0x7ffff400e7dd]
Fix by initializing ret to 0 and only setting to error on failure path.
This configuration knob lets user to set the length of queue of
connection requests waiting to be accept()-ed by the daemon. IOW, it
just controls the @backlog passed to listen:
int listen(int sockfd, int backlog);
Currently, even if max_client limit is hit, we accept() incoming
connection request, but close it immediately. This has disadvantage of
not using listen() queue. We should accept() only those clients we
know we can serve and let all other wait in the (limited) queue.
* The functions qemuDomainPCIAddressReserveAddr and
qemuDomainPCIAddressReserveSlot were very similar (and should have
been more similar) and were about to get more code added to them which
would create even more duplicated code, so this patch gives
qemuDomainPCIAddressReserveAddr a "reserveEntireSlot" arg, then
replaces the body of qemuDomainPCIAddressReserveSlot with a call to
qemuDomainPCIAddressReserveAddr.
You will notice that addrs->lastaddr was previously set in
qemuDomainPCIAddressReserveAddr (but *not* set in
qemuDomainPCIAddressReserveSlot). For consistency and cleanliness of
code, that bit was removed and put into the one caller of
qemuDomainPCIAddressReserveAddr (there is a similar place where the
caller of qemuDomainPCIAddressReserveSlot sets lastaddr). This does
guarantee identical functionality to pre-patch code, but in practice
isn't really critical, because lastaddr is just keeping track of where
to start when looking for a free slot - if it isn't updated, we will
just start looking on a slot that's already occupied, then skip up to
one that isn't.
* qemuCollectPCIAddress was essentially doing the same thing as
qemuDomainPCIAddressReserveAddr, but with some extra special case
checking at the beginning. The duplicate code has been replaced with
a call to qemuDomainPCIAddressReserveAddr. This required adding a
"fromConfig" boolean, which is only used to change the log error
code from VIR_ERR_INTERNAL_ERROR (when the address was
auto-generated by libvirt) to VIR_ERR_XML_ERROR (when the address is
coming from the config); without this differentiation, it would be
difficult to tell if an error was caused by something wrong in
libvirt's auto-allocate code or just bad config.
* the bit of code in qemuDomainPCIAddressValidate that checks the
connect type flags is going to be used in a couple more places where
we don't need to also check the slot limits (because we're generating
the slot number ourselves), so that has been pulled out into a
separate qemuDomainPCIAddressFlagsCompatible function.
* qemuDomainPCIAddressSetNextAddr
The name of this function was confusing because 1) other functions in
the file that end in "Addr" are only operating on a single function of
one PCI slot, not the entire slot, while functions that do something
with the entire slot end in "Slot", and 2) it didn't contain a verb
describing what it is doing (the "Set" refers to the set that contains
all PCI buses in the system, used to keep track of which slots in
which buses are already reserved for use).
It is now renamed to qemuDomainPCIAddressReserveNextSlot, which more
clearly describes what it is doing. Arguably, it could have been
changed to qemuDomainPCIAddressSetReserveNextSlot, but 1) the word
"set" is confusing in this context because it could be intended as a
verb or as a noun, and 2) most other functions that operate on a
single slot or address within this set are also named
qemuDomainPCIAddress... rather than qemuDomainPCIAddressSet... Only
the Create, Free, and Grow functions for an address set (which modify the
entire set, not just one element) use "Set" in their name.
* qemuPCIAddressAsString, qemuPCIAddressValidate
All the other functions in this set are named
qemuDomainPCIAddressxxxxx, so I renamed these to be consistent.
The parser shouldn't be doing arch-specific things like adding in
implicit controllers to the config. This should instead be done in the
hypervisor's post-parse callback.
This patch removes the auto-add of a usb controller from the domain
parser, and puts it into the qemu driver's post-parse callback (just
as is already done with the auto-add of the pci-root controller). In
the future, any machine/arch that shouldn't have a default usb
controller added should just set addDefaultUSB = false in this
function.
We've recently seen that q35 and ARMV7L domains shouldn't get a default USB
controller, so I've set addDefaultUSB to false for both of those.
If upgrading from a libvirt that is older than 1.0.5, we can
not assume that vm->def->resource is non-NULL. This bogus
assumption caused libvirtd to crash
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The journald code would crash if a NULL was passed for the
filename / funcname in the logging code. This shouldn't
happen in general, but it is better to be safe, since there
have been bugs triggering this.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virLibConnError macros in libvirt-lxc.c and
libvirt-qemu.c were passing NULL for the filename.
This causes a crash if the logging code is configured
to use journald.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
* Move platform specific things (e.g. firewalling and route
collision checks) into bridge_driver_platform
* Create two platform specific implementations:
- bridge_driver_linux: Linux implementation using iptables,
it's actually the code moved from bridge_driver.c
- bridge_driver_nop: dumb implementation that does nothing
Signed-off-by: Eric Blake <eblake@redhat.com>
*src/util/virstoragefile.c: Add a helper function to get
the first name of missing backing files, if the name is NULL,
it means the diskchain is not broken.
*src/qemu/qemu_domain.c: qemuDiskChainCheckBroken(disk) to
check if its chain is broken
Refactor this function to make it focus on disk presence checking,
including diskchain checking, and not only for CDROM and Floppy.
This change is good for the following patches.
The virDomainDef is allocated by the caller and also used after
calling to xenDaemonCreateXML. So it must not get freed by the
callee.
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Make the virCgroupNewMachine method try to use systemd-machined
first. If that fails, then fallback to using the traditional
cgroup setup code path.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When systemd is involved in managing processes, it may start
killing off & tearing down croups associated with the process
while we're still doing virCgroupKillPainfully. We must
explicitly check for ENOENT and treat it as if we had finished
killing processes
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Systemd uses a named cgroup mount for tracking processes. Add
it as another type of controller, albeit one which we have to
special case in a number of places. In particular we must
never create/delete directories there, nor add tasks. Essentially
the systemd mount is to be considered read-only for libvirt.
With this change both the virCgroupDetectPlacement and
virCgroupCopyPlacement methods must be invoked. The copy
placement method will copy setup for resource controllers
only. The detect placement method will probe for any
named controllers, or resource controllers not already
setup.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There are some interesting escaping rules to consider when dealing
with systemd slice/scope names. Thus it is helpful to have APIs
for formatting names
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Although this isn't apparently needed for the guest agent itself, the
test I will be adding later depends on the newline as a separator of
messages to process.
this patch introduce the console api in libxl driver for both pv and
hvm guest. and import and update the libxlMakeChrdevStr function
which was deleted in commit dfa1e1dd.
Signed-off-by: Bamvor Jian Zhang <bjzhang@suse.com>
This function is needed for virt-login-shell. Also modify virGirUserDirectory
to use the new function, to simplify the code.
Signed-off-by: Eric Blake <eblake@redhat.com>
The VIR_DOMAIN_PAUSED_GUEST_PANICKED constant is badly named,
leaking the QEMU event name. Elsewhere in the API we use
'CRASHED' rather than 'PANICKED', and the addition of 'GUEST'
is redundant since all events are guest related.
Thus rename it to VIR_DOMAIN_PAUSED_CRASHED, which matches
with VIR_DOMAIN_RUNNING_CRASHED and VIR_DOMAIN_EVENT_CRASHED.
It was added in commit 14e7e0ae8d
which post-dates v1.1.0, so is safe to rename before 1.1.1
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The VIR_DOMAIN_SHUTDOWN_CRASHED state constant does not appear
to be used in the QEMU code anyway. It also doesn't make much
(any) sense, since the 'shutdown' state is a transient state
between 'running' and 'shutoff' and when a guest crashes, it
does not end up in a 'shutdown' state, only 'shutoff'.
It was added in commit 14e7e0ae8d
which post-dates v1.1.0, so is safe to remove before 1.1.1
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The way we were casting small (<32bit) integers was broken
on big endian hosts, causing stack smashing. This was detected
in the test suite either by test failures due to incorrect
results, or by libc/gcc abort'ing with its stack canary
triggered.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Depending on the set of mingw packages installed, it is possible
that other .c files hit the mingw header pollution from the
virdbus.h file.
In file included from ../../src/rpc/virnetserver.c:39:0:
../../src/util/virdbus.h:41:35: error: expected ';', ',' or ')' before 'struct'
const char *interface,
^
* src/util/virdbus.h (virDBusCallMethod): Match .c file change.
Signed-off-by: Eric Blake <eblake@redhat.com>
On platforms without decent group support, the build failed:
Cannot export virGetGroupList: symbol not defined
./.libs/libvirt_security_manager.a(libvirt_security_manager_la-security_dac.o): In function `virSecurityDACPreFork':
/home/eblake/libvirt-tmp/build/src/../../src/security/security_dac.c:248: undefined reference to `virGetGroupList'
collect2: error: ld returned 1 exit status
* src/util/virutil.c (virGetGroupList): Provide dummy implementation.
Signed-off-by: Eric Blake <eblake@redhat.com>
Our recent conversion to make VIR_ALLOC report oom wasn't
tested on mingw:
In file included from ../../src/util/virthread.c:29:0:
../../src/util/virthreadwin32.c: In function 'virCondWait':
../../src/util/virthreadwin32.c:166:81: error: 'VIR_FROM_THIS' undeclared (first use in this function)
if (VIR_REALLOC_N(c->waiters, c->nwaiters + 1) < 0) {
^
* src/util/virthreadwin32.c (VIR_FROM_THIS): Define.
Signed-off-by: Eric Blake <eblake@redhat.com>
The previous patch was incomplete.
CC libvirt_util_la-vircgroup.lo
../../src/util/vircgroup.c:70:12: error: 'virCgroupPartitionEscape' declared 'static' but never defined [-Werror=unused-function]
static int virCgroupPartitionEscape(char **path);
^
* src/util/vircgroup.c (virCgroupPartitionEscape): Move forward
declaration inside conditional.
Signed-off-by: Eric Blake <eblake@redhat.com>
The virCgroupValidateMachineGroup method calls some functions
which are only conditionally compiled, thus it too must be
made conditional. This fixes the build on non-Linux hosts.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A VPATH build 'make check' was failing with:
GEN check-driverimpls
Can't open ../../src/../../src/lxc/lxc_monitor_protocol.h: No such file or directory at ../../src/check-driverimpls.pl line 29, <> line 27153.
Can't open ../../src/../../src/lxc/lxc_monitor_protocol.c: No such file or directory at ../../src/check-driverimpls.pl line 29, <> line 27153.
...
GEN check-aclrules
cannot read ../../src/../../src/remote/remote_protocol.x at ../../src/check-aclrules.pl line 128.
because $(srcdir) was being prepended to file names that already
included it.
* src/Makefile.am (check-driverimpls): Don't add srcdir twice.
Signed-off-by: Eric Blake <eblake@redhat.com>
When the legacy Xen driver probes with a NULL URI, and
finds itself running on Xen, it will set conn->uri. A
little bit later though it checks to see if libxl support
exists, and if so declines the driver. This leaves the
conn->uri set to 'xen:///', so if libxl also declines
it, it prevents probing of the QEMU driver.
Once a driver has set the conn->uri, it must *never*
decline an open request. So we must move the libxl
check earlier
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=981094
The commit 0ad9025ef introduce qemu flag QEMU_CAPS_DEVICE_VIDEO_PRIMARY
for using -device VGA, -device cirrus-vga, -device vmware-svga and
-device qxl-vga. In use, for -device qxl-vga, mouse doesn't display
in guest window like the desciption in above bug.
This patch try to use -device for primary video when qemu >=1.6 which
contains the bug fix patch
Otherwise, with new enough gcc compiling at -O2, the build fails with:
../../src/conf/domain_conf.c: In function ‘virDomainDeviceDefPostParse’:
../../src/conf/domain_conf.c:2821:29: error: ‘cnt’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
for (i = 0; i < *cnt; i++) {
^
../../src/conf/domain_conf.c:2795:20: note: ‘cnt’ was declared here
size_t i, *cnt;
^
../../src/conf/domain_conf.c:2794:30: error: ‘arrPtr’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
virDomainChrDefPtr **arrPtr;
^
* src/conf/domain_conf.c (virDomainChrGetDomainPtrs): Always
assign into output parameters.
Signed-off-by: Eric Blake <eblake@redhat.com>
By setting the default partition in libvirt_lxc it is not
visible when querying the live XML. Move setting of the
default partition into libvirtd virLXCProcessStart
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Decrementing it when it was already 0 causes an invalid free
in virNetworkDefUpdateDNSHost if virNetworkDNSHostDefParseXML
fails and virNetworkDNSHostDefClear gets called twice.
virNetworkForwardDefClear left the number untouched even if it
freed all the elements.
If the app has provided a whitelist of controllers to be used,
we skip detecting its mount point. We still, however, fill in
the placement info which later confuses the machine name
validation code. Skip detecting placement if the controller
mount point is not set
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When a VM has an 'emulator' child cgroup present, we must
strip off that suffix when detecting the cgroup for a
machine
Rename the virCgroupIsValidMachineGroup method to
virCgroupValidateMachineGroup to make a bit clearer
that this isn't simply a boolean check, it will make
changes to the object.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virCgroupIsValidMachine does not need to be called from
outside the cgroups file now, so make it static.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of requiring drivers to use a combination of calls
to virCgroupNewDetect and virCgroupIsValidMachine, combine
the two into virCgroupNewDetectMachine
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Both virStoragePoolFree and virStorageVolFree reset the last error,
which might lead to the cryptic message:
An error occurred, but the cause is unknown
When the volume wasn't found, virStorageVolFree was called with NULL,
leading to an error:
invalid storage volume pointer in virStorageVolFree
This patch changes it to:
Storage volume not found: no storage vol with matching name 'tomato'
Add protection such that the virCgroupRemove and
virCgroupKill* do not do anything to the root cgroup.
Killing all PIDs in the root cgroup does not end well.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of requiring one API call to create a cgroup and
another to add a task to it, introduce a new API
virCgroupNewMachine which does both jobs at once. This
will facilitate the later code to talk to systemd to
achieve this job which is also atomic.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There's a race in lxc driver causing a deadlock. If a domain is
destroyed immediately after started, the deadlock can occur. When domain
is started, the even loop tries to connect to the monitor. If the
connecting succeeds, virLXCProcessMonitorInitNotify() is called with
@mon->client locked. The first thing that callee does, is
virObjectLock(vm). So the order of locking is: 1) @mon->client, 2) @vm.
However, if there's another thread executing virDomainDestroy on the
very same domain, the first thing done here is locking the @vm. Then,
the corresponding libvirt_lxc process is killed and monitor is closed
via calling virLXCMonitorClose(). This callee tries to lock @mon->client
too. So the order is reversed to the first case. This situation results
in deadlock and unresponsive libvirtd (since the eventloop is involved).
The proper solution is to unlock the @vm in virLXCMonitorClose prior
entering virNetClientClose(). See the backtrace as follows:
Thread 25 (Thread 0x7f1b7c9b8700 (LWP 16312)):
0 0x00007f1b80539714 in __lll_lock_wait () from /lib64/libpthread.so.0
1 0x00007f1b8053516c in _L_lock_516 () from /lib64/libpthread.so.0
2 0x00007f1b80534fbb in pthread_mutex_lock () from /lib64/libpthread.so.0
3 0x00007f1b82a637cf in virMutexLock (m=0x7f1b3c0038d0) at util/virthreadpthread.c:85
4 0x00007f1b82a4ccf2 in virObjectLock (anyobj=0x7f1b3c0038c0) at util/virobject.c:320
5 0x00007f1b82b861f6 in virNetClientCloseInternal (client=0x7f1b3c0038c0, reason=3) at rpc/virnetclient.c:696
6 0x00007f1b82b862f5 in virNetClientClose (client=0x7f1b3c0038c0) at rpc/virnetclient.c:721
7 0x00007f1b6ee12500 in virLXCMonitorClose (mon=0x7f1b3c007210) at lxc/lxc_monitor.c:216
8 0x00007f1b6ee129f0 in virLXCProcessCleanup (driver=0x7f1b68100240, vm=0x7f1b680ceb70, reason=VIR_DOMAIN_SHUTOFF_DESTROYED) at lxc/lxc_process.c:174
9 0x00007f1b6ee14106 in virLXCProcessStop (driver=0x7f1b68100240, vm=0x7f1b680ceb70, reason=VIR_DOMAIN_SHUTOFF_DESTROYED) at lxc/lxc_process.c:710
10 0x00007f1b6ee1aa36 in lxcDomainDestroyFlags (dom=0x7f1b5c002560, flags=0) at lxc/lxc_driver.c:1291
11 0x00007f1b6ee1ab1a in lxcDomainDestroy (dom=0x7f1b5c002560) at lxc/lxc_driver.c:1321
12 0x00007f1b82b05be5 in virDomainDestroy (domain=0x7f1b5c002560) at libvirt.c:2303
13 0x00007f1b835a7e85 in remoteDispatchDomainDestroy (server=0x7f1b857419d0, client=0x7f1b8574ae40, msg=0x7f1b8574acf0, rerr=0x7f1b7c9b7c30, args=0x7f1b5c004a50) at remote_dispatch.h:3143
14 0x00007f1b835a7d78 in remoteDispatchDomainDestroyHelper (server=0x7f1b857419d0, client=0x7f1b8574ae40, msg=0x7f1b8574acf0, rerr=0x7f1b7c9b7c30, args=0x7f1b5c004a50, ret=0x7f1b5c0029e0) at remote_dispatch.h:3121
15 0x00007f1b82b93704 in virNetServerProgramDispatchCall (prog=0x7f1b8573af90, server=0x7f1b857419d0, client=0x7f1b8574ae40, msg=0x7f1b8574acf0) at rpc/virnetserverprogram.c:435
16 0x00007f1b82b93263 in virNetServerProgramDispatch (prog=0x7f1b8573af90, server=0x7f1b857419d0, client=0x7f1b8574ae40, msg=0x7f1b8574acf0) at rpc/virnetserverprogram.c:305
17 0x00007f1b82b8c0f6 in virNetServerProcessMsg (srv=0x7f1b857419d0, client=0x7f1b8574ae40, prog=0x7f1b8573af90, msg=0x7f1b8574acf0) at rpc/virnetserver.c:163
18 0x00007f1b82b8c1da in virNetServerHandleJob (jobOpaque=0x7f1b8574dca0, opaque=0x7f1b857419d0) at rpc/virnetserver.c:184
19 0x00007f1b82a64158 in virThreadPoolWorker (opaque=0x7f1b8573cb10) at util/virthreadpool.c:144
20 0x00007f1b82a63ae5 in virThreadHelper (data=0x7f1b8574b9f0) at util/virthreadpthread.c:161
21 0x00007f1b80532f4a in start_thread () from /lib64/libpthread.so.0
22 0x00007f1b7fc4f20d in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7f1b83546740 (LWP 16297)):
0 0x00007f1b80539714 in __lll_lock_wait () from /lib64/libpthread.so.0
1 0x00007f1b8053516c in _L_lock_516 () from /lib64/libpthread.so.0
2 0x00007f1b80534fbb in pthread_mutex_lock () from /lib64/libpthread.so.0
3 0x00007f1b82a637cf in virMutexLock (m=0x7f1b680ceb80) at util/virthreadpthread.c:85
4 0x00007f1b82a4ccf2 in virObjectLock (anyobj=0x7f1b680ceb70) at util/virobject.c:320
5 0x00007f1b6ee13bd7 in virLXCProcessMonitorInitNotify (mon=0x7f1b3c007210, initpid=4832, vm=0x7f1b680ceb70) at lxc/lxc_process.c:601
6 0x00007f1b6ee11fd3 in virLXCMonitorHandleEventInit (prog=0x7f1b3c001f10, client=0x7f1b3c0038c0, evdata=0x7f1b8574a7d0, opaque=0x7f1b3c007210) at lxc/lxc_monitor.c:109
7 0x00007f1b82b8a196 in virNetClientProgramDispatch (prog=0x7f1b3c001f10, client=0x7f1b3c0038c0, msg=0x7f1b3c003928) at rpc/virnetclientprogram.c:259
8 0x00007f1b82b87030 in virNetClientCallDispatchMessage (client=0x7f1b3c0038c0) at rpc/virnetclient.c:1019
9 0x00007f1b82b876bb in virNetClientCallDispatch (client=0x7f1b3c0038c0) at rpc/virnetclient.c:1140
10 0x00007f1b82b87d41 in virNetClientIOHandleInput (client=0x7f1b3c0038c0) at rpc/virnetclient.c:1312
11 0x00007f1b82b88f51 in virNetClientIncomingEvent (sock=0x7f1b3c0044e0, events=1, opaque=0x7f1b3c0038c0) at rpc/virnetclient.c:1832
12 0x00007f1b82b9e1c8 in virNetSocketEventHandle (watch=3321, fd=54, events=1, opaque=0x7f1b3c0044e0) at rpc/virnetsocket.c:1695
13 0x00007f1b82a272cf in virEventPollDispatchHandles (nfds=21, fds=0x7f1b8574ded0) at util/vireventpoll.c:498
14 0x00007f1b82a27af2 in virEventPollRunOnce () at util/vireventpoll.c:645
15 0x00007f1b82a25a61 in virEventRunDefaultImpl () at util/virevent.c:273
16 0x00007f1b82b8e97e in virNetServerRun (srv=0x7f1b857419d0) at rpc/virnetserver.c:1097
17 0x00007f1b8359db6b in main (argc=2, argv=0x7ffff98dbaa8) at libvirtd.c:1512
The avail_vcpu bitmap has to be allocated before it can be used (using
the maximum allowed value for that). Then for each available VCPU the
bit in the mask has to be set (libxl_bitmap_set takes a bit position
as an argument, not the number of bits to set).
Without this, I would always only get one VCPU for guests created
through libvirt/libxl.
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>