The virCgroupIsValidMachine does not need to be called from
outside the cgroups file now, so make it static.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of requiring drivers to use a combination of calls
to virCgroupNewDetect and virCgroupIsValidMachine, combine
the two into virCgroupNewDetectMachine
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add protection such that the virCgroupRemove and
virCgroupKill* do not do anything to the root cgroup.
Killing all PIDs in the root cgroup does not end well.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of requiring one API call to create a cgroup and
another to add a task to it, introduce a new API
virCgroupNewMachine which does both jobs at once. This
will facilitate the later code to talk to systemd to
achieve this job which is also atomic.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
virCgroupAvailable() implementation calls getmntent_r
without checking if HAVE_GETMNTENT_R is defined, so it fails
to build on platforms without getmntent_r support.
Make virCgroupAvailable() just return false without
HAVE_GETMNTENT_R.
Parsing 'user:group' is useful even outside the DAC security driver,
so expose the most abstract function which has no DAC security driver
bits in itself.
Thanks to a lack of coordination between kernel and glibc folks,
it has been impossible to mix code using <linux/in.h> and
<net/in.h> for some time now (see for example commit c308a9a).
On at least RHEL 6, <linux/if_bridge.h> tries to use the kernel
side, and fails due to our desire to use the glibc side elsewhere:
In file included from /usr/include/linux/if_bridge.h:17,
from util/virnetdevbridge.c:42:
/usr/include/linux/in6.h:31: error: redefinition of ‘struct in6_addr’
/usr/include/linux/in6.h:48: error: redefinition of ‘struct sockaddr_in6’
/usr/include/linux/in6.h:56: error: redefinition of ‘struct ipv6_mreq’
Thankfully, the kernel layout of these structs is ABI-compatible,
they only differ in the type system presented to the C compiler.
While there are other versions of kernel headers that avoid the
problem, it is easier to just work around the issue than to expect
all developers to upgrade to working kernel headers.
* src/util/virnetdevbridge.c (includes): Coerce the kernel version
of in.h to not collide with the normal version.
Signed-off-by: Eric Blake <eblake@redhat.com>
dbus 1.2.24 (on RHEL 6) lacks DBUS_TYPE_UNIX_FD; but as we aren't
trying to pass one of those anyways, we can just drop support for
it in our wrapper. Solves this build error introduced in commit
834c9c94:
CC libvirt_util_la-virdbus.lo
util/virdbus.c:242: error: 'DBUS_TYPE_UNIX_FD' undeclared here (not in a function)
* src/util/virdbus.c (virDBusBasicTypes): Drop support for unix fds.
Signed-off-by: Eric Blake <eblake@redhat.com>
The virCgroupNewDomainDriver and virCgroupNewDriver methods
are obsolete now that we can auto-detect existing cgroup
placement. Delete them to reduce code bloat.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add virCgroupIsValidMachine API to check whether an auto
detected cgroup is valid for a machine. This lets us
check if a VM has just been placed into some generic
shared cgroup, or worse, the root cgroup
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a virCgroupNewDetect API which is used to initialize a
cgroup object with the placement of an arbitrary process.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If systemd machine does not exist, return -2 instead of -1,
so that applications don't need to repeat the tedious error
checking code
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Current code for handling dbus errors only works for errors
received from the remote application itself. We must also
handle errors emitted by the bus itself, for example, when
it fails to spawn the target service.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit 834c9c94 introduced virDBusMessageEncode and
virDBusMessageDecode functions, however corresponding stubs
were not added to !WITH_DBUS section, therefore 'make check'
started to fail when compiled w/out dbus support like that:
Expected symbol virDBusMessageDecode is not in ELF library
Convert the remaining methods in vircgroup.c to report errors
instead of returning errno values.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add virErrorSetErrnoFromLastError and virLastErrorIsSystemErrno
to simplify code which wants to handle system errors in a more
graceful fashion.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To register virtual machines and containers with systemd-machined,
and thus have cgroups auto-created, we need to talk over DBus.
This is somewhat tedious code, so introduce a dedicated function
to isolate the DBus call in one place.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Doing DBus method calls using libdbus.so is tedious in the
extreme. systemd developers came up with a nice high level
API for DBus method calls (sd_bus_call_method). While
systemd doesn't use libdbus.so, their API design can easily
be ported to libdbus.so.
This patch thus introduces methods virDBusCallMethod &
virDBusMessageRead, which are based on the code used for
sd_bus_call_method and sd_bus_message_read. This code in
systemd is under the LGPLv2+, so we're license compatible.
This code is probably pretty unintelligible unless you are
familiar with the DBus type system. So I added some API
docs trying to explain how to use them, as well as test
cases to validate that I didn't screw up the adaptation
from the original systemd code.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When virAsprintf was changed from a function to a macro
reporting OOM error in dc6f2da, it was documented as returning
0 on success. This is incorrect, it returns the number of bytes
written as asprintf does.
Some of the functions were converted to use virAsprintf's return
value directly, changing the return value on success from 0 to >= 0.
For most of these, this is not a problem, but the change in
virPCIDriverDir breaks PCI passthrough.
The return value check in virhashtest pre-dates virAsprintf OOM
conversion.
vmwareMakePath seems to be unused.
Merge the virCommandPreserveFD / virCommandTransferFD methods
into a single virCommandPasFD method, and use a new
VIR_COMMAND_PASS_FD_CLOSE_PARENT to indicate their difference
in behaviour
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reuse the buffer for getline and track buffer allocation
separately from the string length to prevent unlikely
out-of-bounds memory access.
This fixes the following leak that happened when zero bytes were read:
==404== 120 bytes in 1 blocks are definitely lost in loss record 1,344 of 1,671
==404== at 0x4C2C71B: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==404== by 0x906F862: getdelim (iogetdelim.c:68)
==404== by 0x52A48FB: virCgroupPartitionNeedsEscaping (vircgroup.c:1136)
==404== by 0x52A0FB4: virCgroupPartitionEscape (vircgroup.c:1171)
==404== by 0x52A0EA4: virCgroupNewDomainPartition (vircgroup.c:1450)
I recently patches the callers to virPCIDeviceReset() to not call it
if the current driver for a device was vfio-pci (since that driver
will always reset the device itself when appropriate. At the time, Dan
Berrange suggested that I could instead modify virPCIDeviceReset
to check the currently bound driver for the device, and decide
for itself whether or not to go ahead with the reset.
This patch removes the previously added checks, and replaces them with
a check down in virPCIDeviceReset(), as suggested.
The functional difference here is that previously we were deciding
based on either the hostdev configuration or the value of
stubDriverName in the virPCIDevice object, but now we are actually
comparing to the "driver" link in the device's sysfs entry
directly. In practice, both should be the same.
virPCIDeviceGetDriverPathAndName is a static function that will need
to be called by another function that occurs above it in the
file. This patch reorders the static functions so that a forward
declaration isn't needed.
Previously a connection object was required to retrieve the auth
credentials. This patch adds the option to call the retrieval functions
only using the connection URI or path to the configuration file. This
will allow to use this toolkit to request passwords for ssh
authentication in the libssh2 connection driver.
Changes:
*virAuthGetConfigFilePathURI(): use URI to retrieve the config file path
*virAuthGetCredential(): Remove the need to propagate conn object
virAuthGetPasswordPath():
*virAuthGetUsernamePath(): New functions, that use config file path
instead of conn object
https://bugzilla.redhat.com/show_bug.cgi?id=964358
POSIX states that multi-threaded apps should not use functions
that are not async-signal-safe between fork and exec, yet we
were using getpwuid_r and initgroups. Although rare, it is
possible to hit deadlock in the child, when it tries to grab
a mutex that was already held by another thread in the parent.
I actually hit this deadlock when testing multiple domains
being started in parallel with a command hook, with the following
backtrace in the child:
Thread 1 (Thread 0x7fd56bbf2700 (LWP 3212)):
#0 __lll_lock_wait ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1 0x00007fd5761e7388 in _L_lock_854 () from /lib64/libpthread.so.0
#2 0x00007fd5761e7257 in __pthread_mutex_lock (mutex=0x7fd56be00360)
at pthread_mutex_lock.c:61
#3 0x00007fd56bbf9fc5 in _nss_files_getpwuid_r (uid=0, result=0x7fd56bbf0c70,
buffer=0x7fd55c2a65f0 "", buflen=1024, errnop=0x7fd56bbf25b8)
at nss_files/files-pwd.c:40
#4 0x00007fd575aeff1d in __getpwuid_r (uid=0, resbuf=0x7fd56bbf0c70,
buffer=0x7fd55c2a65f0 "", buflen=1024, result=0x7fd56bbf0cb0)
at ../nss/getXXbyYY_r.c:253
#5 0x00007fd578aebafc in virSetUIDGID (uid=0, gid=0) at util/virutil.c:1031
#6 0x00007fd578aebf43 in virSetUIDGIDWithCaps (uid=0, gid=0, capBits=0,
clearExistingCaps=true) at util/virutil.c:1388
#7 0x00007fd578a9a20b in virExec (cmd=0x7fd55c231f10) at util/vircommand.c:654
#8 0x00007fd578a9dfa2 in virCommandRunAsync (cmd=0x7fd55c231f10, pid=0x0)
at util/vircommand.c:2247
#9 0x00007fd578a9d74e in virCommandRun (cmd=0x7fd55c231f10, exitstatus=0x0)
at util/vircommand.c:2100
#10 0x00007fd56326fde5 in qemuProcessStart (conn=0x7fd53c000df0,
driver=0x7fd55c0dc4f0, vm=0x7fd54800b100, migrateFrom=0x0, stdin_fd=-1,
stdin_path=0x0, snapshot=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE,
flags=1) at qemu/qemu_process.c:3694
...
The solution is to split the work of getpwuid_r/initgroups into the
unsafe portions (getgrouplist, called pre-fork) and safe portions
(setgroups, called post-fork).
* src/util/virutil.h (virSetUIDGID, virSetUIDGIDWithCaps): Adjust
signature.
* src/util/virutil.c (virSetUIDGID): Add parameters.
(virSetUIDGIDWithCaps): Adjust clients.
* src/util/vircommand.c (virExec): Likewise.
* src/util/virfile.c (virFileAccessibleAs, virFileOpenForked)
(virDirCreate): Likewise.
* src/security/security_dac.c (virSecurityDACSetProcessLabel):
Likewise.
* src/lxc/lxc_container.c (lxcContainerSetID): Likewise.
* configure.ac (AC_CHECK_FUNCS_ONCE): Check for setgroups, not
initgroups.
Signed-off-by: Eric Blake <eblake@redhat.com>
Since neither getpwuid_r() nor initgroups() are safe to call in
between fork and exec (they obtain a mutex, but if some other
thread in the parent also held the mutex at the time of the fork,
the child will deadlock), we have to split out the functionality
that is unsafe. At least glibc's initgroups() uses getgrouplist
under the hood, so the ideal split is to expose getgrouplist for
use before a fork. Gnulib already gives us a nice wrapper via
mgetgroups; we wrap it once more to look up by uid instead of name.
* bootstrap.conf (gnulib_modules): Add mgetgroups.
* src/util/virutil.h (virGetGroupList): New declaration.
* src/util/virutil.c (virGetGroupList): New function.
* src/libvirt_private.syms (virutil.h): Export it.
Signed-off-by: Eric Blake <eblake@redhat.com>
A future patch needs to look up pw_gid; but it is wasteful
to crawl through getpwuid_r twice for two separate pieces
of information, and annoying to copy that much boilerplate
code for doing the crawl. The current internal-only
virGetUserEnt is also a rather awkward interface; it's easier
to just design it to let callers request multiple pieces of
data as needed from one traversal.
And while at it, I noticed that virGetXDGDirectory could deref
NULL if the getpwuid_r lookup fails.
* src/util/virutil.c (virGetUserEnt): Alter signature.
(virGetUserDirectory, virGetXDGDirectory, virGetUserName): Adjust
callers.
Signed-off-by: Eric Blake <eblake@redhat.com>
Recent changes uncovered a NEGATIVE_RETURNS in the return from sysconf()
when processing a for loop in virtTestCaptureProgramExecChild() in
testutils.c
Code review uncovered 3 other code paths with the same condition that
weren't found by Covirity, so fixed those as well.
I had made the change locally, so make check and make syntax-check
were successful, but forgot to add/commit. Unfortunately, git allows a
push when the local directory is dirty, so it didn't catch my mistake.
Eliminate memmove() by using VIR_*_ELEMENT API instead.
In both pci and usb cases, the count that held the size of the list
was unsigned int so it had to be changed to size_t.
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Actually, I'm turning this function into a macro as filename,
function name and line number needs to be passed. The new
function virAsprintfInternal is introduced with the extended set
of arguments.
The device bus value was used instead of the device target when
building the sysfs device path. Trivial.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Whenever virPortAllocatorRelease is called with port == 0, it complains
that the port is not in an allowed range, which is expectable as the
port was never allocated. Let's make virPortAllocatorRelease ignore 0
ports in a similar way free() ignores NULL pointers.
When removing a TAP device, the associated bandwidth settings are
removed. Currently, the /sbin/tc is used for that. It is spawned
several times. Moreover, we use the same @cmd variable to
construct the command and its arguments. That means we need to
virCommandFree(cmd); prior to each virCommandNew(TC); which
wasn't done.
On Fedora 18, when cross-compiling to mingw with the mingw*-dbus
packages installed, compilation fails with:
CC libvirt_net_rpc_server_la-virnetserver.lo
In file included from /usr/i686-w64-mingw32/sys-root/mingw/include/dbus-1.0/dbus/dbus-connection.h:32:0,
from /usr/i686-w64-mingw32/sys-root/mingw/include/dbus-1.0/dbus/dbus-bus.h:30,
from /usr/i686-w64-mingw32/sys-root/mingw/include/dbus-1.0/dbus/dbus.h:31,
from ../../src/util/virdbus.h:26,
from ../../src/rpc/virnetserver.c:39:
/usr/i686-w64-mingw32/sys-root/mingw/include/dbus-1.0/dbus/dbus-message.h:74:58: error: expected ';', ',' or ')' before 'struct'
I have reported this as a bug against two packages:
- mingw-headers, for polluting the namespace
https://bugzilla.redhat.com/show_bug.cgi?id=980270
- dbus, for not dealing with the pollution
https://bugzilla.redhat.com/show_bug.cgi?id=980278
At least dbus has agreed that a future version of dbus headers will
do s/interface/iface/, regardless of what happens in mingw. But it
is also easy to workaround in libvirt in the meantime, without having
to wait for either mingw or dbus to upgrade.
* src/util/virdbus.h (includes): Undo mingw's pollution so that
dbus doesn't fail.
Signed-off-by: Eric Blake <eblake@redhat.com>
iptablesContext holds only 4 pairs of iptables
(table, chain) and there's no need to pass
it around.
This is a first step towards separating bridge_driver.c
in platform-specific parts.
This fixes https://bugzilla.redhat.com/show_bug.cgi?id=971325
The problem was that if virPCIGetVirtualFunctions was given the name
of a non-existent interface, it would return to its caller without
initializing the pointer to the array of virtual functions to NULL,
and the caller (virNetDevGetVirtualFunctions) would try to VIR_FREE()
the invalid pointer.
The final error message before the crash would be:
virPCIGetVirtualFunctions:2088 :
Failed to open dir '/sys/class/net/eth2/device':
No such file or directory
In this patch I move the initialization in virPCIGetVirtualFunctions()
to the begining of the function, and also do an explicit
initialization in virNetDevGetVirtualFunctions, just in case someone
in the future adds code into that function prior to the call to
virPCIGetVirtualFunctions.
The IF_MAXUNIT macro is not present on all BSDs, so
make its use conditional, to avoid breaking OS-X.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The 'in_addr_t' typedef is not present in Mingw64 headers.
Instead we can use the more portable 'struct in_addr' and
then access its 's_addr' field.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When creating a virtual FC HBA with virsh/libvirt API, an error message
will be returned: "error: Node device not found",
also the 'nodedev-dumpxml' shows wrong information of wwpn & wwnn
for the new created device.
Signed-off-by: xschen@tnsoft.com.cn
This reverts f90af69 which switched wwpn & wwwn in the wrong place.
https://www.kernel.org/doc/Documentation/scsi/scsi_fc_transport.txt
Building on FreeBSD had this linker error:
/work/a/ports/devel/libvirt/work/libvirt-1.1.0/src/.libs/libvirt.so:
undefined reference to `virPCIDeviceAddressParse'
This was caused by the new use of virPCIDeviceAddressParse in a
portion of virpci.c that wasn't linux-only (in commit 72c029d8). The
problem was that virPCIDeviceAddressParse had originally been defined
inside #ifdef _linux (because it was only used by another function
that was inside the same ifdef).
The solution is to move it out to the part of virpci.c that is
compiled on all platforms.
(Because the portion that was "moved" was 40-50 lines, but only moved
up by 15 lines, the diff for the patch is less than non-informative -
rather than showing that part that I moved, it shows the bit that was
previously before the moved part, and now sits *after* it.)
Any device which belongs to an "IOMMU group" (used by vfio) will
have links to all devices of its group listed in
/sys/bus/pci/$device/iommu_group/devices;
/sys/bus/pci/$device/iommu_group is actually a link to
/sys/kernel/iommu_groups/$n, where $n is the group number (there
will be a corresponding device node at /dev/vfio/$n once the
devices are bound to the vfio-pci driver)
The following functions are added:
virPCIDeviceGetIOMMUGroupList
Gets a virPCIDeviceList with one virPCIDeviceList for each device
in the same IOMMU group as the provided virPCIDevice (a copy of the
original device object is included in the list.
virPCIDeviceAddressIOMMUGroupIterate
Calls the function @actor once for each device in the group that
contains the given virPCIDeviceAddress.
virPCIDeviceAddressGetIOMMUGroupAddresses
Fills in a virPCIDeviceAddressPtr * with an array of
virPCIDeviceAddress, one for each device in the iommu group of the
provided virPCIDeviceAddress (including a copy of the original).
virPCIDeviceAddressGetIOMMUGroupNum
Returns the group number as an int (a valid group number will always
be 0 or greater). If there is no iommu_group link in the device's
directory (usually indicating that vfio isn't loaded), -2 will be
returned. On any real error, -1 will be returned.
We only break out of the while loop if *content is an empty string.
However the buffer has been allocated to BUFSIZ + 1 (8193 in my case),
but it gets overwritten in the next for iteration.
Move VIR_FREE right before we overwrite it to avoid the leak.
==5777== 16,386 bytes in 2 blocks are definitely lost in loss record 1,022 of 1,027
==5777== by 0x5296E28: virReallocN (viralloc.c:184)
==5777== by 0x52B0C66: virFileReadLimFD (virfile.c:1137)
==5777== by 0x52B0E1A: virFileReadAll (virfile.c:1199)
==5777== by 0x529B092: virCgroupGetValueStr (vircgroup.c:534)
==5777== by 0x529AF64: virCgroupMoveTask (vircgroup.c:1079)
Introduced by 83e4c77.
https://bugzilla.redhat.com/show_bug.cgi?id=978352
Don't check for '\n' at the end of file if zero bytes were read.
Found by valgrind:
==404== Invalid read of size 1
==404== at 0x529B09F: virCgroupGetValueStr (vircgroup.c:540)
==404== by 0x529AF64: virCgroupMoveTask (vircgroup.c:1079)
==404== by 0x1EB475: qemuSetupCgroupForEmulator (qemu_cgroup.c:1061)
==404== by 0x1D9489: qemuProcessStart (qemu_process.c:3801)
==404== by 0x18557E: qemuDomainObjStart (qemu_driver.c:5787)
==404== by 0x190FA4: qemuDomainCreateWithFlags (qemu_driver.c:5839)
Introduced by 0d0b409.
https://bugzilla.redhat.com/show_bug.cgi?id=978356
The "fix" I pushed a few commits ago would still leak a virPCIDevice
in case of an OOM error. Although it's inconsequential in practice,
this patch satisfies my OCD.
The same strings were being re-created multiple times just to save
declaring a new variable. In the meantime, the use of the generic
variable names led to confusion when trying to follow the code. This
patch creates strings for:
stubDriverName (was called "driver" in original args)
stubDriverPath ("/sys/bus/pci/drivers/${stubDriverName}")
driverLink ("${device}/driver")
oldDriverName (the final component of path linked to by
"${device}/driver")
oldDriverPath ("/sys/bus/pci/drivers/${oldDriverName}")
then re-uses them as necessary.
I realized after the fact that it's probably better in the long run to
give this function a name that matches the name of the link used in
sysfs to hold the group (iommu_group).
I'm changing it now because I'm about to add several more functions
that deal with iommu groups.
The driver arg to virPCIDeviceDetach is no longer used (the name of the stub driver is now set in the virPCIDevice object, and virPCIDeviceDetach retrieves it from there). Remove it.
Commit 861d40565 added code (my personal change to "clean up" the
submitter's code, *not* the fault of the submitter) that dereferenced
virtVlan without first checking for NULL. This patch fixes that and,
as part of the fix, cleans up some unnecessary obtuseness.
virNetDevBridgeSetSTPDelay accepts delay in milliseconds,
but BSD implementation was expecting seconds. Therefore,
it was working correctly only with delay == 0.
This patch adds functionality to allow libvirt to configure the
'native-tagged' and 'native-untagged' modes on openvswitch networks.
Signed-off-by: Laine Stump <laine@redhat.com>
All APIs that take typed parameters are only using params address in
their entry point debug messages. With the new VIR_TYPED_PARAMS_DEBUG
macro, all functions can easily log all individual typed parameters
passed to them.
When unsupported parameter is passed to virTypedParamsValidate,
VIR_ERR_ARGUMENT_UNSUPPORTED should be returned rather than
VIR_ERR_INVALID_ARG, which is more appropriate for supported parameters
used incorrectly.
virPCIDeviceDetach would previously sometimes consume the input device
object (to put it on the inactive list) and sometimes not. Avoiding
memory leaks required checking beforehand to see if the device was
already on the list, and freeing the device object in the caller only
if there wasn't already an identical object on the inactive list.
This patch makes it consistent - virPCIDeviceDetach will *never*
consume the input virPCIDevice object; if it needs to put one on the
inactive list, it will create a copy and put *that* on the list. This
way the caller knows that it is always their responsibility to free
the device object they created.
virPCIDeviceReattach was making the assumption that the dev object
given to it was one and the same with the dev object on the
inactiveDevs list. If that had been the case, it would not need to
free the dev object it removed from the inactive list, because the
caller of virPCIDeviceReattach always frees the dev object that it
passes in. Since the dev object passed in is *never* the same object
that's on the list (it is a different object with the same name and
attributes, created just for the purpose of searching for the actual
object), simply doing a "ListSteal" to remove the object from the list
results in one leaked object; we need to actually free the object
after removing it from the list.
* virPCIDeviceFindByIDs - find a device on a list w/o creating an object
This makes searching for an existing device on a list lighter weight.
* virPCIDeviceCopy - make a copy of an existing virPCIDevice object.
* virPCIDeviceGetDriverPathAndName - construct new strings containing
1) the name of the driver bound to this device.
2) the full path to the sysfs config for that driver.
(This code was lifted from virPCIDeviceUnbindFromStub, and replaced
there with a call to this new function).
Previously stubDriver was always set from a string literal, so it was
okay to use a const char * that wasn't freed when the virPCIDevice was
freed. This will not be the case in the near future, so it is now a
char* that is allocated in virPCIDeviceSetStubDriver() and freed
during virPCIDeviceFree().
This patch introduces the virAccessManagerPtr class as the
interface between virtualization drivers and the access
control drivers. The viraccessperm.h file defines the
various permissions that will be used for each type of object
libvirt manages
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
It's not used anywhere except for the switch in
virStorageBackendCreateQemuImgOpts, where leaving it in causes
a dead code coverity warning and omitting it breaks compilation
because of unhandled enum value.
Introduced by 6298f74.
Detect qcow2 images with version 3 in the image header as
VIR_STORAGE_FILE_QCOW2.
These images have a feature bitfield, with just one feature supported
so far: lazy_refcounts.
The header length changed too, moving the location of the backing
format name.
Call virLogVMessage instead of virLogMessage, since libudev
called us with a va_list object, not a list of arguments.
Honor message priority and strip the trailing newline.
https://bugzilla.redhat.com/show_bug.cgi?id=969152
virProcessGetNamespaces() opens files in /proc/XXX/ns/ which will
later be passed to setns(). We have to make sure that the file
descriptors in the array are in the correct order. In particular
the 'user' namespace must be first otherwise setns() may fail
for other namespaces.
The order has been taken from util-linux's sys-utils/nsenter.c
Also we must ignore EINVAL in setns() which occurs if the
namespace associated with the fd, matches the calling process'
current namespace.
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Such as on FreeBSD. Broken in commit aa2a4cff7.
* src/util/virstoragefile.c (virStorageFileResize): Add missing ';',
mark conditionally unused variables.
Signed-off-by: Eric Blake <eblake@redhat.com>
The document for "vol-resize" says the new capacity will be sparse
unless "--allocate" is specified, however, the "--allocate" flag
is never implemented. This implements the "--allocate" flag for
fs backend's raw type volume, based on posix_fallocate and the
syscall SYS_fallocate.
We can't use GNULIB's fprintf-posix due to licensing
incompatibilities. We do already have a portable
formatting via virAsprintf() which we got from GNULIB
though. We can use to create a virFilePrintf() function.
But really gnulib could just provide a 'fprintf'
module, that depended on just its 'asprintf' module.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
I noticed several unusual spacings in for loops, and decided to
fix them up. See the next commit for the syntax check that found
all of these.
* examples/domsuspend/suspend.c (main): Fix spacing.
* python/libvirt-override.c: Likewise.
* src/conf/interface_conf.c: Likewise.
* src/security/virt-aa-helper.c: Likewise.
* src/util/virconf.c: Likewise.
* src/util/virhook.c: Likewise.
* src/util/virlog.c: Likewise.
* src/util/virsocketaddr.c: Likewise.
* src/util/virsysinfo.c: Likewise.
* src/util/viruuid.c: Likewise.
* src/vbox/vbox_tmpl.c: Likewise.
* src/xen/xen_hypervisor.c: Likewise.
* tools/virsh-domain-monitor.c (vshDomainStateToString): Drop
default case, to let compiler check us.
* tools/virsh-domain.c (vshDomainVcpuStateToString): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
When src is NULL, VIR_STRDUP will return 0 directly.
This patch will set dest to NULL before VIR_STRDUP return.
Example:
[root@yds-pc libvirt]# virsh
Welcome to virsh, the virtualization interactive terminal.
Type: 'help' for help with commands
'quit' to quit
virsh # connect
error: Failed to connect to the hypervisor
error: internal error Unable to parse URI �N�*
Signed-off-by: yangdongsheng <yangds.fnst@cn.fujitsu.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
With previous patch, we accept negative value as length of string to
duplicate. So there is no need to pass strlen(src) in case we want to do
duplicate the whole string.
It may shorten the code a bit as the following pattern:
VIR_STRNDUP(dst, src, cond ? n : strlen(src))
is used on several places among our code. However, we can
move the strlen into virStrndup and thus write just:
VIR_STRNDUP(dst, src, cond ? n : -1)
Currently, the controllers argument to virCgroupDetect acts both as
a result filter and a required controller specification, which is
a bit overloaded. If both functionalities are needed, it would be
better to have them seperated into a filter and a requirement mask.
The only situation where it is used today is to ensure that only
CPU related controllers are used for the VCPU directories. But here
we clearly do not want to enforce the existence of cpu, cpuacct and
specifically not cpuset at the same time.
This commit changes the semantics of controllers to "filter only".
Should a required mask ever be needed, more work will have to be done.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Within whole vircgroup.c we 'return -errno', e.g. 'return -ENOMEM'.
However, in this specific function virCgroupAddTaskStrController
we weren't returning -ENOMEM but -1 despite fact that later in
the function we are returning one of errno values indeed.
In my previous patches I enabled the IFF_MULTI_QUEUE flag every
time the user requested multiqueue TAP device. However, this
works only at runtime. During build time the flag may be
undeclared.
In order to learn libvirt multiqueue several things must be done:
1) The '/dev/net/tun' device needs to be opened multiple times with
IFF_MULTI_QUEUE flag passed to ioctl(fd, TUNSETIFF, &ifr);
2) Similarly, '/dev/vhost-net' must be opened as many times as in 1)
in order to keep 1:1 ratio recommended by qemu and kernel folks.
3) The command line construction code needs to switch from 'fd=X' to
'fds=X:Y:...:Z' and from 'vhostfd=X' to 'vhostfds=X:Y:...:Z'.
4) The monitor handling code needs to learn to pass multiple FDs.
https://bugzilla.redhat.com/show_bug.cgi?id=965169 documents a
problem starting domains when cgroups are enabled; I was able
to reliably reproduce the race about 5% of the time when I added
hooks to domain startup by 3 seconds (as that seemed to be about
the length of time that qemu created and then closed a temporary
thread, probably related to aio handling of initially opening
a disk image). The problem has existed since we introduced
virCgroupMoveTask in commit 9102829 (v0.10.0).
There are some inherent TOCTTOU races when moving tasks between
kernel cgroups, precisely because threads can be created or
completed in the window between when we read a thread id from the
source and when we write to the destination. As the goal of
virCgroupMoveTask is merely to move ALL tasks into the new
cgroup, it is sufficient to iterate until no more threads are
being created in the old group, and ignoring any threads that
die before we can move them.
It would be nicer to start the threads in the right cgroup to
begin with, but by default, all child threads are created in
the same cgroup as their parent, and we don't want vcpu child
threads in the emulator cgroup, so I don't see any good way
of avoiding the move. It would also be nice if the kernel were
to implement something like rename() as a way to atomically move
a group of threads from one cgroup to another, instead of forcing
a window where we have to read and parse the source, then format
and write back into the destination.
* src/util/vircgroup.c (virCgroupAddTaskStrController): Ignore
ESRCH, because a thread ended between read and write attempts.
(virCgroupMoveTask): Loop until all threads have moved.
Signed-off-by: Eric Blake <eblake@redhat.com>
Resolves:https://bugzilla.redhat.com/show_bug.cgi?id=927620
#kill -STOP `pidof qemu-kvm`
#virsh destroy $guest --graceful
error: Failed to destroy domain testVM
error: An error occurred, but the cause is unknown
With --graceful, SIGTERM always is emitted to kill driver
process, but it won't success till burning out waiting time
in case of process being stopped.
But domain destroy without --graceful can work, SIGKILL will
be emitted to the stopped process after 10 secs which always
kills a process even one that is currently stopped.
So report an error after burning out waiting time in this case.
Change bbe97ae968 caused the
QEMU driver to ignore ENOENT errors from cgroups, in order
to cope with missing /proc/cgroups. This is not good though
because many other things can cause ENOENT and should not
be ignored. The callers expect to see ENXIO when cgroups
are not present, so adjust the code to report that errno
when /proc/cgroups is missing
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In an upcoming patch, I need the way to safely transfer a nested
virJSON object out of its parent container for independent use,
even after the parent is freed.
* src/util/virjson.h (virJSONValueObjectRemoveKey): New function.
(_virJSONObject, _virJSONArray): Use correct type.
* src/util/virjson.c (virJSONValueObjectRemoveKey): Implement it.
* src/libvirt_private.syms (virjson.h): Export it.
* tests/jsontest.c (mymain): Test it.
Signed-off-by: Eric Blake <eblake@redhat.com>
network: static route support for <network>
This patch adds the <route> subelement of <network> to define a static
route. the address and prefix (or netmask) attribute identify the
destination network, and the gateway attribute specifies the next hop
address (which must be directly reachable from the containing
<network>) which is to receive the packets destined for
"address/(prefix|netmask)".
These attributes are translated into an "ip route add" command that is
executed when the network is started. The command used is of the
following form:
ip route add <address>/<prefix> via <gateway> \
dev <virbr-bridge> proto static metric <metric>
Tests are done to validate that the input data are correct. For
example, for a static route ip definition, the address must be a
network address and not a host address. Additional checks are added
to ensure that the specified gateway is directly reachable via this
network (i.e. that the gateway IP address is in the same subnet as one
of the IP's defined for the network).
prefix='0' is supported for both family='ipv4' address='0.0.0.0'
netmask='0.0.0.0' or prefix='0', and for family='ipv6' address='::',
prefix=0', although care should be taken to not override a desired
system default route.
Anytime an attempt is made to define a static route which *exactly*
duplicates an existing static route (for example, address=::,
prefix=0, metric=1), the following error message will be sent to
syslog:
RTNETLINK answers: File exists
This can be overridden by decreasing the metric value for the route
that should be preferred, or increasing the metric for the route that
shouldn't be preferred (and is thus in place only in anticipation that
the preferred route may be removed in the future). Caution should be
used when manipulating route metrics, especially for a default route.
Note: The use of the command-line interface should be replaced by
direct use of libnl so that error conditions can be handled better. But,
that is being left as an exercise for another day.
Signed-off-by: Gene Czarcinski <gene@czarc.net>
Signed-off-by: Laine Stump <laine@laine.org>
Currently we report a bogus error message when macvlan
creation fails:
error: Failed to start domain migtest
error: operation failed: Unable to create macvlan device
With this removed, we see the real error:
error: Failed to start domain migtest
error: Unable to get index for interface p31p1: No such device
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Use of the select() system call is inherantly dangerous since
applications will hit a buffer overrun if any FD number exceeds
the size of the select set size (typically 1024). Replace the
two uses of select() with poll() and use cfg.mk to ban any
future use of select().
NB: This changes the phyp driver so that it uses an infinite
timeout, instead of busy-waiting for 1ms at a time.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Found that I was unable to start existing domains after updating
to a kernel with no cgroups support
# zgrep CGROUP /proc/config.gz
# CONFIG_CGROUPS is not set
# virsh start test
error: Failed to start domain test
error: Unable to initialize /machine cgroup: Cannot allocate memory
virCgroupPartitionNeedsEscaping() correctly returns errno (ENOENT) when
attempting to open /proc/cgroups on such a system, but it was being
dropped in virCgroupSetPartitionSuffix().
Change virCgroupSetPartitionSuffix() to propagate errors returned by
its callees. Also check for ENOENT in qemuInitCgroup() when determining
if cgroups support is available.
Add a virFileNBDDeviceAssociate method, which given a filename
will setup a NBD device, using qemu-nbd as the server.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To correctly handle errors from readdir() you must set 'errno'
to zero before invoking it & check its value afterwards to
distinguish error from EOF.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The helper works for default sysfs_prefix, but for user specified
prefix, it doesn't work. (Detected when writing test cases. A later
patch will add the test cases for fc_host).
In case of the caller can pass a "prefix" (or "sysfs_prefix")
without the trailing slash, and Unix-Like system always eats
up the redundant "slash" in the filepath, let's add it explicitly.
Introduced by commit 244ce462e2, which refactored the helper for wwn
reading, however, it forgot to change the old "strndup" and "sizeof(buf)",
"sizeof(buf)" operates on the fixed length array ("buf") in the old code,
but now "buf" is a pointer.
Before the fix:
% virsh nodedev-dumpxml scsi_host5
<device>
<name>scsi_host5</name>
<parent>pci_0000_04_00_1</parent>
<capability type='scsi_host'>
<host>5</host>
<capability type='fc_host'>
<wwnn>2001001b</wwnn>
<wwpn>2101001b</wwpn>
<fabric_wwn>2001000d</fabric_wwn>
</capability>
</capability>
</device>
With the fix:
% virsh nodedev-dumpxml scsi_host5
<device>
<name>scsi_host5</name>
<parent>pci_0000_04_00_1</parent>
<capability type='scsi_host'>
<host>5</host>
<capability type='fc_host'>
<wwnn>0x2001001b32a9da4e</wwnn>
<wwpn>0x2101001b32a9da4e</wwpn>
<fabric_wwn>0x2001000dec9877c1</fabric_wwn>
</capability>
</capability>
</device>
Commit bfe7721d introduced a regression, but only on platforms
like FreeBSD that lack posix_fallocate and where mmap serves as
a nice fallback for safezero.
util/virfile.c: In function 'safezero':
util/virfile.c:837: error: 'PROT_READ' undeclared (first use in this function)
* src/util/virutil.c (includes): Move use of <sys/mman.h>...
* src/util/virfile.c (includes): ...to the file that uses mmap.
Signed-off-by: Eric Blake <eblake@redhat.com>
Apps using libvirt will often have code like
if (virXXXX() < 0) {
virErrorPtr err = virGetLastError();
fprintf(stderr, "Something failed: %s\n",
err && err->message ? err->message :
"unknown error");
return -1;
}
Checking for a NULL error object or message leads to very
verbose code. A virGetLastErrorMessage() helper from libvirt
can simplify this to
if (virXXXX() < 0) {
fprintf(stderr, "Something failed: %s\n",
virGetLastErrorMessage());
return -1;
}
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
- provide virNetDevSetMAC() implementation based on SIOCSIFLLADDR
ioctl.
- adjust virNetDevExists() to check for ENXIO error because
FreeBSD throws it when device doesn't exist
Signed-off-by: Eric Blake <eblake@redhat.com>
These all existed before virfile.c was created, and for some reason
weren't moved.
This is mostly straightfoward, although the syntax rule prohibiting
write() had to be changed to have an exception for virfile.c instead
of virutil.c.
This movement pointed out that there is a function called
virBuildPath(), and another almost identical function called
virFileBuildPath(). They really should be a single function, which
I'll take care of as soon as I figure out what the arglist should look
like.
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=851411https://bugzilla.redhat.com/show_bug.cgi?id=955500
The first problem was that virFileOpenAs was returning fd (-1) in one
of the error cases rather than ret (-errno), so the caller thought
that the error was EPERM rather than ENOENT.
The second problem was that some log messages in the general purpose
qemuOpenFile() function would always say "Failed to create" even if
the caller hadn't included O_CREAT (i.e. they were trying to open an
existing file).
This fixes virFileOpenAs to jump down to the error return (which
returns ret instead of fd) in the previously mentioned incorrect
failure case of virFileOpenAs(), removes all error logging from
virFileOpenAs() (since the callers report it), and modifies
qemuOpenFile to appropriately use "open" or "create" in its log
messages.
NB: I seriously considered removing logging from all callers of
virFileOpenAs(), but there is at least one case where the caller
doesn't want virFileOpenAs() to log any errors, because it's just
going to try again (qemuOpenFile()). We can't simply make a silent
variation of virFileOpenAs() though, because qemuOpenFile() can't make
the decision about whether or not it wants to retry until after
virFileOpenAs() has already returned an error code.
Likewise, I also considered changing virFileOpenAs() to return -1 with
errno set on return, and may still do that, but only as a separate
patch, as it obscures the intent of this patch too much.
The individual hypervisor drivers were directly referencing
APIs in virnodesuspend.c in their virDriverPtr struct. Separate
these methods, so there is always a wrapper in the hypervisor
driver. This allows the unused virConnectPtr args to be removed
from the virnodesuspend.c file. Again this will ensure that
ACL checks will only be performed on invocations that are
directly associated with public API usage.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the virGetHostname() API has a bogus virConnectPtr
parameter. This is because virtualization drivers directly
reference this API in their virDriverPtr tables, tieing its
API design to the public virConnectGetHostname API design.
This also causes problems for access control checks since
these must only be done for invocations from the public
API, not internal invocation.
Remove the bogus virConnectPtr parameter, and make each
hypervisor driver provide a dedicated function for the
driver API impl. This will allow access control checks
to be easily inserted later.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Since PIDs can be reused, polkit prefers to be given
a (PID,start time) pair. If given a PID on its own,
it will attempt to lookup the start time in /proc/pid/stat,
though this is subject to races.
It is safer if the client app resolves the PID start
time itself, because as long as the app has the client
socket open, the client PID won't be reused.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There are various methods named "virXXXXSecurityContext",
which are specific to SELinux. Rename them all to
"virXXXXSELinuxContext". They will still raise errors at
runtime if SELinux is not compiled in
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
While reviewing proposed VIR_STRDUP conversions, I've already noticed
several places that do:
if (str && VIR_STRDUP(dest, str) < 0)
which can be simplified by allowing str to be NULL (something that
strdup() doesn't allow). Meanwhile, code that wants to ensure a
non-NULL dest regardless of the source can check for <= 0.
Also, make it part of the VIR_STRDUP contract that macro arguments
are evaluated exactly once.
* src/util/virstring.h (VIR_STRDUP, VIR_STRDUP_QUIET, VIR_STRNDUP)
(VIR_STRNDUP_QUIET): Improve contract.
* src/util/virstring.c (virStrdup, virStrndup): Change return
conventions.
* docs/hacking.html.in: Document this.
* HACKING: Regenerate.
Signed-off-by: Eric Blake <eblake@redhat.com>
VIR_APPEND_ELEMENT(array, size, elem) was not safe if the expression
for 'size' had side effects. While no one in the current code base
was trying to pass side effects, we might as well be robust and
explicitly document our intentions.
* src/util/viralloc.c (virInsertElementsN): Add special case.
* src/util/viralloc.h (VIR_APPEND_ELEMENT): Use it.
(VIR_ALLOC, VIR_ALLOC_N, VIR_REALLOC_N, VIR_EXPAND_N)
(VIR_RESIZE_N, VIR_SHRINK_N, VIR_INSERT_ELEMENT)
(VIR_DELETE_ELEMENT, VIR_ALLOC_VAR, VIR_FREE): Document
which macros are safe in the presence of side effects.
* docs/hacking.html.in: Document this.
* HACKING: Regenerate.
Signed-off-by: Eric Blake <eblake@redhat.com>
The code adaptation is not done right now, but in subsequent patches.
Hence I am not implementing syntax-check rule as it would break
compilation. Developers are strongly advised to use these new macros.
They are similar to VIR_ALLOC() logic: VIR_STRDUP(dst, src) returns zero
on success, -1 otherwise. In case you don't want to report OOM error,
use the _QUIET variant of a macro.
POSIX says pthread_t is opaque. We can't guarantee if it is scaler
or a pointer, nor what size it is; and BSD differs from Linux.
We've also had reports of gcc complaining on attempts to cast it,
if we use a cast to the wrong type (for example, pointers have to be
cast to void* or intptr_t before being narrowed; while casting a
function return of scalar pthread_t to void* triggers a different
warning).
Give up on casts, and use unions to get at decent bits instead. And
rather than futz around with figuring which 32 bits of a potentially
64-bit pointer are most likely to be unique, convert the rest of
the code base to use 64-bit values when using a debug id.
Based on a report by Guido Günther against kFreeBSD, but with a
fix that doesn't regress commit 4d970fd29 for FreeBSD.
* src/util/virthreadpthread.c (virThreadSelfID, virThreadID): Use
union to get at a decent bit representation of thread_t bits.
* src/util/virthread.h (virThreadSelfID, virThreadID): Alter
signature.
* src/util/virthreadwin32.c (virThreadSelfID, virThreadID):
Likewise.
* src/qemu/qemu_domain.h (qemuDomainJobObj): Alter type of owner.
* src/qemu/qemu_domain.c (qemuDomainObjTransferJob)
(qemuDomainObjSetJobPhase, qemuDomainObjReleaseAsyncJob)
(qemuDomainObjBeginNestedJob, qemuDomainObjBeginJobInternal): Fix
clients.
* src/util/virlog.c (virLogFormatString): Likewise.
* src/util/vireventpoll.c (virEventPollInterruptLocked):
Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit 776d49f4 added a static function that is only called
conditionally; leading to this compile error on mingw:
CC libvirt_util_la-virprocess.lo
../../src/util/virprocess.c:624:26: error: 'struct rlimit' declared inside parameter list [-Werror]
../../src/util/virprocess.c:624:26: error: its scope is only this definition or declaration, which is probably not what you want [-Werror]
../../src/util/virprocess.c:622:1: error: 'virProcessPrLimit' defined but not used [-Werror=unused-function]
* src/util/virprocess.c (virProcessPrLimit): Only declare
virProcessPrLimit when used.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit 7c9a2d88 cleaned up too many headers; FreeBSD builds
failed due to:
util/virutil.c:556: warning: implicit declaration of function 'canonicalize_file_name'
(Not sure which Linux header leaked this declaration, but gnulib
only guarantees it in stdlib.h)
libvirt.c:956: warning: implicit declaration of function 'virGetUserConfigDirectory'
(Here, a build on Linux was picking up virutil.h indirectly via
one of the conditional driver headers, where that driver was not
being built on my FreeBSD setup)
* src/util/virutil.c (includes): Need <stdlib.h> for
canonicalize_file_name.
* src/libvirt.c (includes): Use "virutil.h" unconditionally,
rather than relying on conditional indirect inclusion.
Signed-off-by: Eric Blake <eblake@redhat.com>
The source code base needs to be adapted as well. Some files
include virutil.h just for the string related functions (here,
the include is substituted to match the new file), some include
virutil.h without any need (here, the include is removed), and
some require both.
virPCIDeviceReattach and virPCIDeviceUnbindFromStub (called by
virPCIDeviceReattach) had previously required the name of the stub
driver as input. This is unnecessary, because the name of the driver
the device is currently bound to can be found by looking at the link:
/sys/bus/pci/dddd:bb:ss.ff/driver
Instead of requiring that the name of the expected stub driver name
and only unbinding if that one name is matched, we no longer take a
driver name in the arglist for either of these
functions. virPCIDeviceUnbindFromStub just compares the name of the
currently bound driver to a list of "well known" stubs (right now
contains "pci-stub" and "vfio-pci" for qemu, and "pciback" for xen),
and only performs the unbind if it's one of those devices.
This allows virsh nodedevice-reattach to work properly across a
libvirtd restart, and fixes a couple of cases where we were
erroneously still hard-coding "pci-stub" as the drive name.
For some unknown reason, virPCIDeviceReattach had been calling
modprobe on the stub driver prior to unbinding the device. This was
problematic because we no longer know the name of the stub driver in
that function. However, it is pointless to probe for the stub driver
at that time anyway - because the device is bound to the stub driver,
we are guaranteed that it is already loaded, and so that call to
modprobe has been removed.
On cygwin, compilation failed because SIOCSIFHWADDR is undefined.
* src/util/virnetdev.c (virNetDevSetMAC): Cygwin can query but not
set mac address.
Signed-off-by: Eric Blake <eblake@redhat.com>
FreeBSD (and maybe other BSDs) have different member
names in struct ifreq when compared to Linux, such as:
- uses ifr_data instead of ifr_newname for setting
interface names
- uses ifr_index instead of ifr_ifindex for interface
index
Also, add a check for SIOCGIFHWADDR for virNetDevValidateConfig().
Use AF_LOCAL if AF_PACKET is not available.
Signed-off-by: Eric Blake <eblake@redhat.com>
These fixes solve a compilation failure on FreeBSD:
util/virnetdevtap.c: In function 'virNetDevTapGetName':
util/virnetdevtap.c:56: warning: unused parameter 'tapfd' [-Wunused-parameter]
util/virnetdevtap.c:56: warning: unused parameter 'ifname' [-Wunused-parameter]
* src/util/virnetdevtap.c (virNetDevTapGetName): Add attributes
when TUNGETIFF is not present.
Signed-off-by: Eric Blake <eblake@redhat.com>
This will be used on a tap file descriptor returned by the bridge helper
to populate the <target> element, because the helper does not provide
the interface name.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch adds two sets of functions:
1) lower level virProcessSet*() functions that will immediately set
the RLIMIT_MEMLOCK. RLIMIT_NPROC, or RLIMIT_NOFILE of either the
current process (using setrlimit()) or any other process (using
prlimit()). "current process" is indicated by passing a 0 for pid.
2) functions for virCommand* that will setup a virCommand object to
set those limits at a later time just after it has forked a new
process, but before it execs the new program.
configure.ac has prlimit and setrlimit added to the list of functions
to check for, and the low level functions log an "unsupported" error)
on platforms that don't support those functions.
If a user cgroup name begins with "cgroup.", "_" or with any of
the controllers from /proc/cgroups followed by a dot, then they
need to be prefixed with a single underscore. eg if there is
an object "cpu.service", then this would end up as "_cpu.service"
in the cgroup filesystem tree, however, "waldo.service" would
stay "waldo.service", at least as long as nobody comes up with
a cgroup controller called "waldo".
Since we require a '.XXXX' suffix on all partitions, there is
no scope for clashing with the kernel 'tasks' and 'release_agent'
files.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If the partition named passed in the XML does not already have
a suffix, ensure it gets a '.partition' added to each component.
The exceptions are /machine, /user and /system which do not need
to have a suffix, since they are fixed partitions at the top
level.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Recently we changed to create VM cgroups with the naming pattern
$VMNAME.$DRIVER.libvirt. Following discussions with the systemd
community it was decided that only having a single '.' in the
names is preferrable. So this changes the naming scheme to be
$VMNAME.libvirt-$DRIVER. eg for LXC 'mycontainer.libvirt-lxc' or
for KVM 'myvm.libvirt-qemu'.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This can be set when the virPCIDevice is created and placed on a list,
then used later when traversing the list to determine which stub
driver to bind/unbind for managed devices.
The existing Detach and Attach functions' signatures haven't been
changed (they still accept a stub driver name in the arg list), but if
the arg list has NULL for stub driver and one is available in the
device's object, that will be used. (we may later deprecate and remove
the arg from those functions).
POSIX says that both basename() and dirname() may return static
storage (aka they need not be thread-safe); and that they may but
not must modify their input argument. Furthermore, <libgen.h>
is not available on all platforms. For these reasons, you should
never use these functions in a multi-threaded library.
Gnulib instead recommends a way to avoid the portability nightmare:
gnulib's "dirname.h" provides useful thread-safe counterparts. The
obvious dir_name() and base_name() are GPL (because they malloc(),
but call exit() on failure) so we can't use them; but the LGPL
variants mdir_name() (malloc's or returns NULL) and last_component
(always points into the incoming string without modifying it,
differing from basename semantics only on corner cases like the
empty string that we shouldn't be hitting in the first place) are
already in use in libvirt. This finishes the swap over to the safe
functions.
* cfg.mk (sc_prohibit_libgen): New rule.
* src/util/vircgroup.c: Fix offenders.
* src/parallels/parallels_storage.c (parallelsPoolAddByDomain):
Likewise.
* src/parallels/parallels_network.c (parallelsGetBridgedNetInfo):
Likewise.
* src/node_device/node_device_udev.c (udevProcessSCSIHost)
(udevProcessSCSIDevice): Likewise.
* src/storage/storage_backend_disk.c
(virStorageBackendDiskDeleteVol): Likewise.
* src/util/virpci.c (virPCIGetDeviceAddressFromSysfsLink):
Likewise.
* src/util/virstoragefile.h (_virStorageFileMetadata): Avoid false
positive.
Signed-off-by: Eric Blake <eblake@redhat.com>
Refactoring done in 19c6ad9ac7 didn't
correctly take into account the order cgroup limit modification needs to
be done in. This resulted into errors when decreasing the limits.
The operations need to take place in this order:
decrease hard limit
change swap hard limit
or
change swap hard limit
increase hard limit
This patch also fixes the check if the hard_limit is less than
swap_hard_limit to print better error messages. For this purpose I
introduced a helper function virCompareLimitUlong to compare limit
values where value of 0 is equal to unlimited. Additionally the check is
now applied also when the user does not provide all of the tunables
through the API and in that case the currently set values are used.
This patch resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=950478
Create the utility function virSocketAddrGetIpPrefix() to
determine the prefix for this network. The code in this
function was adapted from virNetworkIpDefPrefix().
Update virNetworkIpDefPrefix() in src/conf/network_conf.c
to use the new utility function.
Signed-off-by: Gene Czarcinski <gene@czarc.net>
http://www.uhv.edu/ac/newsletters/writing/grammartip2009.07.01.htm
(and several other sites) give hints that 'onto' is best used if
you can also add 'up' just before it and still make sense. In many
cases in the code base, we really want the two-word form, or even
a simplification to just 'on' or 'to'.
* docs/hacking.html.in: Use correct 'on to'.
* python/libvirt-override.c: Likewise.
* src/lxc/lxc_controller.c: Likewise.
* src/util/virpci.c: Likewise.
* daemon/THREADS.txt: Use simpler 'on'.
* docs/formatdomain.html.in: Better usage.
* docs/internals/rpc.html.in: Likewise.
* src/conf/domain_event.c: Likewise.
* src/rpc/virnetclient.c: Likewise.
* tests/qemumonitortestutils.c: Likewise.
* HACKING: Regenerate.
Signed-off-by: Eric Blake <eblake@redhat.com>
When running unprivileged, virSetUIDGIDWithCaps will fail because it
tries to add the requested capabilities to the permitted and effective
sets.
Detect this case, and invoke the child with cleared permitted and
effective sets. If it is a setuid program, it will get them.
Some care is needed also because you cannot drop capabilities from the
bounding set without CAP_SETPCAP. Because of that, ignore errors from
setting the bounding set.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The need_prctl variable is not really needed. If it is false,
capng_apply will be called twice with the same set, causing
a little extra work but no problem. This keeps the code a bit
simpler.
It is also clearer to invoke capng_apply(CAPNG_SELECT_BOUNDS)
separately, to make sure it is done while we have CAP_SETPCAP.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The recent qemu requires "0x" prefix for the disk wwn, this patch
changes virValidateWWN to allow the prefix, and prepend "0x" if
it's not specified. E.g.
qemu-kvm: -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,\
drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,wwn=6000c60016ea71ad:
Property 'scsi-hd.wwn' doesn't take value '6000c60016ea71ad'
Though it's a qemu regression, but it's nice to allow the prefix,
and doesn't hurt for us to always output "0x".
Detected by a simple Shell script:
for i in $(git ls-files -- '*.[ch]'); do
awk 'BEGIN {
fail=0
}
/# *include.*\.h/{
match($0, /["<][^">]*[">]/)
arr[substr($0, RSTART+1, RLENGTH-2)]++
}
END {
for (key in arr) {
if (arr[key] > 1) {
fail=1
printf("%d %s\n", arr[key], key)
}
}
if (fail == 1)
exit 1
}' $i
if test $? != 0; then
echo "Duplicate header(s) in $i"
fi
done;
A later patch will add the syntax-check to avoid duplicate
headers.
Fix the error
util/vircgroup.c: In function 'virCgroupNewDomainPartition':
util/vircgroup.c:1299:11: error: declaration of 'dirname' shadows a global declaration [-Werror=shadow]
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Add a virCgroupIsolateMount method which looks at where the
current process is place in the cgroups (eg /system/demo.lxc.libvirt)
and then remounts the cgroups such that this sub-directory
becomes the root directory from the current process' POV.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If a cgroup controller is co-mounted with another, eg
/sys/fs/cgroup/cpu,cpuacct
Then it is a requirement that there exist symlinks at
/sys/fs/cgroup/cpu
/sys/fs/cgroup/cpuacct
pointing to the real mount point. Add support to virCgroupPtr
to detect and track these symlinks
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virCgroupNewDriver method had a 'bool privileged' param.
If a false value was ever passed in, it would simply not
work, since non-root users don't have any privileges to create
new cgroups. Just delete this broken code entirely and make
the QEMU driver skip cgroup setup in non-privileged mode
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A resource partition is an absolute cgroup path, ignoring the
current process placement. Expose a virCgroupNewPartition API
for constructing such cgroups
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently if virCgroupMakeGroup fails, we can get in a situation
where some controllers have been setup, but others not. Ensure
we call virCgroupRemove to remove what we've done upon failure
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the virCgroupPtr struct contains 3 pieces of
information
- path - path of the cgroup, relative to current process'
cgroup placement
- placement - current process' placement in each controller
- mounts - mount point of each controller
When reading/writing cgroup settings, the path & placement
strings are combined to form the file path. This approach
only works if we assume all cgroups will be relative to
the current process' cgroup placement.
To allow support for managing cgroups at any place in the
heirarchy a change is needed. The 'placement' data should
reflect the absolute path to the cgroup, and the 'path'
value should no longer be used to form the paths to the
cgroup attribute files.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Rename all the virCgroupForXXX methods to use the form
virCgroupNewXXX since they are all constructors. Also
make sure the output parameter is the last one in the
list, and annotate all pointers as non-null. Fix up
all callers, and make sure they use true/false not 0/1
for the boolean parameters
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The definition of structs for cgroups are kept in vircgroup.c since
they are intended to be private from users of the API. To enable
effective testing, however, they need to be accessible. To address
the latter issue, without compronmising the former, this introduces
a new vircgrouppriv.h file to hold the struct definitions.
To prevent other files including this private header, it requires
that __VIR_CGROUP_ALLOW_INCLUDE_PRIV_H__ be defined before inclusion
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virCgroupForDriver method recently gained an 'int controllers'
parameter, but the stub impl did not
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Though they are the same thing, mixed use of them is uncomfortable.
"unsigned" is used a lot in old codes, this just tries to change the
ones in utils.
Commit 9a3ff01d7f (which was ACKed at
the end of January, but for some reason didn't get pushed until during
the 1.0.4 freeze) fixed the logic in virPCIGetVirtualFunctions().
Unfortunately, a typo in the fix (replacing VIR_REALLOC_N with
VIR_ALLOC_N during code movement) caused not only a memory leak, but
also resulted in most of the elements of the result array being
replaced with NULL. virNetDevGetVirtualFunctions() assumed (and I think
rightly so) that virPCIGetVirtualFunctions() wouldn't return any NULL
elements in the array, so it ended up segfaulting.
This was found when attempting to use a virtual network with an
auto-created pool of SRIOV VFs, e.g.:
<forward mode='hostdev' managed='yes'>
<pf dev='eth4'/>
</forward>
(the pool of PCI addresses is discovered by calling
virNetDevGetVirtualFunctions() on the PF dev).
Even though http://libvirt.org/formatdomain.html#elementsMetadata
states that it requires RFC4122 compliance UUIDs that are generated
by virUUIDGenerate() are not. Following patch modifies generated
UUIDs to conform to rules described in RFC.
Signed-off-by: Milos Vyletel <milos.vyletel@sde.cz>
The virCgroupMounted method is badly named, since a controller can be
mounted, but disabled in the current object. Rename the method to be
virCgroupHasController. Also make it tolerant to a NULL virCgroupPtr
and out-of-range controller index, to avoid duplication of these
checks in all callers
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This finds the parent for vHBA by iterating over all the HBA
which supports vport_ops capability on the host, and return
the first one which is online, not saturated (vports in use
is less than max_vports).
The helper iterates over sysfs, to find out the matched scsi host
name by comparing the wwnn,wwpn pair. It will be used by checkPool
and refreshPool of storage scsi backend. New helper getAdapterName
is introduced in storage_backend_scsi.c, which uses the new util
helper virGetFCHostNameByWWN to get the fc_host adapter name.
The current way virObject instances are allocated using
VIR_ALLOC_N causes alignment warnings
util/virobject.c: In function 'virObjectNew':
util/virobject.c:195:11: error: cast increases required alignment of target type [-Werror=cast-align]
Changing to use VIR_ALLOC_VAR will avoid the need todo
the casts entirely.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virNetlinkCommand() method takes an 'unsigned char **'
parameter to be filled with the received netlink message.
The callers then immediately cast this to 'struct nlmsghdr',
triggering (bogus) warnings about increasing alignment
requirements
util/virnetdev.c: In function 'virNetDevLinkDump':
util/virnetdev.c:1300:12: warning: cast increases required alignment of target type [-Wcast-align]
resp = (struct nlmsghdr *)*recvbuf;
^
util/virnetdev.c: In function 'virNetDevSetVfConfig':
util/virnetdev.c:1429:12: warning: cast increases required alignment of target type [-Wcast-align]
resp = (struct nlmsghdr *)recvbuf;
Since all callers cast to 'struct nlmsghdr' we can avoid
the warning problem entirely by simply changing the
signature of virNetlinkCommand to return a 'struct nlmsghdr **'
instead of 'unsigned char **'. The way we do the cast inside
virNetlinkCommand does not have any alignment issues.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Playing games with field offsets in a struct causes all sorts
of alignment warnings on ARM platforms
util/virkeycode.c: In function '__virKeycodeValueFromString':
util/virkeycode.c:26:7: warning: cast increases required alignment of target type [-Wcast-align]
(*(typeof(field_type) *)((char *)(object) + field_offset))
^
util/virkeycode.c:91:28: note: in expansion of macro 'getfield'
const char *name = getfield(virKeycodes + i, const char *, name_offset);
^
util/virkeycode.c:26:7: warning: cast increases required alignment of target type [-Wcast-align]
(*(typeof(field_type) *)((char *)(object) + field_offset))
^
util/virkeycode.c:94:20: note: in expansion of macro 'getfield'
return getfield(virKeycodes + i, unsigned short, code_offset);
^
util/virkeycode.c: In function '__virKeycodeValueTranslate':
util/virkeycode.c:26:7: warning: cast increases required alignment of target type [-Wcast-align]
(*(typeof(field_type) *)((char *)(object) + field_offset))
^
util/virkeycode.c:127:13: note: in expansion of macro 'getfield'
if (getfield(virKeycodes + i, unsigned short, from_offset) == key_value)
^
util/virkeycode.c:26:7: warning: cast increases required alignment of target type [-Wcast-align]
(*(typeof(field_type) *)((char *)(object) + field_offset))
^
util/virkeycode.c:128:20: note: in expansion of macro 'getfield'
return getfield(virKeycodes + i, unsigned short, to_offset);
There is no compelling reason to use a struct for the keycode
tables. It can easily just use an array of arrays instead,
avoiding all alignment problems
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently when getting an instance of virCgroupPtr we will
create the path in all cgroup controllers. Only at the virt
driver layer are we attempting to filter controllers. This
is bad because the mere act of creating the dirs in the
controllers can have a functional impact on the kernel,
particularly for performance.
Update the virCgroupForDriver() method to accept a bitmask
of controllers to use. Only create dirs in the controllers
that are requested. When creating cgroups for domains,
respect the active controller list from the parent cgroup
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virCgroupGetAppRoot is not clear in its meaning. Change
to virCgroupForSelf to highlight that this returns the
cgroup config for the caller's process
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The Raspberry Pi runs the armv6l architecture and apparently
people are trying to run libvirt LXC on it. So we should allow
that as a valid arch
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Implement the bare minimal sysinfo for ARM platforms by
reading the CPU models from /proc/cpuinfo
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Format the address using the helper instead of having similar code in
multiple places.
This patch also fixes leak of the MAC address string in
ebtablesRemoveForwardAllowIn() and ebtablesAddForwardAllowIn() in
src/util/virebtables.c
The domain XML generator creates the mac addres strings with lowercase
strings with a separate piece of code. This patch changes the formating
helper to do the same stuff to allow using it to normalize a string
provided by the user. After this change some of the tests that are
outputing the mac address will need to be changed.
iptables-1.4.18 removed the long deprecated "state" match.
Use "conntrack" instead in forwarding rules.
Fixes openSUSE bug https://bugzilla.novell.com/811251#811251.
When we write a log message into a log, we separate thread ID from
timestamp using ": ". However, when storing the message into the ring
buffer, we omitted the separator, e.g.:
2013-02-27 11:49:11.852+00003745: ...
virPCIGetVirtualFunctions returns 0 even if there is no "virtfn"
entry under the device sysfs path.
And virPCIGetVirtualFunctions returns -1 when it fails to get
the PCI config space of one VF, however, with keeping the
the VFs already detected.
That's why udevProcessPCI and gather_pci_cap use logic like:
if (!virPCIGetVirtualFunctions(syspath,
&data->pci_dev.virtual_functions,
&data->pci_dev.num_virtual_functions) ||
data->pci_dev.num_virtual_functions > 0)
data->pci_dev.flags |= VIR_NODE_DEV_CAP_FLAG_PCI_VIRTUAL_FUNCTION;
to tag the PCI device with "virtual_function" cap.
However, this results in a VF will aslo get "virtual_function" cap.
This patch fixes it by:
* Ignoring the VF which has failure of getting PCI config space
(given that the successfully detected VFs are kept , it makes
sense to not give up on the failure of one VF too) with a warning,
so virPCIGetVirtualFunctions will not return -1 except out of memory.
* Free the allocated *virtual_functions when out of memory
And thus the logic can be changed to:
/* Out of memory */
int ret = virPCIGetVirtualFunctions(syspath,
&data->pci_dev.virtual_functions,
&data->pci_dev.num_virtual_functions);
if (ret < 0 )
goto out;
if (data->pci_dev.num_virtual_functions > 0)
data->pci_dev.flags |= VIR_NODE_DEV_CAP_FLAG_PCI_VIRTUAL_FUNCTION;
This abstracts nodeDeviceVportCreateDelete as an util function
virManageVport, which can be further used by later storage patches
(to support persistent vHBA, I don't want to create the vHBA
using the public API, which is not good).
This adds two util functions (virIsCapableFCHost and virIsCapableVport),
and rename helper check_fc_host_linux as detect_scsi_host_caps,
check_capable_vport_linux is removed, as it's abstracted to the util
function virIsCapableVport. detect_scsi_host_caps nows detect both
the fc_host and vport_ops capabilities. "stat(2)" is replaced with
"access(2)" for saving.
* src/util/virutil.h:
- Declare virIsCapableFCHost and virIsCapableVport
* src/util/virutil.c:
- Implement virIsCapableFCHost and virIsCapableVport
* src/node_device/node_device_linux_sysfs.c:
- Remove check_capable_vport_linux
- Rename check_fc_host_linux as detect_scsi_host_caps, and refactor
it a bit to detect both fc_host and vport_os capabilities
* src/node_device/node_device_driver.h:
- Change/remove the related declarations
* src/node_device/node_device_udev.c: (Use detect_scsi_host_caps)
* src/node_device/node_device_hal.c: (Likewise)
* src/node_device/node_device_driver.c (Likewise)
"open_wwn_file" in node_device_linux_sysfs.c is redundant, on one
hand it duplicates work of virFileReadAll, on the other hand, it's
waste to use a function for it, as there is no other users of it.
So I don't see why the file opening work cannot be done in
"read_wwn_linux".
"read_wwn_linux" can be abstracted as an util function. As what all
it does is to read the sysfs entry.
So this patch removes "open_wwn_file", and abstract "read_wwn_linux"
as an util function "virReadFCHost" (a more general name, because
after changes, it can read each of the fc_host entry now).
* src/util/virutil.h: (Declare virReadFCHost)
* src/util/virutil.c: (Implement virReadFCHost)
* src/node_device/node_device_linux_sysfs.c: (Remove open_wwn_file,
and read_wwn_linux)
src/node_device/node_device_driver.h: (Remove the declaration of
read_wwn_linux, and the related macros)
src/libvirt_private.syms: (Export virReadFCHost)
Some code mistakenly called virIdentityOnceInit directly
instead of virIdentityInitialize(). This meant that one-time
initializer was run many times with predictably bad results.
The virNetSocket & virIdentity classes accidentally got some
conditionals using HAVE_SELINUX instead of WITH_SELINUX.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Intend to reduce the redundant code,use virNumaSetupMemoryPolicy
to replace virLXCControllerSetupNUMAPolicy and
qemuProcessInitNumaMemoryPolicy.
This patch also moves the numa related codes to the
file virnuma.c and virnuma.h
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
qemuGetNumadAdvice will be used by LXC driver, rename
it to virNumaGetAutoPlacementAdvice and move it to virnuma.c
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
If no user identity is available, some operations may wish to
use the system identity. ie the identity of the current process
itself. Add an API to get such an identity.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To allow any internal API to get the current identity, add APIs
to associate a virIdentityPtr with the current thread, via a
thread local
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Introduce a local object virIdentity for managing security
attributes used to form a client application's identity.
Instances of this object are intended to be used as if they
were immutable, once created & populated with attributes
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We've already scrubbed for comparisons of 'uid_t == -1' (which fail
on platforms where uid_t is a u16), but another one snuck in.
* src/util/virutil.c (virSetUIDGIDWithCaps): Correct uid comparison.
* cfg.mk (sc_prohibit_risky_id_promotion): New rule.
My commit 7a2e845a86 (and its
prerequisites) managed to effectively ignore the
clear_emulator_capabilities setting in qemu.conf (visible in the code
as the VIR_EXEC_CLEAR_CAPS flag when qemu is being exec'ed), with the
result that the capabilities are always cleared regardless of the
qemu.conf setting. This patch fixes it by passing the flag through to
virSetUIDGIDWithCaps(), which uses it to decide whether or not to
clear existing capabilities before adding in those that were
requested.
Note that the existing capabilities are *always* cleared if the new
process is going to run as non-root, since the whole point of running
non-root is to have the capabilities removed (it's still possible to
maintain individual capabilities as needed using the capBits argument
though).
In debug mode, the bug failed to start vm
error: Failed to start domain rhel5u9
error: internal error Out of space while reading console log output:
...
Add a virThreadCancel function. This functional is inherently
dangerous and not something we want to use in general, but
integration with SELinux requires that we provide this stub.
We leave out any Win32 impl to discourage further use and
because obviously SELinux isn't enabled on Win32
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When reading log output from QEMU/LXC we need to skip over any
libvirt log messages. Currently the QEMU driver checks for a
fixed string, but this is better done with a regex. Add a method
virLogProbablyLogMessage to do a regex check
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virNetDevSetupControlFull function was protected by a
conditional on SIOCBRADDBR, which is bogus since it does
not use that symbol. Update the conditionals around all
callers to do stricter checks to ensure we always build
succesfully
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The RHEL4 vintage header files do not define GET_VLAN_VID_CMD.
Conditionally define it in our source, since the kernel can
raise a runtime error if it isn't supported
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The loop.h on RHEL4 is broken and cannot be imported. We already
detect this in configure as a side-effect of looking for whether
LO_FLAGS_AUTOCLEAR is available. We protected the impl with
HAVE_DECL_LO_FLAGS_AUTOCLEAR, but not the header import
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit 0df3e89 only touched the header, but the .c file had the
same shadowing potential.
* src/util/viralloc.c (virDeleteElementsN): s/remove/toremove/ to
match the header.
A value which is equal to a integer maximum such as LLONG_MAX is
a valid integer value.
The patch fix the following error:
1, virsh memtune vm --swap-hard-limit -1
2, virsh start vm
In debug mode, it shows error like:
virScaleInteger:1813 : numerical overflow:\
value too large: 9007199254740991KiB
BZ:https://bugzilla.redhat.com/show_bug.cgi?id=912021
Without error handler set, virDefaultErrorFunc will be called, the
error message is prefixed with "libvir:". It become a little better
by using prefix "libvirt:" when working with upper application.
For example:
1, stop libvirtd daemon
2, run virt-top.
libvir: XML-RPC error : Failed to connect \
socket to '/var/run/libvirt/libvirt-sock-ro': \
No such file or directory
libvirt: VIR_ERR_SYSTEM_ERROR: VIR_FROM_RPC: \
Failed to connect socket to '/var/run/libvirt/libvirt-sock-ro': \
No such file or directory
Currently, after we removed the qemu driver lock, it may happen
that two or more threads will start up a machine with macvlan and
race over virNetDevMacVLanCreateWithVPortProfile(). However,
there's a racy section in which we are generating a sequence of
possible device names and detecting if they exits. If we found
one which doesn't we try to create a device with that name.
However, the other thread is doing just the same. Assume it will
succeed and we must therefore fail. If this happens more than 5
times (which in massive parallel startup surely will) we return
-1 without any error reported. This patch is a simple hack to
both of these problems. It introduces a mutex, so only one thread
will enter the section, and if it runs out of possibilities,
error is reported. Moreover, the number of retries is raised to 20.
The code for putting the emulator threads in a separate cgroup
would spam the logs with warnings
2013-02-27 16:08:26.731+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 3
2013-02-27 16:08:26.731+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 4
2013-02-27 16:08:26.732+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 6
This is because it has only created child cgroups for 3 of the
controllers, but was trying to move the processes from all the
controllers. The fix is to only try to move threads in the
controllers we actually created. Also remove the warning and
make it return a hard error to avoid such lazy callers in the
future.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
uid_t and gid_t are opaque types, ranging from s32 to u32 to u64.
Explicitly cast the magic -1 to the appropriate type.
Signed-off-by: Philipp Hahn <hahn@univention.de>
The uid_t|gid_t values are explicitly casted to "unsigned long", but the
printf() still used "%d", which is for signed values.
Change the format to "%u".
Signed-off-by: Philipp Hahn <hahn@univention.de>
Originally, only a host name was used to associate a
DHCPv6 request with a specific IPv6 address. Further testing
demonstrates that this is an unreliable method and, instead,
a client-id or DUID needs to be used. According to DHCPv6
standards, this id can be a duid-LLT, duid-LL, or duid-UUID
even though dnsmasq will accept almost any text string.
Although validity checking of a specified string makes sure it is
hexadecimal notation with bytes separated by colons, there is no
rigorous check to make sure it meets the standard.
Documentation and schemas have been updated.
Signed-off-by: Gene Czarcinski <gene@czarc.net>
Signed-off-by: Laine Stump <laine@laine.org>
We pass over the address/port start/end values many times so we put
them in structs.
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Signed-off-by: Laine Stump <laine@laine.org>
Let users set the port range to be used for forward mode NAT:
...
<forward mode='nat'>
<nat>
<port start='1024' end='65535'/>
</nat>
</forward>
...
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Signed-off-by: Laine Stump <laine@laine.org>
Support setting which public ip to use for NAT via attribute
address in subelement <nat> in <forward>:
...
<forward mode='nat'>
<address start='1.2.3.4' end='1.2.3.10'/>
</forward>
...
This will construct an iptables line using:
'-j SNAT --to-source <start>-<end>'
instead of:
'-j MASQUERADE'
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Signed-off-by: Laine Stump <laine@laine.org>
The function does not report any errors so there should be no need too
reset an existing error first. Moreover, virTypedParamsFree is mostly
called in cleanup phase where it has the potential to reset any useful
reported earlier.
If you have a qcow2 file /path1/to/file pointed to by symlink
/path2/symlink, and pass qemu /path2/symlink, then qemu treats
a relative backing file in the qcow2 metadata as being relative
to /path2, not /path1/to. Yes, this means that it is possible
to create a qcow2 file where the choice of WHICH directory and
symlink you access its contents from will then determine WHICH
backing file (if any) you actually find; the results can be
rather screwy, but we have to match what qemu does.
Libvirt and qemu default to creating absolute backing file
names, so most users don't hit this. But at least VDSM uses
symlinks and relative backing names alongside the
--reuse-external flags to libvirt snapshot operations, with the
result that libvirt was failing to follow the intended chain of
backing files, and then backing files were not granted the
necessary sVirt permissions to be opened by qemu.
See https://bugzilla.redhat.com/show_bug.cgi?id=903248 for
more gory details. This fixes a regression introduced in
commit 8250783.
I tested this patch by creating the following chain:
ls /home/eblake/Downloads/Fedora.iso # raw file for base
cd /var/lib/libvirt/images
qemu-img create -f qcow2 \
-obacking_file=/home/eblake/Downloads/Fedora.iso,backing_fmt=raw one
mkdir sub
cd sub
ln -s ../one onelink
qemu-img create -f qcow2 \
-obacking_file=../sub/onelink,backing_fmt=qcow2 two
mv two ..
ln -s ../two twolink
qemu-img create -f qcow2 \
-obacking_file=../sub/twolink,backing_fmt=qcow2 three
mv three ..
ln -s ../three threelink
then pointing my domain at /var/lib/libvirt/images/sub/threelink.
Prior to this patch, I got complaints about missing backing
files; afterwards, I was able to verify that the backing chain
(and hence DAC and SELinux relabels) of the entire chain worked.
* src/util/virstoragefile.h (_virStorageFileMetadata): Add
directory member.
* src/util/virstoragefile.c (absolutePathFromBaseFile): Drop,
replaced by...
(virFindBackingFile): ...better function.
(virStorageFileGetMetadataInternal): Add an argument.
(virStorageFileGetMetadataFromFD, virStorageFileChainLookup)
(virStorageFileGetMetadata): Update callers.
Prior to this patch, we had the callchains:
external users
\-> virStorageFileGetMetadataFromFD
\-> virStorageFileGetMetadataFromBuf
virStorageFileGetMetadataRecurse
\-> virStorageFileGetMetadataFromFD
\-> virStorageFileGetMetadataFromBuf
However, a future patch wants to add an additional parameter to
the bottom of the chain, for use by virStorageFileGetMetadataRecurse,
without affecting existing external callers. Since there is only a
single caller of the internal function, we can repurpose it to fit
our needs, with this patch giving us:
external users
\-> virStorageFileGetMetadataFromFD
\-> virStorageFileGetMetadataInternal
virStorageFileGetMetadataRecurse /
\-> virStorageFileGetMetadataInternal
* src/util/virstoragefile.c (virStorageFileGetMetadataFromFD):
Move most of the guts...
(virStorageFileGetMetadataFromBuf): ...here, and rename...
(virStorageFileGetMetadataInternal): ...to this.
(virStorageFileGetMetadataRecurse): Use internal helper.
virStorageFileGetMetadataFromFD is the only caller of
virStorageFileGetMetadataFromBuf; and it doesn't care about the
difference between a return of 0 (total success) or 1
(metadata was inconsistent, but pointer was populated as best
as possible); only about a return of -1 (could not read metadata
or out of memory). Changing the return type, and normalizing
the variable names used, will make merging the functions easier
in the next commit.
* src/util/virstoragefile.c (virStorageFileGetMetadataFromBuf):
Change return value, and rename some variables.
(virStorageFileGetMetadataFromFD): Rename some variables.
No semantic change; done so the next patch doesn't need a forward
declaration of a static function.
* src/util/virstoragefile.c (virStorageFileProbeFormatFromBuf):
Hoist earlier.
CC libvirt_util_la-vircommand.lo
../../src/util/vircommand.c:2358:1: error: 'virCommandHandshakeChild' defined but not used [-Werror=unused-function]
The function is only implemented inside #ifndef WIN32.
* src/util/vircommand.c (virCommandHandshakeChild): Hoist earlier,
so that win32 build doesn't hit an unused forward declaration.
virCommand was previously calling virSetUIDGID() to change the uid and
gid of the child process, then separately calling
virSetCapabilities(). This did not work if the desired uid was != 0,
since a setuid to anything other than 0 normally clears all
capabilities bits.
The solution is to use the new virSetUIDGIDWithCaps(), sending it the
uid, gid, and capabilities bits. This will get the new process setup
properly.
Since the static functions virSetCapabilities() and
virClearCapabilities are no longer called, they have been removed.
NOTE: When combined with "filecap $path-to-qemu sys_rawio", this patch
will make CAP_SYS_RAWIO (which is required for passthrough of generic
scsi commands to a guest - see commits e8daeeb, 177db08, 397e6a7, and
74e0349) be retained by qemu when necessary. Apparently that
capability has been broken for non-root qemu ever since it was
originally added.
Normally when a process' uid is changed to non-0, all the capabilities
bits are cleared, even those explicitly set with calls to
capng_update()/capng_apply() made immediately before setuid. And
*after* the process' uid has been changed, it no longer has the
necessary privileges to add capabilities back to the process.
In order to set a non-0 uid while still maintaining any capabilities
bits, it is necessary to either call capng_change_id() (which
unfortunately doesn't currently call initgroups to setup auxiliary
group membership), or to perform the small amount of calisthenics
contained in the new utility function virSetUIDGIDWithCaps().
Another very important difference between the capabilities
setting/clearing in virSetUIDGIDWithCaps() and virCommand's
virSetCapabilities() (which it will replace in the next patch) is that
the new function properly clears the capabilities bounding set, so it
will not be possible for a child process to set any new
capabilities.
A short description of what is done by virSetUIDGIDWithCaps():
1) clear all capabilities then set all those desired by the caller (in
capBits) plus CAP_SETGID, CAP_SETUID, and CAP_SETPCAP (which is needed
to change the capabilities bounding set).
2) call prctl(), telling it that we want to maintain current
capabilities across an upcoming setuid().
3) switch to the new uid/gid
4) again call prctl(), telling it we will no longer want capabilities
maintained if this process does another setuid().
5) clear the capabilities that we added to allow us to
setuid/setgid/change the bounding set (unless they were also requested
by the caller via the virCommand API).
Because the modification/maintaining of capabilities is intermingled
with setting the uid, this is necessarily done in a single function,
rather than having two independent functions.
Note that, due to the way that effective capabilities are computed (at
time of execve) for a process that has uid != 0, the *file*
capabilities of the binary being executed must also have the desired
capabilities bit(s) set (see "man 7 capabilities"). This can be done
with the "filecap" command. (e.g. "filecap /usr/bin/qemu-kvm sys_rawio").
This is an interim measure to make sure everything still works in this
order. The next step will be to perform capabilities drop and
setuid/gid as a single operation (which is the only way to keep any
capabilities when switching to a non-root uid).
virCommand gets two new APIs: virCommandSetSELinuxLabel() and
virCommandSetAppArmorProfile(), which both save a copy of a
null-terminated string in the virCommand. During virCommandRun, if the
string is non-NULL and we've been compiled with AppArmor and/or
SELinux security driver support, the appropriate security library
function is called for the child process, using the string that was
previously set. In the case of SELinux, setexeccon_raw() is called,
and for AppArmor, aa_change_profile() is called.
This functionality has been added so that users of virCommand can use
the upcoming virSecurityManagerSetChildProcessLabel() prior to running
a child process, rather than needing to setup a hook function to be
called (and in turn call virSecurityManagerSetProcessLabel()) *during*
the setup of the child process.
Rather than treating uid:gid of 0:0 as a NOP, we blindly pass that
through to the lower layers. However, we *do* check for a requested
value of "-1" to mean "don't change this setting". setregid() and
setreuid() already interpret -1 as a NOP, so this is just an
optimization, but we are also calling getpwuid_r and initgroups, and
it's unclear what the former would do with a uid of -1.
If a uid and/or gid is specified for a command, it will be set just
after the user-supplied post-fork "hook" function is called.
The intent is that this can replace user hook functions that set
uid/gid. This moves the setting of uid/gid and dropping of
capabilities closer to each other, which is important since the two
should really be done at the same time (libcapng provides a single
function that does both, which we will be unable to use, but want to
mimic as closely as possible).
All args except "cmd" in the call to virExec are now redundant, since
they can all be found in cmd, so remove the args and reference the
data directly in cmd. One exception to this is that "infd" was being
modified within virExec, and modifying the original in cmd caused make
check failures, so cmd->infd is copied to a local, and the local is
used during virExec().
virExecWithHook is only called from one place, so it always has the
same "hook" function (virHookCommand), and the data sent to that
function is always a virCommandPtr, so eliminate the function and
generic data from the arglist, and replace it with "virCommandPtr
cmd". The call to (hook)(data) is replaced with
"virHookCommand(cmd)". Finally, virExecWithHook is renamed to virExec.
Indentation has been updated only for code that will remain after the
next patch, which will remove all other args to virExec (since they
are now redundant, as they're all members of virCommandPtr).
Currently, if a command wants to do asynchronous IO, a callback
is registered in the libvirtd eventloop to handle writes and
reads. However, there's a race in virCommandWait. The eventloop
may already be executing the callback, while virCommandWait is
mangling internal state of virCommand. To deal with it, we need
to either introduce locking or spawn a separate thread where we
poll() on stdio from child. The former, however, requires to
unlock all mutexes held, as the event loop may execute other
callbacks which tries to lock one of the mutexes, deadlock and
thus never wake us up. So it's safer to spawn a separate thread.
This makes code easier to read, by avoiding lines longer than
80 columns and removing the repetition from the callers.
* src/util/virstoragefile.c (qedGetHeaderUL, qedGetHeaderULL):
Delete in favor of more generic macros.
(qcow2GetBackingStoreFormat, qcowXGetBackingStore)
(qedGetBackingStore, virStorageFileMatchesVersion)
(virStorageFileGetMetadataInternal): Use new macros.
* src/cpu/cpu_x86.c (x86VendorLoad): Likewise.
We have several cases where we need to read endian-dependent
data regardless of host endianness; rather than open-coding
these call sites, it will be nicer to funnel things through
a macro.
The virendian.h file can be expanded to add writer functions,
and/or 16-bit access patterns, if needed. Also, if we need
to turn things into a function to avoid multiple evaluations
of buf, that can be done later. But for now, a macro worked.
* src/util/virendian.h: New file.
* src/Makefile.am (UTIL_SOURCES): Ship it.
* tests/virendiantest.c: New test.
* tests/Makefile.am (test_programs, virendiantest_SOURCES): Run
the test.
* .gitignore: Ignore built file.
Instead of creating an iptables command in one shot, do it in steps
so we can add conditional options like physdev and protocol.
This removes code duplication while keeping existing behaviour.
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Signed-off-by: Eric Blake <eblake@redhat.com>
We are requesting for stderr catching for all cases in
virFileWrapperFdNew(). There is no need to have a separate
function just to report an error, esp. when we can do it in
virFileWrapperFdClose().
We had an easy way to iterate set bits, but not for iterating
cleared bits.
* src/util/virbitmap.h (virBitmapNextClearBit): New prototype.
* src/util/virbitmap.c (virBitmapNextClearBit): Implement it.
* src/libvirt_private.syms (bitmap.h): Export it.
* tests/virbitmaptest.c (test4): Test it.
To allow modifications to the lists to be synchronized, convert
virPCIDeviceList and virUSBDeviceList into virObjectLockable
classes. The locking, however, will not be self-contained. The
users of these classes will have to call virObjectLock/Unlock
in the critical regions.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit 34e8f63a32 introduced support for catching errors from
libvirt iohelper. However, at those times there wasn't such fancy
API as virCommandDoAsyncIO(), so everything has to be implemented
on our own. But since we do have the API now, we can use it and
drop our implementation then.
Currently, if we want to feed stdin, or catch stdout or stderr of a
virCommand we have to use virCommandRun(). When using virCommandRunAsync()
we have to register FD handles by hand. This may lead to code duplication.
Hence, introduce an internal API, which does this automatically within
virCommandRunAsync(). The intended usage looks like this:
virCommandPtr cmd = virCommandNew*(...);
char *buf = NULL;
...
virCommandSetOutputBuffer(cmd, &buf);
virCommandDoAsyncIO(cmd);
if (virCommandRunAsync(cmd, NULL) < 0)
goto cleanup;
...
if (virCommandWait(cmd, NULL) < 0)
goto cleanup;
/* @buf now contains @cmd's stdout */
VIR_DEBUG("STDOUT: %s", NULLSTR(buf));
...
cleanup:
VIR_FREE(buf);
virCommandFree(cmd);
Note, that both stdout and stderr buffers may change until virCommandWait()
returns.
QEMU is fully capable of handling VDI images and we just refuse to
work with them. As qemu-img knows and supports this, there should be
no problem with this addition.
This is of course, just basic functionality, without searching for any
backing files, etc.
Some files have the magic shifted to some offset other than 0, so we
have to support that. I also cleaned up some lines to be more
readable and added missing magic for iso file format.
Commit 6094ad7b (0.9.3 release) promoted several functions from
internal to public, but forgot to fix the documentation generator
to provide details about those functions.
For an example of what this fixes, look at:
file:///path/to/libvirt/docs/html/libvirt-libvirt.html#virEventAddHandle
before and after the patch.
* docs/apibuild.py (ignored_functions): Don't ignore functions
that were turned into official API.
* src/util/virevent.c: Fix comments to pass through parser.
Way back when I started making changes for Coverity messages my first set
were to a bunch of CHECKED_RETURN errors. In particular virAsprintf() had
a few callers that Coverity noted didn't check their return (although some
did check if the buffer being printed to was NULL or not).
It was suggested at the time as a further patch an ATTRIBUTE_RETURN_CHECK
should be added to virAsprintf(), see:
https://www.redhat.com/archives/libvir-list/2013-January/msg00120.html
This patch does that and fixes a few more instances not found by Coverity
that failed the check.
Setting the log output prefix to 0 is not supported and in fact results
in the following message:
warning : virLogParseOutputs:1021 : Ignoring invalid log output setting.
This patch changes the name of the @sep argument to @terminator and
clarifies it's usage. This patch also explicitly documents that
whitespace can't be used as @terminator as it is skipped multiple times
in the implementation.
When building with static analysis enabled, we turn on attribute
nonnull checking. However, this caused the build to fail with:
../../src/util/virobject.c: In function 'virObjectOnceInit':
../../src/util/virobject.c:55:40: error: null argument where non-null required (argument 1) [-Werror=nonnull]
Creation of the virObject class is the one instance where the
parent class is allowed to be NULL. Making things conditional
will let us keep static analysis checking for all other .c file
callers, without breaking the build on this one exception.
* src/util/virobject.c: Define witness.
* src/util/virobject.h (virClassNew): Use it to force most callers
to pass non-null parameter.
The Coverity static analyzer was generating many false positives for the
unary operation inside the VIR_FREE() definition as it was trying to evaluate
the else portion of the "?:" even though the if portion was (1).
Signed-off-by: Eric Blake <eblake@redhat.com>
Currently, whenever somebody calls saferead() on nonblocking FD
(safewrite() is totally interchangeable for purpose of this message)
he might get wrong return value. For instance, in the first iteration
some data is read. The number of bytes read is stored into local
variable 'nread'. However, in next iterations we can get -1 from
read() with errno == EAGAIN, in which case the -1 is returned despite
fact some data has already been read. So the caller gets confused.
Bare read() should be used for nonblocking FD.
Working with virTypedParameters in clients written in C is ugly and
requires all clients to duplicate the same code. This set of APIs makes
this code for manipulating with virTypedParameters integral part of
libvirt so that all clients may benefit from it.
A build on FreeBSD failed with:
util/virportallocator.c:108: error: storage size of 'addr' isn't known
util/virportallocator.c:123: error: 'INADDR_ANY' undeclared (first use in this function)
It turns out that while POSIX allows sockaddr_in to leak in through
<arpa/inet.h> (the way Linux does it), it is not mandatory, and
conforming applications are required to get it through <netinet/in.h>.
* src/util/virportallocator.c: Include header for struct
sockaddr_in.
* tests/virportallocatortest.c: Likewise.
The QEMU driver default max port is 65535, but it then increments
this by 1 to 65536. This maps to 0 in an unsigned short :-( This
was apparently done so that for() loops could use "< max" instead
of "<= max". Remove this insanity and just make the loop do the
right thing.
A great many virObject instances require a mutex, so introduce
a convenient class for this which provides a mutex. This avoids
repeating the tedious init/destroy code
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently all classes must directly inherit from virObject.
This allows for arbitrarily deep hierarchy. There's not much
to this aside from chaining up the 'dispose' handlers from
each class & providing APIs to check types.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Make cpuset local to the while loop and free it once done with it each
time through the loop. Add a sa_assert() to virBitmapParse() to keep Coverity
from believing there could be a negative return and possible resource leak.
Commit c308a9ae was incomplete; it resolved the configure failure,
but not a later build failure.
* src/util/virnetdevbridge.c: Include pre-req header.
* configure.ac (AC_CHECK_HEADERS): Prefer standard in.h over
non-standard ip6.h.
There's no need to do lots of readlink() calls to canonicalize
a name if we're only going to use stat() on it, since stat()
already chases symlinks.
* src/util/virutil.c (virGetDeviceID): Let stat() do the symlink
chasing.
Pass stub driver name directly to pciDettachDevice and pciReAttachDevice to fit
for different libvirt drivers. For example, qemu driver prefers pci-stub, but
Xen prefers pciback.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
"virGetDeviceID" could be used across the sources, but it doesn't
relate with this series, and could be done later.
* src/util/virutil.h: (Declare virGetDeviceID, and
vir{Get,Set}DeviceUnprivSGIO)
* src/util/virutil.c: (Implement virGetDeviceID and
vir{Get,Set}DeviceUnprivSGIO)
* src/libvirt_private.syms: Export private symbols of upper helpers
This is an adjustment to the fix for
https://bugzilla.redhat.com/show_bug.cgi?id=889319
to account for two bonehead mistakes I made.
commit ac2797cf2a attempted to fix a
problem with netlink in newer kernels requiring an extra attribute
with a filter flag set in order to receive an IFLA_VFINFO_LIST from
netlink. Unfortunately, the #ifdef that protected against compiling it
in on systems without the new flag went a bit too far, assuring that
the new code would *never* be compiled, and even if it had, the code
was incorrect.
The first problem was that, while some IFLA_* enum values are also
their existence at compile time, IFLA_EXT_MASK *isn't* #defined, so
checking to see if it's #defined is not a valid method of determining
whether or not to add the attribute. Fortunately, the flag that is
being set (RTEXT_FILTER_VF) *is* #defined, and it is never present if
IFLA_EXT_MASK isn't, so it's sufficient to just check for that flag.
And to top it off, due to the code not actually compiling when I
thought it did, I didn't realize that I'd been given the wrong arglist
to nla_put() - you can't just send a const value to nla_put, you have
to send it a pointer to memory containing what you want to add to the
message, along with the length of that memory.
This time I've actually sent the patch over to the other machine
that's experiencing the problem, applied it to the branch being used
(0.10.2) and verified that it works properly, i.e. it does fix the
problem it's supposed to fix. :-/
To bring in line with new naming practice, rename the=
src/util/cgroup.{h,c} files to vircgroup.{h,c}
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=889319
When assigning an SRIOV virtual function to a guest using "intelligent
PCI passthrough" (<interface type='hostdev'>, which sets the MAC
address and vlan tag of the VF before passing its info to qemu),
libvirt first learns the current MAC address and vlan tag by sending
an NLM_F_REQUEST message for the VF's PF (physical function) to the
kernel via a NETLINK_ROUTE socket (see virNetDevLinkDump()); the
response message's IFLA_VFINFO_LIST section is examined to extract the
info for the particular VF being assigned.
This worked fine with kernels up until kernel commit
115c9b81928360d769a76c632bae62d15206a94a (first appearing in upstream
kernel 3.3) which changed the ABI to not return IFLA_VFINFO_LIST in
the response until a newly introduced IFLA_EXT_MASK field was included
in the request, with the (newly introduced, of course) RTEXT_FILTER_VF
flag set.
The justification for this ABI change was that new fields had been
added to the VFINFO, causing NLM_F_REQUEST messages to fail on systems
with large numbers of VFs if the requesting application didn't have a
large enough buffer for all the info. The idea is that most
applications doing an NLM_F_REQUEST don't care about VFINFO anyway, so
eliminating it from the response would lower the requirements on
buffer size. Apparently, the people who pushed this patch made the
mistaken assumption that iproute2 (the "ip" command) was the only
package that used IFLA_VFINFO_LIST, so it wouldn't break anything else
(and they made sure that iproute2 was fixed.
The logic of this "fix" is debatable at best (one could claim that the
proper fix would be for the applications in question to be fixed so
that they properly sized the buffer, which is what libvirt does
(purely by virtue of using libnl), but it is what it is and we have to
deal with it.
In order for <interface type='hostdev'> to work properly on systems
with a kernel 3.3 or later, libvirt needs to add the afore-mentioned
IFLA_EXT_MASK field with RTEXT_FILTER_VF set.
Of course we also need to continue working on systems with older
kernels, so that one bit of code is compiled conditionally. The one
time this could cause problems is if the libvirt binary was built on a
system without IFLA_EXT_MASK which was subsequently updated to a
kernel that *did* have it. That could be solved by manually providing
the values of IFLA_EXT_MASK and RTEXT_FILTER_VF and adding it to the
message anyway, but I'm uncertain what that might actually do on a
system that didn't support the message, so for the time being we'll
just fail in that case (which will very likely never happen anyway).
This patch fixes the lack of error messages when libvirt fails to find
VFINFO in a returned netlinke response message.
https://bugzilla.redhat.com/show_bug.cgi?id=827519#c10 is an example
of the error message that was previously logged when the
IFLA_VFINFO_LIST object was missing from the netlink response. The
reason for this failure is detailed in
https://bugzilla.redhat.com/show_bug.cgi?id=889319
Even though that root problem has been fixed, the experience of
finding the root cause shows us how important it is to properly log an
error message in these cases. This patch *seems* to replace the entire
function, but really most of the changes are due to moving code that
was previously inside an if() statement out to the top level of the
function (the original if() was reversed and made to log an error and
return).
Revert the complex workaround of commit 39d91e9, now that we have
a nicer framework for shutting up broken gcc.
* src/util/buf.c (virBufferEscape): Simplify.
Historically there was an inconsistency in handling of the
itanium arch. The xen driver & CPU model code treated it
as 'ia64' but the QEMU capabilities code used 'itanium'. On
the grounds that no one has ever seriously used itanium
with QEMU, while RHEL shipped itanium with Xen, we should
favour 'ia64' as the canonical format
Introduce a 'virArch' enum for CPU architectures. Include
data type providing wordsize and endianness, and APIs to
query this info and convert to/from enum and string form.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This is yet another refinement to the fix for CVE-2012-3411:
https://bugzilla.redhat.com/show_bug.cgi?id=833033
It turns out that it would be very intrusive to correctly backport the
entire --bind-dynamic option to older dnsmasq versions
(e.g. dnsmasq-2.48 that is used on RHEL6.x and CentOS 6.x), but very
simple to patch those versions to just use SO_BINDTODEVICE on all
their listening sockets (SO_BINDTODEVICE also has the desired effect
of permitting only traffic that was received on the interface(s) where
dnsmasq was set to listen.)
This patch modifies the dnsmasq capabilities detection to detect the
string:
--bind-interfaces with SO_BINDTODEVICE
in the output of "dnsmasq --version", and in that case realize that
using the old --bind-interfaces option is just as safe as
--bind-dynamic (and therefore *not* forbid creation of networks that
use public IP address ranges).
If -bind-dynamic is available, it is still preferred over
--bind-interfaces.
Note that this patch does no harm in upstream, or in any distro's
downstream if it happens to end up there, but builds for distros that
have a new enough dnsmasq to support --bind-dynamic do *NOT* need to
specifically backport this patch; it's only required for distro
releases that have dnsmasq too old to have --bind-dynamic (and those
distros will need to add the SO_BINDTODEVICE patch to dnsmasq,
*including the extra string in the --version output*, as well.
When LXC labels USB devices during hotplug, it is running in
host context, so it needs to pass in a vroot path to the
container root.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There was a double free issue caused by virSysinfoRead on s390,
as the same manufacturer string instance was assigned to more
than one processor record.
Cleaned up other potential memory issues and restructured the sysinfo
parsing code by moving repeating patterns into a helper function.
The restructuring made it necessary to conditionally disable
-Wlogical-op for some older GCC versions, using pragma GCC diagnostic.
This is a GCC specific pragma, which is acceptable, since we're
using it to work around a GCC specific bug.
Finally, added a function virSysinfoSetup to configure the sysinfo
data source files/script during run time, to facilitate writing test
programs. This function is not published in sysinfo.h and only
there for testing.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
The current virStorageFileGet{LVM,SCSI}Key methods return
the key as the return value. Unfortunately it is desirable
for "NULL" to be a valid return value, as well as an error
indicator. Thus the returned key must instead be provided
as an out-parameter.
When we invoke lvs or scsi_id to extract ID for block devices,
we don't want virCommandWait logging errors messages. Thus we
must explicitly check 'status != 0', rather than letting
virCommandWait do it.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The QED file format is non-versioned, so although the magic
value matched, libvirt rejected it due to lack of a version
number to compare against. We need to distinguish this case
by allowing a value of '-2' to indicate a non-versioned file
where only the magic is required to match
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To help us detect when new storage file versions come into
existance log a warning if the storage file magic matches,
but the version does not
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Fully stub out the virCgroupGetAppRoot method as done with other
methods in the file, rather than just the body. This lets us
annotate the unused parameter to avoid a warning
* Autotools changes:
- Don't assume Qemu is Linux-only
- Check Linux headers only on Linux
- Disable firewalld on FreeBSD
* Initctl:
Initctl seem to present only on Linux, so stub it on other platforms
* Raw I/O: Linux-only as well
* Headers cleanup
This patch gets rid of the undeterministic error reporting code done on
return values of get(pw|gr)nam_r. With this patch, if the group record
is not returned by the corresponding function this error is not
considered fatal even if errno != 0. The error is logged in such case.
virStorageFileGetLVMKey and virStorageFileGetSCSIKey
both return heap allocated strings, so the return value
should not be marked const.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This will be used whenever a NIC with guaranteed throughput is to
be plugged into a bridge. It will adjust the average throughput of
non guaranteed NICs (classid 1:2) to meet new requirements.
These set bridge part of QoS when bringing domain's interface up.
Long story short, if there's a 'floor' set, a new QoS class is created.
ClassID MUST be unique within the bridge and should be kept for
unplug phase.
These classes can borrow unused bandwidth. Basically,
only egress qdsics can have classes, therefore we can
do this kind of traffic shaping only on host's outgoing,
that is domain's incoming traffic.
This is however supported only on domain interfaces with
type='network'. Moreover, target network needs to have at least
inbound QoS set. This is required by hierarchical traffic shaping.
From now on, the required attribute for <inbound/> is either 'average'
(old) or 'floor' (new). This new attribute can be used just for
interfaces type of network (<interface type='network'/>) currently.
Stochastic Fairness Queuing (SFQ) is queuing discipline
(qdisc) which doesn't really shape any traffic but 'just'
re-arrange packets in sending buffer so no stream starve.
The goal is to ensure fairness. There is basically only one
configuration parameter (perturb) which is set to advised
value of 10.
The DHCPv6 support includes IPV6 dhcp-range and dhcp-host for one
IPv6 subnetwork on one interface. This support will only work
if dnsmasq version >= 2.64; otherwise an error occurs if
dhcp-range or dhcp-host is specified for an IPv6 address.
Essentially, this change provides the same DHCP support for IPv6
that has been available for IPv4.
With dnsmasq >= 2.64, support for the RA service is also now provided
by dnsmasq (radvd is no longer used/started). (Although at least one
version of dnsmasq prior to 2.64 "supported" IPv6 Router
Advertisement, there were bugs (fixed in 2.64) that rendered it
unusable.)
Documentation and the network schema has been updated
to reflect the new support.
I noticed when writing the backend functions for virNetworkUpdate that
I was repeating the same sequence of memmove, VIR_REALLOC, nXXX-- (and
messed up the args to memmove at least once), and had seen the same
sequence in a lot of other places, so I decided to write a few
utility functions/macros - see the .h file for full documentation.
The intent is to reduce the number of lines of code, but more
importantly to eliminate the need to check the element size and
element count arithmetic every time we need to do this (I *always*
make at least one mistake.)
VIR_INSERT_ELEMENT: insert one element at an arbitrary index within an
array of objects. The size of each object is determined
automatically by the macro using sizeof(*array). The new element's
contents are copied into the inserted space, then the original copy
of contents are 0'ed out (if everything else was
successful). Compile-time assignment and size compatibility between
the array and the new element is guaranteed (see explanation below
[*])
VIR_INSERT_ELEMENT_COPY: identical to VIR_INSERT_ELEMENT, except that
the original contents of newelem are not cleared to 0 (i.e. a copy
is made).
VIR_APPEND_ELEMENT: This is just a special case of VIR_INSERT_ELEMENT
that "inserts" one past the current last element.
VIR_APPEND_ELEMENT_COPY: identical to VIR_APPEND_ELEMENT, except that
the original contents of newelem are not cleared to 0 (i.e. a copy
is made).
VIR_DELETE_ELEMENT: delete one element at an arbitrary index within an
array of objects. It's assumed that the element being deleted is
already saved elsewhere (or cleared, if that's what is appropriate).
All five of these macros have an _INPLACE variant, which skips the
memory re-allocation of the array, assuming that the caller has
already done it (when inserting) or will do it later (when deleting).
Note that VIR_DELETE_ELEMENT* can return a failure, but only if an
invalid index is given (index + amount to delete is > current array
size), so in most cases you can safely ignore the return (that's why
the helper function virDeleteElementsN isn't declared with
ATTRIBUTE_RETURN_CHECK). A warning is logged if this ever happens,
since it is surely a coding error.
[*] One initial problem with the INSERT and APPEND macros was that,
due to both the array pointer and newelem pointer being cast to void*
when passing to virInsertElementsN(), any chance of type-checking was
lost. If we were going to move in newelem with a memmove anyway, we
would be no worse off for this. However, most current open-coded
insert/append operations use direct struct assignment to move the new
element into place (or just populate the new element directly) - thus
use of the new macros would open a possibility for new usage errors
that didn't exist before (e.g. accidentally sending &newelemptr rather
than newelemptr - I actually did this quite a lot in my test
conversions of existing code).
But thanks to Eric Blake's clever thinking, I was able to modify the
INSERT and APPEND macros so that they *do* check for both assignment
and size compatibility of *ptr (an element in the array) and newelem
(the element being copied into the new position of the array). This is
done via clever use of the C89-guaranteed fact that the sizeof()
operator must have *no* side effects (so an assignment inside sizeof()
is checked for validity, but not actually evaluated), and the fact
that virInsertElementsN has a "# of new elements" argument that we
want to always be 1.
QEMU supports setting vendor and product strings for disk since
1.2.0 (only scsi-disk, scsi-hd, scsi-cd support it), this patch
exposes it with new XML elements <vendor> and <product> of disk
device.
virGetGroupIDByName is documented as returning 1 if the groupname
cannot be found. getgrnam_r is documented as returning:
« 0 or ENOENT or ESRCH or EBADF or EPERM or ... The given name
or gid was not found. »
and that:
« The formulation given above under "RETURN VALUE" is from POSIX.1-2001.
It does not call "not found" an error, hence does not specify what
value errno might have in this situation. But that makes it impossible to
recognize errors. One might argue that according to POSIX errno should be
left unchanged if an entry is not found. Experiments on various UNIX-like
systems shows that lots of different values occur in this situation: 0,
ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. »
virGetGroupIDByName returns an error when the return value of getgrnam_r
is non-0. However on my RHEL system, getgrnam_r returns ENOENT when the
requested user cannot be found, which then causes virGetGroupID not
to behave as documented (it returns an error instead of falling back
to parsing the passed-in value as an gid).
This commit makes virGetGroupIDByName only report an error when errno
is set to one of the values in the posix description of getgrnam_r
(which are the same as the ones described in the manpage on my system).
virGetUserIDByName is documented as returning 1 if the username
cannot be found. getpwnam_r is documented as returning:
« 0 or ENOENT or ESRCH or EBADF or EPERM or ... The given name
or uid was not found. »
and that:
« The formulation given above under "RETURN VALUE" is from POSIX.1-2001.
It does not call "not found" an error, hence does not specify what
value errno might have in this situation. But that makes it impossible to
recognize errors. One might argue that according to POSIX errno should be
left unchanged if an entry is not found. Experiments on various UNIX-like
systems shows that lots of different values occur in this situation: 0,
ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. »
virGetUserIDByName returns an error when the return value of getpwnam_r
is non-0. However on my RHEL system, getpwnam_r returns ENOENT when the
requested user cannot be found, which then causes virGetUserID not
to behave as documented (it returns an error instead of falling back
to parsing the passed-in value as an uid).
This commit makes virGetUserIDByName only report an error when errno
is set to one of the values in the posix description of getpwnam_r
(which are the same as the ones described in the manpage on my system).
If debugging is enabled, the debug messages are sent to stderr.
Moreover, if a command has catching of stderr set, the messages
gets mixed with stdout output (assuming both outputs are stored
in the same variable). The resulting string then doesn't
necessarily have to start with desired prefix then. This bug
exposes itself when parsing dnsmasq output:
2012-12-06 11:18:11.445+0000: 18491: error :
dnsmasqCapsSetFromBuffer:664 : internal error cannot parse
/usr/sbin/dnsmasq version number in '2012-12-06
11:11:02.232+0000: 18492: debug : virFileClose:72 : Closed fd 22'
We can clearly see that the output of dnsmasq --version doesn't
start with expected "Dnsmasq version " string but a libvirt debug
output.
If the debugging is enabled, the virCommand subsystem catches debug
messages in the command output as well. In that case, we can't assume
the string corresponding to command's stdout will start with specific
prefix. But the prefix can be moved deeper in the string. This bug
shows itself when parsing dnsmasq output:
2012-12-06 11:18:11.445+0000: 18491: error :
dnsmasqCapsSetFromBuffer:664 : internal error cannot parse
/usr/sbin/dnsmasq version number in '2012-12-06 11:11:02.232+0000:
18492: debug : virFileClose:72 : Closed fd 22'
We can clearly see that the output of dnsmasq --version
doesn't start with expected "Dnsmasq version " string but a libvirt
debug output.
The pciWrite32 function assembled the array of data to be written to the
fd with a bad offset on the last byte. This issue was probably caused by
a typo (14, 24).
This introduces a few new APIs for dealing with strings.
One to split a char * into a char **, another to join a
char ** into a char *, and finally one to free a char **
There is a simple test suite to validate the edge cases
too. No more need to use the horrible strtok_r() API,
or hand-written code for splitting strings.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To be able todo controlled shutdown/reboot of containers an
API to talk to init via /dev/initctl is required. Fortunately
this is quite straightforward to implement, and is supported
by both sysvinit and systemd. Upstart support for /dev/initctl
is unclear.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This new function returns true if the given address is in the range of
any "private" or "local" networks as defined in RFC1918 (IPv4) or
RFC3484/RFC4193 (IPv6), otherwise they return false.
These ranges are:
192.168.0.0/16
172.16.0.0/16
10.0.0.0/24
FC00::/7
FEC0::/10
In order to optionally take advantage of new features in dnsmasq when
the host's version of dnsmasq supports them, but still be able to run
on hosts that don't support the new features, we need to be able to
detect the version of dnsmasq running on the host, and possibly
determine from the help output what options are in this dnsmasq.
This patch implements a greatly simplified version of the capabilities
code we already have for qemu. A dnsmasqCaps device can be created and
populated either from running a program on disk, reading a file with
the concatenated output of "dnsmasq --version; dnsmasq --help", or
examining a buffer in memory that contains the concatenated output of
those two commands. Simple functions to retrieve capabilities flags,
the version number, and the path of the binary are also included.
bridge_driver.c creates a single dnsmasqCaps object at driver startup,
and disposes of it at driver shutdown. Any time it must be used, the
dnsmasqCapsRefresh method is called - it checks the mtime of the
binary, and re-runs the checks if the binary has changed.
networkxml2argvtest.c creates 2 "artificial" dnsmasqCaps objects at
startup - one "restricted" (doesn't support --bind-dynamic) and one
"full" (does support --bind-dynamic). Some of the test cases use one
and some the other, to make sure both code pathes are tested.
Found by coverity:
Error: REVERSE_INULL (CWE-476):
libvirt-0.10.2/src/util/processinfo.c:141: deref_ptr: Directly
dereferencing pointer "map".
libvirt-0.10.2/src/util/processinfo.c:142: check_after_deref:
Null-checking "map" suggests that it may be null, but it has already
been dereferenced on all paths leading to the check.
The virStateInitialize method and several cgroups methods were
using an 'int privileged' parameter or similar for dual-state
values. These are better represented with the bool type.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
because libvirt_lxc's cgroup mountpoint is what it shown
in /proc/self/cgroup.
we can get container's cgroup through virCgroupNew("/", &group),
add interface virCgroupGetAppRoot to help container to
get it's cgroup.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
virCgroupGetMemSwapUsage is used to get container's swap usage,
with this interface,we can get swap usage in fuse filesystem.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
This bug leads to getting incorrect vcpupin information via
qemudDomainGetVcpuPinInfo() API when the number of maximum
cpu on a host falls into a range such as 31 < ncpus < 64.
gcc warning:
left shift count >= width of type
The following bug is such the case
https://bugzilla.redhat.com/show_bug.cgi?id=876415
In virNetDevVethDelete the virRun method will properly report
errors, but when checking the exit status for non-zero exit
code no error is reported
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When failing to create a macvlan interface, make sure the
error message contains the name of the host interface
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
It is possible for there to be deleted timers when we
calculate the next timeout, and they must be skipped.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The event code is a no-op if requested to update a non-existent
timer/handle watch. This makes it hard to detect bugs in the
caller who have passed bogus data. Add a VIR_WARN output in
such cases, since the API does not allow for return errors.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The docs for virDiskNameToIndex claim it ignores partition
numbers. In actual fact though, a code ordering bug means
that a partition number will cause the code to accidentally
multiply the result by 26.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit e0c469e58b that fixes the detection
of image chain wasn't complete. Iteration through the backing image
chain has to stop at the last existing image if some of the images are
missing otherwise the backing chain that is cached contains entries with
paths being set to NULL resulting to:
error: Unable to allow access for disk path (null): Bad address
Fortunately stat() is kind enough not to crash when it's presented with
a NULL argument. At least on Linux.
Fixes this error when building with -Werror on Alpine Linux:
util/processinfo.c: In function 'virProcessInfoSetAffinity':
util/processinfo.c:52:5: error: implicit declaration of function 'malloc' [-Werror=implicit-function-declaration]
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
This simplifies the top-level code, at the cost of using a little more
stack space. The primary benefit is being able to send more fields
without knowing in advance how many of them, and of which types, these
fields will be, and without having to individually add buffer variables.
The code imposes an upper limit on the total number of iovs/buffers
used, and fields that wouldn't fit are silently dropped. This is not
significant in this patch, but will affect the following one.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... and update all users. No change in functionality, the parameter
will be used later.
The metadata representation is as minimal as possible, but requires
the caller to allocate an array on stack explicitly.
The alternative of using varargs in the virLogMessage() callers:
* Would not allow the caller to optionally omit some metadata elements,
except by having two calls to virLogMessage.
* Would not be as type-safe (e.g. using int vs. size_t), and the compiler
wouldn't be able to do type checking
* Depending on parameter order:
a) virLogMessage(..., message format, message params...,
metadata..., NULL)
can not be portably implemented (parse_printf_format() is a glibc
function)
b) virLogMessage(..., metadata..., NULL,
message format, message params...)
would prevent usage of ATTRIBUTE_FMT_PRINTF and the associated
compiler checking.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
The "restart" function for locks allocates a new array according to
and pre-sets its length, then reads the owner pids from a JSON
document in a loop. Rather than adding each owner at a different
index, though, it repeatedly overwrites the last element of the array
with all the owners.
82507838 refactored the code to keep both the raw and canonicalized form
of the backingStore, which breaks badly when the storage pool contains a
storage volume, which is missing its backing store file:
# ./daemon/libvirtd -l
2012-11-07 12:43:33.279+0000: 22175: info : libvirt version: 1.0.0
2012-11-07 12:43:33.279+0000: 22175: error : absolutePathFromBaseFile:542 : Can't canonicalize path '/var/lib/libvirt/images/base.qcow2': No such file or directory
2012-11-07 12:43:33.280+0000: 22175: error : storageDriverAutostart:115 : Failed to autostart storage pool 'default': Can't canonicalize path '/var/lib/libvirt/images/base.qcow2': No such file or directory
This is because virStorageFileGetMetadataFromBuf() aborts with -1 if the
filename of the backingStore can not be canonicalized:
#0 absolutePathFromBaseFile () at util/storage_file.c:541
#1 virStorageFileGetMetadataFromBuf () at util/storage_file.c:728
#2 virStorageFileGetMetadataFromFD () at util/storage_file.c:932
#3 virStorageBackendProbeTarget () at storage/storage_backend_fs.c:94
#4 virStorageBackendFileSystemRefresh () at storage/storage_backend_fs.c:849
#5 storagePoolStart () at storage/storage_driver.c:700
#6 virStoragePoolCreate () at libvirt.c:12471
...
Treat files which miss their backing file as standalone files.
Signed-off-by: Philipp Hahn <hahn@univention.de>
Some FDs may not implement fdatasync() functionality,
e.g. pipes. In that case EINVAL or EROFS is returned.
We don't want to fail then nor report any error.
Reported-by: Christophe Fergeau <cfergeau@redhat.com>
The libvirt coding standard is to use 'function(...args...)'
instead of 'function (...args...)'. A non-trivial number of
places did not follow this rule and are fixed in this patch.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
With our fix of mkostemp (pushed as 2b435c15) we define a macro
to compile with uclibc. However, this definition is conditional
and thus needs to be properly indented. Moreover, with this definition
sc_prohibit_mkstemp syntax-check rule keeps yelling:
src/util/logging.c:63:# define mkostemp(x,y) mkstemp(x)
maint.mk: use mkostemp with O_CLOEXEC instead of mkstemp
Therefore we should ignore this file for this rule.
* configure.ac docs/news.html.in libvirt.spec.in: update for the new release
* po/*.po*: update from transifex, a lot of added support e.g. Indian
languages, and regenerate
Currently, when we are doing (managed) save, we insert the
iohelper between the qemu and OS. The pipe is created, the
writing end is passed to qemu and the reading end to the
iohelper. It reads data and write them into given file. However,
with write() being asynchronous data may still be in OS
caches and hence in some (corner) cases, all migration data
may have been read and written (not physically though). So
qemu will report success, as well as iohelper. However, with
some non local filesystems, where ENOSPACE is polled every X
time units, we may get into situation where all operations
succeeded but data hasn't reached the disk. And in fact will
never do. Therefore we ought sync caches to make sure data
has reached the block device on remote host.
virPidFileReadPathIfAlive passed in an 'int *' where a 'pid_t *'
was expected, which breaks on Mingw64 targets. Also a few places
were using '%d' for formatting pid_t, change them to '%lld' and
force a cast to the longer type as done elsewhere in the same
file.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There are multiple reasons canonicalize_file_name() used in
absolutePathFromBaseFile helper can fail. This patch enhances error
reporting from that helper.
This patch resolves: https://bugzilla.redhat.com/show_bug.cgi?id=871201
If libvirt is restarted after updating the dnsmasq or radvd packages,
a subsequent "virsh net-destroy" will fail to kill the dnsmasq/radvd
process.
The problem is that when libvirtd restarts, it re-reads the dnsmasq
and radvd pidfiles, then does a sanity check on each pid it finds,
including checking that the symbolic link in /proc/$pid/exe actually
points to the same file as the path used by libvirt to execute the
binary in the first place. If this fails, libvirt assumes that the
process is no longer alive.
But if the original binary has been replaced, the link in /proc is set
to "$binarypath (deleted)" (it literally has the string " (deleted)"
appended to the link text stored in the filesystem), so even if a new
binary exists in the same location, attempts to resolve the link will
fail.
In the end, not only is the old dnsmasq/radvd not terminated when the
network is stopped, but a new dnsmasq can't be started when the
network is later restarted (because the original process is still
listening on the ports that the new process wants).
The solution is, when the initial "use stat to check for identical
inodes" check for identity between /proc/$pid/exe and $binpath fails,
to check /proc/$pid/exe for a link ending with " (deleted)" and if so,
truncate that part of the link and compare what's left with the
original binarypath.
A twist to this problem is that on systems with "merged" /sbin and
/usr/sbin (i.e. /sbin is really just a symlink to /usr/sbin; Fedora
17+ is an example of this), libvirt may have started the process using
one path, but /proc/$pid/exe lists a different path (indeed, on F17
this is the case - libvirtd uses /sbin/dnsmasq, but /proc/$pid/exe
shows "/usr/sbin/dnsmasq"). The further bit of code to resolve this is
to call virFileResolveAllLinks() on both the original binarypath and
on the truncated link we read from /proc/$pid/exe, and compare the
results.
The resulting code still succeeds in all the same cases it did before,
but also succeeds if the binary was deleted or replaced after it was
started.
Currently, we use iohelper when saving/restoring a domain.
However, if there's some kind of error (like I/O) it is not
propagated to libvirt. Since it is not qemu who is doing
the actual write() it will not get error. The iohelper does.
Therefore we should check for iohelper errors as it makes
libvirt more user friendly.
In the XML warning, we print a virsh command line that can be used to
edit that XML. This patch prints UUIDs if the entity name contains
special characters (like shell metacharacters, or "--" that would break
parsing of the XML comment). If the entity doesn't have a UUID, just
print the virsh command that can be used to edit it.
This commit changes the behavior of LIBVIRT_DEBUG=1 libvirtd:
$ git show 7022b09111
commit 7022b09111
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Thu Sep 27 13:13:09 2012 +0100
Automatically enable systemd journal logging
Probe to see if the systemd journal is accessible, and if
so enable logging to the journal by default, rather than
stderr (current default under systemd).
Previously 'LIBVIRT_DEBUG=1 /usr/sbin/libvirtd' would show all debug
output to stderr, now it send debug output to the journal.
Only use the journal by default if running in daemon mode, or
if stdin is _not_ a tty. This should make libvirtd launched from
systemd use the journal, but preserve the old behavior in most
situations.
Sometimes it's handy to know how many bits are set.
* src/util/bitmap.h (virBitmapCountBits): New prototype.
(virBitmapNextSetBit): Use correct type.
* src/util/bitmap.c (virBitmapNextSetBit): Likewise.
(virBitmapSetAll): Maintain invariant of clear tail bits.
(virBitmapCountBits): New function.
* src/libvirt_private.syms (bitmap.h): Export it.
* tests/virbitmaptest.c (test2): Test it.
Add utility functions for Open vSwitch to both save
per-port data before a live migration, and restore the
per-port data after a live migration.
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
We put a comment containing "virsh edit <domain_name>" at the start of
the XML. W3C recommendation forbids the use of "--" in comments [1] and
libvirt can't parse it either. This patch omits the domain name if it
contains a double hyphen.
[1] http://www.w3.org/TR/REC-xml/#sec-comments
Yet another instance of where using plain open() mishandles files
that live on root-squash NFS, and where improving the API can
improve the chance of a successful probe.
* src/util/storage_file.h (virStorageFileProbeFormat): Alter
signature.
* src/util/storage_file.c (virStorageFileProbeFormat): Use better
method for opening file.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Update caller.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Likewise.
This fixes the problem reported in:
https://bugzilla.redhat.com/show_bug.cgi?id=868389
Previously, the dnsmasq hosts file (used for static dhcp entries, and
addnhosts file (used for additional dns host entries) were only
created/referenced on the dnsmasq commandline if there was something
to put in them at the time the network was started. Once we can update
a network definition while it's active (which is now possible with
virNetworkUpdate), this is no longer a valid strategy - if there were
0 dhcp static hosts (resulting in no reference to the hosts file on the
commandline), then one was later added, the commandline wouldn't have
linked dnsmasq up to the file, so even though we create it, dnsmasq
doesn't pay any attention.
The solution is to just always create these files and reference them
on the dnsmasq commandline (almost always, anyway). That way dnsmasq
can notice when a new entry is added at runtime (a SIGHUP is sent to
dnsmasq by virNetworkUdpate whenever a host entry is added or removed)
The exception to this is that the dhcp static hosts file isn't created
if there are no lease ranges *and* no static hosts. This is because in
this case dnsmasq won't be setup to listen for dhcp requests anyway -
in that case, if the count of dhcp hosts goes from 0 to 1, dnsmasq
will need to be restarted anyway (to get it listening on the dhcp
port). Likewise, if the dhcp hosts count goes from 1 to 0 (and there
are no dhcp ranges) we need to restart dnsmasq so that it will stop
listening on port 67. These special situations are handled in the
bridge driver's networkUpdate() by checking for ((bool)
nranges||nhosts) both before and after the update, and triggering a
dnsmasq restart if the before and after don't match.
In order to temporarily label files read/write during a commit
operation, we need to crawl the backing chain and find the absolute
file name that needs labeling in the first place, as well as the
name of the file that owns the backing file.
* src/util/storage_file.c (virStorageFileChainLookup): New
function.
* src/util/storage_file.h: Declare it.
* src/libvirt_private.syms (storage_file.h): Export it.
In order to search for a backing file name as literally present
in a chain, we need to remember if the chain had relative names.
Also, searching for absolute names is easier if we only have
to canonicalize once, rather than on every iteration.
* src/util/storage_file.h (_virStorageFileMetadata): Add field.
* src/util/storage_file.c (virStorageFileGetMetadataFromBuf):
(virStorageFileFreeMetadata): Manage it
(absolutePathFromBaseFile): Store absolute names in canonical form.
Requiring pre-allocation was an unusual idiom. It allowed iteration
over the backing chain to use fewer mallocs, but made one-shot
clients harder to read. Also, this makes it easier for a future
patch to move away from opening fds on every iteration over the chain.
* src/util/storage_file.h (virStorageFileGetMetadataFromFD): Alter
signature.
* src/util/storage_file.c (virStorageFileGetMetadataFromFD): Allocate
return value.
(virStorageFileGetMetadata): Update clients.
* src/conf/domain_conf.c (virDomainDiskDefForeachPath): Likewise.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Likewise.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Likewise.
Previously, no one was using virStorageFileGetMetadata, and for good
reason - it couldn't support root-squash NFS. Change the signature
and make it useful to future patches, including enhancing the metadata
to recursively track the entire chain.
* src/util/storage_file.h (_virStorageFileMetadata): Add field.
(virStorageFileGetMetadata): Alter signature.
* src/util/storage_file.c (virStorageFileGetMetadata): Rewrite.
(virStorageFileGetMetadataRecurse): New function.
(virStorageFileFreeMetadata): Handle recursion.
When an image has no backing file, using VIR_STORAGE_FILE_AUTO
for its type is a bit confusing. Additionally, a future patch
would like to reserve a default value for the case of no file
type specified in the XML, but different from the current use
of -1 to imply probing, since probing is not always safe.
Also, a couple of file types were missing compared to supported
code: libxl supports 'vhd', and qemu supports 'fat' for directories
passed through as a file system.
* src/util/storage_file.h (virStorageFileFormat): Add
VIR_STORAGE_FILE_NONE, VIR_STORAGE_FILE_FAT, VIR_STORAGE_FILE_VHD.
* src/util/storage_file.c (virStorageFileMatchesVersion): Match
documentation when version probing not supported.
(cowGetBackingStore, qcowXGetBackingStore, qcow1GetBackingStore)
(qcow2GetBackingStoreFormat, qedGetBackingStore)
(virStorageFileGetMetadataFromBuf)
(virStorageFileGetMetadataFromFD): Take NONE into account.
* src/conf/domain_conf.c (virDomainDiskDefForeachPath): Likewise.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Likewise.
* src/conf/storage_conf.c (virStorageVolumeFormatFromString): New
function.
(poolTypeInfo): Use it.
Win32 platforms don't have SIGKILL defined, but they do have
SIGABRT. Since our virProcess wrapper treats anything which
isn't SIGTERM/SIGINT as equivalent to SIGKILL, just use
SIGABRT on Win32.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add two new APIs virLockSpaceNewPostExecRestart and
virLockSpacePreExecRestart which allow a virLockSpacePtr
object to be created from a JSON object and saved to a
JSON object, for the purposes of re-exec'ing a process.
As well as saving the state in JSON format, the second
method will disable the O_CLOEXEC flag so that the open
file descriptors are preserved across the process re-exec()
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The previously introduced virFile{Lock,Unlock} APIs provide a
way to acquire/release fcntl() locks on individual files. For
unknown reason though, the POSIX spec says that fcntl() locks
are released when *any* file handle referring to the same path
is closed. In the following sequence
threadA: fd1 = open("foo")
threadB: fd2 = open("foo")
threadA: virFileLock(fd1)
threadB: virFileLock(fd2)
threadB: close(fd2)
you'd expect threadA to come out holding a lock on 'foo', and
indeed it does hold a lock for a very short time. Unfortunately
when threadB does close(fd2) this releases the lock associated
with fd1. For the current libvirt use case for virFileLock -
pidfiles - this doesn't matter since the lock is acquired
at startup while single threaded an never released until
exit.
To provide a more generally useful API though, it is necessary
to introduce a slightly higher level abstraction, which is to
be referred to as a "lockspace". This is to be provided by
a virLockSpacePtr object in src/util/virlockspace.{c,h}. The
core idea is that the lockspace keeps track of what files are
already open+locked. This means that when a 2nd thread comes
along and tries to acquire a lock, it doesn't end up opening
and closing a new FD. The lockspace just checks the current
list of held locks and immediately returns VIR_ERR_RESOURCE_BUSY.
NB, the API as it stands is designed on the basis that the
files being locked are not being otherwise opened and used
by the application code. One approach to using this API is to
acquire locks based on a hash of the filepath.
eg to lock /var/lib/libvirt/images/foo.img the application
might do
virLockSpacePtr lockspace = virLockSpaceNew("/var/lib/libvirt/imagelocks");
lockname = md5sum("/var/lib/libvirt/images/foo.img");
virLockSpaceAcquireLock(lockspace, lockname);
NB, in this example, the caller should ensure that the path
is canonicalized before calculating the checksum.
It is also possible to do locks directly on resources by
using a NULL lockspace directory and then using the file
path as the lock name eg
virLockSpacePtr lockspace = virLockSpaceNew(NULL);
virLockSpaceAcquireLock(lockspace, "/var/lib/libvirt/images/foo.img");
This is only safe to do though if no other part of the process
will be opening the files. This will be the case when this
code is used inside the soon-to-be-reposted virlockd daemon
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit e8fd8757c8 changed 'const char *'
category to virLogSource enum. This changes it in virLogEatParams as
well, thus fixing the build with --disable-debug.
--
Hopefully moving the enum declarations is less ugly than using int.
All USB device lookup functions emit an error when they cannot find the
requested device. With this patch, their caller can choose if a missing
device is an error or normal condition.
Currently virNetSocketNew fails because virSetCloseExec fails as there
is no proper implementation for it on Windows at the moment. Workaround
this by pretending that setting close-on-exec on the fd works. This can
be done because libvirt currently lacks the ability to create child
processes on Windows anyway. So there is no point in failing to set a
flag that isn't useful at the moment anyway.
The code was reporting raw exit status without decoding it into
normal vs. signal exit. virCommandRun already does this, but
with a different error type, so all we have to do is recast
the error to the correct type.
Reported by li guang.
* src/util/hooks.c (virHookCall): Simplify.
This patch updates virGetUserID and virGetGroupID to be able to parse a
user or group name in a similar way to coreutils' chown. This means that
a numeric value with a leading plus sign is always parsed as an ID,
otherwise the functions try to parse the input first as a user or group
name and if this fails they try to parse it as an ID.
This patch includes Peter Krempa's changes to correctly handle errors
returned by getpwnam_r and getgrnam_r.
The output buffer for virFileReadAll was too small for systems with
more than 30 CPUs which leads to a log entry and incorrect behavior.
The new size will be sufficient for the current
architectural limits.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Correct the check for the return value of virStrcpyStatic()
when copying port-profile names. Fixes Open vSwitch ports
which utilize port-profiles from network definitions.
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
Commit f6430390 broke builds on RHEL 5, where glibc (2.5) is too
old to support mkostemp (2.7) or htole64 (2.9). While gnulib
has mkostemp, it still lacks htole64; and it's not worth dragging
in replacements on systems where journald is unlikely to exist
in the first place, so we just use an extra configure-time check
as our witness of whether to attempt compiling the code.
* src/util/logging.c (virLogParseOutputs): Don't attempt to
compile journald on older glibc.
* configure.ac (AC_CHECK_DECLS): Check for htole64.
Add support for logging to the systemd journal, using its
simple client library. The benefit over syslog is that it
accepts structured log data, so the journald can store
individual items like code file/line/func separately from
the string message. Tools which require structured log
data can then query the journal to extract exactly what
they desire without resorting to string parsing
While systemd provides a simple client library for logging,
it is more convenient for libvirt to directly write its
own client code. This lets us build up the iovec's on
the stack, avoiding the need to alloc memory when writing
log messages.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The 'const char *category' parameter only has a few possible
values now that the filename has been separated. Turn this
parameter into an enum instead.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the logging APIs have a 'const char *category' parameter
which indicates where the log message comes from. This is typically
a combination of the __FILE__ string and other prefix. Split the
__FILE__ off into a dedicated parameter so it can passed to the
log outputs
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
General whitespace cleanup in the logging files
- Move '{' to a new line after funtion declaration
- Put each parameter on a new line to avoid long lines
- Put return type on new line
- Leave 2 blank lines between functions
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The log destinations are an enum, but most of the code was
just using a plain 'int' for function params / variables.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The __LINE__ macro value is specified to fit in the size_t
type, so use that instead of 'long long' in the logging code
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The log priority levels are an enum, but most of the code was
just using a plain 'int' for function params / variables.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
I hit this problem recently when trying to create a bridge with an IPv6
address on a 3.2 kernel: dnsmasq (and, further, radvd) would not bind to
the given address, waiting 20s and then giving up with -EADDRNOTAVAIL
(resp. exiting immediately with "error parsing or activating the config
file", without libvirt noticing it, BTW). This can be reproduced with (I
think) any kernel >= 2.6.39 and the following XML (to be used with
"virsh net-create"):
<network>
<name>test-bridge</name>
<bridge name='testbr0' />
<ip family='ipv6' address='fd00::1' prefix='64'>
</ip>
</network>
(it happens even when you have an IPv4, too)
The problem is that since commit [1] (which, ironically, was made to
“help IPv6 autoconfiguration”) the linux bridge code makes bridges
behave like “real” devices regarding carrier detection. This makes the
bridges created by libvirt, which are started without any up devices,
stay with the NO-CARRIER flag set, and thus prevents DAD (Duplicate
address detection) from happening, thus letting the IPv6 address flagged
as “tentative”. Such addresses cannot be bound to (see RFC 2462), so
dnsmasq fails binding to it (for radvd, it detects that "interface XXX
is not RUNNING", thus that "interface XXX does not exist, ignoring the
interface" (sic)). It seems that this behavior was enhanced somehow with
commit [2] by avoiding setting NO-CARRIER on empty bridges, but I
couldn't reproduce this behavior on my kernel. Anyway, with the “dummy
tap to set MAC address” trick, this wouldn't work.
To fix this, the idea is to get the bridge's attached device to be up so
that DAD can happen (deactivating DAD altogether is not a good idea, I
think). Currently, libvirt creates a dummy TAP device to set the MAC
address of the bridge, keeping it down. But even if we set this device
up, it is not RUNNING as soon as the tap file descriptor attached to it
is closed, thus still preventing DAD. So, we must modify the API a bit,
so that we can get the fd, keep the tap device persistent, run the
daemons, and close it after DAD has taken place. After that, the bridge
will be flagged NO-CARRIER again, but the daemons will be running, even
if not happy about the device's state (but we don't really care about
the bridge's daemons doing anything when no up interface is connected to
it).
Other solutions that I envisioned were:
* Keeping the *-nic interface up: this would waste an fd for each
bridge during all its life. May be acceptable, I don't really
know.
* Stop using the dummy tap trick, and set the MAC address directly
on the bridge: it is possible since quite some time it seems,
even if then there is the problem of the bridge not being
RUNNING when empty, contrary to what [2] says, so this will need
fixing (and this fix only happened in 3.1, so it wouldn't work
for 2.6.39)
* Using the --interface option of dnsmasq, but I saw somewhere
that it's not used by libvirt for backward compatibility. I am
not sure this would solve this problem, though, as I don't know
how dnsmasq binds itself to it with this option.
This is why this patch does what's described earlier.
This patch also makes radvd start even if the interface is
“missing” (i.e. it is not RUNNING), as it daemonizes before binding to
it, and thus sometimes does it after the interface has been brought down
by us (by closing the tap fd), and then originally stops. This also
makes it stop yelling about it in the logs when the interface is down at
a later time.
[1]
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=1faa4356a3bd89ea11fb92752d897cff3a20ec0e
[2]
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=b64b73d7d0c480f75684519c6134e79d50c1b341
In addition to the preformatted text line, pass the raw message as well,
to allow the output functions to use a different output format.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Allow for the code converting from libvirt log levels to syslog
log levels to be reused.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In the cgroups APIs we have a virCgroupKillPainfully function
which does the loop sending SIGTERM, then SIGKILL and waiting
for the process to exit. There is similar functionality for
simple processes in qemuProcessKill, but it is tangled with
the QEMU code. Untangle it to provide a virProcessKillPainfuly
function
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Continue consolidation of process functions by moving some
helpers out of command.{c,h} into virprocess.{c,h}
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There are a number of process related functions spread
across multiple files. Start to consolidate them by
creating a virprocess.{c,h} file
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virCommand prefix was inappropriate because the API
does not use any virCommandPtr object instance. This
API closely related to waitpid/exit, so use virProcess
as the prefix
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Nothing uses the return value, and creating it requries otherwise
unnecessary strlen () calls.
This cleanup is conceptually independent from the rest of the series
(although the later patches won't apply without it). This just seems
a good opportunity to clean this up, instead of entrenching the unnecessary
return value in the virLogOutputFunc instance that will be added in this
series.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
When auto-probing hypervisor drivers, the conn->uri field will
initially be NULL. Care must be taken not to access members
when doing auth lookups in the config file
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://www.gnu.org/licenses/gpl-howto.html recommends that
the 'If not, see <url>.' phrase be a separate sentence.
* tests/securityselinuxhelper.c: Remove doubled line.
* tests/securityselinuxtest.c: Likewise.
* globally: s/; If/. If/
Commit ee3d3893 missed the fact that (unsigned char)<<(int)
is truncated to int, and therefore failed for any bitmap data
longer than four bytes.
Also, I failed to run 'make syntax-check' on my commit 4bba6579;
for whatever odd reason, ffs lives in a different header than ffsl.
* src/util/bitmap.c (virBitmapNewData): Use correct shift type.
(includes): Glibc (and therefore gnulib) decided ffs is in
<strings.h>, but ffsl is in <string.h>.
* tests/virbitmaptest.c (test5): Test it.
Commit 0fc89098 used functions only available on glibc, completely
botched 32-bit environments, and risked SIGBUS due to unaligned
memory access on platforms that aren't as forgiving as x86_64.
* bootstrap.conf (gnulib_modules): Import ffsl.
* src/util/bitmap.c (includes): Use <strings.h> for ffsl.
(virBitmapNewData, virBitmapToData): Avoid 64-bit assumptions and
non-portable functions.
Two changes are introduced in this patch:
- The first change removes ATTRIBUTE_RETURN_CHECK from
virNetDevBandwidthClear, because it was called with ignore_value
always, anyway. The function is used even when it's not necessary
to call it, just for cleanup purposes.
- The second change is added ignoring of the command's exit status,
since it may report an error even when run just as "to be sure we
clean up" function. No libvirt errors are suppresed by this.
Validates the wwn while parsing, error out if it's malformed.
* src/util/util.h: Declare virValidateWWN
* src/util/util.c: Implement virValidateWWN
* src/libvirt_private.syms: Export virValidateWWN.
* src/conf/domain_conf.h: New member 'wwn' for disk def.
* src/conf/domain_conf.c: Parse and format disk <wwn>
In many places we store bitmap info in a chunk of data
(pointed to by a char *), and have redundant codes to
set/unset bits. This patch extends virBitmap, and convert
those codes to use virBitmap in subsequent patches.
The introduction of /sys/fs/cgroup came in fairly recent kernels.
Prior to that time distros would pick a custom directory like
/cgroup or /dev/cgroup. We need to auto-detect where this is,
rather than hardcoding it
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch adds a helper to deal with assigning values to
virTypedParameter structures from strings. The helper parses the value
from the string and assigns it to the corresponding union value.
FreeBSD and OpenBSD have a <net/if.h> that is not self-contained;
and mingw lacks the header altogether. But gnulib has just taken
care of that for us, so we might as well simplify our code. In
the process, I got a syntax-check failure if we don't also take
the gnulib execinfo module.
* .gnulib: Update to latest, for execinfo and net_if.
* bootstrap.conf (gnulib_modules): Add execinfo and net_if modules.
* configure.ac: Let gnulib check for headers. Simplify check for
'struct ifreq', while also including enough prereq headers.
* src/internal.h (IF_NAMESIZE): Drop, now that gnulib guarantees it.
* src/nwfilter/nwfilter_learnipaddr.h: Use correct header for
IF_NAMESIZE.
* src/util/virnetdev.c (includes): Assume <net/if.h> exists.
* src/util/virnetdevbridge.c (includes): Likewise.
* src/util/virnetdevtap.c (includes): Likewise.
* src/util/logging.c (includes): Assume <execinfo.h> exists.
(virLogStackTraceToFd): Handle gnulib's fallback implementation.
This fixes https://bugzilla.redhat.com/show_bug.cgi?id=852984
If a network or interface is configured to use Open vSwitch, but
ovs-vswitchd (the Open vSwitch database service) isn't running, the
ovs-vsctl add-port/del-port commands will hang indefinitely rather
than returning an error. There is a --nowait option, but that appears
to have no effect on add-port and del-port commands, so instead we add
a --timeout=5 to the commands - they will retry for up to 5 seconds,
then fail if there is no response.
On OpenBSD, clock_gettime() exists in libc rather than librt, and
blindly linking with -lrt made the build fail. Gnulib already
did the work for determining which libraries to use, so we should
reuse that work rather than doing it ourselves.
* bootstrap.conf (gnulib_modules): Pull in clock-time.
* configure.ac (RT_LIBS): Drop.
* src/Makefile.am (libvirt_util_la_LIBADD): Use gnulib variable
instead.
* src/util/virtime.c (includes): Simplify.
Without this patch, logged command executions can be ambiguous if
the command contained any shell metacharacters. This has caused
more than one person to attempt to patch clients to add unnecessary
quoting, without realizing that the command itself was run with
correct args, and only the logged output was ambiguous.
* src/util/command.c (virCommandToString): Add shell escapes.
* tests/commandtest.c (test16): Test new behavior.
* tests/commanddata/test16.log: Update expected output.
* tests/qemuxml2argvdata/qemuxml2argv-*.args: Likewise.
* tests/networkxml2argvdata/*.argv: Likewise.
The codes were updated to allow to reset the device as long as
there is no devices/functions behind the same bus. However, the
comments were kept without touched.
On NUMA machine, the length of string got from file
cpuacct.usage_percpu is quite large, so expand the
limit of 1024 bytes.
errors like:
Failed to read file \
'/cgroup/cpuacct/libvirt/qemu/rhel6q/cpuacct.usage_percpu': \
Value too large for defined data type
The introduction of the new VLAN code, along with the fix
from 5e465df6be, caused the
addition of OVS ports to fail with the following message:
ovs-vsctl: 00002|vsctl|ERR|: missing column name
This fix takes into account the VLAN arguments are optional,
and correctly sets up the command line to run the "ovs-vsctl"
command to add ports to the OVS bridge.
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
CC: Eric Blake <eblake@redhat.com>
If a 8021.Qbh network device supports SRIOV and its VF is being used
in pci passthrough mode, when the guest is shutdown or destroyed, the
PF inteface is also brought down. qemuDomainHostdevNetConfigRestore()
finds out the PF for provided hostdev (which is VF) and passes it to
virNetDevPortProfileDisassociate() as linkdev. Later, linkdev gets passed
to virNetDevSetOnline() where the interface is brought down by clearing
IFF_UP flag.
Bringing down a PF, when only VF is being brought down is not expected
behavior. This patch adds a check so that virNetDevSetOnline() is called
only for PF and not if device is a VF.
Signed-off-by: Nishank Trivedi <nistrive@cisco.com>
Fixup buffer usage when handling VLANs. Also fix the logic
used to determine if the virNetDevVlanPtr is valid or not.
Fixes crashes in the latest code when using Open vSwitch
virtualports.
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
* src/util/virnetdevopenvswitch.c (virNetDevOpenvswitchAddPort): avoid libvirtd
crash due to derefing a NULL virtVlan->tag.
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=852383
Signed-off-by: Alex Jia <ajia@redhat.com>
getpwuid_r returns success but sets the return structure to NULL when it
fails to deliver data about the requested uid. In our helper code this
created following strange error messages:
" ... cannot getpwuid_r(1234): Success"
This patch creates a more helpful message:
" ... getpwuid_r failed to retrieve data for uid '1234'"
Previous commit 0b4b53bb80 defined 'inline' to prevent broken build on
systems with libnl1 headers. However, it broke build on systems with
libnl3 headers. Therefore we must make that fix conditional.
Ubuntu 10.04 shipped with out-of-the-box libnl1 headers, which
assumed the old gcc semantics of 'extern inline' as a C89 extension:
the function will _always_ be inline if it is used, and that
it may be declared extern inline in headers without a definition,
as long as the definition occurs before any use. But when C99
added 'extern inline' as a mandatory feature of the language, with
slightly different semantics than gcc (the function MUST have
external linkage, and the inline definition MUST be present
alongside any declaration, where the compiler can then choose
which of the two versions to use), this rendered the use of
'inline' in libnl's header obsolete. Most distros already solved
this by removing 'inline' (the resulting 'extern' is correct,
regardless of gcc semantics), and libnl-3 does not have the
problem (where it has switched to 'static inline' instead, again
with the definition present, and again, our hack will result in
plain 'static' with no ill effects). But for the case of building
out of the box, we hack around the broken Ubuntu header.
* src/util/virnetlink.h: Work around libnl issue.
Currently, when guest agent is configured but not responsive
(e.g. due to appropriate service not running in the guest)
we return VIR_ERR_INTERNAL_ERROR. Both are wrong. Therefore
we need to introduce new error code to reflect this case.
libvirt's network config documents that a bridge's STP "forward delay"
(called "delay" in the XML) should be specified in seconds, but
virNetDevBridgeSetSTPDelay() assumes that it is given a delay in
milliseconds (although the comment at the top of the function
incorrectly says "seconds".
This fixes the comment, and converts the delay to milliseconds before
calling virNetDevBridgeSetSTPDelay().
Several VIR_DEBUG()'s were changed to VIR_WARN() while I was testing
the firewalld support patch, and I neglected to change them back
before I pushed.
In the meantime I've decided that it would be useful to have them be
VIR_INFO(), just so there will be logged evidence of which method is
being used (firewall-cmd vs. (eb|ip)tables) without needing to crank
logging to 11. (at most this adds 2 lines to libvirtd's logs per
libvirtd start).
The virNetlinkEventAddClient / virNetlinkEventRemoveClient stub
impls had syntax errors in their parameter lists, using a ')'
after the second-to-last parameter instead of a ','
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch introduce virNetlinkEventServiceStopAll() to stop
all the monitors to receive netlink messages for libvirtd.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
This patch improve all the API in virnetlink.c to support
all kinds of netlink protocols, and make all netlink sockets
be able to join in groups.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
vcpu threads pin are implemented using sched_setaffinity(), but
not controlled by cgroup. This patch does the following things:
1) enable cpuset cgroup
2) reflect all the vcpu threads pin info to cgroup
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Introduce a new API to move tasks of one controller from a cgroup to another cgroup
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Introduce the function virCgroupForEmulator() to create sub directory
for simulator thread(include I/O thread, vhost-net thread)
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
When gcc atomic intrinsics are not available (such as on RHEL 5
with gcc 4.1.2), we were getting link errors due to multiple
definitions:
./.libs/libvirt_util.a(libvirt_util_la-virobject.o): In function `virAtomicIntXor':
/home/dummy/l,ibvirt/src/util/viratomoic.h:404: multiple definition of `virAtomicIntXor'
./.libs/libvirt_util.a(libvirt_util_la-viratomic.o):/home/dummy/libvirt/src/util/viratomic.h:404: first defined here
Solve this by conditionally marking the functions static (the
condition avoids falling foul of gcc warnings about unused
static function declarations).
* src/util/viratomic.h: When not using gcc intrinsics, use static
functions to avoid linker errors on duplicate functions.
We already skip out on building the LXC under RHEL 5, because the
kernel is too old (commits 4c18acf, 2dee896); but commit 9612e4b
moved some LXC-only code into common files, resulting in this
build failure:
util/virfile.c: In function 'virFileLoopDeviceAssociate':
util/virfile.c:580: error: 'LO_FLAGS_AUTOCLEAR' undeclared (first use in this function)
Unfortunately, the kernel folks only made it an enum, rather than
also a #define, so we have to modify configure.ac to record when
it is usable.
* configure.ac (with_lxc): Mark when LO_FLAGS_AUTOCLEAR was found.
* src/util/virfile.c (virFileLoopDeviceAssociate): Avoid
compilation when kernel is too old.
Fix possible double close in the child process after the fork in case
infd and outfd are equal, just like they are after being called from
virNetSocketNewConnectCommand.
* configure.ac, spec file: firewalld defaults to enabled if dbus is
available, otherwise is disabled. If --with_firewalld is explicitly
requested and dbus is not available, configure will fail.
* bridge_driver: add dbus filters to get the FirewallD1.Reloaded
signal and DBus.NameOwnerChanged on org.fedoraproject.FirewallD1.
When these are encountered, reload all the iptables reuls of all
libvirt's virtual networks (similar to what happens when libvirtd is
restarted).
* iptables, ebtables: use firewall-cmd's direct passthrough interface
when available, otherwise use iptables and ebtables commands. This
decision is made once the first time libvirt calls
iptables/ebtables, and that decision is maintained for the life of
libvirtd.
* Note that the nwfilter part of this patch was separated out into
another patch by Stefan in V2, so that needs to be revised and
re-reviewed as well.
================
All the configure.ac and specfile changes are unchanged from Thomas'
V3.
V3 re-ran "firewall-cmd --state" every time a new rule was added,
which was extremely inefficient. V4 uses VIR_ONCE_GLOBAL_INIT to set
up a one-time initialization function.
The VIR_ONCE_GLOBAL_INIT(x) macro references a static function called
vir(Ip|Eb)OnceInit(), which will then be called the first time that
the static function vir(Ip|Eb)TablesInitialize() is called (that
function is defined for you by the macro). This is
thread-safe, so there is no chance of any race.
IMPORTANT NOTE: I've left the VIR_DEBUG messages in these two init
functions (one for iptables, on for ebtables) as VIR_WARN so that I
don't have to turn on all the other debug message just to see
these. Even if this patch doesn't need any other modification, those
messages need to be changed to VIR_DEBUG before pushing.
This one-time initialization works well. However, I've encountered
problems with testing:
1) Whenever I have enabled the firewalld service, *all* attempts to
call firewall-cmd from within libvirtd end with firewall-cmd hanging
internally somewhere. This is *not* the case if firewall-cmd returns
non-0 in response to "firewall-cmd --state" (i.e. *that* command runs
and returns to libvirt successfully.)
2) If I start libvirtd while firewalld is stopped, then start
firewalld later, this triggers libvirtd to reload its iptables rules,
however it also spits out a *ton* of complaints about deletion failing
(I suppose because firewalld has nuked all of libvirt's rules). I
guess we need to suppress those messages (which is a more annoying
problem to fix than you might think, but that's another story).
3) I noticed a few times during this long line of errors that
firewalld made a complaint about "Resource Temporarily
unavailable. Having libvirtd access iptables commands directly at the
same time as firewalld is doing so is apparently problematic.
4) In general, I'm concerned about the "set it once and never change
it" method - if firewalld is disabled at libvirtd startup, causing
libvirtd to always use iptables/ebtables directly, this won't cause
*terrible* problems, but if libvirtd decides to use firewall-cmd and
firewalld is later disabled, libvirtd will not be able to recover.
This patch adds helper functions that enable us to use libssh2 in
conjunction with libvirt's virNetSockets for ssh transport instead of
spawning "ssh" client process.
This implemetation supports tunneled plaintext, keyboard-interactive,
private key, ssh agent based and null authentication. Libvirt's Auth
callback is used for interaction with the user. (Keyboard interactive
authentication, adding of host keys, private key passphrases). This
enables seamless integration into the application using libvirt. No
helpers as "ssh-askpass" are needed.
Reading and writing of OpenSSH style "known_hosts" files is supported.
Communication is done using SSH exec channel, where the user may specify
arbitrary command to be executed on the remote side and reads and writes
to/from stdin/out are sent through the ssh channel. Usage of stderr is
not (yet) supported.
The network pool should be able to keep track of both network device
names and PCI addresses, and return the appropriate one in the
actualDevice when networkAllocateActualDevice is called.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Move the functions the parse/format, and validate PCI addresses to
their own file so they can be conveniently used in other places
besides device_conf.c
Refactoring existing code without causing any functional changes to
prepare for new code.
This patch makes the code reusable.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Add the ability to support VLAN tags for Open vSwitch virtual port
types. To accomplish this, modify virNetDevOpenvswitchAddPort and
virNetDevTapCreateInBridgePort to take a virNetDevVlanPtr
argument. When adding the port to the OVS bridge, setup either a
single VLAN or a trunk port based on the configuration from the
virNetDevVlanPtr.
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
When a network device that is a VF of an SR-IOV card was assigned to a
guest using <interface type='hostdev'>, only the MAC address was being
saved/restored, but the VLAN tag was left untouched. Up to now we
haven't actually used vlan tags on SR-IOV devices, so the guest would
have used whatever was set, and left it the same at the end.
The patch following this one will hook up the <vlan> element from the
interface config, so save/restore of the device state needs to also
include the vlan tag.
MAC address is being saved as a simple ASCII string in a file named
for the device under /var/run. The VLAN tag is now just added at the
end of that file, after a newline. It might be nicer if the file was
XML (in case it ever gets more complicated) but at the moment there's
nothing else on the horizon, and this makes backward compatibility
easier.
To allow for the possibility of vlan "trunks", which have more than
one vlan tag associated with them, we need a vlan struct. Since it
will be used by multiple files in src/util, src/conf, src/network, and
src/qemu, it must be defined in src/util. Unfortunately there isn't
currently a common file for simple netdev data definitions, so I
created a new file.
This caused compilation of virnetdevvportprofile.c to fail on systems
without IFLA support in netlink (these are netlink commands used to
configure the VF's of SR-IOV network devices).
It is desirable to be able to query the config params of
the thread pool, in order to save the server state. Add
virThreadPoolGetMinWorkers, virThreadPoolGetMaxWorkers
and virThreadPoolGetPriorityWorkers APIs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
While the QEMU monitor/agent do not want JSON strings pretty
printed, other parts of libvirt might. Instead of hardcoding
QEMU's desired behaviour in virJSONValueToString(), add a
boolean flag to control pretty printing
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Use of ldexp() requires -lm on some platforms; use gnulib to determine
this for our makefile. Also, optimize virRandomInt() for the case
of a power-of-two limit (actually rather common, given that Daniel
has a pending patch to replace virRandomBits(10) with code that will
default to virRandomInt(1024) on default SELinux settings).
* .gnulib: Update to latest, for ldexp.
* bootstrap.conf (gnulib_modules): Import ldexp.
* src/Makefile.am (libvirt_util_la_CFLAGS): Link with -lm when
needed.
* src/util/virrandom.c (virRandomInt): Optimize powers of 2.
This patch adds three utility functions that operate on
virNetDevVPortProfile objects.
* virNetDevVPortProfileCheckComplete() - verifies that all attributes
required for the type of the given virtport are specified.
* virNetDevVPortProfileCheckNoExtras() - verifies that there are no
attributes specified which are inappropriate for the type of the
given virtport.
* virNetDevVPortProfileMerge3() - merges 3 virtports into a single,
newly allocated virtport. If any attributes are specified in
more than one of the three sources, and do not exactly match,
an error is logged and the function fails.
These new functions depend on new fields in the virNetDevVPortProfile
object that keep track of whether or not each attribute was
specified. Since the higher level parse function doesn't yet set those
fields, these functions are not actually usable yet (but that's okay,
because they also aren't yet used - all of that functionality comes in
a later patch.)
Note that these three functions return 0 on success and -1 on
failure. This may seem odd for the first two Check functions, since
they could also easily return true/false, but since they actually log
an error when the requested condition isn't met (and should result in
a failure of the calling function), I thought 0/-1 was more
appropriate.
virNetDevVPortProfile has (had) a type field that can be set to one of
several values, and a union of several structs, one for each
type. When a domain's interface object is of type "network", the
domain config may not know beforehand which type of virtualport is
going to be provided in the actual device handed down from the network
driver at runtime, but may want to set some values in the virtualport
that may or may not be used, depending on the type. To support this
usage, this patch replaces the union of structs with toplevel fields
in the struct, making it possible for all of the fields to be set at
the same time.
Both of these functions returned void, but it's convenient for them to
return a const char* of the char* that is passed in. This was you can
call the function and use the result in the same expression/arg.
The current virRandomBits() API is only usable if the caller wants
a random number in the range [0, n-1) where n is a power of two.
This adds a virRandom() API which generates a double in the
range [0.0,1.0) with 48 bits of entropy. It then also adds a
virRandomInt(uint32_t max) API which generates an unsigned
in the range [0,@max)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
libvirt creates invalid commands if wrong locale is selected. For
example with locale that uses comma as a decimal point, JSON commands
created with decimal numbers are invalid because comma separates the
entries in JSON. Fortunately even when decimal point is affected,
thousands grouping is not, because for grouping to be enabled with
*printf, there has to be an apostrophe flag specified (and supported).
This patch adds specific internal function for converting doubles to
strings with C locale.
This patch introduces a new error code VIR_ERR_OPERATION_UNSUPPORTED to
mark error messages regarding operations that failed due to lack of
support on the hypervisor or other than libvirt issues.
The code is first used in reporting error if qemu does not support block
IO tuning variables yielding error message:
error: Unable to get block I/O throttle parameters
error: Operation not supported: block_io_throttle field
'total_bytes_sec' missing in qemu's output
instead of:
error: Unable to get block I/O throttle parameters
error: internal error cannot read total_bytes_sec
Otherwise, in locations like virobject.c where PROBE is used,
for certain configure options, the compiler warns:
util/virobject.c:110:1: error: 'intptr_t' undeclared (first use in this function)
As long as we are making this header always available, we can
clean up several other files.
* src/internal.h (includes): Pull in <stdint.h>.
* src/conf/nwfilter_conf.h: Rely on internal.h.
* src/storage/storage_backend.c: Likewise.
* src/storage/storage_backend.h: Likewise.
* src/util/cgroup.c: Likewise.
* src/util/sexpr.h: Likewise.
* src/util/virhashcode.h: Likewise.
* src/util/virnetdevvportprofile.h: Likewise.
* src/util/virnetlink.h: Likewise.
* src/util/virrandom.h: Likewise.
* src/vbox/vbox_driver.c: Likewise.
* src/xenapi/xenapi_driver.c: Likewise.
* src/xenapi/xenapi_utils.c: Likewise.
* src/xenapi/xenapi_utils.h: Likewise.
* src/xenxs/xenxs_private.h: Likewise.
* tests/storagebackendsheepdogtest.c: Likewise.
Both LVM volumes and SCSI LUNs have a globally unique
identifier associated with them. It is useful to be able
to query this identifier to then perform disk locking,
rather than try to figure out a stable pathname.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Prevents libvirt from treating RBD backing stores as files. Without this
patch, creating a domain with a qcow2 overlay on an RBD would fail.
This patch essentially extends 9c7c4a4fc5,
which allows nbd backing stores, to allow rbd backing stores.
From man poll(2), poll does not set errno=EAGAIN on interrupt, however
it does set errno=EINTR. Have libvirt retry on the appropriate errno.
Under heavy load, a program of mine kept getting libvirt errors 'poll on
socket failed: Interrupted system call'. The signals were SIGCHLD from
processes forked by threads unrelated to those using libvirt.
This patch is in response to:
https://bugzilla.redhat.com/show_bug.cgi?id=818467
If a caller to virCommandRun doesn't ask for the exitstatus of the
program it's running, the virCommand functions assume that they should
log an error message and return failure if the exit code isn't
0. However, only the commandline and exit status are logged, while
potentially useful information sent by the program to stderr is
discarded.
Fortunately, virCommandRun is already checking if the caller had asked
for stderr to be saved and, if not, sets things up to save it in
*cmd->errbuf. This makes it fairly simple for virCommandWait to
include *cmd->errbuf in the error log (there are still other callers
that don't setup errbuf, and even virCommandRun won't set it up if the
command is being daemonized, so we have to check that it's non-zero).
This introduces a fairly basic reference counted virObject type
and an associated virClass type, that use atomic operations for
ref counting.
In a global initializer (recommended to be invoked using the
virOnceInit API), a virClass type must be allocated for each
object type. This requires a class name, a "dispose" callback
which will be invoked to free memory associated with the object's
fields, and the size in bytes of the object struct.
eg,
virClassPtr connclass = virClassNew("virConnect",
sizeof(virConnect),
virConnectDispose);
The struct for the object, must include 'virObject' as its
first member
eg
struct _virConnect {
virObject object;
virURIPtr uri;
};
The 'dispose' callback is only responsible for freeing
fields in the object, not the object itself. eg a suitable
impl for the above struct would be
void virConnectDispose(void *obj) {
virConnectPtr conn = obj;
virURIFree(conn->uri);
}
There is no need to reset fields to 'NULL' or '0' in the
dispose callback, since the entire object will be memset
to 0, and the klass pointer & magic integer fields will
be poisoned with 0xDEADBEEF before being free()d
When creating an instance of an object, one needs simply
pass the virClassPtr eg
virConnectPtr conn = virObjectNew(connclass);
if (!conn)
return NULL;
conn->uri = virURIParse("foo:///bar")
Object references can be manipulated with
virObjectRef(conn)
virObjectUnref(conn)
The latter returns a true value, if the object has been
freed (ie its ref count hit zero)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
All callers used the same initialization seed (well, the new
viratomictest forgot to look at getpid()); so we might as well
make this value automatic. And while it may feel like we are
giving up functionality, I documented how to get it back in the
unlikely case that you actually need to debug with a fixed
pseudo-random sequence. I left that crippled by default, so
that a stray environment variable doesn't cause a lack of
randomness to become a security issue.
* src/util/virrandom.c (virRandomInitialize): Rename...
(virRandomOnceInit): ...and make static, with one-shot call.
Document how to do fixed-seed debugging.
* src/util/virrandom.h (virRandomInitialize): Drop prototype.
* src/libvirt_private.syms (virrandom.h): Don't export it.
* src/libvirt.c (virInitialize): Adjust caller.
* src/lxc/lxc_controller.c (main): Likewise.
* src/security/virt-aa-helper.c (main): Likewise.
* src/util/iohelper.c (main): Likewise.
* tests/seclabeltest.c (main): Likewise.
* tests/testutils.c (virtTestMain): Likewise.
* tests/viratomictest.c (mymain): Likewise.
There are a few issues with the current virAtomic APIs
- They require use of a virAtomicInt struct instead of a plain
int type
- Several of the methods do not implement memory barriers
- The methods do not implement compiler re-ordering barriers
- There is no Win32 native impl
The GLib library has a nice LGPLv2+ licensed impl of atomic
ops that works with GCC, Win32, or pthreads.h that addresses
all these problems. The main downside to their code is that
the pthreads impl uses a single global mutex, instead of
a per-variable mutex. Given that it does have a Win32 impl
though, we don't expect anyone to seriously use the pthread.h
impl, so this downside is not significant.
* .gitignore: Ignore test case
* configure.ac: Check for which atomic ops impl to use
* src/Makefile.am: Add viratomic.c
* src/nwfilter/nwfilter_dhcpsnoop.c: Switch to new atomic
ops APIs and plain int datatype
* src/util/viratomic.h: inline impls of all atomic ops
for GCC, Win32 and pthreads
* src/util/viratomic.c: Global pthreads mutex for atomic
ops
* tests/viratomictest.c: Test validate to validate safety
of atomic ops.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove the use of a manually run virLogStartup and
virNodeSuspendInitialize methods. Instead make sure they
are automatically run using VIR_ONCE_GLOBAL_INIT
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add function virCommandNewVAList which is equivalent to the
virCommandNewArgList but with va_list instead of a variable number
of arguments.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Parallels Cloud Server is a cloud-ready virtualization
solution that allows users to simultaneously run multiple virtual
machines and containers on the same physical server.
More information can be found here: http://www.parallels.com/products/pcs/
Also beta version of Parallels Cloud Server can be downloaded there.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
This is a follow up patch of commit f9ce7dad6, it modifies all
the files which declare the copyright like "See COPYING.LIB for
the License of this software" to use the detailed/consistent one.
And deserts the outdated comments like:
* libvirt-qemu.h:
* Summary: qemu specific interfaces
* Description: Provides the interfaces of the libvirt library to handle
* qemu specific methods
*
* Copy: Copyright (C) 2010, 2012 Red Hat, Inc.
Uses the more compact style like:
* libvirt-qemu.h: Interfaces specific for QEMU/KVM driver
*
* Copyright (C) 2010, 2012 Red Hat, Inc.
Change the permissible minimum value of nodesuspend duration time
to 60 seconds. If option is less than the value, reports error.
Update virsh help and manpage the infomation.
No check for conn->uri being NULL in virAuthGetConfigFilePath (valid
state) made the client segfault. This happens for example with these
settings:
- no virtualbox driver installed (modifies conn->uri)
- no default URI set (VIRSH_DEFAULT_CONNECT_URI="",
LIBVIRT_DEFAULT_URI="", uri_default="")
- auth_sock_rw="sasl"
- virsh run as root
That are unfortunately the settings with fresh Fedora 17 installation
with VDSM.
The check ought to be enough as conn->uri being NULL is valid in later
code and is handled properly.
Per the FSF address could be changed from time to time, and GNU
recommends the following now: (http://www.gnu.org/licenses/gpl-howto.html)
You should have received a copy of the GNU General Public License
along with Foobar. If not, see <http://www.gnu.org/licenses/>.
This patch removes the explicit FSF address, and uses above instead
(of course, with inserting 'Lesser' before 'General').
Except a bunch of files for security driver, all others are changed
automatically, the copyright for securify files are not complete,
that's why to do it manually:
src/security/security_selinux.h
src/security/security_driver.h
src/security/security_selinux.c
src/security/security_apparmor.h
src/security/security_apparmor.c
src/security/security_driver.c
When reporting a system error (VIR_ERR_SYSTEM_ERROR) via
virReportSystemError, we should copy the errno value into
the 'int1' field of the virErrorPtr struct. This allows
callers to detect certain errno conditions & discard the
error
* src/util/virterror.c: Place errno value in int1 field
ensures that initialization will always take place when it is
needed, and guarantees it only occurs once. The problem is that
the code to setup a global initializer with proper error
propagation is tedious. This introduces VIR_ONCE_GLOBAL_INIT
macro to simplify this.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This removes nearly all the per-file error reporting macros
from the code in src/util/. A few custom macros remain for the
case, where the file needs to report errors with a variety of
different codes or parameters
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virnetdevtap.c and viruri.c files had two error report
messages which were not annotated with _(...)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Nearly every source file does something like
#define VIR_FROM_THIS VIR_FROM_FOO
#define virFooReportErorr(code, ...) \
virReportErrorHelper(VIR_FROM_THIS, code, __FILE__, \
__FUNCTION__, __LINE__, \
__VA_ARGS__)
This creates needless duplication and inconsistent error
reporting function names in each file. It is trivial to
just have virterror_internal.h provide a virReportError
macro that is equivalent
* src/util/virterror_internal.h: Define virReportError(code, ...)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Introduce new members in the virMacAddr 'class'
- virMacAddrSet: set virMacAddr from a virMacAddr
- virMacAddrSetRaw: setting virMacAddr from raw 6 byte MAC address buffer
- virMacAddrGetRaw: writing virMacAddr into raw 6 byte MAC address buffer
- virMacAddrCmp: comparing two virMacAddr
- virMacAddrCmpRaw: comparing a virMacAddr with a raw 6 byte MAC address buffer
then replace raw MAC addresses by replacing
- 'unsigned char *' with virMacAddrPtr
- 'unsigned char ... [VIR_MAC_BUFLEN]' with virMacAddr
and introduce usage of above functions where necessary.
When building with --disable-debug, VIR_DEBUG expands to a nop.
But parameters to VIR_DEBUG can be variables that are passed only
to VIR_DEBUG. In the case the building system complains about unused
variables.
Instead of changing the existed virFileMakePath to accept mode
argument and modifying a pile of its uses, this patch introduces
virFileMakePathWithMode, and use it instead of mkdir() to create
the readline history dir.
While it is not currently used elsewhere in libvirt, the code
for finding a free loop device & associating a file with it
is not LXC specific. Move it into the viffile.{c,h} file where
potentially shared code is more commonly kept.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Hello,
This is a patch to fix vm's outbound traffic control problem.
Currently, vm's outbound traffic control by libvirt doesn't go well.
This problem was previously discussed at libvir-list ML, however
it seems that there isn't still any answer to the problem.
http://www.redhat.com/archives/libvir-list/2011-August/msg00333.html
I measured Guest(with virtio-net) to Host TCP throughput with the
command "netperf -H".
Here are the outbound QoS parameters and the results.
outbound average rate[kilobytes/s] : Guest to Host throughput[Mbit/s]
======================================================================
1024 (8Mbit/s) : 4.56
2048 (16Mbit/s) : 3.29
4096 (32Mbit/s) : 3.35
8192 (64Mbit/s) : 3.95
16384 (128Mbit/s) : 4.08
32768 (256Mbit/s) : 3.94
65536 (512Mbit/s) : 3.23
The outbound traffic goes down unreasonably and is even not controled.
The cause of this problem is too large mtu value in "tc filter" command run by
libvirt. The command uses burst value to set mtu and the burst is equal to
average rate value if it's not set. This value is too large. For example
if the average rate is set to 1024 kilobytes/s, the mtu value is set to 1024
kilobytes. That's too large compared to the size of network packets.
Here libvirt applies tc ingress filter to Host's vnet(tun) device.
Tc ingress filter is implemented with TBF(Token Buckets Filter) algorithm. TBF
uses mtu value to calculate the amount of token consumed by each packet. With too
large mtu value, the token consumption rate is set too large. This leads to
token starvation and deterioration of TCP throughput.
Then, should we use the default mtu value 2 kilobytes?
The anser is No, because Guest with virtio-net device uses 65536 bytes
as mtu to transmit packets to Host, and the tc filter with the default mtu
value 2k drops packets whose size is larger than 2k. So, the most packets
is droped and again leads to deterioration of TCP throughput.
The appropriate mtu value is 65536 bytes which is equal to the maximum value
of network interface device defined in <linux/netdevice.h>. The value is
not so large that it causes token starvation and not so small that it
drops most packets.
Therefore this patch set the mtu value to 64kb(== 65535 bytes).
Again, here are the outbound QoS parameters and the TCP throughput with
the libvirt patched.
outbound average rate[kilobytes/s] : Guest to Host throughput[Mbit/s]
======================================================================
1024 (8Mbit/s) : 8.22
2048 (16Mbit/s) : 16.42
4096 (32Mbit/s) : 32.93
8192 (64Mbit/s) : 66.85
16384 (128Mbit/s) : 133.88
32768 (256Mbit/s) : 271.01
65536 (512Mbit/s) : 547.32
The outbound traffic conforms to the given limit.
Thank you,
Signed-off-by: Eiichi Tsukata <eiichi.tsukata.xh@hitachi.com>
In order to retrieve some sysinfo data we need to parse /proc/sysinfo and
/proc/cpuinfo.
Signed-off-by: Thang Pham <thang.pham@us.ibm.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Some GNULIB headers (eg unistd.h) will often need to include
winsock2.h for various symbols. There is a rule that winsock2.h
must be included before windows.h. This means that any file
which does
#ifdef WIN32
#include <windows.h>
#endif
#include <unistd.h>
is potentially broken. A simple rule is that /all/ includes of
windows.h must be matched with a preceding include of winsock2.h
regardless of whether unistd.h is used currently
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A core use case of the hook scripts is to be able to do things
to a guest's network configuration. It is possible to hook into
the 'start' operation for a QEMU guest which runs just before
the guest is started. The TAP devices will exist at this point,
but the QEMU process will not. It can be desirable to have a
'started' hook too, which runs once QEMU has started.
If libvirtd is restarted it will re-populate firewall rules,
but there is no QEMU hook to trigger for existing domains.
This is solved with a 'reconnect' hook.
Finally, if attaching to an external QEMU process there needs
to be an 'attach' hook script.
This all also applies to the LXC driver
* docs/hooks.html.in: Document new operations
* src/util/hooks.c, src/util/hooks.c: Add 'started', 'reconnect'
and 'attach' operations for QEMU. Add 'prepare', 'started',
'release' and 'reconnect' operations for LXC
* src/lxc/lxc_driver.c: Add hooks for 'prepare', 'started',
'release' and 'reconnect' operations
* src/qemu/qemu_process.c: Add hooks for 'started', 'reconnect'
and 'reconnect' operations
Right now, the only way to get at the contents of a virBuffer is
to destroy it. But there are cases in my upcoming patches where
peeking at the contents makes life easier. I suppose this does
open up the potential for bad code to dereference a stale pointer,
by disregarding the docs that the return value is invalid on the
next virBuf operation, but such is life.
* src/util/buf.h (virBufferCurrentContent): New declaration.
* src/util/buf.c (virBufferCurrentContent): Implement it.
* src/libvirt_private.syms (buf.h): Export it.
* tests/virbuftest.c (testBufAutoIndent): Test it.
When libvirtd forks off a new child, the child then calls virLogReset(),
which ends up closing file descriptors used as log outputs. However, we
recently started logging closed file descriptors, which means we need to
lock logging mutex which was already locked by virLogReset(). We don't
really want to log anything when we are in the process of closing log
outputs.
There is a theoretical problem of an extreme bug where we can get
into deadlock due to command handshaking. Thanks to a pair of pipes,
we have a situation where the parent thinks the child reported an
error and is waiting for a message from the child to explain the
error; but at the same time the child thinks it reported success
and is waiting for the parent to acknowledge the success; so both
processes are now blocked.
Thankfully, I don't think this deadlock is possible without at
least one other bug in the code, but I did see exactly that sort
of situation prior to commit da831af - I saw a backtrace where a
double close bug in the parent caused the parent to read from the
wrong fd and assume the child failed, even though the child really
sent success.
This potential deadlock is not quite like commit 858c247 (a deadlock
due to multiple readers on one pipe preventing a write from completing),
although the solution is similar - always close unused pipe fds before
blocking, rather than after.
* src/util/command.c (virCommandHandshakeWait): Close unused fds
sooner.
It is possible to deadlock libvirt by having a domain with XML
longer than PIPE_BUF, and by writing a hook script that closes
stdin early. This is because libvirt was keeping a copy of the
child's stdin read fd open, which means the write fd in the
parent will never see EPIPE (remember, libvirt should always be
run with SIGPIPE ignored, so we should never get a SIGPIPE signal).
Since there is no error, libvirt blocks waiting for a write to
complete, even though the only reader is also libvirt. The
solution is to ensure that only the child can act as a reader
before the parent does any writes; and then dealing with the
fallout of dealing with EPIPE.
Thankfully, this is not a security hole - since the only way to
trigger the deadlock is to install a custom hook script, anyone
that already has privileges to install a hook script already has
privileges to do any number of other equally disruptive things
to libvirt; it would only be a security hole if an unprivileged
user could install a hook script to DoS a privileged user.
* src/util/command.c (virCommandRun): Close parent's copy of child
read fd earlier.
(virCommandProcessIO): Don't let EPIPE be fatal; the child may
be done parsing input.
* tests/commandhelper.c (main): Set up a SIGPIPE situation.
* tests/commandtest.c (test20): Trigger it.
* tests/commanddata/test20.log: New file.
EBADF errors are logged as warnings as they normally indicate a double
close bug. This patch also provides VIR_MASS_CLOSE helper to be user in
the only case of mass close after fork when EBADF should rather be
ignored.
KAMEZAWA Hiroyuki reported a nasty double-free bug when virCommand
is used to convert a string into input to a child command. The
problem is that the poll() loop of virCommandProcessIO would close()
the write end of the pipe in order to let the child see EOF, then
the caller virCommandRun() would also close the same fd number, with
the second close possibly nuking an fd opened by some other thread
in the meantime. This in turn can have all sorts of bad effects.
The bug has been present since the introduction of virCommand in
commit f16ad06f.
This is based on his first attempt at a patch, at
https://bugzilla.redhat.com/show_bug.cgi?id=823716
* src/util/command.c (_virCommand): Drop inpipe member.
(virCommandProcessIO): Add argument, to avoid closing caller's fd
without informing caller.
(virCommandRun, virCommandNewArgs): Adjust clients.
Currently, we are logging only one side of pipes we
create in virCommandRequireHandshake(); This is enough
in cases where pipe2() returns two consecutive FDs. However,
it is not guaranteed and it may return any FDs.
Therefore, it's wise to log the other ends as well.
To ensure consistent error reporting of invalid arguments,
provide a number of predefined helper methods & macros.
- An arg which must not be NULL:
virCheckNonNullArgReturn(argname, retvalue)
virCheckNonNullArgGoto(argname, label)
- An arg which must be NULL
virCheckNullArgGoto(argname, label)
- An arg which must be positive (ie 1 or greater)
virCheckPositiveArgGoto(argname, label)
- An arg which must not be 0
virCheckNonZeroArgGoto(argname, label)
- An arg which must be zero
virCheckZeroArgGoto(argname, label)
- An arg which must not be negative (ie 0 or greater)
virCheckNonNegativeArgGoto(argname, label)
* src/libvirt.c, src/libvirt-qemu.c,
src/nodeinfo.c, src/datatypes.c: Update to use
virCheckXXXX macros
* po/POTFILES.in: Add libvirt-qemu.c and virterror_internal.h
* src/internal.h: Define macros for checking invalid args
* src/util/virterror_internal.h: Define macros for reporting
invalid args
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add an impl of +virGetUserRuntimeDirectory, virGetUserCacheDirectory
virGetUserConfigDirectory and virGetUserDirectory for Win32 platform.
Also create stubs for non-Win32 platforms which lack getpwuid_r()
In adding these two helpers were added virFileIsAbsPath and
virFileSkipRoot, along with some macros VIR_FILE_DIR_SEPARATOR,
VIR_FILE_DIR_SEPARATOR_S, VIR_FILE_IS_DIR_SEPARATOR,
VIR_FILE_PATH_SEPARATOR, VIR_FILE_PATH_SEPARATOR_S
All this code was adapted from GLib2 under terms of LGPLv2+ license.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove the uid param from virGetUserConfigDirectory,
virGetUserCacheDirectory, virGetUserRuntimeDirectory,
and virGetUserDirectory
These functions were universally called with the
results of getuid() or geteuid(). To make it practical
to port to Win32, remove the uid parameter and hardcode
geteuid()
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a VIR_ERR_DOMAIN_LAST sentinel for virErrorDomain and
replace the virErrorDomainName function by a VIR_ENUM_IMPL
In the process the naming of error domains is sanitized
* src/util/virterror.c: Use VIR_ENUM_IMPL for converting
error domains to strings
* include/libvirt/virterror.h: Add VIR_ERR_DOMAIN_LAST
The libvirt_private.syms file exports virNetlinkEventServiceLocalPid
so there needs to be a no-op stub for Win32 to avoid linker errors
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
I'm tired of writing:
bool sep = false;
while (...) {
if (sep)
virBufferAddChar(buf, ',');
sep = true;
virBufferAdd(buf, str);
}
This makes it easier, allowing one to write:
while (...)
virBufferAsprintf(buf, "%s,", str);
virBufferTrim(buf, ",", -1);
to trim any remaining comma.
* src/util/buf.h (virBufferTrim): Declare.
* src/util/buf.c (virBufferTrim): New function.
* tests/virbuftest.c (testBufTrim): Test it.
We were being lazy - virnetlink.c was getting uint32_t as a
side-effect from glibc 2.14's <unistd.h>, but older glibc 2.11
does not provide uint32_t from <unistd.h>. In fact, POSIX states
that <unistd.h> need only provide intptr_t, not all of <stdint.h>,
so the bug really is ours. Reported by Jonathan Alescio.
* src/util/virnetlink.h: Include <stdint.h>.
This involves setting the cpuacct cgroup to a per-vcpu granularity,
as well as summing the each vcpu accounting into a common array.
Now that we are reading more than one cgroup file, we double-check
that cpus weren't hot-plugged between reads to invalidate our
summing.
Signed-off-by: Eric Blake <eblake@redhat.com>
Allow the logging APIs to be called with a va_list for format
args, instead of requiring var-args usage.
* src/util/logging.h, src/util/logging.c: Add virLogVMessage
The use of readlink() in lxc_container.c is intentional; we don't
want an absolute pathname there.
* src/util/cgroup.h (VIR_CGROUP_SYSFS_MOUNT): Indent properly.
* cfg.mk (exclude_file_name_regexp--sc_prohibit_readlink): Add
exemption.
Normal practice is for cgroups controllers to be mounted at
/sys/fs/cgroup. When setting up a container, /sys is mounted
with a new sysfs instance, thus we must re-mount all the
cgroups controllers. The complexity is that we must mount
them in the same layout as the host OS. ie if 'cpu' and 'cpuacct'
were mounted at the same location in the host we must preserve
this in the container. Also if any controllers are co-located
we must setup symlinks from the individual controller name to
the co-located mount-point
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Callers of virGetUser{Config,Runtime,Cache}Directory all
append further path component. We should not be
adding a trailing slash in the return path otherwise we
get paths containing '//'
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Sometimes it is useful to see the callpath for log messages.
This change enhances the log filter syntax so that stack traces
can be show by setting '1:+NAME' instead of '1:NAME'.
This results in output like:
2012-05-09 14:18:45.136+0000: 13314: debug : virInitialize:414 : register drivers
/home/berrange/src/virt/libvirt/src/.libs/libvirt.so.0(virInitialize+0xd6)[0x7f89188ebe86]
/home/berrange/src/virt/libvirt/tools/.libs/lt-virsh[0x431921]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x3a21e21735]
/home/berrange/src/virt/libvirt/tools/.libs/lt-virsh[0x40a279]
2012-05-09 14:18:45.136+0000: 13314: debug : virRegisterDriver:775 : driver=0x7f8918d02760 name=Test
/home/berrange/src/virt/libvirt/src/.libs/libvirt.so.0(virRegisterDriver+0x6b)[0x7f89188ec717]
/home/berrange/src/virt/libvirt/src/.libs/libvirt.so.0(+0x11b3ad)[0x7f891891e3ad]
/home/berrange/src/virt/libvirt/src/.libs/libvirt.so.0(virInitialize+0xf3)[0x7f89188ebea3]
/home/berrange/src/virt/libvirt/tools/.libs/lt-virsh[0x431921]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x3a21e21735]
/home/berrange/src/virt/libvirt/tools/.libs/lt-virsh[0x40a279]
* docs/logging.html.in: Document new syntax
* configure.ac: Check for execinfo.h
* src/util/logging.c, src/util/logging.h: Add support for
stack traces
* tests/testutils.c: Adapt to API change
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
As defined in:
http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html
This offers a number of advantages:
* Allows sharing a home directory between different machines, or
sessions (eg. using NFS)
* Cleanly separates cache, runtime (eg. sockets), or app data from
user settings
* Supports performing smart or selective migration of settings
between different OS versions
* Supports reseting settings without breaking things
* Makes it possible to clear cache data to make room when the disk
is filling up
* Allows us to write a robust and efficient backup solution
* Allows an admin flexibility to change where data and settings are stored
* Dramatically reduces the complexity and incoherence of the
system for administrators
Until now, the nl_pid of the source address of every message sent by
virNetlinkCommand has been set to the value of getpid(). Most of the
time this doesn't matter, and in the one case where it does
(communication with lldpad), it previously was the proper thing to do,
because the netlink event service (which listens on a netlink socket
for unsolicited messages from lldpad) coincidentally always happened
to bind with a local nl_pid == getpid().
With the fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=816465
that particular nl_pid is now effectively a reserved value, so the
netlink event service will always bind to something else
(coincidentally "getpid() + (1 << 22)", but it really could be
anything). The result is that communication between lldpad and
libvirtd is broken (lldpad gets a "disconnected" error when it tries
to send a directed message).
The solution to this problem caused by a solution, is to query the
netlink event service's nlhandle for its "local_port", and send that
as the source nl_pid (but only when sending to lldpad, of course - in
other cases we maintain the old behavior of sending getpid()).
There are two cases where a message is being directed at lldpad - one
in virNetDevLinkDump, and one in virNetDevVPortProfileOpSetLink.
The case of virNetDevVPortProfileOpSetLink is simplest to explain -
only if !nltarget_kernel, i.e. the message isn't targetted for the
kernel, is the dst_pid set (by calling
virNetDevVPortProfileGetLldpadPid()), so only in that case do we call
virNetlinkEventServiceLocalPid() to set src_pid.
For virNetDevLinkDump, it's a bit more complicated. The call to
virNetDevVPortProfileGetLldpadPid() was effectively up one level (in
virNetDevVPortProfileOpCommon), although obscured by an unnecessary
passing of a function pointer. This patch removes the function
pointer, and calls virNetDevVPortProfileGetLldpadPid() directly in
virNetDevVPortProfileOpCommon - if it's doing this, it knows that it
should also call virNetlinkEventServiceLocalPid() to set src_pid too;
then it just passes src_pid and dst_pid down to
virNetDevLinkDump. Since (src_pid == 0 && dst_pid == 0) implies that
the kernel is the destination, there is no longer any need to send
nltarget_kernel as an arg to virNetDevLinkDump, so it's been removed.
The disparity between src_pid being int and dst_pid being uint32_t may
be a bit disconcerting to some, but I didn't want to complicate
virNetlinkEventServiceLocalPid() by having status returned separately
from the value.
This value will be needed to set the src_pid when sending netlink
messages to lldpad. It is part of the solution to:
https://bugzilla.redhat.com/show_bug.cgi?id=816465
Note that libnl's port generation algorithm guarantees that the
nl_socket_get_local_port() will always be > 0 (since it is "getpid() +
(n << 22>" where n is always < 1024), so it is okay to cast the
uint32_t to int (thus allowing us to use -1 as an error sentinel).
Until now, virNetlinkCommand has assumed that the nl_pid in the source
address of outgoing netlink messages should always be the return value
of getpid(). In most cases it actually doesn't matter, but in the case
of communication with lldpad, lldpad saves this info and later uses it
to send netlink messages back to libvirt. A recent patch to fix Bug
816465 changed the order of the universe such that the netlink event
service socket is no longer bound with nl_pid == getpid(), so lldpad
could no longer send unsolicited messages to libvirtd. Adding src_pid
as an argument to virNetlinkCommand() is the first step in notifying
lldpad of the proper address of the netlink event service socket.
usbFindDevice():get usb device according to
idVendor, idProduct, bus, device
it is the exact match of the four parameters
usbFindDeviceByBus():get usb device according to bus, device
it returns only one usb device same as usbFindDevice
usbFindDeviceByVendor():get usb device according to idVendor,idProduct
it probably returns multiple usb devices.
usbDeviceSearch(): a helper function to do the actual search
These two functions are called from main() on all platforms, and
always return success on platforms that don't support libnl. They
still log an error message, though, which doesn't make sense - they
should just be NOPs on those platforms. (Per a suggestion during
review, I've turned the logs into debug messages rather than removing
them completely).
Error: STRING_NULL:
/libvirt/src/util/uuid.c:273:
string_null_argument: Function "getDMISystemUUID" does not terminate string "*dmiuuid".
/libvirt/src/util/uuid.c:241:
string_null_argument: Function "saferead" fills array "*uuid" with a non-terminated string.
/libvirt/src/util/util.c:101:
string_null_argument: Function "read" fills array "*buf" with a non-terminated string.
/libvirt/src/util/uuid.c:274:
string_null: Passing unterminated string "dmiuuid" to a function expecting a null-terminated string.
/libvirt/src/util/uuid.c:138:
var_assign_parm: Assigning: "cur" = "uuidstr". They now point to the same thing.
/libvirt/src/util/uuid.c:164:
string_null_sink_loop: Searching for null termination in an unterminated array "cur".
configure.ac: check for libnl-3 in addition to libnl-1
src/Makefile.am: link against libnl when needed
src/util/virnetlink.c:
support libnl3 api. To minimize impact on code flow, wrap the
differences under the virNetlink* namespace.
Unfortunately libnl3 moves netlink/msg.h to
/usr/include/libnl3/netlink/msg.h, so the LIBNL_CFLAGS need to be added
to a bunch of places where they weren't needed with libnl1.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Add function virJSONValueObjectKeysNumber, virJSONValueObjectGetKey
and virJSONValueObjectGetValue, which allow you to iterate over all
fields of json object: you can get number of fields and then get
name and value, stored in field with that name by index.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
In fact, the 'tapfd' is always NULL, the function 'virNetDevTapCreate()' hasn't
assign 'fd' to 'tapfd', when the function 'virNetDevSetMAC()' is failed then
goto 'error' label, finally, the VIR_FORCE_CLOSE() will deref a NULL 'tapfd'.
* util/virnetdevtap.c (virNetDevTapCreateInBridgePort): fix a NULL pointer derefing.
* How to reproduce?
$ cat > /tmp/net.xml <<EOF
<network>
<name>test</name>
<forward mode='nat'/>
<bridge name='br1' stp='off' delay='1' />
<mac address='00:00:00:00:00:00'/>
<ip address='192.168.100.1' netmask='255.255.255.0'>
<dhcp>
<range start='192.168.100.2' end='192.168.100.254' />
</dhcp>
</ip>
</network>
EOF
$ virsh net-define /tmp/net.xml
$ virsh net-start test
error: Failed to start network brTest
error: End of file while reading data: Input/output error
Signed-off-by: Alex Jia <ajia@redhat.com>
When libvirtd is started, we create "libvirt/qemu" directories under
hugetlbfs mount point. Only the "qemu" subdirectory is chowned to qemu
user and "libvirt" remains owned by root. If umask was too restrictive
when libvirtd started, qemu user may lose access to "qemu"
subdirectory. Let's explicitly grant search permissions to "libvirt"
directory for all users.
Below patch fixes the following coverity findings
Error: OVERRUN_STATIC:
/libvirt/src/qemu/qemu_command.c:152:
overrun-buffer-val: Overrunning static array "net->mac" of size 6 bytes by passing it as an argument to a function which indexes it at byte position 15.
/libvirt/src/util/virnetdevmacvlan.c:948:
access_dbuff_const: Calling "virNetDevMacVLanVPortProfileRegisterCallback" indexes array "macaddress" at byte position 15.
/libvirt/src/util/virnetdevmacvlan.c:773:
access_dbuff_const: Calling "memcpy" indexes array "macaddress" with index "16UL" at byte position 15.
Error: OVERRUN_STATIC:
/libvirt/src/qemu/qemu_migration.c:2744:
overrun-buffer-val: Overrunning static array "net->mac" of size 6 bytes by passing it as an argument to a function which indexes it at byte position 15.
/libvirt/src/util/virnetdevmacvlan.c:773:
access_dbuff_const: Calling "memcpy" indexes array "macaddress" with index "16UL" at byte position 15.
Error: OVERRUN_STATIC:
/libvirt/src/qemu/qemu_driver.c:435:
overrun-buffer-val: Overrunning static array "net->mac" of size 6 bytes by passing it as an argument to a function which indexes it at byte position 15.
/libvirt/src/util/virnetdevmacvlan.c:1036:
access_dbuff_const: Calling "virNetDevMacVLanVPortProfileRegisterCallback" indexes array "macaddress" at byte position 15.
/libvirt/src/util/virnetdevmacvlan.c:773:
access_dbuff_const: Calling "memcpy" indexes array "macaddress" with index "16UL" at byte position 15.
Some of the error messages in this function should have been
virReportSystemError (since they have an errno they want to log), but
were mistakenly written as netlinkError, which expects a libvirt error
code instead. The result was that when one of the errors was
encountered, "No error message provided" would be printed instead of
something meaningful (see
https://bugzilla.redhat.com/show_bug.cgi?id=816465 for an example).
This patch resolves https://bugzilla.redhat.com/show_bug.cgi?id=815270
The function virNetDevMacVLanVPortProfileRegisterCallback() takes an
arg "virtPortProfile", and was checking it for non-NULL before using
it. However, the prototype for
virNetDevMacVLanPortProfileRegisterCallback had marked that arg with
ATTRIBUTE_NONNULL(). Contrary to what one may think,
ATTRIBUTE_NONNULL() does not provide any guarantee that an arg marked
as such really is always non-null; the only effect to the code
generated by gcc, is that gcc *assumes* it is non-NULL; this results
in, for example, the check for a non-NULL value being optimized out.
(Unfortunately, this code removal only occurs when optimization is
enabled, and I am in the habit of doing local builds with optimization
off to ease debugging, so the bug didn't show up in my earlier local
testing).
In general, virPortProfile might always be NULL, so it shouldn't be
marked as ATTRIBUTE_NONNULL. One other function prototype made this
same error, so this patch fixes it as well.
Add 2 new functions to the virSocketAddr 'class':
- virSocketAddrEqual: tests whether two IP addresses and their ports are equal
- virSocketaddSetIPv4Addr: set a virSocketAddr given a 32 bit int
This patch improves the previously added virAtomicInt implementation
by using gcc-builtins if possible. The needed builtins are available
since GCC >= 4.1. At least the 4.0 docs don't mention them.
This patch introduces a new block job, useful for live storage
migration using pre-copy streaming. Justification for including
this under virDomainBlockRebase rather than adding a new command
includes: 1) there are now two possible block jobs in qemu, with
virDomainBlockRebase starting either type of command, and
virDomainBlockJobInfo and virDomainBlockJobAbort working to end
either type; 2) reusing this command allows distros to backport
this feature to the libvirt 0.9.10 API without a .so bump.
Note that a future patch may add a more powerful interface named
virDomainBlockJobCopy, dedicated to just the block copy job, in
order to expose even more options (such as setting an arbitrary
format type for the destination without having to probe it from a
pre-existing destination file); adding a new command for targetting
just block copy would be similar to how we already have
virDomainBlockPull for targetting just the block pull job.
Using a live VM with the backing chain:
base <- snap1 <- snap2
as the starting point, we have:
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY)
creates /path/to/copy with the same format as snap2, with no backing
file, so entire chain is copied and flattened
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_COPY_RAW)
creates /path/to/copy as a raw file, so entire chain is copied and
flattened
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_SHALLOW)
creates /path/to/copy with the same format as snap2, but with snap1 as
a backing file, so only snap2 is copied.
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT)
reuse existing /path/to/copy (must have empty contents, and format is
probed[*] from the metadata), and copy the full chain
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT|
VIR_DOMAIN_BLOCK_REBASE_SHALLOW)
reuse existing /path/to/copy (contents must be identical to snap1,
and format is probed[*] from the metadata), and copy only the contents
of snap2
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT|
VIR_DOMAIN_BLOCK_REBASE_SHALLOW|VIR_DOMAIN_BLOCK_REBASE_COPY_RAW)
reuse existing /path/to/copy (must be raw volume with contents
identical to snap1), and copy only the contents of snap2
Less useful combinations:
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_SHALLOW|
VIR_DOMAIN_BLOCK_REBASE_COPY_RAW)
fail if source is not raw, otherwise create /path/to/copy as raw and
the single file is copied (no chain involved)
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT|
VIR_DOMAIN_BLOCK_REBASE_COPY_RAW)
makes little sense: the destination must be raw but have no contents,
meaning that it is an empty file, so there is nothing to reuse
The other three flags are rejected without VIR_DOMAIN_BLOCK_COPY.
[*] Note that probing an existing file for its format can be a security
risk _if_ there is a possibility that the existing file is 'raw', in
which case the guest can manipulate the file to appear like some other
format. But, by virtue of the VIR_DOMAIN_BLOCK_REBASE_COPY_RAW flag,
it is possible to avoid probing of raw files, at which point, probing
of any remaining file type is no longer a security risk.
It would be nice if we could issue an event when pivoting from phase 1
to phase 2, but qemu hasn't implemented that, and we would have to poll
in order to synthesize it ourselves. Meanwhile, qemu will give us a
distinct job info and completion event when we either cancel or pivot
to end the job. Pivoting is accomplished via the new:
virDomainBlockJobAbort(dom, disk, VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT)
Management applications can pre-create the copy with a relative
backing file name, and use the VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT
flag to have qemu reuse the metadata; if the management application
also copies the backing files to a new location, this can be used
to perform live storage migration of an entire backing chain.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_JOB_TYPE_COPY):
New block job type.
(virDomainBlockJobAbortFlags, virDomainBlockRebaseFlags): New enums.
* src/libvirt.c (virDomainBlockRebase): Document the new flags,
and implement general restrictions on flag combinations.
(virDomainBlockJobAbort): Document the new flag.
(virDomainSaveFlags, virDomainSnapshotCreateXML)
(virDomainRevertToSnapshot, virDomainDetachDeviceFlags): Document
restrictions.
* include/libvirt/virterror.h (VIR_ERR_BLOCK_COPY_ACTIVE): New
error.
* src/util/virterror.c (virErrorMsg): Define it.
virThreadSelf tries to access the virThreadPtr stored in TLS for the
current thread via TlsGetValue. When virThreadSelf is called on a thread
that was not created via virThreadCreate (e.g. the main thread) then
TlsGetValue returns NULL as TlsAlloc initializes TLS slots to NULL.
virThreadSelf can be called on the main thread via this call chain from
virsh
vshDeinit
virEventAddTimeout
virEventPollAddTimeout
virEventPollInterruptLocked
virThreadIsSelf
triggering a segfault as virThreadSelf unconditionally dereferences the
return value of TlsGetValue.
Fix this by making virThreadSelf check the TLS slot value for NULL and
setting the given virThreadPtr accordingly.
Reported by Marcel Müller.
Ensure we don't introduce any more lousy integer parsing in new
code, while avoiding a scrub-down of existing legacy code.
Note that we also need to enable sc_prohibit_atoi_atof (see cfg.mk
local-checks-to-skip) before we are bulletproof, but that also
entails scrubbing I'm not ready to do at the moment.
* src/util/util.c (virStrToLong_i, virStrToLong_ui)
(virStrToLong_l, virStrToLong_ul, virStrToLong_ll)
(virStrToLong_ull, virStrToDouble): Mark exemptions.
* src/util/virmacaddr.c (virMacAddrParse): Likewise.
* cfg.mk (sc_prohibit_strtol): New syntax check.
(exclude_file_name_regexp--sc_prohibit_strtol): Ignore files that
I'm not willing to fix yet.
(local-checks-to-skip): Re-enable sc_prohibit_atoi_atof.
https://bugzilla.redhat.com/show_bug.cgi?id=617711 reported that
even with my recent patched to allow <memory unit='G'>1</memory>,
people can still get away with trying <memory>1G</memory> and
silently get <memory unit='KiB'>1</memory> instead. While
virt-xml-validate catches the error, our C parser did not.
Not to mention that it's always fun to fix bugs while reducing
lines of code. :)
* src/conf/domain_conf.c (virDomainParseMemory): Check for parse error.
(virDomainDefParseXML): Avoid strtoll.
* src/conf/storage_conf.c (virStorageDefParsePerms): Likewise.
* src/util/xml.c (virXPathLongBase, virXPathULongBase)
(virXPathULongLong, virXPathLongLong): Likewise.
DBus connection. The HAL device code further requires that
the DBus connection is integrated with the event loop and
provides such glue logic itself.
The forthcoming FirewallD integration also requires a
dbus connection with event loop integration. Thus we need
to pull the current event loop glue out of the HAL driver.
Thus we create src/util/virdbus.{c,h} files. This contains
just one method virDBusGetSystemBus() which obtains a handle
to the single shared system bus instance, with event glue
automagically setup.
Currently upon a migration a callback is created when a 802.1qbg link
is set to PREASSOCIATE, this should not happen because this is a no-op
on most switches, and does not lead to an ASSOCIATE state. This patch
only creates callbacks when CREATE or RESTORE is requested. Migration
and libvirtd restart scenarios are already handled elsewhere.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
The linux-2.6.32 kernel header does not yet define IFLA_VF_MAX and others,
which breaks compiling a new libvirt on old systems like Debian Squeeze.
(I also have to add --without-macvtap --disable-werror --without-virtualport to
./configure to get it to compile.)
Signed-off-by: Philipp Hahn <hahn@univention.de>
This patch adds a netlink callback when migrating a VEPA enabled
virtual machine. It fixes a Bug where a VM would not request a port
association when it was cleared by lldpad.
This patch requires the latest git version of lldpad to work.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
Leak introduced in commit 0436d32. If we allocate an actions array,
but fail early enough to never consume it with the qemu monitor
transaction call, we leaked memory.
But our semantics of making the transaction command free the caller's
memory is awkward; avoiding the memory leak requires making every
intermediate function in the call chain check for error. It is much
easier to fix things so that the function that allocates also frees,
while the call chain leaves the caller's data intact. To do that,
I had to hack our JSON data structure to make it easy to protect a
portion of an arbitrary JSON tree from being freed.
* src/util/json.h (virJSONType): Name the enum.
(_virJSONValue): New field.
* src/util/json.c (virJSONValueFree): Use it to protect a portion
of an array.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONTransaction): Avoid
freeing caller's data.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive):
Free actions array on failure.
https://bugzilla.redhat.com/show_bug.cgi?id=808979
The leak is really in virProcessInfoGetAffinity, as shown in the
valgrind output given in the above bug report - it calls CPU_ALLOC(),
but then fails to call CPU_FREE().
This leak has existed in every version of libvirt since 0.7.5.
With latest gnulib we are checking even the lowest level functions
whether they check flags. Moreover, we are shadowing the real error
on system without TUNSETIFF support.
The code is splattered with a mix of
sizeof foo
sizeof (foo)
sizeof(foo)
Standardize on sizeof(foo) and add a syntax check rule to
enforce it
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A handful of places used %zd for format specifiers even
though the args was size_t, not ssize_t.
* src/remote/remote_driver.c, src/util/xml.c: s/%zd/%zu/
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
An upstream gnulib bug[1] meant that some of our syntax checks
weren't being run. Fix up our offenders before we upgrade to
a newer gnulib.
[1] https://lists.gnu.org/archive/html/bug-gnulib/2012-03/msg00194.html
* src/util/virnetdevtap.c (virNetDevTapCreate): Use flags.
* tests/lxcxml2xmltest.c (mymain): Strip useless ().
When libvirtd is restarted, also restart the netlink event
message callbacks for existing VEPA connections and send
a message to lldpad for these existing links, so it learns
the new libvirtd pid.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
Return statements with parameter enclosed in parentheses were modified
and parentheses were removed. The whole change was scripted, here is how:
List of files was obtained using this command:
git grep -l -e '\<return\s*([^()]*\(([^()]*)[^()]*\)*)\s*;' | \
grep -e '\.[ch]$' -e '\.py$'
Found files were modified with this command:
sed -i -e \
's_^\(.*\<return\)\s*(\(\([^()]*([^()]*)[^()]*\)*\))\s*\(;.*$\)_\1 \2\4_' \
-e 's_^\(.*\<return\)\s*(\([^()]*\))\s*\(;.*$\)_\1 \2\3_'
Then checked for nonsense.
The whole command looks like this:
git grep -l -e '\<return\s*([^()]*\(([^()]*)[^()]*\)*)\s*;' | \
grep -e '\.[ch]$' -e '\.py$' | xargs sed -i -e \
's_^\(.*\<return\)\s*(\(\([^()]*([^()]*)[^()]*\)*\))\s*\(;.*$\)_\1 \2\4_' \
-e 's_^\(.*\<return\)\s*(\([^()]*\))\s*\(;.*$\)_\1 \2\3_'
When qparams support was dropped in commit bc1ff160, we forgot
to add tests to ensure that viruri can do the same round trip
handling of a URI. This round trip was broken, due to use
of the old 'query' field of xmlUriPtr, instead of the new
'query_raw'
Also, we forgot to report an OOM error.
* tests/viruritest.c (mymain): Add tests based on just-deleted
qparamtest.
(testURIParse): Allow difference in input and expected output.
* src/util/viruri.c (virURIFormat): Add missing error. Use
query_raw, instead of query for xmlUriPtr object.
Libvirt on x86 parses 'dmidecode' to gather characteristics of host
system. On PowerPC, this is now implemented by reading /proc/cpuinfo
NOTE: memory-DIMM information is not presently implemented.
Acked-by: Daniel Veillard <veillard@redhat.com>
Acked-by: Daniel P Berrange <berrange@redhat.com>
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
When SASL requests auth credentials, try to look them up in the
config file first. If any are found, remove them from the list
that the user is prompted for
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ensure that the functions in virauth.h have names matching the file
prefix, by renaming virRequest{Username,Password} to
virAuthGet{Username,Password}
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To follow latest naming conventions, rename src/util/authhelper.[ch]
to src/util/virauth.[ch].
* src/util/authhelper.[ch]: Rename to src/util/virauth.[ch]
* src/esx/esx_driver.c, src/hyperv/hyperv_driver.c,
src/phyp/phyp_driver.c, src/xenapi/xenapi_driver.c: Update
for renamed include files
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The '.ini' file format is a useful alternative to the existing
config file style, when you need to have config files which
are hashes of hashes. The 'virKeyFilePtr' object provides a
way to parse these file types.
* src/Makefile.am, src/util/virkeyfile.c,
src/util/virkeyfile.h: Add .ini file parser
* tests/Makefile.am, tests/virkeyfiletest.c: Test
basic parsing capabilities
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert drivers currently using the qparams APIs, to instead
use the virURIPtr query parameters directly.
* src/esx/esx_util.c, src/hyperv/hyperv_util.c,
src/remote/remote_driver.c, src/xenapi/xenapi_utils.c: Remove
use of qparams
* src/util/qparams.h, src/util/qparams.c: Delete
* src/Makefile.am, src/libvirt_private.syms: Remove qparams
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Avoid the need for each driver to parse query parameters itself
by storing them directly in the virURIPtr struct. The parsing
code is a copy of that from src/util/qparams.c The latter will
be removed in a later patch
* src/util/viruri.h: Add query params to virURIPtr
* src/util/viruri.c: Parse query parameters when creating virURIPtr
* tests/viruritest.c: Expand test to cover params
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of just typedef'ing the xmlURIPtr struct for virURIPtr,
use a custom libvirt struct. This allows us to fix various
problems with libxml2. This initially just fixes the query vs
query_raw handling problems.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The parameter in the virURIFormat impl mistakenly used the
xmlURIPtr type, instead of virURIPtr. Since they will soon
cease to be identical, this needs fixing
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Since we defined a custom virURIPtr type, we should use a
virURIFree method instead of assuming it will always be
a typedef for xmlURIPtr
* src/util/viruri.c, src/util/viruri.h, src/libvirt_private.syms:
Add a virURIFree method
* src/datatypes.c, src/esx/esx_driver.c, src/libvirt.c,
src/qemu/qemu_migration.c, src/vmx/vmx.c, src/xen/xend_internal.c,
tests/viruritest.c: s/xmlFreeURI/virURIFree/
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A few times libvirt users manually setting mac addresses have
complained of a networking failure that ends up being due to a multicast
mac address being used for a guest interface. This patch prevents that
by logging an error and failing if a multicast mac address is
encountered in each of the three following cases:
1) domain xml <interface> mac address.
2) network xml bridge mac address.
3) network xml dhcp/host mac address.
There are several other places where a mac address can be input that
aren't controlled in this manner because failure to do so has no
consequences (e.g., if the address will be used to search through
existing interfaces for a match).
The RNG has been updated to add multiMacAddr and uniMacAddr along with
the existing macAddr, and macAddr was switched to uniMacAddr where
appropriate.
This patch is in response to:
https://bugzilla.redhat.com/show_bug.cgi?id=798467
If a guest's tap device is created using the same MAC address the
guest uses for its own network card (which connects to the tap
device), the Linux kernel will log the following message and traffic
will not pass:
kernel: vnet9: received packet with own address as source address
This patch disallows MAC addresses with a first byte of 0xFE, but only in
the case that the MAC address is used for a guest interface that's
connected by way of a standard tap device. (In other words, the
validation is done at runtime at the same place the MAC address is
modified for the tap device, rather than when mac address is parsed,
the idea being that it is then we know for sure the address will be
problematic.)
This patch fixes a NULL pointer check that was causing SegFault on
some specific configurations. It also reverts commit 59d0c9801c
that was checking for this value in one place.
Thanks to cgroups, providing user vs. system time of the overall
guest is easy to add to our existing API.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_CPU_STATS_USERTIME)
(VIR_DOMAIN_CPU_STATS_SYSTEMTIME): New constants.
* src/util/virtypedparam.h (virTypedParameterArrayValidate)
(virTypedParameterAssign): Enforce checking the result.
* src/qemu/qemu_driver.c (qemuDomainGetPercpuStats): Fix offender.
(qemuDomainGetTotalcpuStats): Implement new parameters.
* tools/virsh.c (cmdCPUStats): Tweak output accordingly.
As documented in linux.git/Documentation/cgroups/cpuacct.txt,
cpuacct.stat returns user and system time in ticks (the same
unit used in times(2)). It would be a bit nicer if it were like
getrusage(2) and reported timeval contents, or like cpuacct.usage
and in nanoseconds, but we can't be picky.
* src/util/cgroup.h (virCgroupGetCpuacctStat): New function.
* src/util/cgroup.c (virCgroupGetCpuacctStat): Implement it.
(virCgroupGetValueStr): Allow for multi-line files.
* src/libvirt_private.syms (cgroup.h): Export it.
If there is a disk file with a comma in the name, QEmu expects a double
comma instead of a single one (e.g., the file "virtual,disk.img" needs
to be specified as "virtual,,disk.img" in QEmu's command line). This
patch fixes libvirt to work with that feature. Fix RHBZ #801036.
Based on an initial patch by Crístian Viana.
* src/util/buf.h (virBufferEscape): Alter signature.
* src/util/buf.c (virBufferEscape): Add parameter.
(virBufferEscapeSexpr): Fix caller.
* src/qemu/qemu_command.c (qemuBuildRBDString): Likewise. Also
escape commas in file names.
(qemuBuildDriveStr): Escape commas in file names.
* docs/schemas/basictypes.rng (absFilePath): Relax RNG to allow
commas in input file names.
* tests/qemuxml2argvdata/*-disk-drive-network-sheepdog.*: Update
test.
Signed-off-by: Eric Blake <eblake@redhat.com>
This is nearly identical to an earlier patch for virnetlink.c.
There are special stub versions of all public functions in this file
that are compiled when the platform isn't linux. Each of these
functions had an almost identical message, differing only in the
function name included in the message. Since log messages already
contain the function name, we can just define a const char* with the
common part of the string, and use that same string for all the log
messages.
If nothing else, this at least makes for less strings that need
translating...
There are several functions that call virNetlinkCommand, and they all
follow a common pattern, with three exit labels: err_exit (or
cleanup), malformed_resp, and buffer_too_small. All three of these
labels do their own cleanup and have their own return. However, the
malformed_resp label usually frees the same items as the
cleanup/err_exit label, and the buffer_too_small label just doesn't
free recvbuf (because it's known to always be NULL at the time we goto
buffer_too_small.
In order to simplify and standardize the code, I've made the following
changes to all of these functions:
1) err_exit is replaced with the more libvirt-ish "cleanup", which
makes sense because in all cases this code is also executed in the
case of success, so labelling it err_exit may be confusing.
2) rc is initialized to -1, and set to 0 just before the cleanup
label. Any code that currently sets rc = -1 is made to instead goto
cleanup.
3) malformed_resp and buffer_too_small just log their error and goto
cleanup. This gives us a single return path, and a single place to
free up resources.
4) In one instance, rather then logging an error immediately, a char*
msg was pointed to an error string, then goto cleanup (and cleanup
would log an error if msg != NULL). It takes no more lines of code
to just log the message as we encounter it.
This patch should have 0 functional effects.
There are special stub versions of all public functions in this file
that are compiled when either libnl isn't available or the platform
isn't linux. Each of these functions had two almost identical message,
differing only in the function name included in the message. Since log
messages already contain the function name, we can just define a const
char* with the common part of the string, and use that same string for
all the log messages.
Also, rather than doing #if defined ... #else ... #endif *inside the
error log macro invocation*, this patch does #if defined ... just
once, using it to decide which single string to define. This turns the
error log in each function from 6 lines, to 1 line.
This patch will allow OpenFlow controllers to identify which interface
belongs to a particular VM by using the Domain UUID.
ovs-vsctl get Interface vnet0 external_ids
{attached-mac="52:54:00:8C:55:2C", iface-id="83ce45d6-3639-096e-ab3c-21f66a05f7fa", iface-status=active, vm-id="142a90a7-0acc-ab92-511c-586f12da8851"}
V2 changes:
Replaced vm-uuid with vm-id. There was a discussion in Open vSwitch
mailinglist that we should stick with the same DB key postfixes for the
sake of consistency (e.g iface-id, vm-id ...).
The indentation on the final lines of the function was off by four
spaces, making me wonder for a second if there was something
missing. (There wasn't.)
If we need to virFork() to check assess() under different
UID+GID we need to translate returned status via WEXITSTATUS().
Otherwise, we may return values greater than 255 which is
obviously wrong.
Scaling an integer based on a suffix is something we plan on reusing
in several contexts: XML parsing, virsh CLI parsing, and possibly
elsewhere. Make it easy to reuse, as well as adding in support for
powers of 1000.
* src/util/util.h (virScaleInteger): New function.
* src/util/util.c (virScaleInteger): Implement it.
* src/libvirt_private.syms (util.h): Export it.
Overflow can be user-induced, so it deserves more than being called
an internal error. Note that in general, 32-bit platforms have
far more places to trigger this error (anywhere the public API
used 'unsigned long' but the other side of the connection is a
64-bit server); but some are possible on 64-bit platforms (where
the public API computes the product of two numbers).
* include/libvirt/virterror.h (VIR_ERR_OVERFLOW): New error.
* src/util/virterror.c (virErrorMsg): Translate it.
* src/libvirt.c (virDomainSetVcpusFlags, virDomainGetVcpuPinInfo)
(virDomainGetVcpus, virDomainGetCPUStats): Use it.
* daemon/remote.c (HYPER_TO_TYPE): Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockResize): Likewise.
ATTRIBUTE_UNUSED was accidentally forgotten on one arg of a stub
function for functionality that's not present on non-linux
platforms. This causes a non-linux build with
--enable-compile-warnings=error to fail.
* For now, only "cpu_time" is supported.
* cpuacct cgroup is used for providing percpu cputime information.
* src/qemu/qemu.conf - take care of cpuacct cgroup.
* src/qemu/qemu_conf.c - take care of cpuacct cgroup.
* src/qemu/qemu_driver.c - added an interface
* src/util/cgroup.c/h - added interface for getting percpu cputime
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
This patch includes the following changes to virnetdevmacvlan.c and
virnetdevvportprofile.c:
- removes some netlink functions which are now available in
virnetdev.c
- Adds a vf argument to all port profile functions.
For 802.1Qbh devices, the port profile calls can use a vf argument if
passed by the caller. If the vf argument is -1 it will try to derive the vf
if the device passed is a virtual function.
For 802.1Qbg devices, This patch introduces a null check for the device
argument because during port profile assignment on a hostdev, this argument
can be null.
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
This patch adds the following:
- functions to set and get vf configs
- Functions to replace and store vf configs (Only mac address is handled today.
But the functions can be easily extended for vlans and other vf configs)
- function to dump link dev info (This is moved from virnetdevvportprofile.c)
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
pciDeviceGetVirtualFunctionInfo returns pf netdevice name and virtual
function index for a given vf. This is just a wrapper around existing functions
to return vf's pf and vf_index with one api call
pciConfigAddressToSysfsfile returns the sysfile pci device link
from a 'struct pci_config_address'
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Commit 723d5c (added after the release of 0.9.10) adds a
NetlinkEventClient for each interface sent to
virNetDevMacVLanCreateWithVPortProfile. This should only be done if
the interface actually *has* a virtPortProfile, otherwise the event
handler would be a NOP. The bigger problem is that part of the setup
to create the NetlinkEventClient is to do a memcpy of virtPortProfile
- if it's NULL, this triggers a segv.
This patch just qualifies the code that adds the client - if
virtPortProfile is NULL, it's skipped.
With an additional new bool added to determine whether or not to
discourage the use of the supplied MAC address by the bridge itself,
virNetDevTapCreateInBridgePort had three booleans (well, 2 bools and
an int used as a bool) in the arg list, which made it increasingly
difficult to follow what was going on. This patch combines those three
into a single flags arg, which not only shortens the arg list, but
makes it more self-documenting.
When a tap device for a domain is created and attached to a bridge,
the first byte of the tap device MAC address is set to 0xFE, while the
rest is set to match the MAC address that will be presented to the
guest as its network device MAC address. Setting this high value in
the tap's MAC address discourages the bridge from using the tap
device's MAC address as the bridge's own MAC address (Linux bridges
always take on the lowest numbered MAC address of all attached devices
as their own).
In one case within libvirt, a tap device is created and attached to
the bridge with the intent that its MAC address be taken on by the
bridge as its own (this is used to assure that the bridge has a fixed
MAC address to prevent network outages created by the bridge MAC
address "flapping" as guests are started and stopped). In this case,
the first byte of the mac address is *not* altered to 0xFE.
In the current code, callers to virNetDevTapCreateInBridgePort each
make the MAC address modification themselves before calling, which
leads to code duplication, and also prevents lower level functions
from knowing the real MAC address being used by the guest. The problem
here is that openvswitch bridges must be informed about this MAC
address, or they will be unable to pass traffic to/from the guest.
This patch centralizes the location of the MAC address "0xFE fixup"
into virNetDevTapCreateInBridgePort(), meaning 1) callers of this
function no longer need the extra strange bit of code, and 2)
bitNetDevTapCreateBridgeInPort itself now is called with the guest's
unaltered MAC address, and can pass it on, unmodified, to
virNetDevOpenvswitchAddPort.
There is no other behavioral change created by this patch.
Nuke the last vestiges of printing pid_t values with the wrong
types, at least in code compiled on mingw64. There may be other
places, but for now they are only compiled on systems where the
existing %d doesn't trigger gcc warnings.
* src/rpc/virnetsocket.c (virNetSocketNew): Use %lld and casting,
rather than assuming any particular int type for pid_t.
* src/util/command.c (virCommandRunAsync, virPidWait)
(virPidAbort): Likewise.
(verify): Drop a now stale assertion.
No thanks to 64-bit windows, with 64-bit pid_t, we have to avoid
constructs like 'int pid'. Our API in libvirt-qemu cannot be
changed without breaking ABI; but then again, libvirt-qemu can
only be used on systems that support UNIX sockets, which rules
out Windows (even if qemu could be compiled there) - so for all
points on the call chain that interact with this API decision,
we require a different variable name to make it clear that we
audited the use for safety.
Adding a syntax-check rule only solves half the battle; anywhere
that uses printf on a pid_t still needs to be converted, but that
will be a separate patch.
* cfg.mk (sc_correct_id_types): New syntax check.
* src/libvirt-qemu.c (virDomainQemuAttach): Document why we didn't
use pid_t for pid, and validate for overflow.
* include/libvirt/libvirt-qemu.h (virDomainQemuAttach): Tweak name
for syntax check.
* src/vmware/vmware_conf.c (vmwareExtractPid): Likewise.
* src/driver.h (virDrvDomainQemuAttach): Likewise.
* tools/virsh.c (cmdQemuAttach): Likewise.
* src/remote/qemu_protocol.x (qemu_domain_attach_args): Likewise.
* src/qemu_protocol-structs (qemu_domain_attach_args): Likewise.
* src/util/cgroup.c (virCgroupPidCode, virCgroupKillInternal):
Likewise.
* src/qemu/qemu_command.c(qemuParseProcFileStrings): Likewise.
(qemuParseCommandLinePid): Use pid_t for pid.
* daemon/libvirtd.c (daemonForkIntoBackground): Likewise.
* src/conf/domain_conf.h (_virDomainObj): Likewise.
* src/probes.d (rpc_socket_new): Likewise.
* src/qemu/qemu_command.h (qemuParseCommandLinePid): Likewise.
* src/qemu/qemu_driver.c (qemudGetProcessInfo, qemuDomainAttach):
Likewise.
* src/qemu/qemu_process.c (qemuProcessAttach): Likewise.
* src/qemu/qemu_process.h (qemuProcessAttach): Likewise.
* src/uml/uml_driver.c (umlGetProcessInfo): Likewise.
* src/util/virnetdev.h (virNetDevSetNamespace): Likewise.
* src/util/virnetdev.c (virNetDevSetNamespace): Likewise.
* tests/testutils.c (virtTestCaptureProgramOutput): Likewise.
* src/conf/storage_conf.h (_virStoragePerms): Use mode_t, uid_t,
and gid_t rather than int.
* src/security/security_dac.c (virSecurityDACSetOwnership): Likewise.
* src/conf/storage_conf.c (virStorageDefParsePerms): Avoid
compiler warning.
Commit 7c90026 added #include "conf/domain_conf.h" to
util/virrandom.c. Fortunately it didn't actually use anything from
domain_conf.h, since as far as I'm aware, files in util aren't allowed
to reference anything in conf (although the opposite is allowed). So
this #include is unnecessary.
I verified it still compiles with the line removed, but have placed a
one day moratorium on me doing any "trivial rule" pushes, so will
wait for someone else to verify/ACK before pushing.
Add de-association handling for 802.1qbg (vepa) via lldpad
netlink messages. Also adds the possibility to perform an
association request without waiting for a confirmation.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
This code adds a netlink event interface to libvirt.
It is based upon the event_poll code and makes use of
it. An event is generated for each netlink message sent
to the libvirt pid.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
This patch changes behavior of virPidFileRead to enable passing NULL as
path to the binary the pid file should be checked against to skip this
check. This enables using this function for reading files that have same
semantics as pid files, but belong to unknown processes.
Function xmlParseURI does not remove square brackets around IPv6
address when parsing. One of the solutions is making wrappers around
functions working with xmlURI*. This assures that uri->server will be
always properly assigned and it doesn't have to be changed when used
on some new place in the code.
For this purpose, functions virParseURI and virSaveURI were
added. These function are wrappers around xmlParseURI and xmlSaveUri
respectively.
Also there is one new syntax check function to prohibit these functions
anywhere else.
File changes:
- src/util/viruri.h -- declaration
- src/util/viruri.c -- definition
- src/libvirt_private.syms -- symbol export
- src/Makefile.am -- added source and header files
- cfg.mk -- added sc_prohibit_xmlURI
- all others -- ID name and include fixes
[forwarding this here from RH bug #796732]
When creating a network (virsh net-create) with an erroneous XML
containing an empty <name> element, the error message is misleading:
error: Failed to create network from foo.xml
error: missing domain name information
It took me a bit of time to figure out that it was the *network* name
that was missing (I generate this xml and didn't look at it, first).
I realized that the same message is used for missing name when creating
a domain, network, or device node.
This patch adds VIR_MIGRATE_UNSAFE flag for migration APIs and new
VIR_ERR_MIGRATION_UNSAFE error code. The error code should be returned
whenever migrating a domain is considered unsafe (e.g., it's configured
in a way that does not ensure data integrity once it is migrated).
VIR_MIGRATE_UNSAFE flag may be used to force migration even though it
would normally be considered unsafe and forbidden.
AC_CHECK_PROG checks for program in given path. However, if it doesn't
exists, [variable] is set to [value-if-not-found]. We don't want this
to be the empty string in case of 'modprobe' and 'scrub' as we want to
fallback to runtime detection.
* src/util/virfile.h: the virFileWrapperFdFlags being defined as
a globa variable instead of a type ended up generating a duplicate
symbol error.
* AUTHORS: added Lincoln Myers
This patch allows libvirt to add interfaces to already
existing Open vSwitch bridges. The following syntax in
domain XML file can be used:
<interface type='bridge'>
<mac address='52:54:00:d0:3f:f2'/>
<source bridge='ovsbr'/>
<virtualport type='openvswitch'>
<parameters interfaceid='921a80cd-e6de-5a2e-db9c-ab27f15a6e1d'/>
</virtualport>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x03' function='0x0'/>
</interface>
or if libvirt should auto-generate the interfaceid use
following syntax:
<interface type='bridge'>
<mac address='52:54:00:d0:3f:f2'/>
<source bridge='ovsbr'/>
<virtualport type='openvswitch'>
</virtualport>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x03' function='0x0'/>
</interface>
It is also possible to pass an optional profileid. To do that
use following syntax:
<interface type='bridge'>
<source bridge='ovsbr'/>
<mac address='00:55:1a:65:a2:8d'/>
<virtualport type='openvswitch'>
<parameters interfaceid='921a80cd-e6de-5a2e-db9c-ab27f15a6e1d'
profileid='test-profile'/>
</virtualport>
</interface>
To create Open vSwitch bridge install Open vSwitch and
run the following command:
ovs-vsctl add-br ovsbr
The auto-generated WWN comply with the new addressing schema of WWN:
<quote>
the first nibble is either hex 5 or 6 followed by a 3-byte vendor
identifier and 36 bits for a vendor-specified serial number.
</quote>
We choose hex 5 for the first nibble. And for the 3-bytes vendor ID,
we uses the OUI according to underlying hypervisor type, (invoking
virConnectGetType to get the virt type). e.g. If virConnectGetType
returns "QEMU", we use Qumranet's OUI (00:1A:4A), if returns
ESX|VMWARE, we use VMWARE's OUI (00:05:69). Currently it only
supports qemu|xen|libxl|xenapi|hyperv|esx|vmware drivers. The last
36 bits are auto-generated.
Now that no one is relying on the return value being a pointer to
somewhere inside of the passed-in argument, we can simplify the
callers to simply return success or failure. Also wrap some long
lines and add some const-correctness.
* src/util/sysinfo.c (virSysinfoParseBIOS, virSysinfoParseSystem)
(virSysinfoParseProcessor, virSysinfoParseMemory): Change return.
(virSysinfoRead): Adjust caller.
virFileDirectFd was used for accessing files opened with O_DIRECT using
libvirt_iohelper. We will want to use the helper for accessing files
regardless on O_DIRECT and thus virFileDirectFd was generalized and
renamed to virFileWrapperFd.
dmidecode displays processor information, followed by BIOS, system and
memory-DIMM details.
Calls to virSysinfoParseBIOS(), virSysinfoParseSystem() would update
the buffer pointer 'base', so the processor information would be lost
before virSysinfoParseProcessor() was called. Sysinfo would therefore
not be able to display processor details -- It only described <bios>,
<system> and <memory_device> details.
This patch attempts to insulate sysinfo from ordering of dmidecode
output.
Before the fix:
---------------
virsh # sysinfo
<sysinfo type='smbios'>
<bios>
....
</bios>
<system>
....
</system>
<memory_device>
....
</memory_device>
After the fix:
-------------
virsh # sysinfo
<sysinfo type='smbios'>
<bios>
....
</bios>
<system>
....
</system>
<processor>
....
</processor>
<memory_device>
....
</memory_device>
gcc 4.7 complains:
util/virhashcode.c:49:17: error: always_inline function might not be inlinable [-Werror=attributes]
util/virhashcode.c:35:17: error: always_inline function might not be inlinable [-Werror=attributes]
Normal 'inline' is a hint that the compiler may ignore; the fact
that the function is static is good enough. We don't care if the
compiler decided not to inline after all.
* src/util/virhashcode.c (getblock, fmix): Relax attribute.
virFileOpenAs previously would only try opening a file as the current
user, or as a different user, but wouldn't try both methods in a
single call. This made it cumbersome to use as a replacement for
open(2). Additionally, it had a lot of historical baggage that led to
it being difficult to understand.
This patch refactors virFileOpenAs in the following ways:
* reorganize the code so that everything dealing with both the parent
and child sides of the "fork+setuid+setgid+open" method are in a
separate function. This makes the public function easier to understand.
* Allow a single call to virFileOpenAs() to first attempt the open as
the current user, and if that fails to automatically re-try after
doing fork+setuid (if deemed appropriate, i.e. errno indicates it
would now be successful, and the file is on a networkFS). This makes
it possible (in many, but possibly not all, cases) to drop-in
virFileOpenAs() as a replacement for open(2).
(NB: currently qemuOpenFile() calls virFileOpenAs() twice, once
without forking, then again with forking. That unfortunately can't
be changed without at least some discussion of the ramifications,
because the requested file permissions are different in each case,
which is something that a single call to virFileOpenAs() can't deal
with.)
* Add a flag so that any fchown() of the file to a different uid:gid
is explicitly requested when the function is called, rather than it
being implied by the presence of the O_CREAT flag. This just makes
for less subtle surprises to consumers. (Commit
b1643dc15c added the check for O_CREAT
before forcing ownership. This patch just makes that restriction
more explicit.)
* If either the uid or gid is specified as "-1", virFileOpenAs will
interpret this to mean "the current [gu]id".
All current consumers of virFileOpenAs should retain their present
behavior (after a few minor changes to their setup code and
arguments).
Rename the src/util/netlink files to src/util/virnetlink to
better fit the naming scheme. Also rename nlComm to virNetlinkCommand.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
Sometimes, its easier to run children with 2>&1 in shell notation,
and just deal with stdout and stderr interleaved. This was already
possible for fd handling; extend it to also work when doing string
capture of a child process.
* docs/internals/command.html.in: Document this.
* src/util/command.c (virCommandSetErrorBuffer): Likewise.
(virCommandRun, virExecWithHook): Implement it.
* tests/commandtest.c (test14): Test it.
* daemon/remote.c (remoteDispatchAuthPolkit): Use new command
feature.
This patch adds API to modify domain metadata for running and stopped
domains. The api supports changing description, title as well as the
newly added <metadata> element. The API has support for storing data in
the metadata element using xml namespaces.
* include/libvirt/libvirt.h.in
* src/libvirt_public.syms
- add function headers
- add enum to select metadata to operate on
- export functions
* src/libvirt.c
- add public api implementation
* src/driver.h
- add driver support
* src/remote/remote_driver.c
* src/remote/remote_protocol.x
- wire up the remote protocol
* include/libvirt/virterror.h
* src/util/virterror.c
- add a new error message note that metadata for domain are
missing
If we are building not on a WIN32 architecture and without HAVE_CAPNG
virSetCapabilities has unused argument and virClearCapabilities
is unused as well.
This patch introduces virSetCapabilities() function and implements
virCommandAllowCap() function.
Existing virClearCapabilities() is function to clear all capabilities.
Instead virSetCapabilities() is function to set arbitrary capabilities.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Shota Hirae <m11g1401@hibikino.ne.jp>
And hook it up for policykit auth. This allows virt-manager to detect
that the user clicked the policykit 'cancel' button and not throw
an 'authentication failed' error message at the user.
The virEmitXMLWarning function should always have been in
the xml.[hc] files, and should use virXML as its name
prefix
* src/util/util.c, src/util/util.h: Remove virEmitXMLWarning
* src/util/xml.c, src/util/xml.h: Add virXMLEmitWarning
On RHEL5, I got:
util/virrandom.c:66: warning: nested extern declaration of '_gl_verify_function66' [-Wnested-externs]
The fix is to hoist the verify earlier. Also some other hodge-podge
fixes I noticed while reviewing Dan's recent series.
* .gitignore: Ignore new test.
* src/util/cgroup.c: Bump copyright year.
* src/util/virhash.c: Fix typo in description.
* src/util/virrandom.c (virRandomBits): Mark doc comment, and
hoist assert to silence older gcc.
Recent discussions have illustrated the potential for DOS attacks
with the hash table implementations used by most languages and
libraries.
https://lwn.net/Articles/474912/
libvirt has an internal hash table impl, and uses hash tables for
a variety of purposes. The hash key generation code is pretty
simple and thus not strongly collision resistant.
This patch replaces the current libvirt hash key generator with
the (public domain) Murmurhash3 code. In addition every hash
table now gets a random seed value which is used to perturb the
hashing code. This should make it impossible to mount any
practical attack against libvirt hashing code.
* bootstrap.conf: Import bitrotate module
* src/Makefile.am: Add virhashcode.[ch]
* src/util/util.c: Make virRandom() return a fixed 32 bit
integer value.
* src/util/hash.c, src/util/hash.h, src/util/cgroup.c: Replace
hash code generation with a call to virHashCodeGen()
* src/util/virhashcode.h, src/util/virhashcode.c: Add a new
virHashCodeGen() API using the Murmurhash3 algorithm.
In preparation for the patch to include Murmurhash3, which
introduces a virhashcode.h and virhashcode.c files, rename
the existing hash.h and hash.c to virhash.h and virhash.c
respectively.
In preparation for conversion over to use the Murmurhash3
algorithm, convert various virHash APIs to use size_t or
uint32 for their return values/parameters, instead of the
variable size 'unsigned long' or 'int' types
The old virRandom() API was not generating good random numbers.
Replace it with a new API virRandomBits which instead of being
told the upper limit, gets told the number of bits of randomness
required.
* src/util/virrandom.c, src/util/virrandom.h: Add virRandomBits,
and move virRandomInitialize
* src/util/util.h, src/util/util.c: Delete virRandom and
virRandomInitialize
* src/libvirt.c, src/security/security_selinux.c,
src/test/test_driver.c, src/util/iohelper.c: Update for
changes from virRandom to virRandomBits
* src/storage/storage_backend_iscsi.c: Remove bogus call
to virRandomInitialize & convert to virRandomBits
In file included from ../gnulib/lib/unistd.h:51:0,
from ../src/util/util.h:30,
from rpc/virkeepalive.c:29:
/usr/x86_64-w64-mingw32/sys-root/mingw/include/winsock2.h:15:2: warning: #warning Please include winsock2.h before windows.h [-Wcpp]
Reported by Marc-André Lureau.
* src/util/threads-win32.h (includes): Pick up winsock2.h before
windows.h, as required by mingw64.
Add a virFileTouch API which ensures that a file will always
exist, even if zero length
* src/util/virfile.c, src/util/virfile.h,
src/libvirt_private.syms: Introduce virFileTouch
POLLIN and POLLHUP are not mutually exclusive. Currently the following
seems possible: the child writes 3K to its stdout or stderr pipe, and
immediately closes it. We get POLLIN|POLLHUP (I'm not sure that's possible
on Linux, but SUSv4 seems to allow it). We read 1K and throw away the
rest.
When poll() returns and we're about to check the /revents/ member in a
given array element, let's map all the revents bits to two (independent)
ideas: "let's attempt to read()", and "let's attempt to write()". This
should cover all errors, EOFs, and normal conditions; the read()/write()
call should report any pending error.
Under this approach, both POLLHUP and POLLERR are mapped to "needs read()"
if we're otherwise prepared for POLLIN. POLLERR also maps to "needs
write()" if we're otherwise prepared for POLLOUT. The rest of the mappings
(POLLPRI etc.) would be easy, but probably useless for pipes.
Additionally, SUSv4 doesn't appear to forbid POLLIN|POLLERR (or
POLLOUT|POLLERR) set simultaneously. One could argue that the read() or
write() call would return without blocking in these cases (with an error),
so POLLIN / POLLOUT would be justified beside POLLERR.
The code now penalizes POLLIN|POLLERR differently from plain POLLERR. The
former (ie. read() returning -1) is terminal and we jump to cleanup, while
plain POLLERR masks only the affected file descriptor for the future.
Let's unify those.
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
When converting a linear enum to a string, we have checks in
place in the VIR_ENUM_IMPL macro to ensure that there is one
string for every value, which lets us quickly flag if a user
added a value but forgot to add a counterpart string. However,
this only works if we use the _LAST marker.
* cfg.mk (sc_require_enum_last_marker): New syntax check.
* src/conf/domain_conf.h (virDomainSnapshotState): Add new marker.
* src/conf/domain_conf.c (virDomainSnapshotState): Fix offender.
* src/qemu/qemu_monitor_json.c (qemuMonitorWatchdogAction)
(qemuMonitorIOErrorAction, qemuMonitorGraphicsAddressFamily):
Likewise.
* src/util/virtypedparam.c (virTypedParameter): Likewise.
While we still don't want to enable gcc's new -Wformat-literal
warning, I found a rather easy case where the warning could be
reduced, by getting rid of obsolete error-reporting practices.
This is the last place where we were passing the (unused) net
and conn arguments for constructing an error.
* src/util/virterror_internal.h (virErrorMsg): Delete prototype.
(virReportError): Delete macro.
* src/util/virterror.c (virErrorMsg): Make static.
* src/libvirt_private.syms (virterror_internal.h): Drop export.
* src/util/conf.c (virConfError): Convert to macro.
(virConfErrorHelper): New function, and adjust error calls.
* src/xen/xen_hypervisor.c (virXenErrorFunc): Delete.
(xenHypervisorGetSchedulerType)
(xenHypervisorGetSchedulerParameters)
(xenHypervisorSetSchedulerParameters)
(xenHypervisorDomainBlockStats)
(xenHypervisorDomainInterfaceStats)
(xenHypervisorDomainGetOSType)
(xenHypervisorNodeGetCellsFreeMemory, xenHypervisorGetVcpus):
Update callers.
Preparation for another patch that refactors common patterns
into the new file for fewer lines of code overall.
* src/util/util.h (virTypedParameterArrayClear): Move...
* src/util/virtypedparam.h: ...to new file.
(virTypedParameterArrayValidate, virTypedParameterAssign): New
prototypes.
* src/util/util.c (virTypedParameterArrayClear): Likewise.
* src/util/virtypedparam.c: New file.
* po/POTFILES.in: Mark file for translation.
* src/Makefile.am (UTIL_SOURCES): Build it.
* src/libvirt_private.syms (util.h): Split...
(virtypedparam.h): to new section.
(virkeycode.h): Sort.
* daemon/remote.c: Adjust callers.
* tools/virsh.c: Likewise.
We had a memory leak on a very arcane OOM situation (unlikely to ever
hit in practice, but who knows if libvirt.so would ever be linked
into some other program that exhausts all thread-local storage keys?).
I found it by code inspection, while analyzing a valgrind report
generated by Alex Jia.
* src/util/threads.h (virThreadLocalSet): Alter signature.
* src/util/threads-pthread.c (virThreadHelper): Reduce allocation
lifetime.
(virThreadLocalSet): Detect failure.
* src/util/threads-win32.c (virThreadLocalSet): Likewise.
(virCondWait): Fix caller.
* src/util/virterror.c (virLastErrorObject): Likewise.
Given an LXC guest with a root filesystem path of
/export/lxc/roots/helloworld/root
During startup, we will pivot the root filesystem to end up
at
/.oldroot/export/lxc/roots/helloworld/root
We then try to open
/.oldroot/export/lxc/roots/helloworld/root/dev/pts
Now consider if '/export/lxc' is an absolute symlink pointing
to '/media/lxc'. The kernel will try to open
/media/lxc/roots/helloworld/root/dev/pts
whereas it should be trying to open
/.oldroot//media/lxc/roots/helloworld/root/dev/pts
To deal with the fact that the root filesystem can be moved,
we need to resolve symlinks in *any* part of the filesystem
source path.
* src/libvirt_private.syms, src/util/util.c,
src/util/util.h: Add virFileResolveAllLinks to resolve
all symlinks in a path
* src/lxc/lxc_container.c: Resolve all symlinks in filesystem
paths during startup
pciTrySecondaryBusReset checks if there is active device on the
same bus, however, qemu driver doesn't maintain an effective
list for the inactive devices, and it passes meaningless argument
for parameter "inactiveDevs". e.g. (qemuPrepareHostdevPCIDevices)
if (!(pcidevs = qemuGetPciHostDeviceList(hostdevs, nhostdevs)))
return -1;
..skipped...
if (pciResetDevice(dev, driver->activePciHostdevs, pcidevs) < 0)
goto reattachdevs;
NB, the "pcidevs" used above are extracted from domain def, and
thus one won't be able to attach a device of which bus has other
device even detached from host (nodedev-detach). To see more
details of the problem:
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=773667
This patch is to resolve the problem by introducing an inactive
PCI device list (just like qemu_driver->activePciHostdevs), and
the whole logic is:
* Add the device to inactive list during nodedev-dettach
* Remove the device from inactive list during nodedev-reattach
* Remove the device from inactive list during attach-device
(for non-managed device)
* Add the device to inactive list after detach-device, only
if the device is not managed
With the above, we have a sufficient inactive PCI device list, and thus
we can use it for pciResetDevice. e.g.(qemuPrepareHostdevPCIDevices)
if (pciResetDevice(dev, driver->activePciHostdevs,
driver->inactivePciHostdevs) < 0)
goto reattachdevs;
Detected by Coverity. Although unlikely, if we are ever started
with stdin closed, we could reach a situation where we open a
uuid file but then fail to close it, making that file the new
stdin for the rest of the process.
* src/util/uuid.c (getDMISystemUUID): Allow for stdin.
This functions enables us to get the Virtual Functions attached to
a Physical function given the name of a SR-IOV physical functio.
In order to accomplish the task, added a getter function pciGetDeviceAddrString
to get the BDF of the Virtual Function in a char array.
Commit db371a2 mistakenly added new functions inside a #ifndef WIN32
guard, even though they are needed on all platforms.
* src/util/command.c (virCommandFDSet): Move outside WIN32
conditional.
Currently, virCommand implementation uses FD_ macros from
sys/select.h. However, those cannot handle more opened files
than FD_SETSIZE. Therefore switch to generalized implementation
based on array of integers.
It is a good practise to set revents to zero before doing any poll().
Moreover, we should check if event we waited for really occurred or
if any of fds we were polling on didn't encountered hangup.