This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=851411https://bugzilla.redhat.com/show_bug.cgi?id=955500
The first problem was that virFileOpenAs was returning fd (-1) in one
of the error cases rather than ret (-errno), so the caller thought
that the error was EPERM rather than ENOENT.
The second problem was that some log messages in the general purpose
qemuOpenFile() function would always say "Failed to create" even if
the caller hadn't included O_CREAT (i.e. they were trying to open an
existing file).
This fixes virFileOpenAs to jump down to the error return (which
returns ret instead of fd) in the previously mentioned incorrect
failure case of virFileOpenAs(), removes all error logging from
virFileOpenAs() (since the callers report it), and modifies
qemuOpenFile to appropriately use "open" or "create" in its log
messages.
NB: I seriously considered removing logging from all callers of
virFileOpenAs(), but there is at least one case where the caller
doesn't want virFileOpenAs() to log any errors, because it's just
going to try again (qemuOpenFile()). We can't simply make a silent
variation of virFileOpenAs() though, because qemuOpenFile() can't make
the decision about whether or not it wants to retry until after
virFileOpenAs() has already returned an error code.
Likewise, I also considered changing virFileOpenAs() to return -1 with
errno set on return, and may still do that, but only as a separate
patch, as it obscures the intent of this patch too much.
The individual hypervisor drivers were directly referencing
APIs in virnodesuspend.c in their virDriverPtr struct. Separate
these methods, so there is always a wrapper in the hypervisor
driver. This allows the unused virConnectPtr args to be removed
from the virnodesuspend.c file. Again this will ensure that
ACL checks will only be performed on invocations that are
directly associated with public API usage.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the virGetHostname() API has a bogus virConnectPtr
parameter. This is because virtualization drivers directly
reference this API in their virDriverPtr tables, tieing its
API design to the public virConnectGetHostname API design.
This also causes problems for access control checks since
these must only be done for invocations from the public
API, not internal invocation.
Remove the bogus virConnectPtr parameter, and make each
hypervisor driver provide a dedicated function for the
driver API impl. This will allow access control checks
to be easily inserted later.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Since PIDs can be reused, polkit prefers to be given
a (PID,start time) pair. If given a PID on its own,
it will attempt to lookup the start time in /proc/pid/stat,
though this is subject to races.
It is safer if the client app resolves the PID start
time itself, because as long as the app has the client
socket open, the client PID won't be reused.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There are various methods named "virXXXXSecurityContext",
which are specific to SELinux. Rename them all to
"virXXXXSELinuxContext". They will still raise errors at
runtime if SELinux is not compiled in
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
While reviewing proposed VIR_STRDUP conversions, I've already noticed
several places that do:
if (str && VIR_STRDUP(dest, str) < 0)
which can be simplified by allowing str to be NULL (something that
strdup() doesn't allow). Meanwhile, code that wants to ensure a
non-NULL dest regardless of the source can check for <= 0.
Also, make it part of the VIR_STRDUP contract that macro arguments
are evaluated exactly once.
* src/util/virstring.h (VIR_STRDUP, VIR_STRDUP_QUIET, VIR_STRNDUP)
(VIR_STRNDUP_QUIET): Improve contract.
* src/util/virstring.c (virStrdup, virStrndup): Change return
conventions.
* docs/hacking.html.in: Document this.
* HACKING: Regenerate.
Signed-off-by: Eric Blake <eblake@redhat.com>
VIR_APPEND_ELEMENT(array, size, elem) was not safe if the expression
for 'size' had side effects. While no one in the current code base
was trying to pass side effects, we might as well be robust and
explicitly document our intentions.
* src/util/viralloc.c (virInsertElementsN): Add special case.
* src/util/viralloc.h (VIR_APPEND_ELEMENT): Use it.
(VIR_ALLOC, VIR_ALLOC_N, VIR_REALLOC_N, VIR_EXPAND_N)
(VIR_RESIZE_N, VIR_SHRINK_N, VIR_INSERT_ELEMENT)
(VIR_DELETE_ELEMENT, VIR_ALLOC_VAR, VIR_FREE): Document
which macros are safe in the presence of side effects.
* docs/hacking.html.in: Document this.
* HACKING: Regenerate.
Signed-off-by: Eric Blake <eblake@redhat.com>
The code adaptation is not done right now, but in subsequent patches.
Hence I am not implementing syntax-check rule as it would break
compilation. Developers are strongly advised to use these new macros.
They are similar to VIR_ALLOC() logic: VIR_STRDUP(dst, src) returns zero
on success, -1 otherwise. In case you don't want to report OOM error,
use the _QUIET variant of a macro.
POSIX says pthread_t is opaque. We can't guarantee if it is scaler
or a pointer, nor what size it is; and BSD differs from Linux.
We've also had reports of gcc complaining on attempts to cast it,
if we use a cast to the wrong type (for example, pointers have to be
cast to void* or intptr_t before being narrowed; while casting a
function return of scalar pthread_t to void* triggers a different
warning).
Give up on casts, and use unions to get at decent bits instead. And
rather than futz around with figuring which 32 bits of a potentially
64-bit pointer are most likely to be unique, convert the rest of
the code base to use 64-bit values when using a debug id.
Based on a report by Guido Günther against kFreeBSD, but with a
fix that doesn't regress commit 4d970fd29 for FreeBSD.
* src/util/virthreadpthread.c (virThreadSelfID, virThreadID): Use
union to get at a decent bit representation of thread_t bits.
* src/util/virthread.h (virThreadSelfID, virThreadID): Alter
signature.
* src/util/virthreadwin32.c (virThreadSelfID, virThreadID):
Likewise.
* src/qemu/qemu_domain.h (qemuDomainJobObj): Alter type of owner.
* src/qemu/qemu_domain.c (qemuDomainObjTransferJob)
(qemuDomainObjSetJobPhase, qemuDomainObjReleaseAsyncJob)
(qemuDomainObjBeginNestedJob, qemuDomainObjBeginJobInternal): Fix
clients.
* src/util/virlog.c (virLogFormatString): Likewise.
* src/util/vireventpoll.c (virEventPollInterruptLocked):
Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit 776d49f4 added a static function that is only called
conditionally; leading to this compile error on mingw:
CC libvirt_util_la-virprocess.lo
../../src/util/virprocess.c:624:26: error: 'struct rlimit' declared inside parameter list [-Werror]
../../src/util/virprocess.c:624:26: error: its scope is only this definition or declaration, which is probably not what you want [-Werror]
../../src/util/virprocess.c:622:1: error: 'virProcessPrLimit' defined but not used [-Werror=unused-function]
* src/util/virprocess.c (virProcessPrLimit): Only declare
virProcessPrLimit when used.
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit 7c9a2d88 cleaned up too many headers; FreeBSD builds
failed due to:
util/virutil.c:556: warning: implicit declaration of function 'canonicalize_file_name'
(Not sure which Linux header leaked this declaration, but gnulib
only guarantees it in stdlib.h)
libvirt.c:956: warning: implicit declaration of function 'virGetUserConfigDirectory'
(Here, a build on Linux was picking up virutil.h indirectly via
one of the conditional driver headers, where that driver was not
being built on my FreeBSD setup)
* src/util/virutil.c (includes): Need <stdlib.h> for
canonicalize_file_name.
* src/libvirt.c (includes): Use "virutil.h" unconditionally,
rather than relying on conditional indirect inclusion.
Signed-off-by: Eric Blake <eblake@redhat.com>
The source code base needs to be adapted as well. Some files
include virutil.h just for the string related functions (here,
the include is substituted to match the new file), some include
virutil.h without any need (here, the include is removed), and
some require both.
virPCIDeviceReattach and virPCIDeviceUnbindFromStub (called by
virPCIDeviceReattach) had previously required the name of the stub
driver as input. This is unnecessary, because the name of the driver
the device is currently bound to can be found by looking at the link:
/sys/bus/pci/dddd:bb:ss.ff/driver
Instead of requiring that the name of the expected stub driver name
and only unbinding if that one name is matched, we no longer take a
driver name in the arglist for either of these
functions. virPCIDeviceUnbindFromStub just compares the name of the
currently bound driver to a list of "well known" stubs (right now
contains "pci-stub" and "vfio-pci" for qemu, and "pciback" for xen),
and only performs the unbind if it's one of those devices.
This allows virsh nodedevice-reattach to work properly across a
libvirtd restart, and fixes a couple of cases where we were
erroneously still hard-coding "pci-stub" as the drive name.
For some unknown reason, virPCIDeviceReattach had been calling
modprobe on the stub driver prior to unbinding the device. This was
problematic because we no longer know the name of the stub driver in
that function. However, it is pointless to probe for the stub driver
at that time anyway - because the device is bound to the stub driver,
we are guaranteed that it is already loaded, and so that call to
modprobe has been removed.
On cygwin, compilation failed because SIOCSIFHWADDR is undefined.
* src/util/virnetdev.c (virNetDevSetMAC): Cygwin can query but not
set mac address.
Signed-off-by: Eric Blake <eblake@redhat.com>
FreeBSD (and maybe other BSDs) have different member
names in struct ifreq when compared to Linux, such as:
- uses ifr_data instead of ifr_newname for setting
interface names
- uses ifr_index instead of ifr_ifindex for interface
index
Also, add a check for SIOCGIFHWADDR for virNetDevValidateConfig().
Use AF_LOCAL if AF_PACKET is not available.
Signed-off-by: Eric Blake <eblake@redhat.com>
These fixes solve a compilation failure on FreeBSD:
util/virnetdevtap.c: In function 'virNetDevTapGetName':
util/virnetdevtap.c:56: warning: unused parameter 'tapfd' [-Wunused-parameter]
util/virnetdevtap.c:56: warning: unused parameter 'ifname' [-Wunused-parameter]
* src/util/virnetdevtap.c (virNetDevTapGetName): Add attributes
when TUNGETIFF is not present.
Signed-off-by: Eric Blake <eblake@redhat.com>
This will be used on a tap file descriptor returned by the bridge helper
to populate the <target> element, because the helper does not provide
the interface name.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch adds two sets of functions:
1) lower level virProcessSet*() functions that will immediately set
the RLIMIT_MEMLOCK. RLIMIT_NPROC, or RLIMIT_NOFILE of either the
current process (using setrlimit()) or any other process (using
prlimit()). "current process" is indicated by passing a 0 for pid.
2) functions for virCommand* that will setup a virCommand object to
set those limits at a later time just after it has forked a new
process, but before it execs the new program.
configure.ac has prlimit and setrlimit added to the list of functions
to check for, and the low level functions log an "unsupported" error)
on platforms that don't support those functions.
If a user cgroup name begins with "cgroup.", "_" or with any of
the controllers from /proc/cgroups followed by a dot, then they
need to be prefixed with a single underscore. eg if there is
an object "cpu.service", then this would end up as "_cpu.service"
in the cgroup filesystem tree, however, "waldo.service" would
stay "waldo.service", at least as long as nobody comes up with
a cgroup controller called "waldo".
Since we require a '.XXXX' suffix on all partitions, there is
no scope for clashing with the kernel 'tasks' and 'release_agent'
files.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If the partition named passed in the XML does not already have
a suffix, ensure it gets a '.partition' added to each component.
The exceptions are /machine, /user and /system which do not need
to have a suffix, since they are fixed partitions at the top
level.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Recently we changed to create VM cgroups with the naming pattern
$VMNAME.$DRIVER.libvirt. Following discussions with the systemd
community it was decided that only having a single '.' in the
names is preferrable. So this changes the naming scheme to be
$VMNAME.libvirt-$DRIVER. eg for LXC 'mycontainer.libvirt-lxc' or
for KVM 'myvm.libvirt-qemu'.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This can be set when the virPCIDevice is created and placed on a list,
then used later when traversing the list to determine which stub
driver to bind/unbind for managed devices.
The existing Detach and Attach functions' signatures haven't been
changed (they still accept a stub driver name in the arg list), but if
the arg list has NULL for stub driver and one is available in the
device's object, that will be used. (we may later deprecate and remove
the arg from those functions).
POSIX says that both basename() and dirname() may return static
storage (aka they need not be thread-safe); and that they may but
not must modify their input argument. Furthermore, <libgen.h>
is not available on all platforms. For these reasons, you should
never use these functions in a multi-threaded library.
Gnulib instead recommends a way to avoid the portability nightmare:
gnulib's "dirname.h" provides useful thread-safe counterparts. The
obvious dir_name() and base_name() are GPL (because they malloc(),
but call exit() on failure) so we can't use them; but the LGPL
variants mdir_name() (malloc's or returns NULL) and last_component
(always points into the incoming string without modifying it,
differing from basename semantics only on corner cases like the
empty string that we shouldn't be hitting in the first place) are
already in use in libvirt. This finishes the swap over to the safe
functions.
* cfg.mk (sc_prohibit_libgen): New rule.
* src/util/vircgroup.c: Fix offenders.
* src/parallels/parallels_storage.c (parallelsPoolAddByDomain):
Likewise.
* src/parallels/parallels_network.c (parallelsGetBridgedNetInfo):
Likewise.
* src/node_device/node_device_udev.c (udevProcessSCSIHost)
(udevProcessSCSIDevice): Likewise.
* src/storage/storage_backend_disk.c
(virStorageBackendDiskDeleteVol): Likewise.
* src/util/virpci.c (virPCIGetDeviceAddressFromSysfsLink):
Likewise.
* src/util/virstoragefile.h (_virStorageFileMetadata): Avoid false
positive.
Signed-off-by: Eric Blake <eblake@redhat.com>
Refactoring done in 19c6ad9ac7e7eb2fd3c8262bff5f087b508ad07f didn't
correctly take into account the order cgroup limit modification needs to
be done in. This resulted into errors when decreasing the limits.
The operations need to take place in this order:
decrease hard limit
change swap hard limit
or
change swap hard limit
increase hard limit
This patch also fixes the check if the hard_limit is less than
swap_hard_limit to print better error messages. For this purpose I
introduced a helper function virCompareLimitUlong to compare limit
values where value of 0 is equal to unlimited. Additionally the check is
now applied also when the user does not provide all of the tunables
through the API and in that case the currently set values are used.
This patch resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=950478
Create the utility function virSocketAddrGetIpPrefix() to
determine the prefix for this network. The code in this
function was adapted from virNetworkIpDefPrefix().
Update virNetworkIpDefPrefix() in src/conf/network_conf.c
to use the new utility function.
Signed-off-by: Gene Czarcinski <gene@czarc.net>
http://www.uhv.edu/ac/newsletters/writing/grammartip2009.07.01.htm
(and several other sites) give hints that 'onto' is best used if
you can also add 'up' just before it and still make sense. In many
cases in the code base, we really want the two-word form, or even
a simplification to just 'on' or 'to'.
* docs/hacking.html.in: Use correct 'on to'.
* python/libvirt-override.c: Likewise.
* src/lxc/lxc_controller.c: Likewise.
* src/util/virpci.c: Likewise.
* daemon/THREADS.txt: Use simpler 'on'.
* docs/formatdomain.html.in: Better usage.
* docs/internals/rpc.html.in: Likewise.
* src/conf/domain_event.c: Likewise.
* src/rpc/virnetclient.c: Likewise.
* tests/qemumonitortestutils.c: Likewise.
* HACKING: Regenerate.
Signed-off-by: Eric Blake <eblake@redhat.com>
When running unprivileged, virSetUIDGIDWithCaps will fail because it
tries to add the requested capabilities to the permitted and effective
sets.
Detect this case, and invoke the child with cleared permitted and
effective sets. If it is a setuid program, it will get them.
Some care is needed also because you cannot drop capabilities from the
bounding set without CAP_SETPCAP. Because of that, ignore errors from
setting the bounding set.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The need_prctl variable is not really needed. If it is false,
capng_apply will be called twice with the same set, causing
a little extra work but no problem. This keeps the code a bit
simpler.
It is also clearer to invoke capng_apply(CAPNG_SELECT_BOUNDS)
separately, to make sure it is done while we have CAP_SETPCAP.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The recent qemu requires "0x" prefix for the disk wwn, this patch
changes virValidateWWN to allow the prefix, and prepend "0x" if
it's not specified. E.g.
qemu-kvm: -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,\
drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,wwn=6000c60016ea71ad:
Property 'scsi-hd.wwn' doesn't take value '6000c60016ea71ad'
Though it's a qemu regression, but it's nice to allow the prefix,
and doesn't hurt for us to always output "0x".
Detected by a simple Shell script:
for i in $(git ls-files -- '*.[ch]'); do
awk 'BEGIN {
fail=0
}
/# *include.*\.h/{
match($0, /["<][^">]*[">]/)
arr[substr($0, RSTART+1, RLENGTH-2)]++
}
END {
for (key in arr) {
if (arr[key] > 1) {
fail=1
printf("%d %s\n", arr[key], key)
}
}
if (fail == 1)
exit 1
}' $i
if test $? != 0; then
echo "Duplicate header(s) in $i"
fi
done;
A later patch will add the syntax-check to avoid duplicate
headers.
Fix the error
util/vircgroup.c: In function 'virCgroupNewDomainPartition':
util/vircgroup.c:1299:11: error: declaration of 'dirname' shadows a global declaration [-Werror=shadow]
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Add a virCgroupIsolateMount method which looks at where the
current process is place in the cgroups (eg /system/demo.lxc.libvirt)
and then remounts the cgroups such that this sub-directory
becomes the root directory from the current process' POV.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If a cgroup controller is co-mounted with another, eg
/sys/fs/cgroup/cpu,cpuacct
Then it is a requirement that there exist symlinks at
/sys/fs/cgroup/cpu
/sys/fs/cgroup/cpuacct
pointing to the real mount point. Add support to virCgroupPtr
to detect and track these symlinks
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virCgroupNewDriver method had a 'bool privileged' param.
If a false value was ever passed in, it would simply not
work, since non-root users don't have any privileges to create
new cgroups. Just delete this broken code entirely and make
the QEMU driver skip cgroup setup in non-privileged mode
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A resource partition is an absolute cgroup path, ignoring the
current process placement. Expose a virCgroupNewPartition API
for constructing such cgroups
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently if virCgroupMakeGroup fails, we can get in a situation
where some controllers have been setup, but others not. Ensure
we call virCgroupRemove to remove what we've done upon failure
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the virCgroupPtr struct contains 3 pieces of
information
- path - path of the cgroup, relative to current process'
cgroup placement
- placement - current process' placement in each controller
- mounts - mount point of each controller
When reading/writing cgroup settings, the path & placement
strings are combined to form the file path. This approach
only works if we assume all cgroups will be relative to
the current process' cgroup placement.
To allow support for managing cgroups at any place in the
heirarchy a change is needed. The 'placement' data should
reflect the absolute path to the cgroup, and the 'path'
value should no longer be used to form the paths to the
cgroup attribute files.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Rename all the virCgroupForXXX methods to use the form
virCgroupNewXXX since they are all constructors. Also
make sure the output parameter is the last one in the
list, and annotate all pointers as non-null. Fix up
all callers, and make sure they use true/false not 0/1
for the boolean parameters
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>