The sd_notify method is used to tell systemd when libvirtd
has finished starting up. All it does is send a datagram
containing the string parameter to systemd on a UNIX socket
named in the NOTIFY_SOCKET environment variable. Rather than
pulling in the systemd libraries for this, just code the
notification directly in libvirt as this is a stable ABI
from systemd's POV which explicitly allows independant
implementations:
See "Reimplementable Independently" column in the
"$NOTIFY_SOCKET Daemon Notifications" row:
https://www.freedesktop.org/wiki/Software/systemd/InterfacePortabilityAndStabilityChart/
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1314881
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move the module from qemu_command.c to a new module virqemu.c and
rename the API to virQEMUBuildObjectCommandline.
This API will then be shareable with qemu-img and the need to build
a security object for luks support.
Signed-off-by: John Ferlan <jferlan@redhat.com>
When building using -Og, gcc sees that some variables can be used
uninitialized It can be debatable whether it is possible with our
codeflow, but functions should be self-contained and initializations are
always good. The return instead of goto is due to actualType being used
in the cleanup.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
So imagine the following. You connect read only to a daemon and
try to fetch stats for a shut off domain, e.g.:
virsh -r domstats $dom
but all of a sudden, virsh instead of printing the stats throws
the following error at you:
error: Disconnected from qemu:///system due to I/O error
error: End of file while reading data: Input/output error
The daemon crashed. This is its backtrace:
#0 0x00007fa43e3751a8 in virPerfEventIsEnabled (perf=0x0, type=VIR_PERF_EVENT_MBMT) at util/virperf.c:241
#1 0x00007fa424a9f042 in qemuDomainGetStatsPerf (driver=0x7fa3f4022a30, dom=0x7fa3f40e24c0, record=0x7fa41c000e20, maxparams=0x7fa4360b38d0, privflags=1) at qemu/qemu_driver.c:19110
#2 0x00007fa424a9f2e7 in qemuDomainGetStats (conn=0x7fa41c001b20, dom=0x7fa3f40e24c0, stats=127, record=0x7fa4360b3970, flags=1) at qemu/qemu_driver.c:19213
#3 0x00007fa424a9f672 in qemuConnectGetAllDomainStats (conn=0x7fa41c001b20, doms=0x7fa41c0017f0, ndoms=1, stats=127, retStats=0x7fa4360b3a50, flags=0) at qemu/qemu_driver.c:19303
#4 0x00007fa43e4e15f6 in virDomainListGetStats (doms=0x7fa41c0017f0, stats=0, retStats=0x7fa4360b3a50, flags=0) at libvirt-domain.c:11615
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f28d1a38700 (LWP 16154)]
0x00007f28da4fa1a8 in virPerfEventIsEnabled (perf=0x0, type=VIR_PERF_EVENT_MBMT) at util/virperf.c:241
241 return event->enabled;
Problem is, shut off domains don't have priv->perf allocated.
Therefore if in frame #1 qemuDomainGetStatsPerf() tries to check
if perf events are enabled, NULL is passed to
virPerfEventIsEnabled() which due to some incredible
implementation dereference it. Fix this by checking whether
passed object is not NULL.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This function is not used anywhere. Moreover, the code that would
use lives in virperf.c and therefore has access to the FD anyway.
Well, for instance virPerfReadEvent is doing just that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So far, this function has just three callers. Two of them call
virNetDevSetupControl to create a socket that we can then
optionally use for ioctl() to fetch data. However, querying sysfs
is preferred. Therefore it doesn't make much sense to require
users to set up the socket if they don't even know it will be
used in favour of sysfs. We can set up the socket iff we need to.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Yet another one of those where signed int (or long int) is not
enough. And useless to as we're aiming at unsigned anyway.
../../src/util/virsocketaddr.c: In function 'virSocketAddrIsPrivate':
../../src/util/virsocketaddr.c:289:45: error: result of '192l << 24' requires 33 bits to represent, but 'long int' only has 32 bits [-Werror=shift-overflow=]
return ((val & 0xFFFF0000) == ((192L << 24) + (168 << 16)) ||
^~
../../src/util/virsocketaddr.c:290:45: error: result of '172l << 24' requires 33 bits to represent, but 'long int' only has 32 bits [-Werror=shift-overflow=]
(val & 0xFFF00000) == ((172L << 24) + (16 << 16)) ||
^~
cc1: all warnings being treated as errors
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Apparently, 1 << 31 is signed which in turn does not fit into
a signed integer variable:
../../include/libvirt/libvirt-domain.h:1881:57: error: result of '1 << 31' requires 33 bits to represent, but 'int' only has 32 bits [-Werror=shift-overflow=]
VIR_CONNECT_GET_ALL_DOMAINS_STATS_ENFORCE_STATS = 1 << 31, /* enforce requested stats */
^~
cc1: all warnings being treated as errors
The solution is to make it an unsigned value. I've found only two
such occurrences in our code base.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit c8b1a83605 changed the function, making it
impossible for callers to be able to tell whether a
non-negative return value means "physical function
address found and parsed correctly" or "couldn't find
corresponding physical function".
The important difference between the two being that,
in the latter case, the returned pointer is NULL and
should never, ever be dereferenced.
In order to cope with these changes, the callers
have to be updated.
Move the logic from qemuDomainGenerateRandomKey into this new
function, altering the comments, variable names, and error messages
to keep things more generic.
NB: Although perhaps more reasonable to add soemthing to virrandom.c.
The virrandom.c was included in the setuid_rpc_client, so I chose
placement in vircrypto.
Introduce virCryptoHaveCipher and virCryptoEncryptData to handle
performing encryption.
virCryptoHaveCipher:
Boolean function to determine whether the requested cipher algorithm
is available. It's expected this API will be called prior to
virCryptoEncryptdata. It will return true/false.
virCryptoEncryptData:
Based on the requested cipher type, call the specific encryption
API to encrypt the data.
Currently the only algorithm support is the AES 256 CBC encryption.
Adjust tests for the API's
Seems recent versions of Coverity have (mostly) resolved the issue using
ternary operations in VIR_FREE (and now VIR_DISPOSE*) macros. So let's
just remove it and if necessary handle one off issues as the arise.
Rather than return 0/-1 and/or a pointer to some memory, adjust the
helper to just return the allocated structure or NULL on failure.
Adjust the callers in order to handle that
Signed-off-by: John Ferlan <jferlan@redhat.com>
If we get to the error: label and clear out the *virtual_functions[]
pointers and then return w/ error to the caller - the caller has it's
own cleanup of the same array in the out: label which is keyed off the
value of num_virt_fns, which wasn't reset to 0 in the called function
leading to a possible problem.
Just clear the value (found by Coverity)
Signed-off-by: John Ferlan <jferlan@redhat.com>
Some Intel processor families (e.g. the Intel Xeon processor E5 v3
family) introduced some RDT (Resource Director Technology) features
to monitor or control shared resource. Among these features, MBM
(Memory Bandwidth Monitoring), which is build on the CMT (Cache
Monitoring Technology) infrastructure, provides OS/VMM a way to
monitor bandwidth from one level of cache to another.
With current perf framework, this patch adds support to perf event
for MBM.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1331552
Instead of disabling auto-login of all scsi targets (even those
that do not "belong" to libvirt), use iscsiadm's "--op nonpersistent"
during discovery of iSCSI targets (e.g. "iscsiadm --mode discovery
--type sendtargets") in order to avoid the node database being altered
which led to the need for the "large hammer" approach taken by
commit id '3c12b654'.
This commit removes the virISCSITargetAutologin adjustment (eg. the setting
of node.startup to "manual"). The iscsiadm command has supported this mode
of operation as of commit id 'ad873767' to open-iscsi.
Utilize the exit status parameter for virCommandRunRegex in order to
check the return error from the 'iscsiadm --mode session' command.
Without this enabled, if there are no sessions running then virCommandRun
would have displayed an error such as:
2016-05-13 15:17:15.165+0000: 10920: error : virCommandWait:2553 :
internal error: Child process (iscsiadm --mode session)
unexpected exit status 21: iscsiadm: No active sessions.
It is possible that for certain paths (when probe is true) we only care
whether it's running or not to make certain decisions. Spitting out
the error for those paths is unnecessary.
If we do have a situation where probe = false and there's an error,
then display the error from iscsiadm if it's there.
Rather than have virCommandRun just spit out the error, allow callers
to decide to pass the exitstatus so the caller can make intelligent
decisions based on the error.
For a few cases where we handle secret information it's good to clear
the buffers containing sensitive data before freeing them.
Introduce VIR_DISPOSE, VIR_DISPOSE_N and VIR_DISPOSE_STRING that allow
simple clearing fo the buffers holding sensitive information on cleanup
paths.
Move some parts of virStorageFileRemoveLastPathComponent
into a separate function so they can be reused.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Both virGetLastError and virGetLastErrorMessage call virLastErrorObject method
that returns a thread-local error object. However, if a direct call to malloc
or pthread_setspecific (probably also due to malloc, since it sets ENOMEM)
fail, virLastErrorObject returns NULL which, although incorrectly interpreted
by virGetLastError as no error, still requires the caller to check for NULL
pointer. This isn't the case with virGetLastErrorMessage that also treated it
incorrectly as no error, but returned the literal "no error".
This patch tweaks the checks in the virGetLastErrorMessage function, so that
if virLastErrorObject failed, it returned "unknown error" which is equivalent
to the current approach with virGetLastError and if it returned NULL,
"unknown error" was set.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1318993
Commit id 'dd519a294' caused a regression cloning a volume into a
logical pool by removing just the 'allocation' adjustment during
storageVolCreateXMLFrom. Combined with the change to not require the
new volume input XML to have a capacity listed (commit id 'e3f1d2a8')
left the possibility that a zero allocation value (e.g., not provided)
would create a thin/sparse logical volume. When a thin lv becomes fully
populated, then LVM sets the partition 'inactive' and the subsequent
fdatasync() fails.
Add a new 'has_allocation' flag to be set at XML parse time to indicate
that allocation was provided. This is done so that if it's not provided
the create-from code uses the capacity value since we document that if
omitted, the volume will be fully allocated at time of creation.
For a logical backend, that creation time is 'createVol', while for a
file backend, creation doesn't set the size, but the 'createRaw' called
during buildVolFrom will decide whether the file is sparse or not based
on the provided capacity and allocation value.
For volume clones that provide different allocation and capacity values
to allow for sparse files, there is no change.
Usage of this keyword in front of function declaration that is exported via a
header file is unnecessary, since internally, this has been the default for most
compilers for quite some time.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
SRIOV VFs used in macvtap passthrough mode can take advantage of the
SRIOV card's transparent vlan tagging. All the code was there to set
the vlan tag, and it has been used for SRIOV VFs used for hostdev
interfaces for several years, but for some reason, the vlan tag for
macvtap passthrough devices was stubbed out with a -1.
This patch moves a bit of common validation down to a lower level
(virNetDevReplaceNetConfig()) so it is shared by hostdev and macvtap
modes, and updates the macvtap caller to actually send the vlan config
instead of -1.
Commit 0b36b0e9 broke polkit agent startup when attempting to fix a
coverity warning. Refactor it properly so that we don't need the 'cmd'
intermediate variable.
For disks sources described by a libvirt volume we don't need to do a
complicated check since virStorageTranslateDiskSourcePool already
correctly determines the actual disk type.
Replace the checks using a new accessor that does not open-code the
whole logic.
Fron c3bd0019c0 on instead of creating the following path for
cgroups:
/sys/fs/cgroupX/$name.libvirt-$driver
we generate rather more verbose one:
/sys/fs/cgroupX/$driver-$id-$name.libvirt-$driver
where $name is optional and included iff contains allowed chars.
See original commit for more reasoning. Now, problem with the
original commit is that we are unable to start any LXC domain
after it. Because when starting LXC container, the CGroup layout
is created by our lxc_controller process and then detected and
validated by libvirtd. The validation is done by trying to match
detected layout against all the possible patterns for cgroup
paths that we've ever had. And the commit in question forgot to
update this part of the code.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
json_reformat uses two spaces for when indenting nested objects, let's
do the same. The result of virJSONValueToString will be exactly the same
as json_reformat would produce.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Our socket address format is in a rather non-standard format and that is
because sasl library requires the IP address and service to be delimited by a
semicolon. The string form is a completely internal matter, however once the
admin interfaces to retrieve client identity information are merged, we should
return the socket address string in a common format, e.g. format defined by
URI rfc-3986, i.e. the IP address and service are delimited by a colon and
in case of an IPv6 address, square brackets are added:
Examples:
127.0.0.1:1234
[::1]:1234
This patch changes our default format to the one described above, while adding
separate methods to request the non-standard SASL format using semicolon as a
delimiter.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Just like with server-related APIs, before any of client-based APIs can be
called, a reference to a client-side client object needs to be obtained. For
this purpose, a lookup method should exist. Apart from the client retrieval
logic, a new error code for non-existent client had to be added as well.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
We had both and the only difference was that the latter also included
information about multifunction setting. The problem with that was that
we couldn't use functions made for only one of the structs (e.g.
parsing). To consolidate those two structs, use the one in virpci.h,
include that in domain_conf.h and add the multifunction member in it.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
$ echo -n 'log_level=1' > ~/.config/libvirt/libvirtd.conf
$ libvirtd --timeout=10
2014-10-10 10:30:56.394+0000: 6626: info : libvirt version: 1.1.3.6, package: 1.fc20 (Fedora Project, 2014-09-08-17:50:42, buildvm-05.phx2.fedoraproject.org)
2014-10-10 10:30:56.394+0000: 6626: error : main:1261 : Can't load config file: configuration file syntax error: /home/rjones/.config/libvirt/libvirtd.conf:1: expecting a value: /home/rjones/.config/libvirt/libvirtd.conf
Rather than try to fix this in the depths of the parser, just catch
the case when a config file doesn't end in a newline, and manually
append a newline to the content before parsing
https://bugzilla.redhat.com/show_bug.cgi?id=1151409
This an ubuntu/debian packaging convention. At one point it may have
been an actually different binary, but at least as of ubuntu precise
(the oldest supported ubuntu distro, released april 2012) kvm-img is
just a symlink to qemu-img for back compat.
I think it's safe to drop support for it
QEMU introduced the query-gic-capabilities QMP command
with commit 4468d4e0f383: use the command, if available,
to probe available GIC capabilities.
The information obtained is stored in a virQEMUCaps
instance, and will be later used to fill in a
virDomainCaps instance.
So in glibc-2.23 sys/sysmacros.h is no longer included from sys/types.h
and we don't build because of the usage of major/minor/makedev macros.
Autoconf already has AC_HEADER_MAJOR macro that check where exactly
these functions/macros are defined, so let's use that.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Since threadpool increments the current number of threads according to current
load, i.e. how many jobs are waiting in the queue. The count however, is
constrained by max and min limits of workers. The logic of this new API works
like this:
1) setting the minimum
a) When the limit is increased, depending on the current number of
threads, new threads are possibly spawned if the current number of
threads is less than the new minimum limit
b) Decreasing the minimum limit has no possible effect on the current
number of threads
2) setting the maximum
a) Icreasing the maximum limit has no immediate effect on the current
number of threads, it only allows the threadpool to spawn more
threads when new jobs, that would otherwise end up queued, arrive.
b) Decreasing the maximum limit may affect the current number of
threads, if the current number of threads is less than the new
maximum limit. Since there may be some ongoing time-consuming jobs
that would effectively block this API from killing any threads.
Therefore, this API is asynchronous with best-effort execution,
i.e. the necessary number of workers will be terminated once they
finish their previous job, unless other workers had already
terminated, decreasing the limit to the requested value.
3) setting priority workers
- both increase and decrease in count of these workers have an
immediate impact on the current number of workers, new ones will be
spawned or some of them get terminated respectively.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
In order for the client to see all thread counts and limits, current total
and free worker count getters need to be introduced. Client might also be
interested in the job queue length, so provide a getter for that too. As with
the other getters, preparing for the admin interface, mutual exclusion is used
within all getters.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
So far, the values the affected getters retrieve are static, i.e. there's no
way of changing them during runtime. But admin interface will later enable
not only getting but changing them as well. So to prevent phenomenons like
torn reads or concurrent reads and writes of unaligned values, use mutual
exclusion when getting these values (writes do, understandably, use them
already).
Signed-off-by: Erik Skultety <eskultet@redhat.com>
When either creating a threadpool, or creating a new thread to accomplish a job
that had been placed into the jobqueue, every time thread-specific data need to
be allocated, threadpool needs to be (re)-allocated and thread count indicators
updated. Make the code clearer to read by compressing these operations into a
more complex one.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
virTypedParamsValidate currently uses an index based check to find
duplicate parameters. This check does not work. Consider the following
simple example:
We have only 2 keys
A (multiples allowed)
B (multiples NOT allowed)
We are given the following list of parameters to check:
A
A
B
If you work through the validation loop you will see that our last iteration
through the loop has i=2 and j=1. In this case, i > j and keys[j].value.i will
indicate that multiples are not allowed. Both conditionals are satisfied so
an incorrect error will be given: "parameter '%s' occurs multiple times"
This patch replaces the index based check with code that remembers
the name of the last parameter seen and only triggers the error case if
the current parameter name equals the last one. This works because the
list is sorted and duplicate parameters will be grouped together.
In reality, we hit this bug while using selective block migration to migrate
a guest with 5 disks. 5 was apparently just the right number to push i > j
and hit this bug.
virsh migrate --live guestname --copy-storage-all
--migrate-disks vdb,vdc,vdd,vde,vdf
qemu+ssh://dsthost/system
Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com>
In a few places in libvirt we busy-wait for events, for example qemu
creating a monitor socket. This is problematic because:
- We need to choose a sufficiently small polling period so that
libvirt doesn't add unnecessary delays.
- We need to choose a sufficiently large polling period so that
the effect of busy-waiting doesn't affect the system.
The solution to this conflict is to use an exponential backoff.
This patch adds two functions to hide the details, and modifies a few
places where we currently busy-wait.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Refreshes meta-information such as allocation, capacity, format, etc.
Ploop volumes differ from other volume types. Path to volume is the path
to directory with image file root.hds and DiskDescriptor.xml.
https://openvz.org/Ploop/format
Due to this fact, operations of opening the volume have to be done once
again. get the information.
To decide whether the given volume is ploops one, it is necessary to check
the presence of root.hds and DiskDescriptor.xml files in volumes' directory.
Only in this case the volume can be manipulated as the ploops one.
Such strategy helps us to resolve problems that might occure, when we
upload some other volume type from ploop source.
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
virSocketAddrFormat() wants a single pointer, not a double pointer.
Fixes the following compilation error on FreeBSD:
util/virnetdev.c:1448:72: error: incompatible pointer types passing
'virSocketAddr **' to parameter of type 'const virSocketAddr *';
remove & [-Werror,-Wincompatible-pointer-types]
if (VIR_SOCKET_ADDR_VALID(peer) && !(peerstr = virSocketAddrFormat(&peer)))
^~~~~
./util/virsocketaddr.h:92:48: note: passing argument to parameter 'addr' here
char *virSocketAddrFormat(const virSocketAddr *addr);
^
FreeBSD lacks ENODATA, and viruuid.c redefines it to EIO, but it's not
actually using it. On the other hand, we have virrandom.c that's using
ENODATA. So make this re-definition common by moving it to internal.h,
so all the current and possible future users don't need to care about
that.
Using the existing virUUIDGenerateRandomBytes, move API to virrandom.c
rename it to virRandomBytes and add it to libvirt_private.syms.
This will be used as a fallback for generating a domain master key.
This reverts commit ee4cfb5643.
Since we're still not persisting our bookkeeping lists across
daemon restarts, we might have lost some information
virPCIDeviceReattach() relies on, for example whether the
device needs to be unbound from the stub driver.
As a result, if the daemon has been restarted in the meantime,
the device might end up remaining bound to the stub driver even
after 'virsh nodedev-reattach' or similar has been called, with
no way of giving it back to the host short of messing with
sysfs behind libvirt's back.
Revert back to the previous behavior of always trying to bind
the device to the host driver, regardless of its status when it
was detached, until persistent bookkeeping lists have been
implemented.
Do I really need to explain why?
Well, if read() is interrupted int the middle of reading, we will
never read the rest (even though it's highly unlikely as we are
reading just 8 bytes).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Now that we have @flags we can support changing perf events just
in active or inactive configuration regardless of the other.
Previously, calling virDomainSetPerfEvents set events in both
active and inactive configuration at once. Even though we allow
users to set perf events that are to be enabled once domain is
started up. The virDomainGetPerfEvents API was flawed too. It
returned just runtime info.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In some cases it's impractical to use the regular APIs as the bitmap
size needs to be pre-declared. These new APIs allow to use bitmaps that
self expand.
The new code adds a property to the bitmap to track the allocation of
memory so that VIR_RESIZE_N can be used.
This patch implement a set of interfaces for perf event. Based on
these interfaces, we can implement internal driver API for perf,
and get the results of perf conuter you care about.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Message-id: 1459171833-26416-4-git-send-email-qiaowei.ren@intel.com
After the patches that added tracking of in-use macvtap names (commit
370608, first appearing in libvirt-1.3.2), if the function to allocate
a new macvtap device came to a device name created outside libvirt, it
would retry the same device name MACVLAN_MAX_ID (8191) times before
finally giving up in failure.
The problem was that virBitmapNextClearBit was always being called
with "0" rather than the value most recently checked (which would
increment each time through the loop), so it would always return the
same id (since we dutifully release that id after failing to create a
new device using it).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1321546
Signed-off-by: Laine Stump <laine@laine.org>
Instead of forcing the values for the unbind_from_stub, remove_slot
and reprobe properties, look up the actual device and use that when
calling virPCIDeviceReattach().
This ensures the device is restored to its original state after
reattach: for example, if it was not bound to any driver before
detach, it will not be bound forcefully during reattach.
We would be just fine looking up the information in pcidevs most
of the time; however, some corner cases would not be handled
properly, so look up the actual device instead.
After this patch, ownership of virPCIDevice instances is very easy
to keep track of: for each host PCI device, the only instance that
actually matters is the one inside one of the bookkeeping list.
Whenever some operation needs to be performed on a PCI device, the
actual device is looked up first; when this is not the case, a
comment explains the reason.
Unmanaged devices, as the name suggests, are not detached
automatically from the host by libvirt before being attached to a
guest: it's the user's responsability to detach them manually
beforehand. If that preliminary step has not been performed, the
attach operation can't complete successfully.
Instead of relying on the lower layers to error out with cryptic
messages such as
error: Failed to attach device from /tmp/hostdev.xml
error: Path '/dev/vfio/12' is not accessible: No such file or directory
prevent the situation altogether and provide the user with a more
useful error message.
Unmanaged devices are attached to guests in two steps: first,
the device is detached from the host and marked as inactive;
subsequently, it is marked as active and attached to the guest.
If the daemon is restarted between these two operations, we lose
track of the inactive device.
Steps 5 and 6 of virHostdevPreparePCIDevices() already subtly
take care of this situation, but some planned changes will make
it so that's no longer the case. Plus, explicit is always better
than implicit.
This allows setting the address in host and/or network order and makes
the naming consistent. Now you don't need to call [hn]to[nh]l()
functions as that is taken care of by these functions. Also, now
the *NetOrder take the address in network order, the other functions in
host order so the naming and usage is consistent. Some places were
having the address in network order and calling ntohl() just so the
original function can call htonl() again. This makes it nicer to read.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
If we expose this information, which is one byte in every PCI config
file, we let all mgmt apps know whether the device itself is an endpoint
or not so it's easier for them to decide whether such device can be
passed through into a VM (endpoint) or not (*-bridge).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1317531
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The implementation is pretty straightforward. Moreover, because
of the nature of things, gethostbyname_r and gethostbyname2_r can
be implemented at the same time too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is a missing counterpart for virSocketAddrSetIPv4Addr()
and is going to be needed later in the tests.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This function is going to be used later in such context where the
argument makes no sense. Teach this function to cope with that
instead of the caller having to deal with passing some dummy
argument.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
These functions are going to be reused very shortly. So instead
of duplicating the code, lets move them into utils module.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We include the file in plenty of places. This is mostly due to
historical reasons. The only place that needs something from the
header file is storage_backend_fs which opens _PATH_MOUNTED. But
it gets the file included indirectly via mntent.h. At no other
place in our code we need _PATH_.*. Drop the include and
configure check then.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Refactor series 0b231195 worked with virLogDestination type which, depending
on the compiler, might be (and probably will be) an unsigned data type.
However, virEnumFromString may return -1 in case of an error. So, when enum
happens to be unsigned, some compilers will naturally complain about foo:
'if (foo < 0)'
The problem with the original virLogParseOutputs method was that the way it
parsed the input, walking the string char by char and using absolute jumps
depending on the virLogDestination type, was rather complicated to read.
This patch utilizes virStringSplit method twice, first time to filter out any
spaces and split the input to individual log outputs and then for each
individual output to tokenize it by to the parts according to our
PRIORITY:DESTINATION?(:DATA) format. Also, to STREQLEN for matching destination
was replaced with virDestinationTypeFromString call.
In order to refactor the ugly virLogParseOutputs method, this is a neat way of
finding out whether the destination type (in the form of a string) user
provided is a valid one. As a bonus, if it turns out it is valid, we get the
actual enum which will later be passed to any of virLogAddOutput methods right
away.
These comments explain the difference between a virPCIDevice
instance used for lookups and an actual device instance; some
information is also provided for specific uses.
virHostdevGetPCIHostDeviceList() is similar but does not filter out
devices that are not in the active list; that said, we are looking
up the device in the active list just a few lines after anyway, so
we might as well just keep a single function around.
This also helps stress the fact the objects contained in pcidevs are
only for looking up the actual devices, which is something later
commits will make even more explicit.
We're in the hostdev module, so mgr is not an ambiguous name, and
in fact it's already used in some cases. Switch all the code over.
Take the chance to shorten declaration of
virHostdevIsPCINodeDeviceUsedData structures.
When we want to look up a device in a device list and we already
have the IDs from another source, we can simply use
virPCIDeviceListFindByIDs() instead of creating a temporary device
object.
If 'last_processed_hostdev_vf != -1' is false then, since the
loop counter 'i' starts at 0, 'i <= last_processed_hostdev_vf'
can't possibly be true and the loop body will never be executed.
However, since 'i' is unsigned and 'last_processed_hostdev_vf'
is signed, we can't just get rid of the check completely; what
we can do is move it outside of the loop to avoid checking its
value on every iteration and cluttering the actual loop
condition.
This serves the same purpose as VIR_ERR_NO_xxx where xxx is any object
that API can be called upon. Only this particular one is used for
daemon's servers.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The comment claimed that virPCIDeviceReattach() does not reattach
a device to the host driver; except it actually does, so the
comment is just confusing and we're better off removing it.
Replace the term "loop" with the more generic "step". This allows us
to be more flexible and eg. have a step that consists in a single
function call.
Don't include the number of steps in the first comment of the
function, so that we can add or remove steps without having to worry
about keeping that comment in sync.
For the same reason, remove the summary contained in that comment.
Clean up some weird vertical spacing while we're at it.
While trying to build with -Os I've encountered some build
failures.
util/vircommand.c: In function 'virCommandAddEnvFormat':
util/vircommand.c:1257:1: error: inlining failed in call to 'virCommandAddEnv': call is unlikely and code size would grow [-Werror=inline]
virCommandAddEnv(virCommandPtr cmd, char *env)
^
util/vircommand.c:1308:5: error: called from here [-Werror=inline]
virCommandAddEnv(cmd, env);
^
This function is big enough for the compiler to be not inlined.
This is the error message I'm seeing:
Then virDomainNumatuneNodeSpecified is exported and called from
other places. It shouldn't be inlined then.
In file included from network/bridge_driver_platform.h:30:0,
from network/bridge_driver_platform.c:26:
network/bridge_driver_linux.c: In function 'networkRemoveRoutingFirewallRules':
./conf/network_conf.h:350:1: error: inlining failed in call to 'virNetworkDefForwardIf.constprop': call is unlikely and code size would grow [-Werror=inline]
virNetworkDefForwardIf(const virNetworkDef *def, size_t n)
^
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
More fallout from changing to using virPolkitAgent and handling error
paths. Needed to clear the 'cmd' once stored and of course add the
virCommandFree(cmd) in the error: label.
In virPolkitAgentCreate neglected to initialize agent to NULL. If
there was an error in the pipe, then we jump to error and would have
an issue. Found by coverity.
qemuProcessSetupEmulator runs at a point in time where there is only
the qemu main thread. Use virCgroupAddTask to put just that one task
into the emulator cgroup. That patch makes virCgroupMoveTask and
virCgroupAddTaskStrController obsolete.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
Introduce virPolkitAgentCreate and virPolkitAgentDestroy
virPolkitAgentCreate will run the polkit pkttyagent image as an asynchronous
command in order to handle the local agent authentication via stdin/stdout.
The code makes use of the pkttyagent --notify-fd mechanism to let it know
when the agent is successfully registered.
virPolkitAgentDestroy will close the command effectively reaping our
child process
When there isn't a ssh -X type session running and a user has not
been added to the libvirt group, attempts to run 'virsh -c qemu:///system'
commands from an otherwise unprivileged user will fail with rather
generic or opaque error message:
"error: authentication failed: no agent is available to authenticate"
This patch will adjust the error code and message to help reflect the
situation that the problem is the requested mechanism is UNAVAILABLE and
a slightly more descriptive error. The result on a failure then becomes:
"error: authentication unavailable: no polkit agent available to
authenticate action 'org.libvirt.unix.manage'"
A bit more history on this - at one time a failure generated the
following type message when running the 'pkcheck' as a subprocess:
"error: authentication failed: polkit\56retains_authorization_after_challenge=1
Authorization requires authentication but no agent is available."
but, a patch was generated to adjust the error message to help provide
more details about what failed. This was pushed as commit id '96a108c99'.
That patch prepended a "polkit: " to the output. It really didn't solve
the problem, but gave a hint.
After some time it was deemed using DBus API calls directly was a
better way to go (since pkcheck calls them anyway). So, commit id
'1b854c76' (more or less) copied the code from remoteDispatchAuthPolkit
and adjusted it. Then commit id 'c7542573' adjusted the remote.c
code to call the new API (virPolkitCheckAuth). Finally, commit id
'308c0c5a' altered the code to call DBus APIs directly. In doing
so, it reverted the failing error message to the generic message
that would have been received from DBus anyway.
Use virCgroupAddTaskController in virCgroupAddTask so we have one
single point where we add tasks to cgroups.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
The virHostdevIsVirtualFunction() was called exactly twice, and in
both cases the return value was saved to a temporary variable before
being checked. This would be okay if it improved readability, but in
this case is pretty pointless.
Get rid of the temporary variable and check the return value
directly; while at it, change the check from '<= 0' to '!= 1' to
align it with the way other similar *IsVirtualFunction() functions
are used thorough the code.
virNetDevIsVirtualFunction() returns 1 if the interface is a
virtual function, 0 if it isn't and -1 on error. This means that,
despite the name suggesting otherwise, using it as a predicate is
not correct.
Fix two callers that were doing so adding an explicit check on
the return value.
It may be useful in some cases to call TristateSwitch helper with TristateBool.
Document that enum values equivalency in the code.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
GIC v2 is the default, but checking against that specific version when
we want to know whether the default has been selected is potentially
error prone; using an alias instead makes it safer.
In cf113e8d we changed the declaration of
virCgroupAllowDevicePath() and virCgroupDenyDevicePath().
However, while updating the stub for non-cgroup platforms for the
former we forgot to update the latter too causing a build
failure.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The method will now return 0 on success and -1 on error, rather than number of
items which it iterated over before it returned back to the caller. Since the
only place where we actually check the number of elements iterated is in
virhashtest, return value of 0 and -1 can be a pretty accurate hint that it
iterated over all the items. However, if we really want to know the number of
items iterated over (like virhashtest does), a counter has to be provided
through opaque data to each iterator call. This patch adjusts return value of
virHashForEach, refactors the body, so it returns as soon as one of the
iterators fail and adjusts virhashtest to reflect these changes.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Our existing virHashForEach method iterates through all items disregarding the
fact, that some of the iterators might have actually failed. Errors are usually
dispatched through an error element in opaque data which then causes the
original caller of virHashForEach to return -1. In that case, virHashForEach
could return as soon as one of the iterators fail. This patch changes the
iterator return type and adjusts all of its instances accordingly, so the
actual refactor of virHashForEach method can be dealt with later.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
When adding disk images to ACL we may call those functions on NFS
shares. In that case we might get an EACCES, which isn't really relevant
since NFS would not hold a block device. This patch adds a flag that
allows to stop reporting an error on EACCES to avoid spaming logs.
Currently there's no functional change.
Since commit 47e5b5ae virCgroupAllowDevice allows to pass -1 as either
the minor or major device number and it automatically uses '*' in place
of that. Reuse the new approach through the code and drop the duplicated
functions.
We currently blindly accept any numeric value as a GIC version, even
though only GIC v2 and GIC v3 actually exist; on the other hand, we
reject "host", which is a perfectly legitimate value for QEMU guests.
This new enumeration contains all GIC versions libvirt is aware of.
The existing log messages for this have several problems; there are
two lines of log when one will suffice, they duplicate the function
name in log message (when it's already included by VIR_DEBUG), they're
missing some useful bits, they get logged even when the call is a NOP.
This patch cleans up the problems with those existing logs, and also
adds a new VIR_INFO-level log down at the function that is actually
creating and sending the netlink message that logs *everything* going
into the netlink message (which turns out to be much more useful in
practice for me; I didn't want to eliminate the logs at the existing
location though, in case they are useful in some scenario I'm
unfamiliar with; anyway those logs are remaining at debug level, so it
shouldn't be a bother to anyone).
Apparently we are not the only ones with dumb free functions
because dbus_message_unref() does not accept NULL either. But if
I were to vote, this one is even more evil. Instead of returning
an error just like we do it immediately dereference any pointer
passed and thus crash you app. Well done DBus!
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f878ebda700 (LWP 31264)]
0x00007f87be4016e5 in ?? () from /usr/lib64/libdbus-1.so.3
(gdb) bt
#0 0x00007f87be4016e5 in ?? () from /usr/lib64/libdbus-1.so.3
#1 0x00007f87be3f004e in dbus_message_unref () from /usr/lib64/libdbus-1.so.3
#2 0x00007f87bf6ecf95 in virSystemdGetMachineNameByPID (pid=9849) at util/virsystemd.c:228
#3 0x00007f879761bd4d in qemuConnectCgroup (driver=0x7f87600a32a0, vm=0x7f87600c7550) at qemu/qemu_cgroup.c:909
#4 0x00007f87976386b7 in qemuProcessReconnect (opaque=0x7f87600db840) at qemu/qemu_process.c:3386
#5 0x00007f87bf6edfff in virThreadHelper (data=0x7f87600d5580) at util/virthread.c:206
#6 0x00007f87bb602334 in start_thread (arg=0x7f878ebda700) at pthread_create.c:333
#7 0x00007f87bb3481bd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
(gdb) frame 2
#2 0x00007f87bf6ecf95 in virSystemdGetMachineNameByPID (pid=9849) at util/virsystemd.c:228
228 dbus_message_unref(reply);
(gdb) p reply
$1 = (DBusMessage *) 0x0
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The virStringListLength function does not ever modify the passed
string list. It merely counts the items in it. Make sure that we
reflect this bit in the function header.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(crobinso: fix up spacing and squash in sheepdog bit suggested
by Andrea)
In the commit 7938b533 we've changed the function signature,
however forgot to update stump that's used on systems without
CGroups causing a build failure.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
I've noticed that variable @reply is not initialized and if
something at the beginning of the function fails, e.g.
virDBusGetSystemBus(), the control jump straight to cleanup label
where dbus_message_unref() is then called over this uninitialized
variable.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
I've noticed couple of warning in dmesg while debugging
something:
[ 9683.973754] HTB: quantum of class 10001 is big. Consider r2q change.
[ 9683.976460] HTB: quantum of class 10002 is big. Consider r2q change.
I've read the HTB documentation and linux kernel code to find out
what's wrong. Basically we need to pass another argument
"quantum" to our tc cmd line because the default computed by HTB
does not always work in which case the warning message is printed
out.
You can read more details here:
http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm#sharing
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So, systemd-machined has this philosophy that machine names are like
hostnames and hence should follow the same rules. But we always allowed
international characters in domain names. Thus we need to modify the
machine name we are passing to systemd.
In order to change some machine names that we will be passing to systemd,
we also need to call TerminateMachine at the end of a lifetime of a
domain. Even for domains that were started with older libvirt. That
can be achieved thanks to virSystemdGetMachineNameByPID(). And because
we can change machine names, we can get rid of the inconsistent and
pointless escaping of domain names when creating machine names.
So this patch modifies the naming in the following way. It creates the
name as <drivername>-<id>-<name> where invalid hostname characters are
stripped out of the name and if the resulting name is longer, it
truncates it to 64 characters. That way we can start domains we
couldn't start before. Well, at least on systemd.
To make it work all together, the machineName (which is needed only with
systemd) is saved in domain's private data. That way the generation is
moved to the driver and we don't need to pass various unnecessary
arguments to cgroup functions.
The only thing this complicates a bit is the scope generation when
validating a cgroup where we must check both old and new naming, so a
slight modification was needed there.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1282846
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Same as for deserializer, this method might get handy for admin one day.
The major reason for this patch is to stay consistent with idea, i.e.
when deserializer can be shared, why not serializer as well. The only
problem to be solved was that the daemon side serializer uses a code
snippet which handles sparse arrays returned by some APIs as well as
removes any string parameters that can't be returned to older clients.
This patch makes of the new virTypedParameterRemote datatype introduced
by one of the pvious patches.
Since the method is static to remote_driver, it can't even be used by our
daemon. Other than that, it would be useful to be able to use it with admin as
well. This patch uses the new virTypedParameterRemote datatype introduced in
one of previous patches.
Currently, the deserializer is hardcoded into remote_driver which makes
it impossible for admin to use it. One way to achieve a shared implementation
(besides moving the code to another module) would be pass @ret_params_val as a
void pointer as opposed to the remote_typed_param pointer and add a new extra
argument specifying which of those two protocols is being used and typecast
the pointer at the function entry. An example from remote_protocol:
struct remote_typed_param_value {
int type;
union {
int i;
u_int ui;
int64_t l;
uint64_t ul;
double d;
int b;
remote_nonnull_string s;
} remote_typed_param_value_u;
};
typedef struct remote_typed_param_value remote_typed_param_value;
struct remote_typed_param {
remote_nonnull_string field;
remote_typed_param_value value;
};
That would leave us with a bunch of if-then-elses that needed to be used across
the method. This patch takes the other approach using the new datatype
introduced in one of earlier commits.
Both admin and remote protocols define their own types
(remote_typed_param vs admin_typed_param). Because of the naming convention,
admin typed params wouldn't be able to reuse the serialization/deserialization
methods, which are tailored for use by remote protocol, even if those method
were exported properly. In that case, introduce a new internal data type
structurally copying both admin and remote protocols which, eventually, would
allow serializer and deserializer to be used in a more generic way.
Use 'ret' for return variable name, clarify use of 'param_idx' and avoid
unnecessary 'success' label. No functional changes. Also document the
function.
The affected functions are:
virPCIDeviceGetManaged()
virPCIDeviceGetUnbindFromStub()
virPCIDeviceGetRemoveSlot()
virPCIDeviceGetReprobe()
Change their return type from unsigned int to bool: the corresponding
members in struct _virPCIDevice are defined as bool, and even the
corresponding virPCIDeviceSet*() functions take a bool value as input
so there's no point in these functions having unsigned int as return
type.
Suggested-by: John Ferlan <jferlan@redhat.com>
Unbinding a PCI device from the stub driver can require several steps,
and it can be useful for debugging to be able to trace which of these
steps are performed and which are skipped for each device.
The name is confusing, and there are just two uses: one is a test case,
and the other will be removed as part of an upcoming refactoring of
the hostdev code.
Commit 871e10f fixed a memory corruption error, but called strlen()
twice on the same string to do so. Even though the compiler is
probably smart enough to optimize the second call away, having a
single invocation makes the code slightly cleaner.
Suggested-by: Michal Privoznik <mprivozn@redhat.com>
In 370608b4c7 we have introduced two new internal APIs.
However, there are no stubs for build without macvtap. Therefore
build on systems lacking macvtap support (e.g. mingw or freebds)
fails when trying to link.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
libvirtd crashes on free()ing portData for an open vswitch port if that port
was deleted. To reproduce:
ovs-vsctl del-port vnet0
virsh migrate --live kvm1 qemu+ssh://dstHost/system
Error message:
libvirtd: *** Error in `/usr/sbin/libvirtd': free(): invalid pointer: 0x000003ff90001e20 ***
The problem is that virCommandRun can return an empty string in the event that
the port being queried does not exist. When this happens then we are
unconditionally overwriting a newline character at position strlen()-1. When
strlen is 0, we overwrite memory that does not belong to the string.
The fix: Only overwrite the newline if the string is not empty.
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
This patch creates two bitmaps, one for macvlan device names and one
for macvtap. The bitmap position is used to indicate that libvirt is
currently using a device with the name macvtap%d/macvlan%d, where %d
is the position in the bitmap. When requested to create a new
macvtap/macvlan device, libvirt will now look for the first clear bit
in the appropriate bitmap and derive the device name from that rather
than just starting at 0 and counting up until one works.
When libvirtd is restarted, the qemu driver code that reattaches to
active domains calls the appropriate function to "re-reserve" the
device names as it is scanning the status of running domains.
Note that it may seem strange that the retry counter now starts at
8191 instead of 5. This is because we now don't do a "pre-check" for
the existence of a device once we've reserved it in the bitmap - we
move straight to creating it; although very unlikely, it's possible
that someone has a running system where they have a large number of
network devices *created outside libvirt* named "macvtap%d" or
"macvlan%d" - such a setup would still allow creating more devices
with the old code, while a low retry max in the new code would cause a
failure. Since the objective of the retry max is just to prevent an
infinite loop, and it's highly unlikely to do more than 1 iteration
anyway, having a high max is a reasonable concession in order to
prevent lots of new failures.
In the following cases nl_recv() was returning the error "No buffer
space available":
* When switching CPUs to offline/online in a system more than 128 cpus
* When using virsh to destroy domain in a system with many interfaces
This patch sets the buffer size for all netlink sockets created by
libnl to 128K and turns on message peeking for nl_recv(). This
eliminates the "No buffer space available" errors seen in the cases
above, and also preempts other future errors the smaller buffers could
have caused.
Signed-off-by: Leno Hou <houqy@linux.vnet.ibm.com>
Signed-off-by: Laine Stump <laine@laine.org>
In dc576025c3 we renamed virCgroupIsolateMount function to
virCgroupBindMount. However, we forgot about one occurrence in
section of the code which provides stubs for platforms without
support for CGroups like *BSD for instance.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
On the host when we start a container, it will be
placed in a cgroup path of
/machine.slice/machine-lxc\x2ddemo.scope
under /sys/fs/cgroup/*
Inside the containers' namespace we need to setup
/sys/fs/cgroup mounts, and currently will bind
mount /machine.slice/machine-lxc\x2ddemo.scope on
the host to appear as / in the container.
While this may sound nice, it confuses applications
dealing with cgroups, because /proc/$PID/cgroup
now does not match the directory in /sys/fs/cgroup
This particularly causes problems for systems and
will make it create repeated path components in
the cgroup for apps run in the container eg
/machine.slice/machine-lxc\x2ddemo.scope/machine.slice/machine-lxc\x2ddemo.scope/user.slice/user-0.slice/session-61.scope
This also causes any systemd service that uses
sd-notify to fail to start, because when systemd
receives the notification it won't be able to
identify the corresponding unit it came from.
In particular this break rabbitmq-server startup
Future kernels will provide proper cgroup namespacing
which will handle this problem, but until that time
we should not try to play games with hiding parent
cgroups.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
libvirt always resets the MAC address of the physdev used for macvtap
passthrough when the guest is finished with it. This was happening
prior to the 802.1Qb[gh] DISASSOCIATE command, and was quite often
failing, presumably because the driver wouldn't allow the MAC address
to be reset while the association was still active, with a log message
like this:
virNetDevSetMAC:168 : Cannot set interface MAC to 00:00:00:00:00:00 on 'eth13': Cannot assign requested address
This patch changes the order - we now do the 802.1Qb[gh] disassociate
and delete the macvtap interface first, then and reset the MAC
address.
if instanceId is NULL
When virNetDevVPortProfileGetStatus() was called with instanceId =
NULL (which is the case for all DISASSOCIATE requests in 802.1Qbh) it
would log the following error:
Could not find netlink response with expected parameters
even though the disassociate had been successfully completely. Then,
due to the fortunate coincidence of status having been initialized to
0 and then not changed when the "failure" was encountered, it would
still return a status of 0 (PORT_VDP_RESPONSE_SUCCESS), so the caller
would assume a successful operation.
This would result in a spurious log message though, and would fill in
LastErrorMessage, so that the API would return that error if it
happened during cleanup from some other error. That, in turn, would
lead to an incorrect supposition that the response to the port profile
disassociate was the cause of the failure.
During debugging, I noticed that the VF in question usually had *no
uuid* associated with it (big surprise)by the time the disassociate
completed, so the solution is *not* to send the previous instanceId
down.
This patch fixes virNetDevVPortProfileGetStatus() to only check the
VF's uuid in the status if it was given an instanceId to check against
when originally called. Otherwise it only checks that the particular
VF is present (it will be).
This does cause a slight difference in behavior - rather than
returning with status unchanged (and thus always 0) it will actually
get the IFLA_PORT_RESPONSE. This could lead to revelation of error
conditions we were previously ignoring. Or not. So far "not".
Some of the protocol files already include handing of the missing int
types such as xdr_uint64_t, some don't. To fix it everywhere, move out
of the appropriate defines to the utils/virxdrdefs.h file and include
it where needed.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
As cgroup implementation only works on Linux, it does not
make much sense to include sys/mount.h if other requirements are
not met, such as HAVE_MNTENT_H and HAVE_GETMNTENT_R.
Also, it fixes build on OpenBSD that requires to include sys/param.h
along with sys/mount.h.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
Most of the changes to the list of active and inactive PCI devices
happen in virHostdev, where they are properly logged.
virPCIDeviceDetach() and virPCIDeviceReattach(), however, change the
inactive list as well, so they should be logging similar messages.
Instead of misusing a const string to hold up runtime allocated
data, introduce new variable @hoststr and obey const correctness.
==6879== 15 bytes in 1 blocks are definitely lost in loss record 68 of 1,064
==6879== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==6879== by 0xA7DDF97: vasprintf (in /lib64/libc-2.21.so)
==6879== by 0x552BBC6: virVasprintfInternal (virstring.c:493)
==6879== by 0x552BCDB: virAsprintfInternal (virstring.c:514)
==6879== by 0x54FA44C: virLogHostnameString (virlog.c:468)
==6879== by 0x54FAB0F: virLogVMessage (virlog.c:645)
==6879== by 0x54FA680: virLogMessage (virlog.c:531)
==6879== by 0x54FBBF4: virLogParseOutputs (virlog.c:1130)
==6879== by 0x11CB4F: daemonSetupLogging (libvirtd.c:685)
==6879== by 0x11E137: main (libvirtd.c:1297)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Once @hostname is printed into @hoststr we don't need it anymore.
==6879== 5 bytes in 1 blocks are definitely lost in loss record 10 of 1,064
==6879== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==6879== by 0xA7ED599: strdup (in /lib64/libc-2.21.so)
==6879== by 0x552C126: virStrdup (virstring.c:726)
==6879== by 0x553B13E: virGetHostnameImpl (virutil.c:720)
==6879== by 0x553B1BF: virGetHostnameQuiet (virutil.c:741)
==6879== by 0x54FA3FD: virLogHostnameString (virlog.c:462)
==6879== by 0x54FAB0F: virLogVMessage (virlog.c:645)
==6879== by 0x54FA680: virLogMessage (virlog.c:531)
==6879== by 0x54FBBF4: virLogParseOutputs (virlog.c:1130)
==6879== by 0x11CB4F: daemonSetupLogging (libvirtd.c:685)
==6879== by 0x11E137: main (libvirtd.c:1297)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If no port number was provided for a storage pool libvirt defaults to
port 6789; however, librbd/librados already default to 6789 when no port
number is provided.
In the future Ceph will switch to a new port for the Ceph monitors since
port 6789 is already assigned to a different application by IANA.
Port 6789 is assigned to SMC-HTTPS and Ceph now has port 3300 assigned as
the 'Ceph monitor' port.
In this case it is the best solution to not hardcode any port number into
libvirt and let librados handle the connection.
Only if a user specifies a different port number we pass it down to librados,
otherwise we leave it blank.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
merge
Due to debug logs like this:
virPCIGetDeviceAddressFromSysfsLink:2432 : Attempting to resolve device path from device link '/sys/class/net/eth1/device/virtfn6'
logStrToLong_ui:2369 : Converted '0000:07:00.7' to unsigned int 0
logStrToLong_ui:2369 : Converted '07:00.7' to unsigned int 7
logStrToLong_ui:2369 : Converted '00.7' to unsigned int 0
logStrToLong_ui:2369 : Converted '7' to unsigned int 7
virPCIGetDeviceAddressFromSysfs:1947 : virPCIDeviceAddress 0000:07:00.7
virPCIGetVirtualFunctions:2554 : Found virtual function 7
printed *once for each SR-IOV Virtual Function* of a Physical Function
each time libvirt retrieved the list of VFs (so if the system has 128
VFs, there would be 900 lines of log for each call), the debug logs on
any system with a large number of VFs was dominated by "information"
that was possibly useful for debugging when the code was being
written, but is now useless for debugging of any problem on a running
system, and only serves to obscure the real useful information. This
overkill has no place in production code, so this patch removes it.
The previous error message just indicated that the desired response
couldn't be found, this patch tells what was desired, as well as
listing out the entire table that had been in the netlink response, to
give some kind of idea why it failed.
I noticed in a log file that we had failed to set a MAC address. The
log said which interface we were trying to set, but didn't give the
offending MAC address, which could have been useful in determining the
source of the problem. This patch modifies all three places in the
code that set MAC addresses to report the failed MAC as well as
interface.
The manpage for sysconf() suggest including unistd.h as the
function is declared there. Even though we are not hitting any
compile issues currently, let's include the correct header file
instead of relying on some hidden include chain.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We want to eventually factor out the code dealing with device detaching
and reattaching, so that we can share it and make sure it's called eg.
when 'virsh nodedev-detach' is used.
For that to happen, it's important that the lists of active and inactive
PCI devices are updated every time a device changes its state.
Instead of passing NULL as the last argument of virPCIDeviceDetach() and
virPCIDeviceReattach(), pass the proper list so that it can be updated.
This replaces the virPCIKnownStubs string array that was used
internally for stub driver validation.
Advantages:
* possible values are well-defined
* typos in driver names will be detected at compile time
* avoids having several copies of the same string around
* no error checking required when setting / getting value
The names used mirror those in the
virDomainHostdevSubsysPCIBackendType enumeration.
This internal function supports, in theory, binding to a different
stub driver than the one the PCI device has been configured to use.
In practice, it is only ever called like
virPCIDeviceBindToStub(dev, dev->stubDriver);
which makes its second parameter redundant. Get rid of it, along
with the extra string copy required to support it.
This function can be used to retrieve the current locked memory
limit for a process, so that the setting can be later restored.
Add a configure check for getrlimit(), which we now use.
The prlimit() function allows both getting and setting limits for
a process; expose the same functionality in our wrapper.
Add the const modifier for new_limit, in accordance with the
prototype for prlimit().
Instead of replicating the information (domain, bus, slot, function)
inside the virPCIDevice structure, use the already-existing
virPCIDeviceAddress structure.
For users of the module, this means that the object returned by
virPCIDeviceGetAddress() can no longer be NULL and must no longer
be freed by the caller.
virCgroupNewMachine used to add the pidleader to the newly created
machine cgroup. Do not do this implicit anymore.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
Firstly, there's a bug (or typo) in the only place where we call
this function: @multiqueue is set whenever @tapfdSize is greater
than zero, while in fact the condition should have been 'greater
than one'.
Then, secondly, since the condition depends on just one
variable, that we are even passing down to the function, we can
move the condition into the function and drop useless argument.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Some older systems, e.g. RHEL-6 do not have IFF_MULTI_QUEUE flag
which we use to enable multiqueue feature. Therefore one gets the
following compile error there:
CC util/libvirt_util_la-virnetdevmacvlan.lo
util/virnetdevmacvlan.c: In function 'virNetDevMacVLanTapSetup':
util/virnetdevmacvlan.c:338: error: 'IFF_MULTI_QUEUE' undeclared (first use in this function)
util/virnetdevmacvlan.c:338: error: (Each undeclared identifier is reported only once
util/virnetdevmacvlan.c:338: error: for each function it appears in.)
make[3]: *** [util/libvirt_util_la-virnetdevmacvlan.lo] Error 1
So, whenever user wants us to enable the feature on such systems,
we will just throw a runtime error instead.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit id '56e2171c6' removed a variable from the argument list, but
neglected to update the ATTRIBUTE_NONNULL values, so when commit id
'08da97bfb' added a couple of arguments, the values were off.
Always return LLONG_MAX even on 32 bit systems. The limitation
originates from our use of "unsigned long" in several APIs. The internal
data type is unsigned long long. Make the test suite deterministic by
removing the architecture difference.
Flaw was introduced in 645881139b where
I've added a test that uses too large numbers.
For the multiqueue on macvtaps we are going to need to open
the device multiple times. Currently, this is not supported.
Rework the function, so that upper layers can be reworked too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Like we are doing for TUN/TAP devices, we should do the same for
macvtaps. Although, it's not as critical as in that case, we
should do it for the consistency.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For the multiqueue on macvtaps we are going to need to open
the device multiple times. Currently, this is not supported.
Rework the function, so that upper layers can be reworked too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For the multiqueue on macvtaps we are going to need to open
the device multiple times. Currently, this is not supported.
Rework the function, so that upper layers can be reworked too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There are few outdated things. Firstly, we don't need to undergo
the torture of fopen, fscanf and fclose just to get the interface
index when we have nice wrapper over that: virNetDevGetIndex.
Secondly, we don't need to have statically allocated buffer for
the path we are opening.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So yet again one of integer arguments that we use as a boolean.
Since the argument count of the function is unbearably long
enough, lets turn those booleans into flags.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
On the very first log message we send to any output, we include
the libvirt version number and package string. In some bug reports
we have been given libvirtd.log files that came from a different
host than the corresponding /var/log/libvirt/qemu log files. So
extend the initial log message to include the hostname too.
eg on first log message we would now see:
$ libvirtd
2015-12-04 17:35:36.610+0000: 20917: info : libvirt version: 1.3.0
2015-12-04 17:35:36.610+0000: 20917: info : hostname: dhcp-1-180.lcy.redhat.com
2015-12-04 17:35:36.610+0000: 20917: error : qemuMonitorIO:687 : internal error: End of file from monitor
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The log file descriptor associated with the virRotatingFile
struct should be marked close-on-exec, as even when virtlogd
re-exec's itself it expect to open the log file fresh. It
does not need to preserve the logfile handles, only the network
client FDs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit 0f7436ca54 "network: wait for DAD to finish for bridge IPv6 addresses"
results in:
CC util/libvirt_util_la-virnetdevmacvlan.lo
util/virnetdev.c: In function 'virNetDevParseDadStatus':
util/virnetdev.c:1319:188: error: cast increases required alignment of target type [-Werror=cast-align]
util/virnetdev.c:1332:41: error: cast increases required alignment of target type [-Werror=cast-align]
util/virnetdev.c:1334:92: error: cast increases required alignment of target type [-Werror=cast-align]
cc1: all warnings being treated as errors
on at least ARM platforms.
The three macros involved (NLMSG_NEXT, IFA_RTA and RTA_NEXT) all appear to
correctly take care of alignment, therefore suppress Wcast-align around their
uses.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Maxim Perevedentsev <mperevedentsev@virtuozzo.com>
Cc: Laine Stump <laine@laine.org>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Our domain_conf.* files are big enough. Not only they contain XML
parsing code, but they served as a storage of all functions whose
name is virDomain prefixed. This is just wrong as it gathers not
related functions (and modules) into one big file which is then
harder to maintain. Split virDomainObjList module into a separate
file called virdomainobjlist.[ch].
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If for some reason there is an existing log file, that is larger then
max length of log file, we need to rollover that file immediately.
Trying to figure out how much data we could write will resolve in
overflow of unsigned variable 'towrite' and this leads to segfault.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
As we need to provide support for URI aliases in libvirt-admin as well, URI
alias matching needs to be internally visible. Since
virConnectOpenResolveURIAlias does have a compatible signature, it could be
easily reused by libvirt-admin. This patch moves URI alias matching to util,
renaming it accordingly.
virConnectGetConfig and virConnectGetConfigPath were static libvirt
methods, merely because there hasn't been any need for having them
internally exported yet. Since libvirt-admin also needs to reference
its config file, 'xGetConfig' should be exported.
Besides moving, this patch also renames the methods accordingly,
as they are libvirt config specific.
Machine name escaping follows the same rules as serice name escape,
except that '.' and '-' must not be escaped in machine names, due
to a bug in systemd-machined.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1282846
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Copy the virtlockd codebase across to form the initial virlogd
code. Simple search & replace of s/lock/log/ and gut the remote
protocol & dispatcher. This gives us a daemon that starts up
and listens for connections, but does nothing with them.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add virRotatingFileReader and virRotatingFileWriter objects
which allow reading & writing from/to files with automation
rotation to N backup files when a size limit is reached. This
is useful for guest logging when a guaranteed finite size
limit is required. Use of external tools like logrotate is
inadequate since it leaves the possibility for guest to DOS
the host in between invokations of logrotate.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
According to the documentation, CreateMachine accepts only 7bit ASCII
characters in the machinename parameter, so let's make sure we can start
machines with unicode names with systemd. We already have a function
for that, we just forgot to use it.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1062943
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1282846
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
A PCI device may have the capability to setup virtual functions (VFs)
but have them currently all disabled. Prior to this patch, if that was
the case the the node device XML for the device wouldn't report any
virtual_functions capability.
With this patch, if a file called "sriov_totalvfs" is found in the
device's sysfs directory, its contents will be interpreted as a
decimal number, and that value will be reported as "maxCount" in a
capability element of the device's XML, e.g.:
<capability type='virtual_functions' maxCount='7'/>
This will be reported regardless of whether or not any VFs are
currently enabled for the device.
NB: sriov_numvfs (the number of VFs currently active) is also
available in sysfs, but that value is implied by the number of items
in the list that is inside the capability element, so there is no
reason to explicitly provide it as an attribute.
sriov_totalvfs and sriov_numvfs are available in kernels at least as far
back as the 2.6.32 that is in RHEL6.7, but in the case that they
simply aren't there, libvirt will behave as it did prior to this patch
- no maxCount will be displayed, and the virtual_functions capability
will be absent from the device's XML when 0 VFs are enabled.
Introduce a new helper function "virDiskNameParse" which extends
virDiskNameToIndex but handling both disk index and partition index.
Also rework virDiskNameToIndex to be based on virDiskNameParse.
A test is also added for this function testing both valid and
invalid disk names.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
The LXC driver uses virSetUIDGID() to become UID/GID 0.
It passes an empty groups list to virSetUIDGID()
to get rid of all supplementary groups from the host side.
But virSetUIDGID() calls setgroups() only if the supplied list
is larger than 0.
This leads to a container root with unrelated supplementary groups.
In most cases this issue is unoticed as libvirtd runs as UID/GID 0
without any supplementary groups.
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch addresses BZ 1244895.
Adapt the sysfs TPM command cancel path for the TPM driver that
does not use a miscdevice anymore since Linux 4.0. Support old
and new paths and check their availability.
Add a mockup for the test cases to avoid the testing for
availability of the cancel path.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Use virNetDevSetupControl instead of open coding using socket(AF_LOCAL...)
and clearing virIfreq.
By using virNetDevSetupControl, the socket is then opened using
AF_PACKET which requires being privileged (effectively root) in
order to complete successfully. Since that's now a requirement,
then the ioctl(SIOCETHTOOL) should not fail with EPERM, thus it
is removed from the filtered listed of failure codes.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Since the SIOCETHTOOL ioctl only works for privileged daemons, if called
when not root, then virNetDevGetFeatures will VIR_DEBUG a message and
return 0 as if the functions were not available for the architecture.
This effectively returns an empty bitmap indicating no features available.
Introduced by commit id 'c9027d8f4'
Signed-off-by: John Ferlan <jferlan@redhat.com>
In commit id 'c9027d8f4' when updating the posted patch to generate
a bitmap instead of an array of named feature bits, adjustment of
the args was missed
Recently reverted commit id '6f2a0198' showed a need to add extra
comments when dealing with filtering of potential "non-issues".
Scanning through upstream patch postings indicates early on the
reasons for the filtering of specific ioctl failures were provided;
however, when converted from causing an error to VIR_DEBUG's the
reasons were missing. A future read/change of the code incorrectly
assumed they could or should be removed.
This reverts commit 6f2a0198e9.
This commit removed error reporting from virNetDevSendEthtoolIoctl
pushing responsibility onto the callers. This is wrong, however,
since virNetDevSendEthtoolIoctl calls virNetDevSetupControl
which can still report errors. So as a result virNetDevSendEthtoolIoctl
may or may not report errors depending on which bit of it fails, and as
a result callers now overwrite some errors.
It also introduced a regression causing unprivileged libvirtd to
spew error messages to the console due to inability to query the
NIC features, an error which was previously ignored.
virNetDevSetupControlFull:148 : Cannot open network interface control socket: Operation not permitted
virNetDevFeatureAvailable:3062 : Cannot get device wlp3s0 flags: Operation not permitted
virNetDevSetupControlFull:148 : Cannot open network interface control socket: Operation not permitted
virNetDevFeatureAvailable:3062 : Cannot get device wlp3s0 flags: Operation not permitted
virNetDevSetupControlFull:148 : Cannot open network interface control socket: Operation not permitted
virNetDevFeatureAvailable:3062 : Cannot get device wlp3s0 flags: Operation not permitted
virNetDevSetupControlFull:148 : Cannot open network interface control socket: Operation not permitted
virNetDevFeatureAvailable:3062 : Cannot get device wlp3s0 flags: Operation not permitted
Looking back at the original posting I see no explanation of why
thsi refactoring was needed, so reverting the clearly broken
error reporting logic looks like the best option.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit id '0f7436ca' added virNetDevWaitDadFinish using ATTRIBUTE_NONNULL
for both arguments, although one is a non-null argument. A Coverity build
balks at that.
Rather than "if (virNetDevFeatureAvailable(ifname, &cmd))" change the
success criteria to "if (virNetDevFeatureAvailable(ifname, &cmd) == 1)".
The called helper returns -1 on failure, 0 on not found, and 1 on found.
Thus a failure was setting bits.
Introduced by commit ac3ed20 which changed the helper's return
values without adjusting its callers
Signed-off-by: John Ferlan <jferlan@redhat.com>
VIR_DEBUG and VIR_WARN will automatically add a new line to the message,
having "\n" at the end or at the beginning of the message results in
empty lines.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This was originally set to 5 seconds, but times of 5.5 to 7 seconds
were experienced. Since it's an arbitrary number intended to prevent
an infinite hang, having it a bit too high won't hurt anything, and 20
seconds looks to be adequate (i.e. I think/hope we don't need to make
it tunable in libvirtd.conf)
If DAD not finished in 5 seconds, user will get an
unknown error like this:
# virsh net-start ipv6
error: Failed to start network ipv6
error: An error occurred, but the cause is unknown
Call virReportError to set an error.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Build on non-Linux fails because the virNetDevWaitDadFinish() stub
has unused parameters. Fix by adding appropriate ATTRIBUTE_UNUSED
for these parameters.
Pushing under build-breaker rule.
commit db488c79 assumed that dnsmasq would complete IPv6 DAD before
daemonizing, but in reality it doesn't wait, which creates problems
when libvirt's bridge driver sets the matching "dummy tap device" to
IFF_DOWN prior to DAD completing.
This patch waits for DAD completion by periodically polling the kernel
using netlink to check whether there are any IPv6 addresses assigned
to bridge which have a 'tentative' state (if there are any in this
state, then DAD hasn't yet finished). After DAD is finished, execution
continues. To avoid an endless hang in case something was wrong with
the kernel's DAD, we wait a maximum of 5 seconds.
Commit id '1c24cfe9' added error messages for virNumaSetPagePoolSize;
however, virNumaGetHugePageInfo also uses virNumaGetHugePageInfoPath
in order to build the path, but it never checked upon return if
the built path exists which could lead to an error message as follows:
$ virsh freepages 0 1
error: Failed to open file
'/sys/devices/system/node/node0/hugepages/hugepages-1kB/free_hugepages':
No such file or directory
Rather than add the same message for the other two callers, adjust
the virNumaGetHugePageInfoPath in order not only build the path, but
also check if the built path exists. If the path does not exist,
then generate the error message and return failure.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Commit id '1c24cfe9' added new checks and error messaes for failure
scenarios. Let's adjust those error messages to after the call to
virNumaGetHugePageInfoPath in order to provide a more specific error
message depending on node and page_size
After this patch:
# virsh allocpages --pagesize 2047 --pagecount 1 --cellno 0
error: operation failed: page size 2047 is not available on node 0
# virsh allocpages --pagesize 2047 --pagecount 1
error: operation failed: page size 2047 is not available
Signed-off-by: Luyao Huang <lhuang@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1265114
Refactor helper virNumaGetHugePageInfoPath to handle returning a directory
path when passed a page_size of 0 and suffix == NULL into a new helper
virNumaGetHugePageInfoDir which will only be called when a directory
path is expected to be returned. This solves the issue where the helper
was called with page_size == 0 expecting a file path in return, but
instead got a directory path and failed in virFileReadAll with:
error : virFileReadAll:1358 : Failed to read file
'/sys/devices/system/node/node0/hugepages/': Is a directory
Since virNumaGetPages API expects to return a directory by passing
page_size == 0 and suffix == NULL, it will now call the new helper.
Callers to virNumaGetHugePageInfoPath expect to return a file path
which could then be used in the call to virFileReadAll.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
We have macros for both positive and negative string matching.
Therefore there is no need to use !STREQ or !STRNEQ. At the same
time as we are dropping this, new syntax-check rule is
introduced to make sure we won't introduce it again.
Signed-off-by: Ishmanpreet Kaur Khera <khera.ishman@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Event implementations need to be registered before a connection to the
Hypervisor is opened, otherwise event handling can be impaired (e.g.
delayed messages). This fact is referenced in an e-mail [1], but should
also be noted in the documentation of the registration functions.
[1] https://www.redhat.com/archives/libvirt-users/2014-April/msg00011.html
After a successful creation of a directory, if some other call results
in returning a failure, let's remove the directory we created to
prevent another round trip or confusion in the caller. In particular, this
function can be called during a storage backend buildVol, so in order
to ensure that caller doesn't need to distinguish between failed create
or some other failure after create, just remove the directory we created.
Signed-off-by: John Ferlan <jferlan@redhat.com>
After a successful creation of a file, if some other call results
in returning a failure, let's unlink the file we created to prevent
another round trip or confusion in the caller. In particular, this
function can be called during a storage backend buildVol, so in order
to ensure that caller doesn't need to distinguish between failed create
or some other failure after create, just remove the volume we created.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The internal representation of a JSON array counts the items in
size_t. However, for some reason, when asking for the count it's
reported as int. Firstly, we need the function to return a signed
type as it's returning -1 on an error. But, not every system has
integer the same size as size_t. Therefore, lets return ssize_t.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
As it turns out the caller in this case expects a return < 0 for failure
and to get/use "errno" rather than using the negative of returned status.
Again different than the create path.
If someone "deleted" a file from the pool without using virsh vol-delete,
then the unlink/rmdir would return an error (-1) and set errno to ENOENT.
The caller checks errno for ENOENT when determining whether to throw an
error message indicating the failure. Without the change, the error
message is:
error: Failed to delete vol $vol
error: cannot unlink file '/$pathto/$vol': Success
This patch thus allows the fork path to follow the non-fork path
where unlink/rmdir return -1 and errno.
Unlike create options, if the file to be removed is already in the
pool, then the uid/gid will come from the pool. If it's the same as the
currently running process, then just do the unlink/rmdir directly
rather than going through the fork processing unnecessarily
Similar to commit id '35847860', it's possible to attempt to create
a 'netfs' directory in an NFS root-squash environment which will cause
the 'vol-delete' command to fail. It's also possible error paths from
the 'vol-create' would result in an error to remove a created directory
if the permissions were incorrect (and disallowed root access).
Thus rename the virFileUnlink to be virFileRemove to match the C API
functionality, adjust the code to following using rmdir or unlink
depending on the path type, and then use/call it for the VIR_STORAGE_VOL_DIR
Commit id 'f1f68ca33' added code to remove the directory paths for
auto-generated sockets, but that code could be called before the
paths were created resulting in generating error messages from
virFileDeleteTree indicating that the file doesn't exist.
Rather than "enforce" all callers to make the non-NULL and existence
checks, modify the virFileDeleteTree API to silently ignore NULL on
input and non-existent directory trees.
Commit 35847860f6 Added the virFileUnlink function, but failed to add
a version for mingw build, causing the following error:
Cannot export virFileUnlink: symbol not defined
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Coverity claims it could be possible to call virDBusTypeStackFree with
*stack == NULL and although the two API's that call it don't appear to
allow that - I suppose it's better to be safe than sorry
In virFileNBDDeviceFindUnused if virFileNBDDeviceIsBusy returns 0,
then both branches jumped to cleanup, so just use ignore_value
since the function returns NULL or some memory and the caller
handles the error.
Before libvirt sets the MAC address of the physdev (the physical
ethernet device) linked to a macvtap passthrough device, it always
saves the previous MAC address to restore when the guest is finished
(following a "leave nothing behind" policy). For a long time it
accomplished the save/restore with a combination of
ioctl(SIOCGIFHWADDR) and ioctl(SIOCSIFHWADDR), but in commit cbfe38c
(first in libvirt 1.2.15) this was changed to use netlink RTM_GETLINK
and RTM_SETLINK commands sent to the Physical Function (PF) of any
device that was detected to be a Virtual Function (VF).
We later found out that this caused problems with any devices using
the Cisco enic driver (e.g. vmfex cards) because the enic driver
hasn't implemented the function that is called to gather the
information in the IFLA_VFINFO_LIST attribute of RTM_GETLINK
(ndo_get_vf_config() for those keeping score), so we would never get
back a useful response.
In an ideal world, all drivers would implement all functions, but it
turns out that in this case we can work around this omission without
any bad side effects - since all macvtap passthrough <interface>
definitions pointing to a physdev that uses the enic driver *must*
have a <virtualport type='802.1Qbh'>, and since no other type of
ethernet devices use 802.1Qbh, libvirt can change its behavior in this
case to use the old-style. ioctl(SIOC[GS]IFHWADDR). That's what this
patch does.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1257004
These functions were made static as a part of commit cbfe38c since
they were no longer called from outside virnetdev.c. We once again
need to call them from another file, so this patch makes them once
again public.
In an NFS root-squashed environment the 'vol-delete' command will fail to
'unlink' the target volume since it was created under a different uid:gid.
This code continues the concepts introduced in virFileOpenForked and
virDirCreate[NoFork] with respect to running the unlink command under
the uid/gid of the child. Unlike the other two, don't retry on EACCES
(that's why we're here doing this now).
This will only be seen when debugging, but in order to help determine
whether a virFileOpenForceOwnerMode failed during an NFS root-squash
volume/file creation, add an error message from the child.
commit 09778e09 switched from using ioctl(SIOCBRDELBR) for bridge
device deletion to using a netlink RTM_DELLINK message, which is the
more modern way to delete a bridge (and also doesn't require the
bridge to be ~IFF_UP to succeed). However, although older kernels
(e.g. 2.6.32, in RHEL6/CentOS6) support deleting *some* link types
with RTM_NEWLINK, they don't support deleting bridges, and there is no
compile-time way to figure this out.
This patch moves the body of the SIOCBRDELBR version of
virNetDevBridgeDelete() into a static function, calls the new function
from the original, and also calls the new function from the
RTM_DELLINK version if the RTM_DELLINK message generates an EOPNOTSUPP
error. Since RTM_DELLINK is done from the subordinate function
virNetlinkDelLink, which is also called for other purposes (deleting a
macvtap interface), a function pointer called "fallback" has been
added to the arglist of virNetlinkDelLink() - if that arg != NULL, the
provided function will be called when (and only when) RTM_DELLINK
fails with EOPNOTSUPP.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1252780 (part 2)
commit fc7b23db switched from using ioctl(SIOCBRADDBR) for bridge
creation to using a netlink RTM_NEWLINK message with IFLA_INFO_KIND =
"bridge", which is the more modern way to create a bridge. However,
although older kernels (e.g. 2.6.32, in RHEL6/CentOS6) support
creating *some* link types with RTM_NEWLINK, they don't support
creating bridges, and there is no compile-time way to figure this out
(since the "type" isn't an enum, but rather a character string).
This patch moves the body of the SIOCBRADDBR version of
virNetDevBridgeCreate() into a static function, calls the new function
from the original, and also calls the new function from the
RTM_NEWLINK version if the RTM_NEWLINK message generates an EOPNOTSUPP
error.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1252780
So far, the virProcessSetNamespaces() takes an array of FDs that
it tries to set namespace on. However, in the very next commit
this array may be sparse, having some -1's in it. Teach the
function to cope with that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The ACS checks are meaningless when using the more modern VFIO driver
for device assignment since VFIO has its own more complete and exact
checks, but I didn't realize that when I added support for VFIO. This
patch eliminates the ACS check when preparing PCI devices for
assignment if VFIO is being used.
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1256486
Commit 89c509a0 added getters for cgroup block device I/O throttling,
however stub versions of these functions have not matching function
prototypes that result in compilation fail on platforms not supporting
cgroup.
Fix build by correcting prototypes of the stubbed functions.
Pushing under build-breaker rule.
This function translates device paths to "major:minor " string, and all
virCgroupSetBlkioDevice* functions are modified to use it. It's a
cleanup with no functional change.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
That function takes string list and returns first string in that list
that starts with the @prefix parameter with that prefix being skipped as
the caller knows what it starts with (also for easier manipulation in
future).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Fix inconsistency between function description and actual
parameter name in virConfGetValue/virConfSetValue.
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Ever since commit e44b0269, 64-bit mingw compilation fails with:
../../src/util/virprocess.c: In function 'virProcessGetPids':
../../src/util/virprocess.c:628:50: error: passing argument 4 of 'virStrToLong_i' from incompatible pointer type [-Werror=incompatible-pointer-types]
if (virStrToLong_i(ent->d_name, NULL, 10, &tmp_pid) < 0)
^
In file included from ../../src/util/virprocess.c:59:0:
../../src/util/virstring.h:53:5: note: expected 'int *' but argument is of type 'pid_t * {aka long long int *}'
int virStrToLong_i(char const *s,
^
cc1: all warnings being treated as errors
Although mingw won't be using this function, it does compile the
file, and the fix is relatively simple.
* src/util/virprocess.c (virProcessGetPids): Don't assume pid_t
fits in int.
Signed-off-by: Eric Blake <eblake@redhat.com>
If this function fails, the error message is reported only in
some cases (e.g. OOM), but in some it's not (e.g. duplicate key).
This fact is painful and we should either not report error at all
or report the error in all possible cases. I vote for the latter.
Unfortunately, since the key may be an arbitrary value (not
necessarily a string) we can't report it in the error message.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In 9190f0b0 we've tried to fix an OOM. And boy, was that fix
successful. But back then, the hash table implementation worked
strictly over string keys, which is not the case anymore. Hash
table have this function keyCopy() which returns void *.
Therefore a local variable that is temporarily holding the
intermediate return value from that function should be void *
too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In order to share as much virsh' logic as possible with upcomming
virt-admin client we need to split virsh logic into virsh specific and
client generic features.
Since majority of virsh methods should be generic enough to be used by
other clients, it's much easier to rename virsh specific data to virshX
than doing this vice versa. It moved generic virsh commands (including info
and opts structures) to generic module vsh.c.
Besides renaming methods and structures, this patch also involves introduction
of a client specific control structure being referenced as private data in the
original control structure, introduction of a new global vsh Initializer,
which currently doesn't do much, but there is a potential for added
functionality in the future.
Lastly it introduced client hooks which are especially necessary during
client connecting phase.
This fixes the crash described here:
https://www.redhat.com/archives/libvir-list/2015-August/msg00162.html
In short, we were calling ioctl(SIOCETHTOOL) pointing to a too-short
object that was a local on the stack, resulting in the memory past the
end of the object being overwritten. This was because the struct used
by the ETHTOOL_GFEATURES command of SIOCETHTOOL ends with a 0-length
array, but we were telling ethtool that it could use 2 elements on the
array.
The fix is to allocate the necessary memory with VIR_ALLOC_VAR(),
including the extra length needed for a 2 element array at the end.
This is no functional change. It's just that later in the series we
will need to pass class_id as an integer.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There is no guarantee that an enum start it mapped onto a value
of zero. However, we are guaranteed that enum items are
consecutive integers. Moreover, it's a pity to define an enum to
avoid using magical constants but then using them anyway.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This patch modifies virSocketAddrGetRange() to function properly when
the containing network/prefix of the address range isn't known, for
example in the case of the NAT range of a virtual network (since it is
a range of addresses on the *host*, not within the network itself). We
then take advantage of this new functionality to validate the NAT
range of a virtual network.
Extra test cases are also added to verify that virSocketAddrGetRange()
works properly in both positive and negative cases when the network
pointer is NULL.
This is the *real* fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=985653
Commits 1e334a and 48e8b9 had earlier been pushed as fixes for that
bug, but I had neglected to read the report carefully, so instead of
fixing validation for the NAT range, I had fixed validation for the
DHCP range. sigh.
Qemu reports physical size 0 for block devices. As 15fa84acbb
changed the behavior of qemuDomainGetBlockInfo to just query the monitor
this created a regression since we didn't report the size correctly any
more.
This patch adds code to refresh the physical size of a block device by
opening it and seeking to the end and uses it both in
qemuDomainGetBlockInfo and also in qemuDomainGetStatsOneBlock that was
broken since it was introduced in this respect.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1250982
The commit 7e72de4 didn't consider the hotplug scenarios. The patch addresses
the hotplug case whereby if atleast one of the pci function is owned by a
guest, the hotplug of other functions/devices in the same iommu group to the
same guest goes through successfully.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
So far qemu-nbd is run even if the nbd kernel module isn't loaded. This
leads to errors when the user starts his lxc container while libvirt
could easily load the nbd module automatically.
Commit id 'ac3ed2085' causes 'virsh nodedev-list --cap net' to fail
on any system without SYSFS_INFINIBAND_DIR (/sys/class/infiniband).
Rather than assume it's there and fail on the attempt to open the
non-existent directory, check if it's there - if not, return
success and move on. Also fix caller to check < 0 upon return.
As reported by Suren Hajyan <shajyan@redhat.com> from run of unit tests
Commit ac3ed20 breaks build on FreeBSD with:
CC util/libvirt_util_la-virnetdev.lo
util/virnetdev.c:2967:1: error: unused function 'virNetDevRDMAFeature' [-Werror,-Wunused-function]
virNetDevRDMAFeature(const char *ifname,
^
So hide virNetDevRDMAFeature function under the #ifdef 'SIOCETHTOOL'
and 'HAVE_STRUCT_IFREQ' section.
Pushed under the build breaker rule.
The scope name, even according to our docs is
"machine-$DRIVER\x2d$VMNAME.scope" virSystemdMakeScopeName would use the
resource partition name instead of "machine-" if it was specified thus
creating invalid scope paths.
This makes libvirt drop cgroups for a VM that uses custom resource
partition upon reconnecting since the detected scope name would not
match the expected name generated by virSystemdMakeScopeName.
The error is exposed by the following log entry:
debug : virCgroupValidateMachineGroup:302 : Name 'machine-qemu\x2dtestvm.scope' for controller 'cpu' does not match 'testvm', 'testvm.libvirt-qemu' or 'machine-test-qemu\x2dtestvm.scope'
for a "/machine/test" resource and "testvm" vm.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1238570
Adding functionality to libvirt that will allow
it query the interface for the availability of RDMA and
tx-udp_tnl-segmentation Offloading NIC capabilities
Here is an example of the feature XML definition:
<device>
<name>net_eth4_90_e2_ba_5e_a5_45</name>
<path>/sys/devices/pci0000:00/0000:00:03.0/0000:08:00.1/net/eth4</path>
<parent>pci_0000_08_00_1</parent>
<capability type='net'>
<interface>eth4</interface>
<address>90:e2:ba:5e:a5:45</address>
<link speed='10000' state='up'/>
<feature name='rx'/>
<feature name='tx'/>
<feature name='sg'/>
<feature name='tso'/>
<feature name='gso'/>
<feature name='gro'/>
<feature name='rxvlan'/>
<feature name='txvlan'/>
<feature name='rxhash'/>
<feature name='rdma'/>
<feature name='txudptnl'/>
<capability type='80203'/>
</capability>
</device>
virDomainMigrateFinish* APIs were unfortunately designed to return the
pointer to the domain on destination and NULL on error. This looks OK in
normal cases but the same API is also called when we know migration
failed and thus we expect Finish to return NULL even if it actually did
all it was supposed to do without any error. The call is defined to
return nonnull domain pointer over RPC, which means returning NULL will
always result in an error being send. If this was not in fact an error,
the API itself wouldn't set anything to the thread local virError, which
makes the RPC layer come up with it's own "Library function returned
error but did not set virError" error.
This is quite confusing and also hard to detect by the caller. This
patch adds a special error code which can be used to check that Finish
successfully aborted migration.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This is a self-locking wrapper around virHashTable. Only a limited set
of APIs are implemented now (the ones which are used in the following
patch) as more can be added on demand.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Optimize the virBitmap to array-of-char bitmap conversion by skipping
trailing zero bytes.
This also fixes a regression when requesting iothread information from a
live VM since after commit 825df8c315 the
bitmap returned from virProcessGetAffinity is too big to be formatted
properly via RPC. A user would get the following error:
error: Unable to get domain IOThreads information
error: Unable to encode message payload
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1238589
Avoid a false positive since Coverity find a path in virResizeN which
could return 0 prior to the allocation of memory and thus flags a
possible NULL dereference. Instead allocate the output buffer based
on 'nparams' and only fill it partially if need be - shouldn't be too
much a waste of space. Quicker than multiple VIR_RESIZE_N calls or
two loops of STREQ's sandwiched around a single VIR_ALLOC_N using
'n' matches from a first loop to generate the 'n' addresses to return
Signed-off-by: John Ferlan <jferlan@redhat.com>
Convert virPCIDriverDir to return the buffer allocated (or not) and make the
appropriate check in the caller.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Convert virPCIDriverFile to return the buffer allocated (or not) and make the
appropriate check in the caller.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Convert virPCIFile to return the buffer allocated (or not) and make the
appropriate check in the caller.
Signed-off-by: John Ferlan <jferlan@redhat.com>
We already enable the parser option to detect invalid UTF-8, but
didn't test it. Also, JSON states that behavior of an object
with a duplicated key is undefined; we chose to reject it, but
were not testing it.
With the enhanced tests in place, we can simplify yajl2
initialization by relying on parser defaults being sane.
* src/util/virjson.c (virJSONValueFromString): Simplify.
* tests/jsontest.c (mymain): Test more bad usage.
Signed-off-by: Eric Blake <eblake@redhat.com>