Add the function virHostdevIsSCSIDevice() which detects whether a
hostdev is a SCSI device or not.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
After commit f2bf5fbb04, MinGW strikes again. Simply print pid as any
other place after commit b7d2d4af2b.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Commit 94cc577807 tried fixing build on systems that did not have
SCHED_BATCH or SCHED_IDLE defined. But instead of changing it to
conditional support, it rather completely disabled the support for
setting any scheduler. Since then, such old systems are not
supported, but rather than reverting that commit, let's change that to
the conditional support. That way any addition to the list of
schedulers can follow the same style so that we're consistent in the
future.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This new function can be used to check if e.g. name of XML
node don't contains forbidden chars like "/" or "\n".
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1357416
Rather than return a 0 or -1 and the *result string, return just the result
string to the caller. Alter all the callers to handle the different return.
As a side effect or result of this, it's much clearer that we cannot just
assign the returned string into the scsi_host wwnn, wwpn, and fabric_wwn
fields - rather we should fetch a temporary string, then as long as our
fetch was good, VIR_FREE what may have been there, and STEAL what we just got.
This fixes a memory leak in the virNodeDeviceCreateXML code path through
find_new_device and nodeDeviceLookupSCSIHostByWWN which will continually
call nodeDeviceSysfsGetSCSIHostCaps until the expected wwnn/wwpn is found
in the device object capabilities.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Not every system out there has syslog, that's why we check for it
in our configure script. However, in 640b58abdf while fixing
another issue, some variables and functions are called that are
defined only when syslog.h is present. But these function
calls/variables were not guarded by #ifdef-s.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This initially started as a fix of some debug printing in
virCgroupDetect. However it turned out that other places suffer
from the similar problem. While dealing with pids, esp. in cases
where we cannot use pid_t for ABI stability reasons, we often
chose an unsigned integer type. This makes no sense as pid_t is
signed.
Also, new syntax-check rule is introduced so we won't repeat this
mistake.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In both virLogParseOutput and virLogParseFilter, rather than returning
NULL, goto cleanup since it's possible that for each the first condition
passes, but the || condition doesn't and thus we leak memory.
Handling of outputs and filters has been changed in a way that splits
parsing and defining. Do the same thing for logging priority as well, this
however, doesn't need much of a preparation.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This is mainly virLogAddOutputTo* which were replaced by virLogNewOutputTo* and
the previously poorly named ones virLogParseAndDefine* functions. All of these
are unnecessary now, since all the original callers were transparently switched
to the new model of separate parsing and defining logic.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Similar to outputs, parser should do parsing only, thus the 'define' logic
is going to be stripped from virLogParseAndDefineFilters by replacing calls to
this method to virLogSetFilters instead.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Since virLogParseAndDefineOutputs is going to be stripped from 'output defining'
logic, replace all relevant occurrences with virLogSetOutputs call to make the
change transparent to all original callers (daemons mostly).
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This method will eventually replace virLogParseAndDefineFilters which
currently does both parsing and defining.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This API is the entry point to output modification of the logger. Currently,
everything is done by virLogParseAndDefineOutputs. Parsing and defining will be
split into two operations both handled by this method transparently.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Abstraction added over parsing a single filter. The method parses potentially a
set of logging filters, while adding each filter logging object to a
caller-provided array.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Another abstraction added on the top of parsing a single logging output. This
method takes and parses the whole set of outputs, adding each single output
that has already been parsed into a caller-provided array. If the user-supplied
string contained duplicate outputs, only the last occurrence is taken into
account (all the others are removed from the list), so we silently avoid
duplicate logs.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Same as for outputs, introduce a new method, that is basically the same as
virLogParseAndDefineFilter with the difference that it does not define the
filter. It rather returns a newly created object that needs to be inserted into
a list and then defined separately.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Introduce a method to parse an individual logging output. The difference
compared to the virLogParseAndDefineOutput is that this method does not define
the output, instead it makes use of the virLogNewOutputTo* methods introduced
in the previous patch and just returns the virLogOutput object that has to be
added to a list of object which then can be defined as a whole via
virLogDefineOutputs. The idea remains still the same - split parsing and
defining of the logging primitives (outputs, filters).
Additionally, since virLogNewOutputTo* methods are now finally used,
ATTRIBUTE_UNUSED can be successfully removed from the methods' definitions,
since that was just to avoid compiler complaints about unused static functions.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Now that we're in the critical section, syslog connection can be re-opened
by issuing openlog, which is something that cannot be done beforehand, since
syslog keeps its file descriptor private and changing the tag earlier might
introduce a log inconsistency if something went wrong with preparing a new set
of logging outputs in order to replace the existing one.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Continuing with the effort to split output parsing and defining, these new
functions return a logging object reference instead of defining the output.
Eventually, these functions will replace the existing ones (virLogAddOutputTo*)
which will then be dropped.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Prepare a method that only defines a set of filters. It takes a list of
filters, preferably created by virLogParseFilters. The original set of filters
is reset and replaced by the new user-provided set of filters.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Prepare a method that only defines a set of outputs. It takes a list of
outputs, preferably created by virLogParseOutputs. The original set of outputs
is reset and replaced by the new user-provided set of outputs.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Outputs are a bit trickier than filters, since the user(config)-specified
set of outputs can contain duplicates. That would lead to logging the same
message twice. For compatibility reasons, we cannot just error out and forbid
the daemon to start if we find duplicate outputs which do not make sense.
Instead, we could silently take into account only the last occurrence of the
duplicate output and remove all the previous ones, so that the logger will not
try to use them when it is looping over all of its registered outputs.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
In order to later split output parsing and output defining, introduce a new
function which will create a new virLogOutput object which the parser will
insert into a list with the list being eventually defined.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
There is really no reason why we could not keep journald's fd within the
journald output object the same way as we do for regular file-based outputs.
By doing this we later won't have to special case the journald-based output
(due to the fd being globally shared) when replacing the existing set of outputs
with a new one. Additionally, by making this change, we don't need the
virLogCloseJournald routine anymore, plain virLogCloseFd will suffice.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Right now virLogParse* functions are doing both parsing and defining of filters
and outputs which should be two separate operations. Since the naming is
apparently a bit poor this patch renames these functions to
virLogParseAndDefine* which eventually will be replaced by virLogSet*.
Additionally, virLogParse{Filter,Output} will be later (after the split) reused,
so that these functions do exactly what the their name suggests.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
During first stage of virlog.c refactor, commit 0b231195 forgot to remove the
macro definition along with its usage.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
If the event is already disabled, then don't bother with setting it
disabled again. Causes unnecessary error on systems that don't support
the feature anyway.
Provide the Steal API for any code paths that will desire to grab the
object array and then free it afterwards rather than relying to freeing
the whole chain from the reply.
Libvirt, on its own, shouldn't decide whether an expired lease should
stay in the custom leases database or not. It should rather rely on
the 'DEL' event from dnsmasq.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
There is nothing Linux-specific in that function. Also since commit
8c3b5bf481 mingw build is broken due to
the fact that this function is not compiled in the library.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
We will need this function shortly when implementing
nodeGetCPUStats in the test driver.
Signed-off-by: Tomáš Ryšavý <tom.rysavy.0@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Name it virNumaGetHostMemoryNodeset and return only NUMA nodes which
have memory installed. This is necessary as the kernel is not very happy
to set the memory cgroup setting for nodes which do not have any memory.
This would break vcpu hotplug with following message on such
configruation:
Invalid value '0,8' for 'cpuset.mems': Invalid argument
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1375268
Commit id 'a48c7141' altered how to determine if a volume was encrypted
by adding a peek at an offset into the file at a specific buffer location.
Unfortunately, all that was compared was the first "char" of the buffer
against the expect "int" value.
Restore the virReadBufInt32BE to get the complete field in order to
compare against the expected value from the qcow2EncryptionInfo or
qcow1EncryptionInfo "modeValue" field.
This restores the capability to create a volume with encryption, then
refresh the pool, and still find the encryption for the volume.
When a new filter is being defined, the return code is not handled properly,
thus triggering OOM error reporting routine (bug introduced by 51b2606f).
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Commit id 'b00d7f29' shifted the opening of the /sys/devices/intel_cqm/type
file from event enable to perf event initialization. If the file did not
exist, then an error would be written to the domain log:
2016-09-06 20:51:21.677+0000: 7310: error : virFileReadAll:1360 : Failed to open file '/sys/devices/intel_cqm/type': No such file or directory
Since the error is now handled in virPerfEventEnable by checking if the
event_attr->attrType == 0 for CMT, MBML, and MBMT events - we can just
use the Quiet API in order to not log the error we're going to throw away.
Additionally, rather than using virReportSystemError, use virReportError
and VIR_ERR_ARGUMENT_UNSUPPORTED in order to signify that support isn't there
for that type of perf event - adjust the error message as well.
There is a possibility that qemu driver frees by unreferencing its
closeCallbacks pointer as it has the only reference to the object,
while in fact not all users of CloseCallbacks called thier
virCloseCallbacksUnset.
Backtrace is the following:
Thread #1:
0 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
1 in virCondWait (c=<optimized out>, m=<optimized out>)
at util/virthread.c:154
2 in virThreadPoolFree (pool=0x7f0810110b50)
at util/virthreadpool.c:266
3 in qemuStateCleanup () at qemu/qemu_driver.c:1116
4 in virStateCleanup () at libvirt.c:808
5 in main (argc=<optimized out>, argv=<optimized out>)
at libvirtd.c:1660
Thread #2:
0 in virClassIsDerivedFrom (klass=0xdeadbeef, parent=0x7f0837c694d0) at util/virobject.c:169
1 in virObjectIsClass (anyobj=anyobj@entry=0x7f08101d4760, klass=<optimized out>) at util/virobject.c:365
2 in virObjectLock (anyobj=0x7f08101d4760) at util/virobject.c:317
3 in virCloseCallbacksUnset (closeCallbacks=0x7f08101d4760, vm=vm@entry=0x7f08101d47b0, cb=cb@entry=0x7f081d078fc0 <qemuProcessAutoDestroy>) at util/virclosecallbacks.c:163
4 in qemuProcessAutoDestroyRemove (driver=driver@entry=0x7f081018be50, vm=vm@entry=0x7f08101d47b0) at qemu/qemu_process.c:6368
5 in qemuProcessStop (driver=driver@entry=0x7f081018be50, vm=vm@entry=0x7f08101d47b0, reason=reason@entry=VIR_DOMAIN_SHUTOFF_SHUTDOWN, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_NONE, flags=flags@entry=0) at qemu/qemu_process.c:5854
6 in processMonitorEOFEvent (vm=0x7f08101d47b0, driver=0x7f081018be50) at qemu/qemu_driver.c:4585
7 qemuProcessEventHandler (data=<optimized out>, opaque=0x7f081018be50) at qemu/qemu_driver.c:4629
8 in virThreadPoolWorker (opaque=opaque@entry=0x7f0837c4f820) at util/virthreadpool.c:145
9 in virThreadHelper (data=<optimized out>) at util/virthread.c:206
10 in start_thread () from /lib64/libpthread.so.0
Let's reference CloseCallbacks object in virCloseCallbacksSet and
unreference in virCloseCallbacksUnset.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
In the latest glibc, major() and minor() functions are marked as
deprecated (glibc commit dbab6577):
CC util/libvirt_util_la-vircgroup.lo
util/vircgroup.c: In function 'virCgroupGetBlockDevString':
util/vircgroup.c:768:5: error: '__major_from_sys_types' is deprecated:
In the GNU C Library, `major' is defined by <sys/sysmacros.h>.
For historical compatibility, it is currently defined by
<sys/types.h> as well, but we plan to remove this soon.
To use `major', include <sys/sysmacros.h> directly.
If you did not intend to use a system-defined macro `major',
you should #undef it after including <sys/types.h>.
[-Werror=deprecated-declarations]
if (virAsprintf(&ret, "%d:%d ", major(sb.st_rdev), minor(sb.st_rdev)) < 0)
^~
In file included from /usr/include/features.h:397:0,
from /usr/include/bits/libc-header-start.h:33,
from /usr/include/stdio.h:28,
from ../gnulib/lib/stdio.h:43,
from util/vircgroup.c:26:
/usr/include/sys/sysmacros.h:87:1: note: declared here
__SYSMACROS_DEFINE_MAJOR (__SYSMACROS_FST_IMPL_TEMPL)
^
Moreover, in the glibc commit, there's suggestion to keep
ordering of including of header files as implemented here.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Current implementation uses the dev.cpu.0.freq sysctl that is
provided by the cpufreq(4) framework and returns the actual
CPU frequency. However, there are environments where it's not available,
e.g. when running nested in KVM. In this case fall back to hw.clockrate
that reports CPU frequency at the boot time.
Resolves (hopefully):
https://bugzilla.redhat.com/show_bug.cgi?id=1369964
Currently the QEMU processes inherit their core dump rlimit
from libvirtd, which is really suboptimal. This change allows
their limit to be directly controlled from qemu.conf instead.
With current perf framework, this patch adds support and documentation
for more perf events, including cache misses, cache references, cpu cycles,
and instructions.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Introduce a static attr table and refactor virPerfEventEnable() for
general purpose usage.
This patch creates a static table/matrix that converts the VIR_PERF_EVENT_*
events into their respective "attr.type" and "attr.config" so that
virPerfEventEnable doesn't have the switch the calling function passes
by value the 'type'.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
libvirt uses the new_id PCI sysfs interface to bind a PCI stub driver
to a PCI device. The new_id interface is known to be buggy and racey,
hence a more deterministic interface was introduced in the 3.12 kernel:
driver_override. For more details see
https://www.redhat.com/archives/libvir-list/2016-June/msg02124.html
For more details about the driver_override interface and examples of
its usage, see
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/pci/pci-driver.c?h=v3.12&id=782a985d7af26db39e86070d28f987cad21313c0
This patch adds support for the driver_override interface by
- adding new virPCIDevice{BindTo,UnbindFrom}StubWithOverride functions
that use the driver_override interface
- renames the existing virPCIDevice{BindTo,UnbindFrom}Stub functions
to virPCIDevice{BindTo,UnbindFrom}StubWithNewid to perserve existing
behavior on new_id interface
- changes virPCIDevice{BindTo,UnbindFrom}Stub function to call one of
the above depending on availability of driver_override
The patch includes a bit of duplicate code, but allows for easily
dropping the new_id code once support for older kernels is no
longer desired.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Finding an USB device from the vendor/device values will be needed
by libxl driver to convert from vendor/device to bus/dev addresses.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1367259
Crash occurs because 'secrets' is being dereferenced in call:
if (qemuDomainSecretSetup(conn, priv, secinfo, disk->info.alias,
VIR_SECRET_USAGE_TYPE_VOLUME, NULL,
&src->encryption->secrets[0]->seclookupdef,
true) < 0)
(gdb) p *src->encryption
$1 = {format = 2, nsecrets = 0, secrets = 0x0, encinfo = {cipher_size = 0,
cipher_name = 0x0, cipher_mode = 0x0, cipher_hash = 0x0, ivgen_name = 0x0,
ivgen_hash = 0x0}}
(gdb) bt
priv=priv@entry=0x7fffc03be160, disk=disk@entry=0x7fffb4002ae0)
at qemu/qemu_domain.c:1087
disk=0x7fffb4002ae0, vm=0x7fffc03a2580, driver=0x7fffc02ca390,
conn=0x7fffb00009a0) at qemu/qemu_hotplug.c:355
Upon entry to qemuDomainAttachVirtioDiskDevice, src->encryption points
at a valid 'secret' buffer w/ nsecrets == 1; however, the call to
qemuDomainDetermineDiskChain will call virStorageFileGetMetadata
and eventually virStorageFileGetMetadataInternal where the src->encryption
was overwritten when probing the volume.
Commit id 'a48c7141' added code to virStorageFileGetMetadataInternal
to determine if the disk/volume would use/need encryption and allocated
a meta->encryption. This overwrote an existing encryption buffer
already provided by the XML
This patch adds a check for meta->encryption already present before
just allocating and overwriting an existing buffer. It then checks the
existing encryption data to ensure the XML provided format for the
disk matches the expected format read from the disk and errors if there
is a mismatch.
The first argument should be const char ** instead of
char **, because this is a search function and as such it
doesn't, and shouldn't, alter the haystack in any way.
This change means we no longer have to cast arrays of
immutable strings to arrays of mutable strings; we still
have to do the opposite, though, but that's reasonable.
Usually, this variable is used to hold the return value for a
function of ours. Well, this is not the case. Its use does not
match our pattern and therefore it is very misleading. Drop it
and define an alternative @rc variable, but only in that single
block where it is needed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This variable is very misleading. We use VIR_FORCE_CLOSE to set
it to -1 and returning it even though it does not refer to a FD
at all. It merely holds 0 or -1. Drop it completely. Also, at the
same time some corner cases are fixed too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1240439
In this function we create a macvtap device and open its tap
device. Possibly multiple times. Now the thing is, if opening the
tap device fails, that is virNetDevMacVLanTapOpen() returns a
negative value, we unroll all the changes BUT return 0 fooling
caller into thinking everything went okay.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit b3e4401dc6 introduced a check to ignore an error if the guest
is already terminated. However the check accidentally compared
error.code with VIR_ERR_ERROR, which is an error level, not an error
code. Because of this, almost every error got silently ignored.
Fixes: b3e4401dc6 ("systemd: don't report an error if the guest is
already terminated")
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
The virJSONValueArraySize() function return ssize_t (with
possibly returning -1 if the passed json is not an array).
Storing the return value into size_t is possibly dangerous then.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1356436
According to RFC 3721 (https://www.ietf.org/rfc/rfc3721.txt), there are
two ways to "discover" targets in/for the iSCSI environment. Discovery
is the process which allows the initiator to find the targets to which
it has access and at least one address at which each target may be
accessed.
The method currently implemented in libvirt using the virISCSIScanTargets
API is known as "SendTargets" discovery. This method is more useful when
the target IP Address and TCP port information are available, e.g. in
libvirt terms the "portal". It returns a list of targets for the portal.
From that list, the target can be found. This operation can also fill an
iSCSI node table into which iSCSI logins may occur. Commit id '56057900'
altered that filling by adding the "--op nonpersistent" since it was
not necessarily desired to perform that for non libvirt related targets.
The second method is "Static Configuration". This method not only needs
the IP Address and TCP port (e.g. portal), but also the iSCSI target name.
In libvirt terms this would be the device path field from the iSCSI pool
<source> XML. This patch implements the second methodology using that
required device path as the targetname.
The current LUKS support has a "luks" volume type which has
a "luks" encryption format.
This partially makes sense if you consider the QEMU shorthand
syntax only requires you to specify a format=luks, and it'll
automagically uses "raw" as the next level driver. QEMU will
however let you override the "raw" with any other driver it
supports (vmdk, qcow, rbd, iscsi, etc, etc)
IOW the intention though is that the "luks" encryption format
is applied to all disk formats (whether raw, qcow2, rbd, gluster
or whatever). As such it doesn't make much sense for libvirt
to say the volume type is "luks" - we should be saying that it
is a "raw" file, but with "luks" encryption applied.
IOW, when creating a storage volume we should use this XML
<volume>
<name>demo.raw</name>
<capacity>5368709120</capacity>
<target>
<format type='raw'/>
<encryption format='luks'>
<secret type='passphrase' uuid='0a81f5b2-8403-7b23-c8d6-21ccd2f80d6f'/>
</encryption>
</target>
</volume>
and when configuring a guest disk we should use
<disk type='file' device='disk'>
<driver name='qemu' type='raw'/>
<source file='/home/berrange/VirtualMachines/demo.raw'/>
<target dev='sda' bus='scsi'/>
<encryption format='luks'>
<secret type='passphrase' uuid='0a81f5b2-8403-7b23-c8d6-21ccd2f80d6f'/>
</encryption>
</disk>
This commit thus removes the "luks" storage volume type added
in
commit 318ebb36f1
Author: John Ferlan <jferlan@redhat.com>
Date: Tue Jun 21 12:59:54 2016 -0400
util: Add 'luks' to the FileTypeInfo
The storage file probing code is modified so that it can probe
the actual encryption formats explicitly, rather than merely
probing existance of encryption and letting the storage driver
guess the format.
The rest of the code is then adapted to deal with
VIR_STORAGE_FILE_RAW w/ VIR_STORAGE_ENCRYPTION_FORMAT_LUKS
instead of just VIR_STORAGE_FILE_LUKS.
The commit mentioned above was included in libvirt v2.0.0.
So when querying volume XML this will be a change in behaviour
vs the 2.0.0 release - it'll report 'raw' instead of 'luks'
for the volume format, but still report 'luks' for encryption
format. I think this change is OK because the storage driver
did not include any support for creating volumes, nor starting
guets with luks volumes in v2.0.0 - that only since then.
Clearly if we change this we must do it before v2.1.0 though.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Refactor the virStorageFileMatchesNNN methods so that
they don't take a struct FileFormatInfo parameter, but
instead get the actual raw dat items they needs. This
will facilitate reuse in other contexts.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To allow richer definitions of disk sources add infrastructure that will
allow to register functionst generating a JSON object based definition.
This infrastructure will then convert the definition to the proper
command line syntax and use it in cases where it's necessary. This will
allow to keep legacy definitions for back-compat when possible and use
the new definitions for the configurations requiring them.
Add support for converting objects nested in arrays with a numbering
discriminator on the command line. This syntax is used for the
object-based specification of disk source properties.
Add a modular parser that will allow to parse 'json' backing definitions
that are supported by qemu. The initial implementation adds support for
the 'file' driver.
Due to the approach qemu took to implement the JSON backing strings it's
possible to specify them in two approaches.
The object approach:
json:{ "file" : { "driver":"file",
"filename":"/path/to/file"
}
}
And a partially flattened approach:
json:{"file.driver":"file"
"file.filename":"/path/to/file"
}
Both of the above are supported by qemu and by the code added in this
commit. The current implementation de-flattens the first level ('file.')
if possible and required. Other handling may be added later but
currently only one level was possible anyways.
Since commit c4bdff19, the path to the configuration file has been constructed
in the following manner:
- if no config filename was passed to virConfLoadConfigPath, libvirt.conf was
used as default
- otherwise the filename was concatenated with
"<config_dir>/libvirt/libvirt%s%s.conf" which in admin case resulted in
"libvirt-libvirt-admin.conf.conf". Obviously, this non-existent config led to
ignoring all user settings in libvirt-admin.conf. This patch requires the
config filename to be always provided as an argument with the concatenation
being simplified.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1357364
Signed-off-by: Erik Skultety <eskultet@redhat.com>
For use with memory hotplug virQEMUBuildCommandLineJSONRecurse attempted
to format JSON arrays as bitmap on the command line. Make the formatter
function configurable so that it can be reused with different syntaxes
of arrays such as numbered arrays for use with disk sources.
This patch extracts the code and adds a parameter for the function that
will allow to plug in different formatters.
Until now the JSON->commandline convertor was used only for objects
created by qemu. To allow reusing it with disk formatter we'll need to
escape ',' as usual in qemu commandlines.
Refactor the command line generator by adding a wrapper (with
documentation) that will handle the outermost object iteration.
This patch also renames the functions and tweaks the error message for
nested arrays to be more universal.
The new function is then reused to simplify qemucommandutiltest.
As we already test that the extraction of the backing store string works
well additional tests for the backing store string parser can be made
simpler.
Export virStorageSourceNewFromBackingAbsolute and use it to parse the
backing store strings, format them using virDomainDiskSourceFormat and
match them against expected XMLs.
The symbol being missing has been reported as causing build
failures on OS X. If it's not already defined, define it to
zero so that it won't have any effect.
Partially resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1301021
If the volume xml was looking to create a luks volume take the necessary
steps in order to make that happen.
The processing will be:
1. create a temporary file (virStorageBackendCreateQemuImgSecretPath)
1a. use the storage driver state dir path that uses the pool and
volume name as a base.
2. create a secret object (virStorageBackendCreateQemuImgSecretObject)
2a. use an alias combinding the volume name and "_luks0"
2b. add the file to the object
3. create/add luks options to the commandline (virQEMUBuildLuksOpts)
3a. at the very least a "key-secret=%s" using the secret object alias
3b. if found in the XML the various "cipher" and "ivgen" options
Signed-off-by: John Ferlan <jferlan@redhat.com>
virConfGetValueLLong() errors out if the value is too big to
fit into a long long integer, but claims the supported range
to be (0,LLONG_MAX) instead of (LLONG_MIN,LLONG_MAX).
When parsing numeric values, we always store them as unsigned
unless they're negative. We can use this fact to simplify the
logic by removing a bunch of unnecessary checks.
Commit 6381c89f8c changed virConfValue to store long long
integers instead of long integers; however, the temporary variable
used in virConfParseLong() was not updated accordingly, causing
trouble for 32-bit machines.
If size_t is the same size as long long, then we can skip
some of the range checks. This avoids triggering some
bogus compiler warning messages.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virConf 'l' field is a 'signed long long', so whenever
the 'type' field is VIR_CONF_ULONG, we should explicitly cast
'l' to a 'unsigned long long' before doing range checks.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This function tries to get a ssize_t value from a config file.
But before returning it, it checks whether the value would fit in
ssize_t and if not an error is printed out among with the range
for the ssize_t type. However, on some platforms SSIZE_MAX may
actually be a signed long type:
util/virconf.c: In function 'virConfGetValueSSizeT':
util/virconf.c:1268:9: error: format '%zd' expects argument of type 'signed size_t', but argument 9 has type 'long int' [-Werror=format=]
virReportError(VIR_ERR_INTERNAL_ERROR,
^
$ grep -r SSIZE_MAX /usr/include/
/usr/include/bits/posix1_lim.h:#ifndef SSIZE_MAX
/usr/include/bits/posix1_lim.h:# define SSIZE_MAX LONG_MAX
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
IPv6 RA always contains an implicit default route via
the link-local address of the source of RA. This forces
the guest to install a route via isolated network, which
may disturb the guest's networking in case of multiple interfaces.
More info in 013427e6e7.
The validity of this route is controlled by "default [route] lifetime"
field of RA. If the lifetime is set to 0 seconds, then no route
is installed by receiver.
dnsmasq 2.67+ supports "ra-param=<interface>,<RA interval>,<default
lifetime>" option. We pass "ra-param=*,0,0"
(here, RA_interval=0 means default) to disable default gateway in RA
for isolated networks.
At least with systemd v210, NOTIFY_SOCKET is abstact, e.g.
@/org/freedesktop/systemd1/notify. sendmsg() fails on such a socket
with "Connection refused". The unix(7) man page contains the following
details wrt abstract socket addresses
abstract: an abstract socket address is distinguished (from a
pathname socket) by the fact that sun_path[0] is a null byte
('\0'). The socket's address in this namespace is given by the
additional bytes in sun_path that are covered by the specified
length of the address structure. (Null bytes in the name have
no special significance.)
So we need to be more precise about the address length, setting it to
the sizeof sa_family_t + length of address copied to sun_path instead
of setting it to the sizeof the entire sockaddr_un struct.
Resolves: https://bugzilla.opensuse.org/show_bug.cgi?id=987668
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
In an unlikely event of execve() failing, the virCommandExec()
function does not report any error, even though checks that are
at the beginning of the function are verbose when failing.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Currently many users of virConf APIs are defining the same
macros for calling virConfValue() and then doing type
checking. To remove this repeated code, add a set of
typesafe accessor methods.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If the config file does not end with a \n, the parser will append
one. When re-allocating the array though, it is mistakenly assuming
that 'len' is the length including the trailing NUL, but it does
not. So we must add 2 to len, when reallocating, not 1.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When storage secret is parsed in virStorageEncryptionSecretParse(),
virSecretLookupParseSecret() which allocates some memory. This is
however never freed.
==21711== 134 bytes in 6 blocks are definitely lost in loss record 70 of 85
==21711== at 0x4C29F80: malloc (vg_replace_malloc.c:296)
==21711== by 0xBCA0356: xmlStrndup (in /usr/lib64/libxml2.so.2.9.4)
==21711== by 0xA9F432E: virXMLPropString (virxml.c:479)
==21711== by 0xA9D25B0: virSecretLookupParseSecret (virsecret.c:70)
==21711== by 0xA9D616E: virStorageEncryptionSecretParse (virstorageencryption.c:172)
==21711== by 0xA9D66B2: virStorageEncryptionParseXML (virstorageencryption.c:281)
==21711== by 0xA9D68DF: virStorageEncryptionParseNode (virstorageencryption.c:338)
==21711== by 0xAA12575: virDomainDiskDefParseXML (domain_conf.c:7606)
==21711== by 0xAA2CAC6: virDomainDefParseXML (domain_conf.c:16658)
==21711== by 0xAA2FC75: virDomainDefParseNode (domain_conf.c:17472)
==21711== by 0xAA2FAE4: virDomainDefParse (domain_conf.c:17419)
==21711== by 0xAA2FB72: virDomainDefParseFile (domain_conf.c:17443)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If we don't HAVE_LINUX_KVM_H, we can't query /dev/kvm to discover
the limits on the number of vCPUs, so we report an error and
return a negative value instead.
As there is an explicit constructor for the special case of empty
bitmaps, we should mention that the generic constructors rejects the
creation of empty bitmaps.
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Before the variable 'bits' was initialized with 0 (commit
3470cd860d), the following bug was
possible.
A function call with an empty bitmap leads to undefined
behavior. Because if 'bitmap->map_len == 0' 'unusedBits' will be <= 0
and 'sz == 1'. So the non global and non static variable 'bits' would
have never been set. Consequently the check 'bits == 0' results in
undefined behavior.
This patch clarifies the current version of the function by handling the
empty bitmap explicitly. Also, for an empty bitmap there is obviously no
bit set so we can just return -1 (indicating no bit set) right away. The
explicit check for 'bits == 0' after the loop is unnecessary because we
only get to this point if no set bit was found.
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
This is just a convenience method for discarding a list of filters instead of
using a 'for' loop everywhere. It is safe to pass -1 as the number of elements
in the list as well as passing NULL as list reference.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Provide a separate method to free a logging filter object. This will come handy
once a method to create an individual logging filter object is introduced.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This is just a convenience method for discarding a list of outputs instead of
using a 'for' loop everywhere. It is safe to pass -1 as the number of elements
in the list as well as passing NULL as list reference.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Provide a separate method to free a logging output object. This will come handy
once a method to create an individual logging output object is introduced.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Same as with outputs; since the operations will be further divided into smaller
tasks, creating a filter will become a separate operation that will return
a reference to a newly created filter.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Right now, we define outputs one after another. However, the correct flow
should be to define a set of outputs as a whole unit. Therefore each output
should be first created, placed into an array/list and the list will be
defined. Output creation should be a separate operation, so an output will be
returned by a reference. From that perspective, it makes perfect sense to
only store pointers to actual outputs.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
In this particular case, reset is meant as clearing the whole list of
outputs/filters, not resetting it to a predefined default setting. Looking at
it from that perspective, returning the number of records removed doesn't help
the caller in any way (not that any of the callers would actually check for
it). Well, callers could detect an error from the number of successfully
removed records, but the only thing that can fail in virLogReset is force
closing a file descriptor in which case the error isn't propagated back to
virLogReset anyway. Conclusion: there is no practical use for having a return
type of 'int' rather than 'void' in this case.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This will apply to any IP address setting that uses
virNetDevIPInfoAddToDev() (which so far is only the guest-side of LXC
type='ethernet' interfaces).
(This patch had been pushed earlier in
commit cb20f989df, but was reverted in
commit cba06aea8d because it had been
accidentally pushed during the freeze for release 2.0.0)
The peer attribute is used to set the property of the same name in the
interface IP info:
<interface type='ethernet'>
...
<ip family='ipv4' address='192.168.122.5'
prefix='32' peer='192.168.122.6'/>
...
</interface>
Note that this element is used to set the IP information on the
*guest* side interface, not the host side interface - that will be
supported in an upcoming patch.
(This patch now has quite a history: it was originally pushed in
commit 690969af, which was subsequently reverted in commit 1d14b13f,
then reworked and pushed (along with a lot of other related/supporting
patches) in commit 93135abf1; however *that* commit had been
accidentally pushed during dev. freeze for release 2.0.0, so it was
again reverted in commit f6acf039f0).
Signed-off-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
Signed-off-by: Laine Stump <laine@laine.org>
This patch takes the code out of
lxcContainerRenameAndEnableInterfaces() that adds all IP addresses and
IP routes to the interface, and puts it into a utility function
virNetDevIPInfoAddToDev() in virnetdevip.c so that it can be used by
anyone.
One small change in functionality -
lxcContainerRenameAndEnableInterfaces() previously would add all IP
addresses to the interface while it was still offline, then set the
interface online, and then add the routes. Because I don't want the
utility function to set the interface online, I've moved this up so
the interface is first set online, then IP addresses and routes are
added. This is the same order that the network service from
initscripts (in ifup-ether) does it, so it shouldn't pose any problem
(and hasn't, in the tests that I've run).
(This patch had been pushed earlier in commit
f1e0d0da11, but was reverted in commit
05eab47559 because it had been
accidentally pushed during the freeze for release 2.0.0)
In order to use more common code and set up for a future type, modify the
encryption secret to allow the "usage" attribute or the "uuid" attribute
to define the secret. The "usage" in the case of a volume secret would be
the path to the volume as dictated by the backwards compatibility brought
on by virStorageGenerateQcowEncryption where it set up the usage field as
the vol->target.path and didn't allow someone to provide it. This carries
into virSecretObjListFindByUsageLocked which takes the secret usage attribute
value from from the domain disk definition and compares it against the
usage type from the secret definition. Since none of the code dealing
with qcow/qcow2 encryption secrets uses usage for lookup, it's a mostly
cosmetic change. The real usage comes in a future path where the encryption
is expanded to be a luks volume and the secret will allow definition of
the usage field.
This code will make use of the virSecretLookup{Parse|Format}Secret common code.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Commit cf0568b0af moved a bunch of functions from virNetDev
to the more specific virNetDevIP; however, not all of the
existing uses were moved properly, causing build failures on
FreeBSD.
Complete the transition to the new names and drop the
obsolete declarations from the header file while at it.
Not including the header causes
util/virnetdevip.c:520:5: error:
unknown type name 'virCommandPtr'; did you mean 'virCondPtr'?
virCommandPtr cmd = NULL;
^~~~~~~~~~~~~
and plenty more similar failures when compiling on FreeBSD.
The peer attribute is used to set the property of the same name in the
interface IP info:
<interface type='ethernet'>
...
<ip family='ipv4' address='192.168.122.5'
prefix='32' peer='192.168.122.6'/>
...
</interface>
Note that this element is used to set the IP information on the
*guest* side interface, not the host side interface - that will be
supported in an upcoming patch.
(This is an updated *re*-commit of commit 690969af, which was
subsequently reverted in commit 1d14b13f).
Signed-off-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
Signed-off-by: Laine Stump <laine@laine.org>
This patch takes the code out of
lxcContainerRenameAndEnableInterfaces() that adds all IP addresses and
IP routes to the interface, and puts it into a utility function
virNetDevIPInfoAddToDev() in virnetdevip.c so that it can be used by
anyone.
One small change in functionality -
lxcContainerRenameAndEnableInterfaces() previously would add all IP
addresses to the interface while it was still offline, then set the
interface online, and then add the routes. Because I don't want the
utility function to set the interface online, I've moved this up so
the interface is first set online, then IP addresses and routes are
added. This is the same order that the network service from
initscripts (in ifup-ether) does it, so it shouldn't pose any problem
(and hasn't, in the tests that I've run).
It makes more sense to have the logging at the lower level so other
callers can share the goodness.
While removing so much stuff from / touching so many lines in
lxcContainerRenameAndEnableInterfaces() (which used to have this
debug/error logging), label names were changed and it was updated to
use the now-more-common method of initializing ret to -1 (failure),
then setting to 0 right before the cleanup label.
There are currently two places in the domain where this combination is
used, and there is about to be another. This patch puts them together
for brevity and uniformity.
As with the newly-renamed virNetDevIPAddr and virNetDevIPRoute
objects, the new virNetDevIPInfo object will need to be accessed by a
utility function that calls low level Netlink functions (so we don't
want it to be in the conf directory) and will be called from multiple
hypervisor drivers (so it can't be in any hypervisor directory); the
most appropriate place is thus once again the util directory.
The parse and format functions are in conf/domain_conf.c because only
the domain XML (i.e. *not* the network XML) has this exact combination
of IP addresses plus routes. Note that virDomainNetIPInfoFormat() will
end up being the only caller to virDomainNetRoutesFormat() and
virDomainNetIPsFormat(), so it will just subsume those functions in a
later patch, but we can't do that until they are no longer called.
(It would have been nice to include the interface name within the
virNetDevIPInfo object (with a slight name change), but that can't
be done cleanly, because in each case the interface name is provided
in a different place in the XML relative to the routes and IP
addresses, so putting it in this object would actually make the code
more confused rather than simpler).
These functions all need to be called from a utility function that
must be located in the util directory, so we move them all into
util/virnetdevip.[ch] now that it exists.
Function and struct names were appropriately changed for the new
location, but all code is unchanged aside from motion and renaming.
This patch splits virnetdev.[ch] into multiple files, with the new
virnetdevip.[ch] containing all the functions related to setting and
retrieving IP-related info for a device (both addresses and routes).
Commit c9a641 (first appearred in 1.2.12) added support for setting
the guest-side IP address of veth devices in lxc domains.
Unfortunately, it hardcoded the assumption that the proper prefix for
any IP address with no explicit prefix in the config should be "24";
that is only correct for class C IPv4 addresses, but not for any other
IPv4 address, nor for any IPv6 address.
The good news is that there is already a function in libvirt that will
determine the proper default prefix for any IP address. This patch
replaces the use of the ill-fated VIR_SOCKET_ADDR_DEFAULT_PREFIX with
calls to virSocketAddrGetIPPrefix().
There are times when we don't have a netmask pointer to give to
virSocketAddrGetIPPrefix() (e.g. the IP addresses in domain interfaces
only have a prefix, no netmask), but it would have caused a segv if we
called it with NULL instead of a pointer to a netmask. This patch
qualifies the code that would use the netmask or address pointers to
check for NULL first.
I'm tired of mistyping this all the time, so let's do it the same all
the time (similar to how we changed all "Pci" to "PCI" awhile back).
(NB: I've left alone some things in the esx and vbox drivers because
I'm unable to compile them and they weren't obviously *not* a part of
some API. I also didn't change a couple of variables named,
e.g. "somethingIptables", because they were derived from the name of
the "iptables" command)
These had been declared in conf/device_conf.h, but then used in
util/virnetdev.c, meaning that we had to #include conf/device_conf.h
in virnetdev.c (which we have for a long time said shouldn't be done.
This caused a bigger problem when I tried to #include util/virnetdev.h
in a file in src/conf (which is allowed) - for some reason the
"device_conf.h: File not found" error.
The solution is to move the data types and functions used in util
sources from conf to util. Some names were adjusted during the move
("virInterface" --> "virNetDevIf", and "VIR_INTERFACE" -->
"VIR_NETDEV_IF")
virNetDevLinkDump should have been in virnetlink.c, but that file
didn't exist yet when the function was created. It didn't really
matter until now - I found that having virnetlink.h included by
virnetdev.h caused build problems when trying to #include virnetdev.h
in a .c file in src/conf (due to missing directory in -I). Rather than
fix that to further institutionalize the incorrect placement of this
one function, this patch moves the function.
Commit e81de04c switched virNetDevTapGetRealDeviceName() to use
virDirOpen() instead of opendir(), however it mistakenly dropped
DIR *dirp declaration, so restore that to fix build.
The version field historically has been a 4 byte data; however, an upcoming
new type will use a 2 byte version. So let's adjust for that now.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
In order to read 16 bits of data in the native format and convert add
the 16 bit macros to match existing 32 and 64 bit code.
Signed-off-by: John Ferlan <jferlan@redhat.com>
This kvmGetMaxVCPUs() needs to be used at two different places
so move it to utils with appropriate name and mark it as private
global now.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
The directories we iterate over are unlikely to contain any entries
starting with a dot, other than '.' and '..' which is already skipped
by virDirRead.
Move to virsecret.c and rename to virSecretLookupParseSecret. Also convert
to usage xmlNodePtr and virXMLPropString rather than virXPathString.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move the enum into a new src/util/virsecret.h, rename it to be
virSecretLookupType. Add a src/util/virsecret.h in order to perform
a couple of simple operations on the secret XML and virSecretLookupTypeDef
for clearing and copying.
This includes quite a bit of collateral damage, but the goal is to remove
the "virStorage*" and replace with the virSecretLookupType so that it's
easier to to add new lookups that aren't necessarily storage pool related.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Since introduction of the DAC security driver we've documented that
seclabels with a leading + can be used with numerical uid. This would
not work though with the rest of libvirt if the uid was not actually
used in the system as we'd fail when trying to get a list of
supplementary groups for the given uid. Since a uid without entry in
/etc/passwd (or other user database) will not have any supplementary
groups we can treat the failure to obtain them as such.
This patch modifies virGetGroupList to not report the error for missing
users and makes it return an empty list or just the group specified in
@gid.
All callers will grant less permissions to a user in case of failure of
this function and thus this change is safe.
This will be used for the caller that needs to specify a separator.
Currently identical to virBitmapParse.
Also change one test case to use the new function.
Commit b3d069872c added peer address setting to the low level
virNetDevSetIPAddress() function, but ended up causing a segfault in
cases where the caller passed NULL for peer address.
Commit a3510e33d3 fixed the segfault, but managed to cause us to
skip setting the broadcast address when setting an interface's IP
address. The result is that the broadcast address is 0.0.0.0 for all
libvirt-created bridges (and interfaces in lxc containers with IP
addresses set by libvirt).
This was reported on the mailing list:
https://www.redhat.com/archives/libvir-list/2016-June/msg00027.html
but I was too busy to investigate at the time. I found it by accident
today while refactoring virNetDevSetIPAddress(). Since this regression
is present in the 1.3.5 release, I'm sending the bugfix as a separate
patch from my larger refactoring patchset.
In the auth config file, it is currently required to have
an entry for each hostname to connect to, eg
[auth-libvirt-prod1.example.com]
credentials=prod
This is inconvenient when there are large numbers of machines
all with the same credentials. Add support for a default
entry:
[auth-default]
credentials=prod
This function is plenty of ifdefs providing implementations for
Linux, *BSD and OS-X. However, if we are being build for any
other architecture, all that's left behind by preprocessor is
just a error reporting call and return of -1. In that case,
passed arguments are unused:
../../src/util/virhostcpu.c: In function 'virHostCPUGetInfo':
../../src/util/virhostcpu.c:966:33: error: unused parameter 'cpus' [-Werror=unused-parameter]
unsigned int *cpus,
^~~~
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The virQEMUDriverConfig object contains lists of
loader:nvram pairs to advertise firmwares supported by
by the driver, and qemu_conf.c contains code to populate
the lists, all of which is useful for other drivers too.
To avoid code duplication, introduce a virFirmware object
to encapsulate firmware details and switch the qemu driver
to use it.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
* Fix misspelt function name:
s/virHostCPUGetStatsFreebsd/virHostCPUGetStatsFreeBSD/
* Mark the first argument to virHostCPUGetInfo with ATTRIBUTE_UNUSED
as it's not actually used on non-Linux
Move all APIs with a virHostMEM name prefix out into new
util/virhostmem.h & util/virhostmem.c files
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move all APIs with a virHostCPU name prefix out into new
util/virhostcpu.h & util/virhostcpu.c files
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In preparation for moving all the CPU related APIs out of
the nodeinfo file, give them a virHostCPU name prefix.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In preparation for moving all the memory related APIs out of
the nodeinfo file, give them a virHostMem name prefix.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Nearly all the methods in the nodeinfo file are given a
'const char *sysfs_prefix' parameter to override the
default sysfs path (/sys/devices/system). Every single
caller passes in NULL for this, except one use in the
unit tests. Furthermore this parameter is totally
Linux-specific, when the APIs are intended to be cross
platform portable.
This removes the sysfs_prefix parameter and instead gives
a new method linuxNodeInfoSetSysFSSystemPath for use by
the test suite.
For two of the methods this hardcodes use of the constant
SYSFS_SYSTEM_PATH, since the test suite does not need to
override the path for thos methods.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The sd_notify method is used to tell systemd when libvirtd
has finished starting up. All it does is send a datagram
containing the string parameter to systemd on a UNIX socket
named in the NOTIFY_SOCKET environment variable. Rather than
pulling in the systemd libraries for this, just code the
notification directly in libvirt as this is a stable ABI
from systemd's POV which explicitly allows independant
implementations:
See "Reimplementable Independently" column in the
"$NOTIFY_SOCKET Daemon Notifications" row:
https://www.freedesktop.org/wiki/Software/systemd/InterfacePortabilityAndStabilityChart/
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1314881
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move the module from qemu_command.c to a new module virqemu.c and
rename the API to virQEMUBuildObjectCommandline.
This API will then be shareable with qemu-img and the need to build
a security object for luks support.
Signed-off-by: John Ferlan <jferlan@redhat.com>
When building using -Og, gcc sees that some variables can be used
uninitialized It can be debatable whether it is possible with our
codeflow, but functions should be self-contained and initializations are
always good. The return instead of goto is due to actualType being used
in the cleanup.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
So imagine the following. You connect read only to a daemon and
try to fetch stats for a shut off domain, e.g.:
virsh -r domstats $dom
but all of a sudden, virsh instead of printing the stats throws
the following error at you:
error: Disconnected from qemu:///system due to I/O error
error: End of file while reading data: Input/output error
The daemon crashed. This is its backtrace:
#0 0x00007fa43e3751a8 in virPerfEventIsEnabled (perf=0x0, type=VIR_PERF_EVENT_MBMT) at util/virperf.c:241
#1 0x00007fa424a9f042 in qemuDomainGetStatsPerf (driver=0x7fa3f4022a30, dom=0x7fa3f40e24c0, record=0x7fa41c000e20, maxparams=0x7fa4360b38d0, privflags=1) at qemu/qemu_driver.c:19110
#2 0x00007fa424a9f2e7 in qemuDomainGetStats (conn=0x7fa41c001b20, dom=0x7fa3f40e24c0, stats=127, record=0x7fa4360b3970, flags=1) at qemu/qemu_driver.c:19213
#3 0x00007fa424a9f672 in qemuConnectGetAllDomainStats (conn=0x7fa41c001b20, doms=0x7fa41c0017f0, ndoms=1, stats=127, retStats=0x7fa4360b3a50, flags=0) at qemu/qemu_driver.c:19303
#4 0x00007fa43e4e15f6 in virDomainListGetStats (doms=0x7fa41c0017f0, stats=0, retStats=0x7fa4360b3a50, flags=0) at libvirt-domain.c:11615
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f28d1a38700 (LWP 16154)]
0x00007f28da4fa1a8 in virPerfEventIsEnabled (perf=0x0, type=VIR_PERF_EVENT_MBMT) at util/virperf.c:241
241 return event->enabled;
Problem is, shut off domains don't have priv->perf allocated.
Therefore if in frame #1 qemuDomainGetStatsPerf() tries to check
if perf events are enabled, NULL is passed to
virPerfEventIsEnabled() which due to some incredible
implementation dereference it. Fix this by checking whether
passed object is not NULL.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This function is not used anywhere. Moreover, the code that would
use lives in virperf.c and therefore has access to the FD anyway.
Well, for instance virPerfReadEvent is doing just that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So far, this function has just three callers. Two of them call
virNetDevSetupControl to create a socket that we can then
optionally use for ioctl() to fetch data. However, querying sysfs
is preferred. Therefore it doesn't make much sense to require
users to set up the socket if they don't even know it will be
used in favour of sysfs. We can set up the socket iff we need to.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Yet another one of those where signed int (or long int) is not
enough. And useless to as we're aiming at unsigned anyway.
../../src/util/virsocketaddr.c: In function 'virSocketAddrIsPrivate':
../../src/util/virsocketaddr.c:289:45: error: result of '192l << 24' requires 33 bits to represent, but 'long int' only has 32 bits [-Werror=shift-overflow=]
return ((val & 0xFFFF0000) == ((192L << 24) + (168 << 16)) ||
^~
../../src/util/virsocketaddr.c:290:45: error: result of '172l << 24' requires 33 bits to represent, but 'long int' only has 32 bits [-Werror=shift-overflow=]
(val & 0xFFF00000) == ((172L << 24) + (16 << 16)) ||
^~
cc1: all warnings being treated as errors
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Apparently, 1 << 31 is signed which in turn does not fit into
a signed integer variable:
../../include/libvirt/libvirt-domain.h:1881:57: error: result of '1 << 31' requires 33 bits to represent, but 'int' only has 32 bits [-Werror=shift-overflow=]
VIR_CONNECT_GET_ALL_DOMAINS_STATS_ENFORCE_STATS = 1 << 31, /* enforce requested stats */
^~
cc1: all warnings being treated as errors
The solution is to make it an unsigned value. I've found only two
such occurrences in our code base.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit c8b1a83605 changed the function, making it
impossible for callers to be able to tell whether a
non-negative return value means "physical function
address found and parsed correctly" or "couldn't find
corresponding physical function".
The important difference between the two being that,
in the latter case, the returned pointer is NULL and
should never, ever be dereferenced.
In order to cope with these changes, the callers
have to be updated.
Move the logic from qemuDomainGenerateRandomKey into this new
function, altering the comments, variable names, and error messages
to keep things more generic.
NB: Although perhaps more reasonable to add soemthing to virrandom.c.
The virrandom.c was included in the setuid_rpc_client, so I chose
placement in vircrypto.
Introduce virCryptoHaveCipher and virCryptoEncryptData to handle
performing encryption.
virCryptoHaveCipher:
Boolean function to determine whether the requested cipher algorithm
is available. It's expected this API will be called prior to
virCryptoEncryptdata. It will return true/false.
virCryptoEncryptData:
Based on the requested cipher type, call the specific encryption
API to encrypt the data.
Currently the only algorithm support is the AES 256 CBC encryption.
Adjust tests for the API's
Seems recent versions of Coverity have (mostly) resolved the issue using
ternary operations in VIR_FREE (and now VIR_DISPOSE*) macros. So let's
just remove it and if necessary handle one off issues as the arise.
Rather than return 0/-1 and/or a pointer to some memory, adjust the
helper to just return the allocated structure or NULL on failure.
Adjust the callers in order to handle that
Signed-off-by: John Ferlan <jferlan@redhat.com>
If we get to the error: label and clear out the *virtual_functions[]
pointers and then return w/ error to the caller - the caller has it's
own cleanup of the same array in the out: label which is keyed off the
value of num_virt_fns, which wasn't reset to 0 in the called function
leading to a possible problem.
Just clear the value (found by Coverity)
Signed-off-by: John Ferlan <jferlan@redhat.com>
Some Intel processor families (e.g. the Intel Xeon processor E5 v3
family) introduced some RDT (Resource Director Technology) features
to monitor or control shared resource. Among these features, MBM
(Memory Bandwidth Monitoring), which is build on the CMT (Cache
Monitoring Technology) infrastructure, provides OS/VMM a way to
monitor bandwidth from one level of cache to another.
With current perf framework, this patch adds support to perf event
for MBM.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1331552
Instead of disabling auto-login of all scsi targets (even those
that do not "belong" to libvirt), use iscsiadm's "--op nonpersistent"
during discovery of iSCSI targets (e.g. "iscsiadm --mode discovery
--type sendtargets") in order to avoid the node database being altered
which led to the need for the "large hammer" approach taken by
commit id '3c12b654'.
This commit removes the virISCSITargetAutologin adjustment (eg. the setting
of node.startup to "manual"). The iscsiadm command has supported this mode
of operation as of commit id 'ad873767' to open-iscsi.
Utilize the exit status parameter for virCommandRunRegex in order to
check the return error from the 'iscsiadm --mode session' command.
Without this enabled, if there are no sessions running then virCommandRun
would have displayed an error such as:
2016-05-13 15:17:15.165+0000: 10920: error : virCommandWait:2553 :
internal error: Child process (iscsiadm --mode session)
unexpected exit status 21: iscsiadm: No active sessions.
It is possible that for certain paths (when probe is true) we only care
whether it's running or not to make certain decisions. Spitting out
the error for those paths is unnecessary.
If we do have a situation where probe = false and there's an error,
then display the error from iscsiadm if it's there.
Rather than have virCommandRun just spit out the error, allow callers
to decide to pass the exitstatus so the caller can make intelligent
decisions based on the error.
For a few cases where we handle secret information it's good to clear
the buffers containing sensitive data before freeing them.
Introduce VIR_DISPOSE, VIR_DISPOSE_N and VIR_DISPOSE_STRING that allow
simple clearing fo the buffers holding sensitive information on cleanup
paths.
Move some parts of virStorageFileRemoveLastPathComponent
into a separate function so they can be reused.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Both virGetLastError and virGetLastErrorMessage call virLastErrorObject method
that returns a thread-local error object. However, if a direct call to malloc
or pthread_setspecific (probably also due to malloc, since it sets ENOMEM)
fail, virLastErrorObject returns NULL which, although incorrectly interpreted
by virGetLastError as no error, still requires the caller to check for NULL
pointer. This isn't the case with virGetLastErrorMessage that also treated it
incorrectly as no error, but returned the literal "no error".
This patch tweaks the checks in the virGetLastErrorMessage function, so that
if virLastErrorObject failed, it returned "unknown error" which is equivalent
to the current approach with virGetLastError and if it returned NULL,
"unknown error" was set.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1318993
Commit id 'dd519a294' caused a regression cloning a volume into a
logical pool by removing just the 'allocation' adjustment during
storageVolCreateXMLFrom. Combined with the change to not require the
new volume input XML to have a capacity listed (commit id 'e3f1d2a8')
left the possibility that a zero allocation value (e.g., not provided)
would create a thin/sparse logical volume. When a thin lv becomes fully
populated, then LVM sets the partition 'inactive' and the subsequent
fdatasync() fails.
Add a new 'has_allocation' flag to be set at XML parse time to indicate
that allocation was provided. This is done so that if it's not provided
the create-from code uses the capacity value since we document that if
omitted, the volume will be fully allocated at time of creation.
For a logical backend, that creation time is 'createVol', while for a
file backend, creation doesn't set the size, but the 'createRaw' called
during buildVolFrom will decide whether the file is sparse or not based
on the provided capacity and allocation value.
For volume clones that provide different allocation and capacity values
to allow for sparse files, there is no change.
Usage of this keyword in front of function declaration that is exported via a
header file is unnecessary, since internally, this has been the default for most
compilers for quite some time.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
SRIOV VFs used in macvtap passthrough mode can take advantage of the
SRIOV card's transparent vlan tagging. All the code was there to set
the vlan tag, and it has been used for SRIOV VFs used for hostdev
interfaces for several years, but for some reason, the vlan tag for
macvtap passthrough devices was stubbed out with a -1.
This patch moves a bit of common validation down to a lower level
(virNetDevReplaceNetConfig()) so it is shared by hostdev and macvtap
modes, and updates the macvtap caller to actually send the vlan config
instead of -1.
Commit 0b36b0e9 broke polkit agent startup when attempting to fix a
coverity warning. Refactor it properly so that we don't need the 'cmd'
intermediate variable.
For disks sources described by a libvirt volume we don't need to do a
complicated check since virStorageTranslateDiskSourcePool already
correctly determines the actual disk type.
Replace the checks using a new accessor that does not open-code the
whole logic.
Fron c3bd0019c0 on instead of creating the following path for
cgroups:
/sys/fs/cgroupX/$name.libvirt-$driver
we generate rather more verbose one:
/sys/fs/cgroupX/$driver-$id-$name.libvirt-$driver
where $name is optional and included iff contains allowed chars.
See original commit for more reasoning. Now, problem with the
original commit is that we are unable to start any LXC domain
after it. Because when starting LXC container, the CGroup layout
is created by our lxc_controller process and then detected and
validated by libvirtd. The validation is done by trying to match
detected layout against all the possible patterns for cgroup
paths that we've ever had. And the commit in question forgot to
update this part of the code.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
json_reformat uses two spaces for when indenting nested objects, let's
do the same. The result of virJSONValueToString will be exactly the same
as json_reformat would produce.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Our socket address format is in a rather non-standard format and that is
because sasl library requires the IP address and service to be delimited by a
semicolon. The string form is a completely internal matter, however once the
admin interfaces to retrieve client identity information are merged, we should
return the socket address string in a common format, e.g. format defined by
URI rfc-3986, i.e. the IP address and service are delimited by a colon and
in case of an IPv6 address, square brackets are added:
Examples:
127.0.0.1:1234
[::1]:1234
This patch changes our default format to the one described above, while adding
separate methods to request the non-standard SASL format using semicolon as a
delimiter.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Just like with server-related APIs, before any of client-based APIs can be
called, a reference to a client-side client object needs to be obtained. For
this purpose, a lookup method should exist. Apart from the client retrieval
logic, a new error code for non-existent client had to be added as well.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
We had both and the only difference was that the latter also included
information about multifunction setting. The problem with that was that
we couldn't use functions made for only one of the structs (e.g.
parsing). To consolidate those two structs, use the one in virpci.h,
include that in domain_conf.h and add the multifunction member in it.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
$ echo -n 'log_level=1' > ~/.config/libvirt/libvirtd.conf
$ libvirtd --timeout=10
2014-10-10 10:30:56.394+0000: 6626: info : libvirt version: 1.1.3.6, package: 1.fc20 (Fedora Project, 2014-09-08-17:50:42, buildvm-05.phx2.fedoraproject.org)
2014-10-10 10:30:56.394+0000: 6626: error : main:1261 : Can't load config file: configuration file syntax error: /home/rjones/.config/libvirt/libvirtd.conf:1: expecting a value: /home/rjones/.config/libvirt/libvirtd.conf
Rather than try to fix this in the depths of the parser, just catch
the case when a config file doesn't end in a newline, and manually
append a newline to the content before parsing
https://bugzilla.redhat.com/show_bug.cgi?id=1151409
This an ubuntu/debian packaging convention. At one point it may have
been an actually different binary, but at least as of ubuntu precise
(the oldest supported ubuntu distro, released april 2012) kvm-img is
just a symlink to qemu-img for back compat.
I think it's safe to drop support for it
QEMU introduced the query-gic-capabilities QMP command
with commit 4468d4e0f383: use the command, if available,
to probe available GIC capabilities.
The information obtained is stored in a virQEMUCaps
instance, and will be later used to fill in a
virDomainCaps instance.
So in glibc-2.23 sys/sysmacros.h is no longer included from sys/types.h
and we don't build because of the usage of major/minor/makedev macros.
Autoconf already has AC_HEADER_MAJOR macro that check where exactly
these functions/macros are defined, so let's use that.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Since threadpool increments the current number of threads according to current
load, i.e. how many jobs are waiting in the queue. The count however, is
constrained by max and min limits of workers. The logic of this new API works
like this:
1) setting the minimum
a) When the limit is increased, depending on the current number of
threads, new threads are possibly spawned if the current number of
threads is less than the new minimum limit
b) Decreasing the minimum limit has no possible effect on the current
number of threads
2) setting the maximum
a) Icreasing the maximum limit has no immediate effect on the current
number of threads, it only allows the threadpool to spawn more
threads when new jobs, that would otherwise end up queued, arrive.
b) Decreasing the maximum limit may affect the current number of
threads, if the current number of threads is less than the new
maximum limit. Since there may be some ongoing time-consuming jobs
that would effectively block this API from killing any threads.
Therefore, this API is asynchronous with best-effort execution,
i.e. the necessary number of workers will be terminated once they
finish their previous job, unless other workers had already
terminated, decreasing the limit to the requested value.
3) setting priority workers
- both increase and decrease in count of these workers have an
immediate impact on the current number of workers, new ones will be
spawned or some of them get terminated respectively.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
In order for the client to see all thread counts and limits, current total
and free worker count getters need to be introduced. Client might also be
interested in the job queue length, so provide a getter for that too. As with
the other getters, preparing for the admin interface, mutual exclusion is used
within all getters.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
So far, the values the affected getters retrieve are static, i.e. there's no
way of changing them during runtime. But admin interface will later enable
not only getting but changing them as well. So to prevent phenomenons like
torn reads or concurrent reads and writes of unaligned values, use mutual
exclusion when getting these values (writes do, understandably, use them
already).
Signed-off-by: Erik Skultety <eskultet@redhat.com>
When either creating a threadpool, or creating a new thread to accomplish a job
that had been placed into the jobqueue, every time thread-specific data need to
be allocated, threadpool needs to be (re)-allocated and thread count indicators
updated. Make the code clearer to read by compressing these operations into a
more complex one.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
virTypedParamsValidate currently uses an index based check to find
duplicate parameters. This check does not work. Consider the following
simple example:
We have only 2 keys
A (multiples allowed)
B (multiples NOT allowed)
We are given the following list of parameters to check:
A
A
B
If you work through the validation loop you will see that our last iteration
through the loop has i=2 and j=1. In this case, i > j and keys[j].value.i will
indicate that multiples are not allowed. Both conditionals are satisfied so
an incorrect error will be given: "parameter '%s' occurs multiple times"
This patch replaces the index based check with code that remembers
the name of the last parameter seen and only triggers the error case if
the current parameter name equals the last one. This works because the
list is sorted and duplicate parameters will be grouped together.
In reality, we hit this bug while using selective block migration to migrate
a guest with 5 disks. 5 was apparently just the right number to push i > j
and hit this bug.
virsh migrate --live guestname --copy-storage-all
--migrate-disks vdb,vdc,vdd,vde,vdf
qemu+ssh://dsthost/system
Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com>
In a few places in libvirt we busy-wait for events, for example qemu
creating a monitor socket. This is problematic because:
- We need to choose a sufficiently small polling period so that
libvirt doesn't add unnecessary delays.
- We need to choose a sufficiently large polling period so that
the effect of busy-waiting doesn't affect the system.
The solution to this conflict is to use an exponential backoff.
This patch adds two functions to hide the details, and modifies a few
places where we currently busy-wait.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Refreshes meta-information such as allocation, capacity, format, etc.
Ploop volumes differ from other volume types. Path to volume is the path
to directory with image file root.hds and DiskDescriptor.xml.
https://openvz.org/Ploop/format
Due to this fact, operations of opening the volume have to be done once
again. get the information.
To decide whether the given volume is ploops one, it is necessary to check
the presence of root.hds and DiskDescriptor.xml files in volumes' directory.
Only in this case the volume can be manipulated as the ploops one.
Such strategy helps us to resolve problems that might occure, when we
upload some other volume type from ploop source.
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
virSocketAddrFormat() wants a single pointer, not a double pointer.
Fixes the following compilation error on FreeBSD:
util/virnetdev.c:1448:72: error: incompatible pointer types passing
'virSocketAddr **' to parameter of type 'const virSocketAddr *';
remove & [-Werror,-Wincompatible-pointer-types]
if (VIR_SOCKET_ADDR_VALID(peer) && !(peerstr = virSocketAddrFormat(&peer)))
^~~~~
./util/virsocketaddr.h:92:48: note: passing argument to parameter 'addr' here
char *virSocketAddrFormat(const virSocketAddr *addr);
^
FreeBSD lacks ENODATA, and viruuid.c redefines it to EIO, but it's not
actually using it. On the other hand, we have virrandom.c that's using
ENODATA. So make this re-definition common by moving it to internal.h,
so all the current and possible future users don't need to care about
that.
Using the existing virUUIDGenerateRandomBytes, move API to virrandom.c
rename it to virRandomBytes and add it to libvirt_private.syms.
This will be used as a fallback for generating a domain master key.
This reverts commit ee4cfb5643.
Since we're still not persisting our bookkeeping lists across
daemon restarts, we might have lost some information
virPCIDeviceReattach() relies on, for example whether the
device needs to be unbound from the stub driver.
As a result, if the daemon has been restarted in the meantime,
the device might end up remaining bound to the stub driver even
after 'virsh nodedev-reattach' or similar has been called, with
no way of giving it back to the host short of messing with
sysfs behind libvirt's back.
Revert back to the previous behavior of always trying to bind
the device to the host driver, regardless of its status when it
was detached, until persistent bookkeeping lists have been
implemented.
Do I really need to explain why?
Well, if read() is interrupted int the middle of reading, we will
never read the rest (even though it's highly unlikely as we are
reading just 8 bytes).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Now that we have @flags we can support changing perf events just
in active or inactive configuration regardless of the other.
Previously, calling virDomainSetPerfEvents set events in both
active and inactive configuration at once. Even though we allow
users to set perf events that are to be enabled once domain is
started up. The virDomainGetPerfEvents API was flawed too. It
returned just runtime info.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In some cases it's impractical to use the regular APIs as the bitmap
size needs to be pre-declared. These new APIs allow to use bitmaps that
self expand.
The new code adds a property to the bitmap to track the allocation of
memory so that VIR_RESIZE_N can be used.
This patch implement a set of interfaces for perf event. Based on
these interfaces, we can implement internal driver API for perf,
and get the results of perf conuter you care about.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Message-id: 1459171833-26416-4-git-send-email-qiaowei.ren@intel.com
After the patches that added tracking of in-use macvtap names (commit
370608, first appearing in libvirt-1.3.2), if the function to allocate
a new macvtap device came to a device name created outside libvirt, it
would retry the same device name MACVLAN_MAX_ID (8191) times before
finally giving up in failure.
The problem was that virBitmapNextClearBit was always being called
with "0" rather than the value most recently checked (which would
increment each time through the loop), so it would always return the
same id (since we dutifully release that id after failing to create a
new device using it).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1321546
Signed-off-by: Laine Stump <laine@laine.org>
Instead of forcing the values for the unbind_from_stub, remove_slot
and reprobe properties, look up the actual device and use that when
calling virPCIDeviceReattach().
This ensures the device is restored to its original state after
reattach: for example, if it was not bound to any driver before
detach, it will not be bound forcefully during reattach.
We would be just fine looking up the information in pcidevs most
of the time; however, some corner cases would not be handled
properly, so look up the actual device instead.
After this patch, ownership of virPCIDevice instances is very easy
to keep track of: for each host PCI device, the only instance that
actually matters is the one inside one of the bookkeeping list.
Whenever some operation needs to be performed on a PCI device, the
actual device is looked up first; when this is not the case, a
comment explains the reason.
Unmanaged devices, as the name suggests, are not detached
automatically from the host by libvirt before being attached to a
guest: it's the user's responsability to detach them manually
beforehand. If that preliminary step has not been performed, the
attach operation can't complete successfully.
Instead of relying on the lower layers to error out with cryptic
messages such as
error: Failed to attach device from /tmp/hostdev.xml
error: Path '/dev/vfio/12' is not accessible: No such file or directory
prevent the situation altogether and provide the user with a more
useful error message.
Unmanaged devices are attached to guests in two steps: first,
the device is detached from the host and marked as inactive;
subsequently, it is marked as active and attached to the guest.
If the daemon is restarted between these two operations, we lose
track of the inactive device.
Steps 5 and 6 of virHostdevPreparePCIDevices() already subtly
take care of this situation, but some planned changes will make
it so that's no longer the case. Plus, explicit is always better
than implicit.
This allows setting the address in host and/or network order and makes
the naming consistent. Now you don't need to call [hn]to[nh]l()
functions as that is taken care of by these functions. Also, now
the *NetOrder take the address in network order, the other functions in
host order so the naming and usage is consistent. Some places were
having the address in network order and calling ntohl() just so the
original function can call htonl() again. This makes it nicer to read.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
If we expose this information, which is one byte in every PCI config
file, we let all mgmt apps know whether the device itself is an endpoint
or not so it's easier for them to decide whether such device can be
passed through into a VM (endpoint) or not (*-bridge).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1317531
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The implementation is pretty straightforward. Moreover, because
of the nature of things, gethostbyname_r and gethostbyname2_r can
be implemented at the same time too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is a missing counterpart for virSocketAddrSetIPv4Addr()
and is going to be needed later in the tests.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This function is going to be used later in such context where the
argument makes no sense. Teach this function to cope with that
instead of the caller having to deal with passing some dummy
argument.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
These functions are going to be reused very shortly. So instead
of duplicating the code, lets move them into utils module.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We include the file in plenty of places. This is mostly due to
historical reasons. The only place that needs something from the
header file is storage_backend_fs which opens _PATH_MOUNTED. But
it gets the file included indirectly via mntent.h. At no other
place in our code we need _PATH_.*. Drop the include and
configure check then.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Refactor series 0b231195 worked with virLogDestination type which, depending
on the compiler, might be (and probably will be) an unsigned data type.
However, virEnumFromString may return -1 in case of an error. So, when enum
happens to be unsigned, some compilers will naturally complain about foo:
'if (foo < 0)'
The problem with the original virLogParseOutputs method was that the way it
parsed the input, walking the string char by char and using absolute jumps
depending on the virLogDestination type, was rather complicated to read.
This patch utilizes virStringSplit method twice, first time to filter out any
spaces and split the input to individual log outputs and then for each
individual output to tokenize it by to the parts according to our
PRIORITY:DESTINATION?(:DATA) format. Also, to STREQLEN for matching destination
was replaced with virDestinationTypeFromString call.
In order to refactor the ugly virLogParseOutputs method, this is a neat way of
finding out whether the destination type (in the form of a string) user
provided is a valid one. As a bonus, if it turns out it is valid, we get the
actual enum which will later be passed to any of virLogAddOutput methods right
away.
These comments explain the difference between a virPCIDevice
instance used for lookups and an actual device instance; some
information is also provided for specific uses.
virHostdevGetPCIHostDeviceList() is similar but does not filter out
devices that are not in the active list; that said, we are looking
up the device in the active list just a few lines after anyway, so
we might as well just keep a single function around.
This also helps stress the fact the objects contained in pcidevs are
only for looking up the actual devices, which is something later
commits will make even more explicit.
We're in the hostdev module, so mgr is not an ambiguous name, and
in fact it's already used in some cases. Switch all the code over.
Take the chance to shorten declaration of
virHostdevIsPCINodeDeviceUsedData structures.
When we want to look up a device in a device list and we already
have the IDs from another source, we can simply use
virPCIDeviceListFindByIDs() instead of creating a temporary device
object.
If 'last_processed_hostdev_vf != -1' is false then, since the
loop counter 'i' starts at 0, 'i <= last_processed_hostdev_vf'
can't possibly be true and the loop body will never be executed.
However, since 'i' is unsigned and 'last_processed_hostdev_vf'
is signed, we can't just get rid of the check completely; what
we can do is move it outside of the loop to avoid checking its
value on every iteration and cluttering the actual loop
condition.
This serves the same purpose as VIR_ERR_NO_xxx where xxx is any object
that API can be called upon. Only this particular one is used for
daemon's servers.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The comment claimed that virPCIDeviceReattach() does not reattach
a device to the host driver; except it actually does, so the
comment is just confusing and we're better off removing it.
Replace the term "loop" with the more generic "step". This allows us
to be more flexible and eg. have a step that consists in a single
function call.
Don't include the number of steps in the first comment of the
function, so that we can add or remove steps without having to worry
about keeping that comment in sync.
For the same reason, remove the summary contained in that comment.
Clean up some weird vertical spacing while we're at it.
While trying to build with -Os I've encountered some build
failures.
util/vircommand.c: In function 'virCommandAddEnvFormat':
util/vircommand.c:1257:1: error: inlining failed in call to 'virCommandAddEnv': call is unlikely and code size would grow [-Werror=inline]
virCommandAddEnv(virCommandPtr cmd, char *env)
^
util/vircommand.c:1308:5: error: called from here [-Werror=inline]
virCommandAddEnv(cmd, env);
^
This function is big enough for the compiler to be not inlined.
This is the error message I'm seeing:
Then virDomainNumatuneNodeSpecified is exported and called from
other places. It shouldn't be inlined then.
In file included from network/bridge_driver_platform.h:30:0,
from network/bridge_driver_platform.c:26:
network/bridge_driver_linux.c: In function 'networkRemoveRoutingFirewallRules':
./conf/network_conf.h:350:1: error: inlining failed in call to 'virNetworkDefForwardIf.constprop': call is unlikely and code size would grow [-Werror=inline]
virNetworkDefForwardIf(const virNetworkDef *def, size_t n)
^
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
More fallout from changing to using virPolkitAgent and handling error
paths. Needed to clear the 'cmd' once stored and of course add the
virCommandFree(cmd) in the error: label.
In virPolkitAgentCreate neglected to initialize agent to NULL. If
there was an error in the pipe, then we jump to error and would have
an issue. Found by coverity.
qemuProcessSetupEmulator runs at a point in time where there is only
the qemu main thread. Use virCgroupAddTask to put just that one task
into the emulator cgroup. That patch makes virCgroupMoveTask and
virCgroupAddTaskStrController obsolete.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
Introduce virPolkitAgentCreate and virPolkitAgentDestroy
virPolkitAgentCreate will run the polkit pkttyagent image as an asynchronous
command in order to handle the local agent authentication via stdin/stdout.
The code makes use of the pkttyagent --notify-fd mechanism to let it know
when the agent is successfully registered.
virPolkitAgentDestroy will close the command effectively reaping our
child process
When there isn't a ssh -X type session running and a user has not
been added to the libvirt group, attempts to run 'virsh -c qemu:///system'
commands from an otherwise unprivileged user will fail with rather
generic or opaque error message:
"error: authentication failed: no agent is available to authenticate"
This patch will adjust the error code and message to help reflect the
situation that the problem is the requested mechanism is UNAVAILABLE and
a slightly more descriptive error. The result on a failure then becomes:
"error: authentication unavailable: no polkit agent available to
authenticate action 'org.libvirt.unix.manage'"
A bit more history on this - at one time a failure generated the
following type message when running the 'pkcheck' as a subprocess:
"error: authentication failed: polkit\56retains_authorization_after_challenge=1
Authorization requires authentication but no agent is available."
but, a patch was generated to adjust the error message to help provide
more details about what failed. This was pushed as commit id '96a108c99'.
That patch prepended a "polkit: " to the output. It really didn't solve
the problem, but gave a hint.
After some time it was deemed using DBus API calls directly was a
better way to go (since pkcheck calls them anyway). So, commit id
'1b854c76' (more or less) copied the code from remoteDispatchAuthPolkit
and adjusted it. Then commit id 'c7542573' adjusted the remote.c
code to call the new API (virPolkitCheckAuth). Finally, commit id
'308c0c5a' altered the code to call DBus APIs directly. In doing
so, it reverted the failing error message to the generic message
that would have been received from DBus anyway.
Use virCgroupAddTaskController in virCgroupAddTask so we have one
single point where we add tasks to cgroups.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
The virHostdevIsVirtualFunction() was called exactly twice, and in
both cases the return value was saved to a temporary variable before
being checked. This would be okay if it improved readability, but in
this case is pretty pointless.
Get rid of the temporary variable and check the return value
directly; while at it, change the check from '<= 0' to '!= 1' to
align it with the way other similar *IsVirtualFunction() functions
are used thorough the code.
virNetDevIsVirtualFunction() returns 1 if the interface is a
virtual function, 0 if it isn't and -1 on error. This means that,
despite the name suggesting otherwise, using it as a predicate is
not correct.
Fix two callers that were doing so adding an explicit check on
the return value.
It may be useful in some cases to call TristateSwitch helper with TristateBool.
Document that enum values equivalency in the code.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
GIC v2 is the default, but checking against that specific version when
we want to know whether the default has been selected is potentially
error prone; using an alias instead makes it safer.
In cf113e8d we changed the declaration of
virCgroupAllowDevicePath() and virCgroupDenyDevicePath().
However, while updating the stub for non-cgroup platforms for the
former we forgot to update the latter too causing a build
failure.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The method will now return 0 on success and -1 on error, rather than number of
items which it iterated over before it returned back to the caller. Since the
only place where we actually check the number of elements iterated is in
virhashtest, return value of 0 and -1 can be a pretty accurate hint that it
iterated over all the items. However, if we really want to know the number of
items iterated over (like virhashtest does), a counter has to be provided
through opaque data to each iterator call. This patch adjusts return value of
virHashForEach, refactors the body, so it returns as soon as one of the
iterators fail and adjusts virhashtest to reflect these changes.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Our existing virHashForEach method iterates through all items disregarding the
fact, that some of the iterators might have actually failed. Errors are usually
dispatched through an error element in opaque data which then causes the
original caller of virHashForEach to return -1. In that case, virHashForEach
could return as soon as one of the iterators fail. This patch changes the
iterator return type and adjusts all of its instances accordingly, so the
actual refactor of virHashForEach method can be dealt with later.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
When adding disk images to ACL we may call those functions on NFS
shares. In that case we might get an EACCES, which isn't really relevant
since NFS would not hold a block device. This patch adds a flag that
allows to stop reporting an error on EACCES to avoid spaming logs.
Currently there's no functional change.
Since commit 47e5b5ae virCgroupAllowDevice allows to pass -1 as either
the minor or major device number and it automatically uses '*' in place
of that. Reuse the new approach through the code and drop the duplicated
functions.
We currently blindly accept any numeric value as a GIC version, even
though only GIC v2 and GIC v3 actually exist; on the other hand, we
reject "host", which is a perfectly legitimate value for QEMU guests.
This new enumeration contains all GIC versions libvirt is aware of.
The existing log messages for this have several problems; there are
two lines of log when one will suffice, they duplicate the function
name in log message (when it's already included by VIR_DEBUG), they're
missing some useful bits, they get logged even when the call is a NOP.
This patch cleans up the problems with those existing logs, and also
adds a new VIR_INFO-level log down at the function that is actually
creating and sending the netlink message that logs *everything* going
into the netlink message (which turns out to be much more useful in
practice for me; I didn't want to eliminate the logs at the existing
location though, in case they are useful in some scenario I'm
unfamiliar with; anyway those logs are remaining at debug level, so it
shouldn't be a bother to anyone).
Apparently we are not the only ones with dumb free functions
because dbus_message_unref() does not accept NULL either. But if
I were to vote, this one is even more evil. Instead of returning
an error just like we do it immediately dereference any pointer
passed and thus crash you app. Well done DBus!
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f878ebda700 (LWP 31264)]
0x00007f87be4016e5 in ?? () from /usr/lib64/libdbus-1.so.3
(gdb) bt
#0 0x00007f87be4016e5 in ?? () from /usr/lib64/libdbus-1.so.3
#1 0x00007f87be3f004e in dbus_message_unref () from /usr/lib64/libdbus-1.so.3
#2 0x00007f87bf6ecf95 in virSystemdGetMachineNameByPID (pid=9849) at util/virsystemd.c:228
#3 0x00007f879761bd4d in qemuConnectCgroup (driver=0x7f87600a32a0, vm=0x7f87600c7550) at qemu/qemu_cgroup.c:909
#4 0x00007f87976386b7 in qemuProcessReconnect (opaque=0x7f87600db840) at qemu/qemu_process.c:3386
#5 0x00007f87bf6edfff in virThreadHelper (data=0x7f87600d5580) at util/virthread.c:206
#6 0x00007f87bb602334 in start_thread (arg=0x7f878ebda700) at pthread_create.c:333
#7 0x00007f87bb3481bd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
(gdb) frame 2
#2 0x00007f87bf6ecf95 in virSystemdGetMachineNameByPID (pid=9849) at util/virsystemd.c:228
228 dbus_message_unref(reply);
(gdb) p reply
$1 = (DBusMessage *) 0x0
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The virStringListLength function does not ever modify the passed
string list. It merely counts the items in it. Make sure that we
reflect this bit in the function header.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(crobinso: fix up spacing and squash in sheepdog bit suggested
by Andrea)
In the commit 7938b533 we've changed the function signature,
however forgot to update stump that's used on systems without
CGroups causing a build failure.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
I've noticed that variable @reply is not initialized and if
something at the beginning of the function fails, e.g.
virDBusGetSystemBus(), the control jump straight to cleanup label
where dbus_message_unref() is then called over this uninitialized
variable.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
I've noticed couple of warning in dmesg while debugging
something:
[ 9683.973754] HTB: quantum of class 10001 is big. Consider r2q change.
[ 9683.976460] HTB: quantum of class 10002 is big. Consider r2q change.
I've read the HTB documentation and linux kernel code to find out
what's wrong. Basically we need to pass another argument
"quantum" to our tc cmd line because the default computed by HTB
does not always work in which case the warning message is printed
out.
You can read more details here:
http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm#sharing
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So, systemd-machined has this philosophy that machine names are like
hostnames and hence should follow the same rules. But we always allowed
international characters in domain names. Thus we need to modify the
machine name we are passing to systemd.
In order to change some machine names that we will be passing to systemd,
we also need to call TerminateMachine at the end of a lifetime of a
domain. Even for domains that were started with older libvirt. That
can be achieved thanks to virSystemdGetMachineNameByPID(). And because
we can change machine names, we can get rid of the inconsistent and
pointless escaping of domain names when creating machine names.
So this patch modifies the naming in the following way. It creates the
name as <drivername>-<id>-<name> where invalid hostname characters are
stripped out of the name and if the resulting name is longer, it
truncates it to 64 characters. That way we can start domains we
couldn't start before. Well, at least on systemd.
To make it work all together, the machineName (which is needed only with
systemd) is saved in domain's private data. That way the generation is
moved to the driver and we don't need to pass various unnecessary
arguments to cgroup functions.
The only thing this complicates a bit is the scope generation when
validating a cgroup where we must check both old and new naming, so a
slight modification was needed there.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1282846
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Same as for deserializer, this method might get handy for admin one day.
The major reason for this patch is to stay consistent with idea, i.e.
when deserializer can be shared, why not serializer as well. The only
problem to be solved was that the daemon side serializer uses a code
snippet which handles sparse arrays returned by some APIs as well as
removes any string parameters that can't be returned to older clients.
This patch makes of the new virTypedParameterRemote datatype introduced
by one of the pvious patches.
Since the method is static to remote_driver, it can't even be used by our
daemon. Other than that, it would be useful to be able to use it with admin as
well. This patch uses the new virTypedParameterRemote datatype introduced in
one of previous patches.
Currently, the deserializer is hardcoded into remote_driver which makes
it impossible for admin to use it. One way to achieve a shared implementation
(besides moving the code to another module) would be pass @ret_params_val as a
void pointer as opposed to the remote_typed_param pointer and add a new extra
argument specifying which of those two protocols is being used and typecast
the pointer at the function entry. An example from remote_protocol:
struct remote_typed_param_value {
int type;
union {
int i;
u_int ui;
int64_t l;
uint64_t ul;
double d;
int b;
remote_nonnull_string s;
} remote_typed_param_value_u;
};
typedef struct remote_typed_param_value remote_typed_param_value;
struct remote_typed_param {
remote_nonnull_string field;
remote_typed_param_value value;
};
That would leave us with a bunch of if-then-elses that needed to be used across
the method. This patch takes the other approach using the new datatype
introduced in one of earlier commits.
Both admin and remote protocols define their own types
(remote_typed_param vs admin_typed_param). Because of the naming convention,
admin typed params wouldn't be able to reuse the serialization/deserialization
methods, which are tailored for use by remote protocol, even if those method
were exported properly. In that case, introduce a new internal data type
structurally copying both admin and remote protocols which, eventually, would
allow serializer and deserializer to be used in a more generic way.
Use 'ret' for return variable name, clarify use of 'param_idx' and avoid
unnecessary 'success' label. No functional changes. Also document the
function.
The affected functions are:
virPCIDeviceGetManaged()
virPCIDeviceGetUnbindFromStub()
virPCIDeviceGetRemoveSlot()
virPCIDeviceGetReprobe()
Change their return type from unsigned int to bool: the corresponding
members in struct _virPCIDevice are defined as bool, and even the
corresponding virPCIDeviceSet*() functions take a bool value as input
so there's no point in these functions having unsigned int as return
type.
Suggested-by: John Ferlan <jferlan@redhat.com>
Unbinding a PCI device from the stub driver can require several steps,
and it can be useful for debugging to be able to trace which of these
steps are performed and which are skipped for each device.
The name is confusing, and there are just two uses: one is a test case,
and the other will be removed as part of an upcoming refactoring of
the hostdev code.
Commit 871e10f fixed a memory corruption error, but called strlen()
twice on the same string to do so. Even though the compiler is
probably smart enough to optimize the second call away, having a
single invocation makes the code slightly cleaner.
Suggested-by: Michal Privoznik <mprivozn@redhat.com>
In 370608b4c7 we have introduced two new internal APIs.
However, there are no stubs for build without macvtap. Therefore
build on systems lacking macvtap support (e.g. mingw or freebds)
fails when trying to link.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
libvirtd crashes on free()ing portData for an open vswitch port if that port
was deleted. To reproduce:
ovs-vsctl del-port vnet0
virsh migrate --live kvm1 qemu+ssh://dstHost/system
Error message:
libvirtd: *** Error in `/usr/sbin/libvirtd': free(): invalid pointer: 0x000003ff90001e20 ***
The problem is that virCommandRun can return an empty string in the event that
the port being queried does not exist. When this happens then we are
unconditionally overwriting a newline character at position strlen()-1. When
strlen is 0, we overwrite memory that does not belong to the string.
The fix: Only overwrite the newline if the string is not empty.
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
This patch creates two bitmaps, one for macvlan device names and one
for macvtap. The bitmap position is used to indicate that libvirt is
currently using a device with the name macvtap%d/macvlan%d, where %d
is the position in the bitmap. When requested to create a new
macvtap/macvlan device, libvirt will now look for the first clear bit
in the appropriate bitmap and derive the device name from that rather
than just starting at 0 and counting up until one works.
When libvirtd is restarted, the qemu driver code that reattaches to
active domains calls the appropriate function to "re-reserve" the
device names as it is scanning the status of running domains.
Note that it may seem strange that the retry counter now starts at
8191 instead of 5. This is because we now don't do a "pre-check" for
the existence of a device once we've reserved it in the bitmap - we
move straight to creating it; although very unlikely, it's possible
that someone has a running system where they have a large number of
network devices *created outside libvirt* named "macvtap%d" or
"macvlan%d" - such a setup would still allow creating more devices
with the old code, while a low retry max in the new code would cause a
failure. Since the objective of the retry max is just to prevent an
infinite loop, and it's highly unlikely to do more than 1 iteration
anyway, having a high max is a reasonable concession in order to
prevent lots of new failures.
In the following cases nl_recv() was returning the error "No buffer
space available":
* When switching CPUs to offline/online in a system more than 128 cpus
* When using virsh to destroy domain in a system with many interfaces
This patch sets the buffer size for all netlink sockets created by
libnl to 128K and turns on message peeking for nl_recv(). This
eliminates the "No buffer space available" errors seen in the cases
above, and also preempts other future errors the smaller buffers could
have caused.
Signed-off-by: Leno Hou <houqy@linux.vnet.ibm.com>
Signed-off-by: Laine Stump <laine@laine.org>
In dc576025c3 we renamed virCgroupIsolateMount function to
virCgroupBindMount. However, we forgot about one occurrence in
section of the code which provides stubs for platforms without
support for CGroups like *BSD for instance.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
On the host when we start a container, it will be
placed in a cgroup path of
/machine.slice/machine-lxc\x2ddemo.scope
under /sys/fs/cgroup/*
Inside the containers' namespace we need to setup
/sys/fs/cgroup mounts, and currently will bind
mount /machine.slice/machine-lxc\x2ddemo.scope on
the host to appear as / in the container.
While this may sound nice, it confuses applications
dealing with cgroups, because /proc/$PID/cgroup
now does not match the directory in /sys/fs/cgroup
This particularly causes problems for systems and
will make it create repeated path components in
the cgroup for apps run in the container eg
/machine.slice/machine-lxc\x2ddemo.scope/machine.slice/machine-lxc\x2ddemo.scope/user.slice/user-0.slice/session-61.scope
This also causes any systemd service that uses
sd-notify to fail to start, because when systemd
receives the notification it won't be able to
identify the corresponding unit it came from.
In particular this break rabbitmq-server startup
Future kernels will provide proper cgroup namespacing
which will handle this problem, but until that time
we should not try to play games with hiding parent
cgroups.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
libvirt always resets the MAC address of the physdev used for macvtap
passthrough when the guest is finished with it. This was happening
prior to the 802.1Qb[gh] DISASSOCIATE command, and was quite often
failing, presumably because the driver wouldn't allow the MAC address
to be reset while the association was still active, with a log message
like this:
virNetDevSetMAC:168 : Cannot set interface MAC to 00:00:00:00:00:00 on 'eth13': Cannot assign requested address
This patch changes the order - we now do the 802.1Qb[gh] disassociate
and delete the macvtap interface first, then and reset the MAC
address.
if instanceId is NULL
When virNetDevVPortProfileGetStatus() was called with instanceId =
NULL (which is the case for all DISASSOCIATE requests in 802.1Qbh) it
would log the following error:
Could not find netlink response with expected parameters
even though the disassociate had been successfully completely. Then,
due to the fortunate coincidence of status having been initialized to
0 and then not changed when the "failure" was encountered, it would
still return a status of 0 (PORT_VDP_RESPONSE_SUCCESS), so the caller
would assume a successful operation.
This would result in a spurious log message though, and would fill in
LastErrorMessage, so that the API would return that error if it
happened during cleanup from some other error. That, in turn, would
lead to an incorrect supposition that the response to the port profile
disassociate was the cause of the failure.
During debugging, I noticed that the VF in question usually had *no
uuid* associated with it (big surprise)by the time the disassociate
completed, so the solution is *not* to send the previous instanceId
down.
This patch fixes virNetDevVPortProfileGetStatus() to only check the
VF's uuid in the status if it was given an instanceId to check against
when originally called. Otherwise it only checks that the particular
VF is present (it will be).
This does cause a slight difference in behavior - rather than
returning with status unchanged (and thus always 0) it will actually
get the IFLA_PORT_RESPONSE. This could lead to revelation of error
conditions we were previously ignoring. Or not. So far "not".
Some of the protocol files already include handing of the missing int
types such as xdr_uint64_t, some don't. To fix it everywhere, move out
of the appropriate defines to the utils/virxdrdefs.h file and include
it where needed.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
As cgroup implementation only works on Linux, it does not
make much sense to include sys/mount.h if other requirements are
not met, such as HAVE_MNTENT_H and HAVE_GETMNTENT_R.
Also, it fixes build on OpenBSD that requires to include sys/param.h
along with sys/mount.h.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
Most of the changes to the list of active and inactive PCI devices
happen in virHostdev, where they are properly logged.
virPCIDeviceDetach() and virPCIDeviceReattach(), however, change the
inactive list as well, so they should be logging similar messages.
Instead of misusing a const string to hold up runtime allocated
data, introduce new variable @hoststr and obey const correctness.
==6879== 15 bytes in 1 blocks are definitely lost in loss record 68 of 1,064
==6879== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==6879== by 0xA7DDF97: vasprintf (in /lib64/libc-2.21.so)
==6879== by 0x552BBC6: virVasprintfInternal (virstring.c:493)
==6879== by 0x552BCDB: virAsprintfInternal (virstring.c:514)
==6879== by 0x54FA44C: virLogHostnameString (virlog.c:468)
==6879== by 0x54FAB0F: virLogVMessage (virlog.c:645)
==6879== by 0x54FA680: virLogMessage (virlog.c:531)
==6879== by 0x54FBBF4: virLogParseOutputs (virlog.c:1130)
==6879== by 0x11CB4F: daemonSetupLogging (libvirtd.c:685)
==6879== by 0x11E137: main (libvirtd.c:1297)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Once @hostname is printed into @hoststr we don't need it anymore.
==6879== 5 bytes in 1 blocks are definitely lost in loss record 10 of 1,064
==6879== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==6879== by 0xA7ED599: strdup (in /lib64/libc-2.21.so)
==6879== by 0x552C126: virStrdup (virstring.c:726)
==6879== by 0x553B13E: virGetHostnameImpl (virutil.c:720)
==6879== by 0x553B1BF: virGetHostnameQuiet (virutil.c:741)
==6879== by 0x54FA3FD: virLogHostnameString (virlog.c:462)
==6879== by 0x54FAB0F: virLogVMessage (virlog.c:645)
==6879== by 0x54FA680: virLogMessage (virlog.c:531)
==6879== by 0x54FBBF4: virLogParseOutputs (virlog.c:1130)
==6879== by 0x11CB4F: daemonSetupLogging (libvirtd.c:685)
==6879== by 0x11E137: main (libvirtd.c:1297)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If no port number was provided for a storage pool libvirt defaults to
port 6789; however, librbd/librados already default to 6789 when no port
number is provided.
In the future Ceph will switch to a new port for the Ceph monitors since
port 6789 is already assigned to a different application by IANA.
Port 6789 is assigned to SMC-HTTPS and Ceph now has port 3300 assigned as
the 'Ceph monitor' port.
In this case it is the best solution to not hardcode any port number into
libvirt and let librados handle the connection.
Only if a user specifies a different port number we pass it down to librados,
otherwise we leave it blank.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
merge
Due to debug logs like this:
virPCIGetDeviceAddressFromSysfsLink:2432 : Attempting to resolve device path from device link '/sys/class/net/eth1/device/virtfn6'
logStrToLong_ui:2369 : Converted '0000:07:00.7' to unsigned int 0
logStrToLong_ui:2369 : Converted '07:00.7' to unsigned int 7
logStrToLong_ui:2369 : Converted '00.7' to unsigned int 0
logStrToLong_ui:2369 : Converted '7' to unsigned int 7
virPCIGetDeviceAddressFromSysfs:1947 : virPCIDeviceAddress 0000:07:00.7
virPCIGetVirtualFunctions:2554 : Found virtual function 7
printed *once for each SR-IOV Virtual Function* of a Physical Function
each time libvirt retrieved the list of VFs (so if the system has 128
VFs, there would be 900 lines of log for each call), the debug logs on
any system with a large number of VFs was dominated by "information"
that was possibly useful for debugging when the code was being
written, but is now useless for debugging of any problem on a running
system, and only serves to obscure the real useful information. This
overkill has no place in production code, so this patch removes it.
The previous error message just indicated that the desired response
couldn't be found, this patch tells what was desired, as well as
listing out the entire table that had been in the netlink response, to
give some kind of idea why it failed.
I noticed in a log file that we had failed to set a MAC address. The
log said which interface we were trying to set, but didn't give the
offending MAC address, which could have been useful in determining the
source of the problem. This patch modifies all three places in the
code that set MAC addresses to report the failed MAC as well as
interface.
The manpage for sysconf() suggest including unistd.h as the
function is declared there. Even though we are not hitting any
compile issues currently, let's include the correct header file
instead of relying on some hidden include chain.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We want to eventually factor out the code dealing with device detaching
and reattaching, so that we can share it and make sure it's called eg.
when 'virsh nodedev-detach' is used.
For that to happen, it's important that the lists of active and inactive
PCI devices are updated every time a device changes its state.
Instead of passing NULL as the last argument of virPCIDeviceDetach() and
virPCIDeviceReattach(), pass the proper list so that it can be updated.
This replaces the virPCIKnownStubs string array that was used
internally for stub driver validation.
Advantages:
* possible values are well-defined
* typos in driver names will be detected at compile time
* avoids having several copies of the same string around
* no error checking required when setting / getting value
The names used mirror those in the
virDomainHostdevSubsysPCIBackendType enumeration.
This internal function supports, in theory, binding to a different
stub driver than the one the PCI device has been configured to use.
In practice, it is only ever called like
virPCIDeviceBindToStub(dev, dev->stubDriver);
which makes its second parameter redundant. Get rid of it, along
with the extra string copy required to support it.
This function can be used to retrieve the current locked memory
limit for a process, so that the setting can be later restored.
Add a configure check for getrlimit(), which we now use.
The prlimit() function allows both getting and setting limits for
a process; expose the same functionality in our wrapper.
Add the const modifier for new_limit, in accordance with the
prototype for prlimit().
Instead of replicating the information (domain, bus, slot, function)
inside the virPCIDevice structure, use the already-existing
virPCIDeviceAddress structure.
For users of the module, this means that the object returned by
virPCIDeviceGetAddress() can no longer be NULL and must no longer
be freed by the caller.
virCgroupNewMachine used to add the pidleader to the newly created
machine cgroup. Do not do this implicit anymore.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
Firstly, there's a bug (or typo) in the only place where we call
this function: @multiqueue is set whenever @tapfdSize is greater
than zero, while in fact the condition should have been 'greater
than one'.
Then, secondly, since the condition depends on just one
variable, that we are even passing down to the function, we can
move the condition into the function and drop useless argument.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Some older systems, e.g. RHEL-6 do not have IFF_MULTI_QUEUE flag
which we use to enable multiqueue feature. Therefore one gets the
following compile error there:
CC util/libvirt_util_la-virnetdevmacvlan.lo
util/virnetdevmacvlan.c: In function 'virNetDevMacVLanTapSetup':
util/virnetdevmacvlan.c:338: error: 'IFF_MULTI_QUEUE' undeclared (first use in this function)
util/virnetdevmacvlan.c:338: error: (Each undeclared identifier is reported only once
util/virnetdevmacvlan.c:338: error: for each function it appears in.)
make[3]: *** [util/libvirt_util_la-virnetdevmacvlan.lo] Error 1
So, whenever user wants us to enable the feature on such systems,
we will just throw a runtime error instead.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit id '56e2171c6' removed a variable from the argument list, but
neglected to update the ATTRIBUTE_NONNULL values, so when commit id
'08da97bfb' added a couple of arguments, the values were off.
Always return LLONG_MAX even on 32 bit systems. The limitation
originates from our use of "unsigned long" in several APIs. The internal
data type is unsigned long long. Make the test suite deterministic by
removing the architecture difference.
Flaw was introduced in 645881139b where
I've added a test that uses too large numbers.
For the multiqueue on macvtaps we are going to need to open
the device multiple times. Currently, this is not supported.
Rework the function, so that upper layers can be reworked too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Like we are doing for TUN/TAP devices, we should do the same for
macvtaps. Although, it's not as critical as in that case, we
should do it for the consistency.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For the multiqueue on macvtaps we are going to need to open
the device multiple times. Currently, this is not supported.
Rework the function, so that upper layers can be reworked too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For the multiqueue on macvtaps we are going to need to open
the device multiple times. Currently, this is not supported.
Rework the function, so that upper layers can be reworked too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There are few outdated things. Firstly, we don't need to undergo
the torture of fopen, fscanf and fclose just to get the interface
index when we have nice wrapper over that: virNetDevGetIndex.
Secondly, we don't need to have statically allocated buffer for
the path we are opening.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So yet again one of integer arguments that we use as a boolean.
Since the argument count of the function is unbearably long
enough, lets turn those booleans into flags.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
On the very first log message we send to any output, we include
the libvirt version number and package string. In some bug reports
we have been given libvirtd.log files that came from a different
host than the corresponding /var/log/libvirt/qemu log files. So
extend the initial log message to include the hostname too.
eg on first log message we would now see:
$ libvirtd
2015-12-04 17:35:36.610+0000: 20917: info : libvirt version: 1.3.0
2015-12-04 17:35:36.610+0000: 20917: info : hostname: dhcp-1-180.lcy.redhat.com
2015-12-04 17:35:36.610+0000: 20917: error : qemuMonitorIO:687 : internal error: End of file from monitor
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The log file descriptor associated with the virRotatingFile
struct should be marked close-on-exec, as even when virtlogd
re-exec's itself it expect to open the log file fresh. It
does not need to preserve the logfile handles, only the network
client FDs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit 0f7436ca54 "network: wait for DAD to finish for bridge IPv6 addresses"
results in:
CC util/libvirt_util_la-virnetdevmacvlan.lo
util/virnetdev.c: In function 'virNetDevParseDadStatus':
util/virnetdev.c:1319:188: error: cast increases required alignment of target type [-Werror=cast-align]
util/virnetdev.c:1332:41: error: cast increases required alignment of target type [-Werror=cast-align]
util/virnetdev.c:1334:92: error: cast increases required alignment of target type [-Werror=cast-align]
cc1: all warnings being treated as errors
on at least ARM platforms.
The three macros involved (NLMSG_NEXT, IFA_RTA and RTA_NEXT) all appear to
correctly take care of alignment, therefore suppress Wcast-align around their
uses.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Maxim Perevedentsev <mperevedentsev@virtuozzo.com>
Cc: Laine Stump <laine@laine.org>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Cc: Jim Fehlig <jfehlig@suse.com>
Our domain_conf.* files are big enough. Not only they contain XML
parsing code, but they served as a storage of all functions whose
name is virDomain prefixed. This is just wrong as it gathers not
related functions (and modules) into one big file which is then
harder to maintain. Split virDomainObjList module into a separate
file called virdomainobjlist.[ch].
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If for some reason there is an existing log file, that is larger then
max length of log file, we need to rollover that file immediately.
Trying to figure out how much data we could write will resolve in
overflow of unsigned variable 'towrite' and this leads to segfault.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
As we need to provide support for URI aliases in libvirt-admin as well, URI
alias matching needs to be internally visible. Since
virConnectOpenResolveURIAlias does have a compatible signature, it could be
easily reused by libvirt-admin. This patch moves URI alias matching to util,
renaming it accordingly.
virConnectGetConfig and virConnectGetConfigPath were static libvirt
methods, merely because there hasn't been any need for having them
internally exported yet. Since libvirt-admin also needs to reference
its config file, 'xGetConfig' should be exported.
Besides moving, this patch also renames the methods accordingly,
as they are libvirt config specific.
Machine name escaping follows the same rules as serice name escape,
except that '.' and '-' must not be escaped in machine names, due
to a bug in systemd-machined.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1282846
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Copy the virtlockd codebase across to form the initial virlogd
code. Simple search & replace of s/lock/log/ and gut the remote
protocol & dispatcher. This gives us a daemon that starts up
and listens for connections, but does nothing with them.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add virRotatingFileReader and virRotatingFileWriter objects
which allow reading & writing from/to files with automation
rotation to N backup files when a size limit is reached. This
is useful for guest logging when a guaranteed finite size
limit is required. Use of external tools like logrotate is
inadequate since it leaves the possibility for guest to DOS
the host in between invokations of logrotate.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
According to the documentation, CreateMachine accepts only 7bit ASCII
characters in the machinename parameter, so let's make sure we can start
machines with unicode names with systemd. We already have a function
for that, we just forgot to use it.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1062943
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1282846
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
A PCI device may have the capability to setup virtual functions (VFs)
but have them currently all disabled. Prior to this patch, if that was
the case the the node device XML for the device wouldn't report any
virtual_functions capability.
With this patch, if a file called "sriov_totalvfs" is found in the
device's sysfs directory, its contents will be interpreted as a
decimal number, and that value will be reported as "maxCount" in a
capability element of the device's XML, e.g.:
<capability type='virtual_functions' maxCount='7'/>
This will be reported regardless of whether or not any VFs are
currently enabled for the device.
NB: sriov_numvfs (the number of VFs currently active) is also
available in sysfs, but that value is implied by the number of items
in the list that is inside the capability element, so there is no
reason to explicitly provide it as an attribute.
sriov_totalvfs and sriov_numvfs are available in kernels at least as far
back as the 2.6.32 that is in RHEL6.7, but in the case that they
simply aren't there, libvirt will behave as it did prior to this patch
- no maxCount will be displayed, and the virtual_functions capability
will be absent from the device's XML when 0 VFs are enabled.
Introduce a new helper function "virDiskNameParse" which extends
virDiskNameToIndex but handling both disk index and partition index.
Also rework virDiskNameToIndex to be based on virDiskNameParse.
A test is also added for this function testing both valid and
invalid disk names.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
The LXC driver uses virSetUIDGID() to become UID/GID 0.
It passes an empty groups list to virSetUIDGID()
to get rid of all supplementary groups from the host side.
But virSetUIDGID() calls setgroups() only if the supplied list
is larger than 0.
This leads to a container root with unrelated supplementary groups.
In most cases this issue is unoticed as libvirtd runs as UID/GID 0
without any supplementary groups.
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch addresses BZ 1244895.
Adapt the sysfs TPM command cancel path for the TPM driver that
does not use a miscdevice anymore since Linux 4.0. Support old
and new paths and check their availability.
Add a mockup for the test cases to avoid the testing for
availability of the cancel path.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Use virNetDevSetupControl instead of open coding using socket(AF_LOCAL...)
and clearing virIfreq.
By using virNetDevSetupControl, the socket is then opened using
AF_PACKET which requires being privileged (effectively root) in
order to complete successfully. Since that's now a requirement,
then the ioctl(SIOCETHTOOL) should not fail with EPERM, thus it
is removed from the filtered listed of failure codes.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Since the SIOCETHTOOL ioctl only works for privileged daemons, if called
when not root, then virNetDevGetFeatures will VIR_DEBUG a message and
return 0 as if the functions were not available for the architecture.
This effectively returns an empty bitmap indicating no features available.
Introduced by commit id 'c9027d8f4'
Signed-off-by: John Ferlan <jferlan@redhat.com>
In commit id 'c9027d8f4' when updating the posted patch to generate
a bitmap instead of an array of named feature bits, adjustment of
the args was missed
Recently reverted commit id '6f2a0198' showed a need to add extra
comments when dealing with filtering of potential "non-issues".
Scanning through upstream patch postings indicates early on the
reasons for the filtering of specific ioctl failures were provided;
however, when converted from causing an error to VIR_DEBUG's the
reasons were missing. A future read/change of the code incorrectly
assumed they could or should be removed.
This reverts commit 6f2a0198e9.
This commit removed error reporting from virNetDevSendEthtoolIoctl
pushing responsibility onto the callers. This is wrong, however,
since virNetDevSendEthtoolIoctl calls virNetDevSetupControl
which can still report errors. So as a result virNetDevSendEthtoolIoctl
may or may not report errors depending on which bit of it fails, and as
a result callers now overwrite some errors.
It also introduced a regression causing unprivileged libvirtd to
spew error messages to the console due to inability to query the
NIC features, an error which was previously ignored.
virNetDevSetupControlFull:148 : Cannot open network interface control socket: Operation not permitted
virNetDevFeatureAvailable:3062 : Cannot get device wlp3s0 flags: Operation not permitted
virNetDevSetupControlFull:148 : Cannot open network interface control socket: Operation not permitted
virNetDevFeatureAvailable:3062 : Cannot get device wlp3s0 flags: Operation not permitted
virNetDevSetupControlFull:148 : Cannot open network interface control socket: Operation not permitted
virNetDevFeatureAvailable:3062 : Cannot get device wlp3s0 flags: Operation not permitted
virNetDevSetupControlFull:148 : Cannot open network interface control socket: Operation not permitted
virNetDevFeatureAvailable:3062 : Cannot get device wlp3s0 flags: Operation not permitted
Looking back at the original posting I see no explanation of why
thsi refactoring was needed, so reverting the clearly broken
error reporting logic looks like the best option.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit id '0f7436ca' added virNetDevWaitDadFinish using ATTRIBUTE_NONNULL
for both arguments, although one is a non-null argument. A Coverity build
balks at that.
Rather than "if (virNetDevFeatureAvailable(ifname, &cmd))" change the
success criteria to "if (virNetDevFeatureAvailable(ifname, &cmd) == 1)".
The called helper returns -1 on failure, 0 on not found, and 1 on found.
Thus a failure was setting bits.
Introduced by commit ac3ed20 which changed the helper's return
values without adjusting its callers
Signed-off-by: John Ferlan <jferlan@redhat.com>
VIR_DEBUG and VIR_WARN will automatically add a new line to the message,
having "\n" at the end or at the beginning of the message results in
empty lines.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This was originally set to 5 seconds, but times of 5.5 to 7 seconds
were experienced. Since it's an arbitrary number intended to prevent
an infinite hang, having it a bit too high won't hurt anything, and 20
seconds looks to be adequate (i.e. I think/hope we don't need to make
it tunable in libvirtd.conf)
If DAD not finished in 5 seconds, user will get an
unknown error like this:
# virsh net-start ipv6
error: Failed to start network ipv6
error: An error occurred, but the cause is unknown
Call virReportError to set an error.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Build on non-Linux fails because the virNetDevWaitDadFinish() stub
has unused parameters. Fix by adding appropriate ATTRIBUTE_UNUSED
for these parameters.
Pushing under build-breaker rule.
commit db488c79 assumed that dnsmasq would complete IPv6 DAD before
daemonizing, but in reality it doesn't wait, which creates problems
when libvirt's bridge driver sets the matching "dummy tap device" to
IFF_DOWN prior to DAD completing.
This patch waits for DAD completion by periodically polling the kernel
using netlink to check whether there are any IPv6 addresses assigned
to bridge which have a 'tentative' state (if there are any in this
state, then DAD hasn't yet finished). After DAD is finished, execution
continues. To avoid an endless hang in case something was wrong with
the kernel's DAD, we wait a maximum of 5 seconds.
Commit id '1c24cfe9' added error messages for virNumaSetPagePoolSize;
however, virNumaGetHugePageInfo also uses virNumaGetHugePageInfoPath
in order to build the path, but it never checked upon return if
the built path exists which could lead to an error message as follows:
$ virsh freepages 0 1
error: Failed to open file
'/sys/devices/system/node/node0/hugepages/hugepages-1kB/free_hugepages':
No such file or directory
Rather than add the same message for the other two callers, adjust
the virNumaGetHugePageInfoPath in order not only build the path, but
also check if the built path exists. If the path does not exist,
then generate the error message and return failure.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Commit id '1c24cfe9' added new checks and error messaes for failure
scenarios. Let's adjust those error messages to after the call to
virNumaGetHugePageInfoPath in order to provide a more specific error
message depending on node and page_size
After this patch:
# virsh allocpages --pagesize 2047 --pagecount 1 --cellno 0
error: operation failed: page size 2047 is not available on node 0
# virsh allocpages --pagesize 2047 --pagecount 1
error: operation failed: page size 2047 is not available
Signed-off-by: Luyao Huang <lhuang@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1265114
Refactor helper virNumaGetHugePageInfoPath to handle returning a directory
path when passed a page_size of 0 and suffix == NULL into a new helper
virNumaGetHugePageInfoDir which will only be called when a directory
path is expected to be returned. This solves the issue where the helper
was called with page_size == 0 expecting a file path in return, but
instead got a directory path and failed in virFileReadAll with:
error : virFileReadAll:1358 : Failed to read file
'/sys/devices/system/node/node0/hugepages/': Is a directory
Since virNumaGetPages API expects to return a directory by passing
page_size == 0 and suffix == NULL, it will now call the new helper.
Callers to virNumaGetHugePageInfoPath expect to return a file path
which could then be used in the call to virFileReadAll.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
We have macros for both positive and negative string matching.
Therefore there is no need to use !STREQ or !STRNEQ. At the same
time as we are dropping this, new syntax-check rule is
introduced to make sure we won't introduce it again.
Signed-off-by: Ishmanpreet Kaur Khera <khera.ishman@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Event implementations need to be registered before a connection to the
Hypervisor is opened, otherwise event handling can be impaired (e.g.
delayed messages). This fact is referenced in an e-mail [1], but should
also be noted in the documentation of the registration functions.
[1] https://www.redhat.com/archives/libvirt-users/2014-April/msg00011.html
After a successful creation of a directory, if some other call results
in returning a failure, let's remove the directory we created to
prevent another round trip or confusion in the caller. In particular, this
function can be called during a storage backend buildVol, so in order
to ensure that caller doesn't need to distinguish between failed create
or some other failure after create, just remove the directory we created.
Signed-off-by: John Ferlan <jferlan@redhat.com>
After a successful creation of a file, if some other call results
in returning a failure, let's unlink the file we created to prevent
another round trip or confusion in the caller. In particular, this
function can be called during a storage backend buildVol, so in order
to ensure that caller doesn't need to distinguish between failed create
or some other failure after create, just remove the volume we created.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The internal representation of a JSON array counts the items in
size_t. However, for some reason, when asking for the count it's
reported as int. Firstly, we need the function to return a signed
type as it's returning -1 on an error. But, not every system has
integer the same size as size_t. Therefore, lets return ssize_t.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
As it turns out the caller in this case expects a return < 0 for failure
and to get/use "errno" rather than using the negative of returned status.
Again different than the create path.
If someone "deleted" a file from the pool without using virsh vol-delete,
then the unlink/rmdir would return an error (-1) and set errno to ENOENT.
The caller checks errno for ENOENT when determining whether to throw an
error message indicating the failure. Without the change, the error
message is:
error: Failed to delete vol $vol
error: cannot unlink file '/$pathto/$vol': Success
This patch thus allows the fork path to follow the non-fork path
where unlink/rmdir return -1 and errno.
Unlike create options, if the file to be removed is already in the
pool, then the uid/gid will come from the pool. If it's the same as the
currently running process, then just do the unlink/rmdir directly
rather than going through the fork processing unnecessarily
Similar to commit id '35847860', it's possible to attempt to create
a 'netfs' directory in an NFS root-squash environment which will cause
the 'vol-delete' command to fail. It's also possible error paths from
the 'vol-create' would result in an error to remove a created directory
if the permissions were incorrect (and disallowed root access).
Thus rename the virFileUnlink to be virFileRemove to match the C API
functionality, adjust the code to following using rmdir or unlink
depending on the path type, and then use/call it for the VIR_STORAGE_VOL_DIR
Commit id 'f1f68ca33' added code to remove the directory paths for
auto-generated sockets, but that code could be called before the
paths were created resulting in generating error messages from
virFileDeleteTree indicating that the file doesn't exist.
Rather than "enforce" all callers to make the non-NULL and existence
checks, modify the virFileDeleteTree API to silently ignore NULL on
input and non-existent directory trees.
Commit 35847860f6 Added the virFileUnlink function, but failed to add
a version for mingw build, causing the following error:
Cannot export virFileUnlink: symbol not defined
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Coverity claims it could be possible to call virDBusTypeStackFree with
*stack == NULL and although the two API's that call it don't appear to
allow that - I suppose it's better to be safe than sorry
In virFileNBDDeviceFindUnused if virFileNBDDeviceIsBusy returns 0,
then both branches jumped to cleanup, so just use ignore_value
since the function returns NULL or some memory and the caller
handles the error.
Before libvirt sets the MAC address of the physdev (the physical
ethernet device) linked to a macvtap passthrough device, it always
saves the previous MAC address to restore when the guest is finished
(following a "leave nothing behind" policy). For a long time it
accomplished the save/restore with a combination of
ioctl(SIOCGIFHWADDR) and ioctl(SIOCSIFHWADDR), but in commit cbfe38c
(first in libvirt 1.2.15) this was changed to use netlink RTM_GETLINK
and RTM_SETLINK commands sent to the Physical Function (PF) of any
device that was detected to be a Virtual Function (VF).
We later found out that this caused problems with any devices using
the Cisco enic driver (e.g. vmfex cards) because the enic driver
hasn't implemented the function that is called to gather the
information in the IFLA_VFINFO_LIST attribute of RTM_GETLINK
(ndo_get_vf_config() for those keeping score), so we would never get
back a useful response.
In an ideal world, all drivers would implement all functions, but it
turns out that in this case we can work around this omission without
any bad side effects - since all macvtap passthrough <interface>
definitions pointing to a physdev that uses the enic driver *must*
have a <virtualport type='802.1Qbh'>, and since no other type of
ethernet devices use 802.1Qbh, libvirt can change its behavior in this
case to use the old-style. ioctl(SIOC[GS]IFHWADDR). That's what this
patch does.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1257004
These functions were made static as a part of commit cbfe38c since
they were no longer called from outside virnetdev.c. We once again
need to call them from another file, so this patch makes them once
again public.
In an NFS root-squashed environment the 'vol-delete' command will fail to
'unlink' the target volume since it was created under a different uid:gid.
This code continues the concepts introduced in virFileOpenForked and
virDirCreate[NoFork] with respect to running the unlink command under
the uid/gid of the child. Unlike the other two, don't retry on EACCES
(that's why we're here doing this now).
This will only be seen when debugging, but in order to help determine
whether a virFileOpenForceOwnerMode failed during an NFS root-squash
volume/file creation, add an error message from the child.
commit 09778e09 switched from using ioctl(SIOCBRDELBR) for bridge
device deletion to using a netlink RTM_DELLINK message, which is the
more modern way to delete a bridge (and also doesn't require the
bridge to be ~IFF_UP to succeed). However, although older kernels
(e.g. 2.6.32, in RHEL6/CentOS6) support deleting *some* link types
with RTM_NEWLINK, they don't support deleting bridges, and there is no
compile-time way to figure this out.
This patch moves the body of the SIOCBRDELBR version of
virNetDevBridgeDelete() into a static function, calls the new function
from the original, and also calls the new function from the
RTM_DELLINK version if the RTM_DELLINK message generates an EOPNOTSUPP
error. Since RTM_DELLINK is done from the subordinate function
virNetlinkDelLink, which is also called for other purposes (deleting a
macvtap interface), a function pointer called "fallback" has been
added to the arglist of virNetlinkDelLink() - if that arg != NULL, the
provided function will be called when (and only when) RTM_DELLINK
fails with EOPNOTSUPP.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1252780 (part 2)
commit fc7b23db switched from using ioctl(SIOCBRADDBR) for bridge
creation to using a netlink RTM_NEWLINK message with IFLA_INFO_KIND =
"bridge", which is the more modern way to create a bridge. However,
although older kernels (e.g. 2.6.32, in RHEL6/CentOS6) support
creating *some* link types with RTM_NEWLINK, they don't support
creating bridges, and there is no compile-time way to figure this out
(since the "type" isn't an enum, but rather a character string).
This patch moves the body of the SIOCBRADDBR version of
virNetDevBridgeCreate() into a static function, calls the new function
from the original, and also calls the new function from the
RTM_NEWLINK version if the RTM_NEWLINK message generates an EOPNOTSUPP
error.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1252780
So far, the virProcessSetNamespaces() takes an array of FDs that
it tries to set namespace on. However, in the very next commit
this array may be sparse, having some -1's in it. Teach the
function to cope with that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The ACS checks are meaningless when using the more modern VFIO driver
for device assignment since VFIO has its own more complete and exact
checks, but I didn't realize that when I added support for VFIO. This
patch eliminates the ACS check when preparing PCI devices for
assignment if VFIO is being used.
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=1256486
Commit 89c509a0 added getters for cgroup block device I/O throttling,
however stub versions of these functions have not matching function
prototypes that result in compilation fail on platforms not supporting
cgroup.
Fix build by correcting prototypes of the stubbed functions.
Pushing under build-breaker rule.
This function translates device paths to "major:minor " string, and all
virCgroupSetBlkioDevice* functions are modified to use it. It's a
cleanup with no functional change.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
That function takes string list and returns first string in that list
that starts with the @prefix parameter with that prefix being skipped as
the caller knows what it starts with (also for easier manipulation in
future).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Fix inconsistency between function description and actual
parameter name in virConfGetValue/virConfSetValue.
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Ever since commit e44b0269, 64-bit mingw compilation fails with:
../../src/util/virprocess.c: In function 'virProcessGetPids':
../../src/util/virprocess.c:628:50: error: passing argument 4 of 'virStrToLong_i' from incompatible pointer type [-Werror=incompatible-pointer-types]
if (virStrToLong_i(ent->d_name, NULL, 10, &tmp_pid) < 0)
^
In file included from ../../src/util/virprocess.c:59:0:
../../src/util/virstring.h:53:5: note: expected 'int *' but argument is of type 'pid_t * {aka long long int *}'
int virStrToLong_i(char const *s,
^
cc1: all warnings being treated as errors
Although mingw won't be using this function, it does compile the
file, and the fix is relatively simple.
* src/util/virprocess.c (virProcessGetPids): Don't assume pid_t
fits in int.
Signed-off-by: Eric Blake <eblake@redhat.com>
If this function fails, the error message is reported only in
some cases (e.g. OOM), but in some it's not (e.g. duplicate key).
This fact is painful and we should either not report error at all
or report the error in all possible cases. I vote for the latter.
Unfortunately, since the key may be an arbitrary value (not
necessarily a string) we can't report it in the error message.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In 9190f0b0 we've tried to fix an OOM. And boy, was that fix
successful. But back then, the hash table implementation worked
strictly over string keys, which is not the case anymore. Hash
table have this function keyCopy() which returns void *.
Therefore a local variable that is temporarily holding the
intermediate return value from that function should be void *
too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In order to share as much virsh' logic as possible with upcomming
virt-admin client we need to split virsh logic into virsh specific and
client generic features.
Since majority of virsh methods should be generic enough to be used by
other clients, it's much easier to rename virsh specific data to virshX
than doing this vice versa. It moved generic virsh commands (including info
and opts structures) to generic module vsh.c.
Besides renaming methods and structures, this patch also involves introduction
of a client specific control structure being referenced as private data in the
original control structure, introduction of a new global vsh Initializer,
which currently doesn't do much, but there is a potential for added
functionality in the future.
Lastly it introduced client hooks which are especially necessary during
client connecting phase.
This fixes the crash described here:
https://www.redhat.com/archives/libvir-list/2015-August/msg00162.html
In short, we were calling ioctl(SIOCETHTOOL) pointing to a too-short
object that was a local on the stack, resulting in the memory past the
end of the object being overwritten. This was because the struct used
by the ETHTOOL_GFEATURES command of SIOCETHTOOL ends with a 0-length
array, but we were telling ethtool that it could use 2 elements on the
array.
The fix is to allocate the necessary memory with VIR_ALLOC_VAR(),
including the extra length needed for a 2 element array at the end.
This is no functional change. It's just that later in the series we
will need to pass class_id as an integer.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There is no guarantee that an enum start it mapped onto a value
of zero. However, we are guaranteed that enum items are
consecutive integers. Moreover, it's a pity to define an enum to
avoid using magical constants but then using them anyway.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This patch modifies virSocketAddrGetRange() to function properly when
the containing network/prefix of the address range isn't known, for
example in the case of the NAT range of a virtual network (since it is
a range of addresses on the *host*, not within the network itself). We
then take advantage of this new functionality to validate the NAT
range of a virtual network.
Extra test cases are also added to verify that virSocketAddrGetRange()
works properly in both positive and negative cases when the network
pointer is NULL.
This is the *real* fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=985653
Commits 1e334a and 48e8b9 had earlier been pushed as fixes for that
bug, but I had neglected to read the report carefully, so instead of
fixing validation for the NAT range, I had fixed validation for the
DHCP range. sigh.
Qemu reports physical size 0 for block devices. As 15fa84acbb
changed the behavior of qemuDomainGetBlockInfo to just query the monitor
this created a regression since we didn't report the size correctly any
more.
This patch adds code to refresh the physical size of a block device by
opening it and seeking to the end and uses it both in
qemuDomainGetBlockInfo and also in qemuDomainGetStatsOneBlock that was
broken since it was introduced in this respect.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1250982
The commit 7e72de4 didn't consider the hotplug scenarios. The patch addresses
the hotplug case whereby if atleast one of the pci function is owned by a
guest, the hotplug of other functions/devices in the same iommu group to the
same guest goes through successfully.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
So far qemu-nbd is run even if the nbd kernel module isn't loaded. This
leads to errors when the user starts his lxc container while libvirt
could easily load the nbd module automatically.
Commit id 'ac3ed2085' causes 'virsh nodedev-list --cap net' to fail
on any system without SYSFS_INFINIBAND_DIR (/sys/class/infiniband).
Rather than assume it's there and fail on the attempt to open the
non-existent directory, check if it's there - if not, return
success and move on. Also fix caller to check < 0 upon return.
As reported by Suren Hajyan <shajyan@redhat.com> from run of unit tests
Commit ac3ed20 breaks build on FreeBSD with:
CC util/libvirt_util_la-virnetdev.lo
util/virnetdev.c:2967:1: error: unused function 'virNetDevRDMAFeature' [-Werror,-Wunused-function]
virNetDevRDMAFeature(const char *ifname,
^
So hide virNetDevRDMAFeature function under the #ifdef 'SIOCETHTOOL'
and 'HAVE_STRUCT_IFREQ' section.
Pushed under the build breaker rule.
The scope name, even according to our docs is
"machine-$DRIVER\x2d$VMNAME.scope" virSystemdMakeScopeName would use the
resource partition name instead of "machine-" if it was specified thus
creating invalid scope paths.
This makes libvirt drop cgroups for a VM that uses custom resource
partition upon reconnecting since the detected scope name would not
match the expected name generated by virSystemdMakeScopeName.
The error is exposed by the following log entry:
debug : virCgroupValidateMachineGroup:302 : Name 'machine-qemu\x2dtestvm.scope' for controller 'cpu' does not match 'testvm', 'testvm.libvirt-qemu' or 'machine-test-qemu\x2dtestvm.scope'
for a "/machine/test" resource and "testvm" vm.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1238570
Adding functionality to libvirt that will allow
it query the interface for the availability of RDMA and
tx-udp_tnl-segmentation Offloading NIC capabilities
Here is an example of the feature XML definition:
<device>
<name>net_eth4_90_e2_ba_5e_a5_45</name>
<path>/sys/devices/pci0000:00/0000:00:03.0/0000:08:00.1/net/eth4</path>
<parent>pci_0000_08_00_1</parent>
<capability type='net'>
<interface>eth4</interface>
<address>90:e2:ba:5e:a5:45</address>
<link speed='10000' state='up'/>
<feature name='rx'/>
<feature name='tx'/>
<feature name='sg'/>
<feature name='tso'/>
<feature name='gso'/>
<feature name='gro'/>
<feature name='rxvlan'/>
<feature name='txvlan'/>
<feature name='rxhash'/>
<feature name='rdma'/>
<feature name='txudptnl'/>
<capability type='80203'/>
</capability>
</device>
virDomainMigrateFinish* APIs were unfortunately designed to return the
pointer to the domain on destination and NULL on error. This looks OK in
normal cases but the same API is also called when we know migration
failed and thus we expect Finish to return NULL even if it actually did
all it was supposed to do without any error. The call is defined to
return nonnull domain pointer over RPC, which means returning NULL will
always result in an error being send. If this was not in fact an error,
the API itself wouldn't set anything to the thread local virError, which
makes the RPC layer come up with it's own "Library function returned
error but did not set virError" error.
This is quite confusing and also hard to detect by the caller. This
patch adds a special error code which can be used to check that Finish
successfully aborted migration.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This is a self-locking wrapper around virHashTable. Only a limited set
of APIs are implemented now (the ones which are used in the following
patch) as more can be added on demand.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Optimize the virBitmap to array-of-char bitmap conversion by skipping
trailing zero bytes.
This also fixes a regression when requesting iothread information from a
live VM since after commit 825df8c315 the
bitmap returned from virProcessGetAffinity is too big to be formatted
properly via RPC. A user would get the following error:
error: Unable to get domain IOThreads information
error: Unable to encode message payload
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1238589
Avoid a false positive since Coverity find a path in virResizeN which
could return 0 prior to the allocation of memory and thus flags a
possible NULL dereference. Instead allocate the output buffer based
on 'nparams' and only fill it partially if need be - shouldn't be too
much a waste of space. Quicker than multiple VIR_RESIZE_N calls or
two loops of STREQ's sandwiched around a single VIR_ALLOC_N using
'n' matches from a first loop to generate the 'n' addresses to return
Signed-off-by: John Ferlan <jferlan@redhat.com>
Convert virPCIDriverDir to return the buffer allocated (or not) and make the
appropriate check in the caller.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Convert virPCIDriverFile to return the buffer allocated (or not) and make the
appropriate check in the caller.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Convert virPCIFile to return the buffer allocated (or not) and make the
appropriate check in the caller.
Signed-off-by: John Ferlan <jferlan@redhat.com>
We already enable the parser option to detect invalid UTF-8, but
didn't test it. Also, JSON states that behavior of an object
with a duplicated key is undefined; we chose to reject it, but
were not testing it.
With the enhanced tests in place, we can simplify yajl2
initialization by relying on parser defaults being sane.
* src/util/virjson.c (virJSONValueFromString): Simplify.
* tests/jsontest.c (mymain): Test more bad usage.
Signed-off-by: Eric Blake <eblake@redhat.com>
Since older yajl ignores trailing garbage, a client can cause
problems by intentionally ending the wrapper array early. Since
we already track nesting, it's not too much harder to reject
invalid nesting pops.
* src/util/virjson. (_virJSONParser): Add field.
(virJSONValueFromString): Set witness.
(virJSONParserHandleEndArray): Use it to catch abuse.
* tests/jsontest.c (mymain): Test it.
Signed-off-by: Eric Blake <eblake@redhat.com>