The fix for CVE-2018-6764 introduced a potential deadlock scenario
that gets triggered by the NSS module when virGetHostname() calls
getaddrinfo to resolve the hostname:
#0 0x00007f6e714b57e7 in futex_wait
#1 futex_wait_simple
#2 __pthread_once_slow
#3 0x00007f6e71d16e7d in virOnce
#4 0x00007f6e71d0997c in virLogInitialize
#5 0x00007f6e71d0a09a in virLogVMessage
#6 0x00007f6e71d09ffd in virLogMessage
#7 0x00007f6e71d0db22 in virObjectNew
#8 0x00007f6e71d0dbf1 in virObjectLockableNew
#9 0x00007f6e71d0d3e5 in virMacMapNew
#10 0x00007f6e71cdc50a in findLease
#11 0x00007f6e71cdcc56 in _nss_libvirt_gethostbyname4_r
#12 0x00007f6e724631fc in gaih_inet
#13 0x00007f6e72464697 in __GI_getaddrinfo
#14 0x00007f6e71d19e81 in virGetHostnameImpl
#15 0x00007f6e71d1a057 in virGetHostnameQuiet
#16 0x00007f6e71d09936 in virLogOnceInit
#17 0x00007f6e71d09952 in virLogOnce
#18 0x00007f6e714b5829 in __pthread_once_slow
#19 0x00007f6e71d16e7d in virOnce
#20 0x00007f6e71d0997c in virLogInitialize
#21 0x00007f6e71d0a09a in virLogVMessage
#22 0x00007f6e71d09ffd in virLogMessage
#23 0x00007f6e71d0db22 in virObjectNew
#24 0x00007f6e71d0dbf1 in virObjectLockableNew
#25 0x00007f6e71d0d3e5 in virMacMapNew
#26 0x00007f6e71cdc50a in findLease
#27 0x00007f6e71cdc839 in _nss_libvirt_gethostbyname3_r
#28 0x00007f6e71cdc724 in _nss_libvirt_gethostbyname2_r
#29 0x00007f6e7248f72f in __gethostbyname2_r
#30 0x00007f6e7248f494 in gethostbyname2
#31 0x000056348c30c36d in hosts_keys
#32 0x000056348c30b7d2 in main
Fortunately the extra stuff virGetHostname does is totally irrelevant to
the needs of the logging code, so we can just inline a call to the
native hostname() syscall directly.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The floppy command formatting is special-cased since it does not
directly translate to a single '-device' argument.
Move the code from qemuBuildDiskDriveCommandLine to a new helper
function so that all the related code is together.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
We forgot to free alloced mem when failed to
dup ifname or macaddr.
Also use VIR_STEAL_PTR to simplify codes.
Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Some of function comments don't have the right named parameters
and others are not consistent with the description alignment.
This patch fixes this.
Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
Several PCI controllers have the same options, and thus
can be handled together.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
This is a hard error, and should be handled as such.
Introduced in 2461476022.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
The Win32 symbol export file format can't do wildcards, so none of
the 'xdr_*' symbols are exported from the libvirt DLL. This doesn't
matter generally since the RPC client is built into the DLL and we
don't build libvirtd on Win32. The virnetmessagetest, however, does
require xdr_virNetMessageError to be exported, so just do a hack for
that.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Commit id 'ce7ae55e' added support for the lockd admin socket, but
forgot to add the socket to the make and spec files for installation
purposes.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Commit id '85d45ff0' added support for the logd admin socket, but
forgot to add the socket to the make and spec files for installation
purposes.
NB: Includes breaking up the long %systemd_ lists across multiple lines
for ease of reading
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Loadable drivers must never depend on each other. Over time some usage
mistakenly crept in for the storage and network drivers, but now this is
eliminated the syntax-check rules can enforce this separation once more.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Undefined symbols are a bad thing in general because they can get
resolved in unexpected ways at runtime if multiple sources provide the
same symbol name. For example both glibc and libtirpc may provide XDR
symbols and we want to ensure that we resolve to libtirpc if that's what
we originally built against.
The toolchain maintainers thus strongly recommend that all applications
use the '-z defs' linker flag to prevent undefined symbols. This is
shortly becoming part of the default linker flags for RPMs. As an added
benefit this aligns Linux builds with Windows builds, where the linker
has never permitted undefined symbols.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Dynamic loadable modules all need a common set of linker flags
-module -avoid-version $(AM_LDFLAGS)
Bundle those up into a $(AM_LDFLAGS_MOD) to avoid repetition.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The dlopened modules we currently build all use various symbols from
libvirt.so, but don't actually link to it. They rely on the libvirtd
daemon re-exporting the libvirt.so symbols. This means that at the
time the modules are linked, they contain a huge number of undefined
symbols. It also means that these undefined symbols are not versioned,
so despite us providing a LIBVIRT_PRIVATE_XXXX version that
intentionally changes on every release, the loadable modules could
actually be loaded into any libvirtd regardless of version.
This change explicitly links all modules against libvirt.so so
that they don't rely on the re-export behave and can be fully resolved
at build time. This will give us a stronger guarantee modules will
actually be loadable at runtime and that we're using modules from the
matched build.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The storagePoolLookupByTargetPath() method in the storage driver is used
by the QEMU driver during block migration. If there's a valid use case
for this in the QEMU driver, then external apps likely have similar
needs. Exposing it in the public API removes the direct dependancy from
the QEMU driver to the storage driver.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The virStorageTranslateDiskSourcePool method modifies a virDomainDiskDef
to resolve any storage pool reference. For some reason this was added
into the storage driver code, despite working entirely in terms of the
public APIs. Move it into the domain conf file and rename it to match the
object it modifies.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The networkDnsmasqConfContents() method is only used by the test suite
and that's only built with WITH_NETWORK is set. So there is no longer
any reason to conditionalize the declaration of this method.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Currently the QEMU driver will call directly into the network driver
impl to modify resolve the atual type of NICs with type=network. It
has todo this before it has allocated the actual NIC. This introduces
a callback system to allow us to decouple the QEMU driver from the
network driver.
This is a short term step, as it ought to be possible to achieve the
same end goal by simply querying XML via the public network API. The
QEMU code in question though, has no virConnectPtr conveniently
available at this time.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The QEMU driver calls into the network driver to get the first IP
address of the network. This information is readily available via the
formal public API by fetching the XML doc and then parsing it.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Currently the QEMU driver will call directly into the network driver
impl to modify network device bandwidth for interfaces with
type=network. This introduces a callback system to allow us to decouple
the QEMU driver from the network driver.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Currently virt drivers will call directly into the network driver impl
to allocate domain interface devices where type=network. This introduces
a callback system to allow us to decouple the virt drivers from the
network driver.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Rather than static linking in various of the helper libraries to
libvirt_lxc, just link against the main libvirt.so. This is more memory
and time efficient because it will already be cached in memory and
sharable between processes.
CAPNG flags need adding because the LXC code directly calls various
libcapng APIs and no longer inherits the CAPNG flags via the statically
linked .a libs.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The libvirt_driver_remote.la static library is linked into the
libvirt.so dynamic library, providing both the generic RPC layer code
and the remote protocol client driver. The libvirtd daemon the itself
links to libvirt_driver_remote.la, in order to get access to the generic
RPC layer code and the XDR functions for the remote driver. This means
we get multiple copies of the same code in libvirtd, one direct and one
indirect via libvirt.so. The same mistake affects the lockd plugin.
The libvirtd daemon should instead just link aganist the generic RPC
layer code that's in libvirt.so. This is easily doable if we add exports
for the few symbols we've previously missed, and wildcard export xdr_*
to expose the auto-generated XDR marshallers.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The QEMU driver loadable module needs to be able to resolve all ELF
symbols it references against libvirt.so. Some of its symbols can only
be resolved against the storage_driver.so loadable module which creates
a hard dependancy between them. By moving the storage file backend
framework into the util directory, this gets included directly in the
libvirt.so library. The actual backend implementations are still done as
loadable modules, so this doesn't re-add deps on gluster libraries.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The storage driver backends are serving the public storage pools API,
while the storage file backends are serving the internal QEMU driver and
/ or libvirt utility code.
To prep for moving this storage file backend framework into the utility
code, split out the backend definitions.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
At later point it might not be possible or even safe to use getaddrinfo(). It
can in turn result in a load of NSS module.
Notably, on a LXC container startup we may find ourselves with the guest
filesystem already having replaced the host one. Loading a NSS module
from the guest tree would allow a malicous guest to escape the
confinement of its container environment because libvirt will not yet
have locked it down.
Refreshing the halted state can cause VM performance issues. Since
s390 is currently the only architecture with a known interest in
the halted state, we're avoiding to call QEMU on other platforms.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Since it may be possible that the state is unknown in some cases we
should store it as a tristate so that other code using it can determine
whether the state was updated.
Don't extract the halted state into a separate array, but rater access
the vcpu structures directly. We still need to call the vcpu helper to
retrieve the performance statistics though.
NUMA distances are part of guest ABI (guests can read it
directly!) and therefore as such shouldn't change throughout the
lifetime of domain.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The virt-aa-helper fails to parse the xmls with the memory/cpu
hotplug features or user assigned aliases. Set the features in
xmlopt->config for the parsing to succeed.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Tested-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Reviewed-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
https://bugzilla.redhat.com/show_bug.cgi?id=916061
If the QEMU version running is new enough (based on the DUMP_COMPLETED
event), then we can add a 'detach' boolean to the dump-guest-memory
command in order to tell QEMU to run in a thread. This ensures that we
don't lock out other commands while the potentially long running dump
memory is completed.
This allows the usage of a qemuDumpWaitForCompletion which will wait
for the event while the qemuDomainGetJobInfoDumpStats can be used via
qemuDomainGetJobInfo in order to query QEMU to determine how far along
the job is.
Now that we have a true async job, we'll only set the dump_memory_only
flag only when @detach=false; otherwise, we note that the job is a
for stats dump this allows the opposite end for job info to determine
what to copy.
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Add an API to allow fetching the memory only dump statistics
for a job via the qemuDomainGetJobInfo API.
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Add the query-dump API's in order to allow the dump-guest-memory
to be used to monitor progress. This will use the dump stats
extraction helper to fill a return buffer.
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Handle a DUMP_COMPLETED event processing the status, stats, and
error string. Use the @status in order to copy the error that
was generated whilst processing the @stats data. If an error was
provided by QEMU, then use that instead.
If there's no async job, we can just ignore the data.
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
The event will be fired when the domain memory only dump completes.
Fill in a return buffer to store/pass along the dump statistics that
will be eventually shared by a query-dump command. Also pass along
the status of the filling and any possible error received.
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Define the qemuMonitorDumpStats as a new job JobStatsType to handle
being able to get memory dump statistics. For now do nothing with
the new TYPE_MEMDUMP.
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Add a TYPE_SAVEDUMP so that when coalescing stats for a save or
dump we don't needlessly try to get the mirror stats for a migration.
Other conditions can still use MIGRATION and SAVEDUMP interchangably
including usage of the @migStats field to fetch/store the data.
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Convert the stats field in _qemuDomainJobInfo to be a union. This
will allow for the collection of various different types of stats
in the same field.
When starting the async job that will end up being used for stats,
set the @statsType value appropriately. The @mirrorStats are
special and are used with stats.mig in order to generate the
returned job stats for a migration.
Using the NONE should avoid the possibility that some random
async job would try to return stats for migration even though
a migration is not in progress.
For now a migration and a save job will use the same statsType
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Note the fact that the unused portion of the last element in the bitmap
needs to be cleared, since we use functions which process only full-size
elements and don't really deal with individual bits.
The function only reduces the size of the bitmap thus we can use the
appropriate shrinking function which also does not have any return
value.
Since virBitmapShrink now does not return any value callers need to be
fixed as well.
The virBitmap code uses VIR_RESIZE_N to do quadratic scaling, which
means that along with the number of requested map elements we also need
to keep the number of actually allocated elements for the scaling
algorithm to work properly.
The shrinking code did not fix 'map_alloc' thus virResizeN might
actually not expand the bitmap properly after called on a previously
shrunk bitmap.
'max_bit' is misleading as the value is set to the first invalid bit
as it's used as the number of bits in the bitmap. Rename it to a more
descriptive name.
Since one of the things in capabilities (info from resctrl updated with data
about caches) can be change on the system by remounting the /sys/fs/resctrl with
different options, the capabilities need to be refreshed. There is a better fix
in the works, but it's going to be way bigger than this (hence the XXX note
there), so for the time being let's workaround this. And in order not to slow
down the domain starting, only get the capabilities if there are any cachetunes.
Relates-to: https://bugzilla.redhat.com/show_bug.cgi?id=1540780
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Just in case someone re-mounted /sys/fs/resctrl with different mount
options (cdp), add a check here.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1540780
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Add and use qemuProcessEventFree for freeing qemuProcessEvents. This
is less error-prone as the compiler can help us make sure that for
every new enumeration value of qemuProcessEventType the
qemuProcessEventFree function has to be adapted.
All process*Event functions are *only* called by
qemuProcessHandleEvent and this function does the freeing by itself
with qemuProcessEventFree. This means that an explicit freeing of
processEvent->data is no longer required in each process*Event
handler.
The effectiveness of this change is also demonstrated by the fact that
it fixes a memory leak of the panic info data in
qemuProcessHandleGuestPanic.
Reported-by: Wang Dong <dongdwdw@linux.vnet.ibm.com>
Signed-off-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Use the return value of virObjectRef directly. This way, it's easier
for another reader to identify the reason why the additional reference
is required.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
No sense in calling ServiceToggle for all nservices during
ServiceDispose since ServerClose calls ServiceClose which
removes the IOCallback that's being toggled via ServiceToggle.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Add the DUMP_COMPLETED check to the capabilities. This is the
mechanism used to determine whether the dump-guest-memory command
can support the "-detach" option and thus be able to wait on the
event and allow for a query of the progress of the dump.
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Extract out the parts of qemuDomainGetJobStatsInternal that get
the migration stats. We're about to add the ability to get just
dump information.
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
commit 77a12987a4 changed the "virDomainChrSourceDef source" inside
virDomainChrDef to "virDomainChrSourceDefPtr source", and started
allocating source inside virDomainChrDefNew(), but vboxDumpSerial()
was allocating a virDomainChrDef with a simple VIR_ALLOC() (i.e. never
calling virDomainChrDefNew()), so source was never initialized,
leading to a SEGV any time a serial port was present. The same problem
was created in vboxDumpParallel().
This patch changes vboxDumpSerial() and vboxDumpParallel() to use
virDomainChrDefNew() instead of VIR_ALLOC(), and changes both of those
functions to return an error if virDomainChrDef() (or any other
allocation) fails.
This resolves: https://bugzilla.redhat.com/1536649
Remove the unnecessary check as since commit id '46a811db07' it is
not possible to add or alter a filter using the same name, but with
a different UUID.
NB: It's not required to provide a UUID for a filter by name, but
if one is provided, then it must match the existing. If not provided,
then one is generated during ParseXML processing.
Reviewed-by: Laine Stump <laine@laine.org>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Use a switch statement instead of if-else-if statements. Move the
command line building of the iothread attribute into the common path
as the SCSI controller attributes are already validated.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Move the SATA controller check from command line building to
controller def validation. This includes copying the SATA
skip check found in qemuBuildSkipController.
Move the qemuCaps checks over to qemuDomainControllerDefValidatePCI.
This requires two test updates in order to set the correct capability
bit for an xml2xml test as well as setting up the similar capability
for the pseries memlocktest.
Excluding the qemuCaps checks, move the remainder of the checks
that validate whether the PCI definition is valid or not into
qemuDomainControllerDefValidatePCI.
Similar to the checking the modelName vs. NAME_NONE, let's make the
ModelNameTypeToString check more generic too within the checking done
in controller validation (with the same ignore certain models.
NB: We need to keep the ModelNameTypeToString fetch in command line
validation since we use it, but at least we can assume it returns
something valid now.
Move the various modelName == NAME_NONE from the command line
generation into domain controller validation. Also rather than
have multiple cases with the same check, let's make the code
more generic, but also note that it was the modelName option
that caused the failure. We also have to be sure not to check
the PCI models that we don't care about.
For the remaining checks in command line building, we can use
the field name in the error message to be more specific about
what causes the failure.
We format the 'chassis' and 'port' properties on the QEMU command
line later on, so we should make sure they've been set.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Move PCI validation checks out of qemu_command into the proper
qemu_domain validation helper.
Since there's a lot to move, we'll start slow by replicating the
pcie-root and pci-root avoidance from qemuBuildSkipController and
the first switch found in qemuBuildControllerDevStr.
Move SCSI validation from qemu_command into qemu_domain.
Rename/reorder the args in qemuCheckSCSIControllerIOThreads
to match the caller as well as fixing up the comments to
remove the previously removed qemuCaps arg.
Modify the SCSI controller switch during command line building
to account for all virDomainControllerModelSCSI types rather
than using the default label.
Move the checks that various attributes are not set on any controller
other than SCSI controller using virtio-scsi model into the common
controller validate checks.
Some of the other functions depend on the fact that unused bits and longs are
always zero and it's less error-prone to clear it than fix the other functions.
It's enough to zero out one piece of the map since we're calling realloc() to
get rid of the rest (and updating map_len).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1540817
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The position of various parameters changes depending on the WITH_GNUTLS
macro.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Since we annotate the APIs are having non-NULL parameters, we can remove
the checks for NULL in the code too.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1461214
Since fec8f9c49a we try to use predictable file names for
'memory-backend-file' objects. But that made us provide full path
to qemu when hot plugging the object while previously we provided
merely a directory. But this makes qemu behave differently. If
qemu sees a path terminated with a directory it calls mkstemp()
and unlinks the file immediately. But if it sees full path it
just calls open(path, O_CREAT ..); and never unlinks the file.
Therefore it's up to libvirt to unlink the file and not leave it
behind.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In my first approach in 4b480d1076 I overlooked the comment in
qemuMigrationRunIncoming stating that during actual migration the
qemuMigrationRunIncoming does not wait until the migration is complete
but rather offloads that to the Finish phase of migration.
This means that during actual migration qemuProcessRefreshState was
called prior to qemu actually transferring the full state and thus the
queries did not get the correct information. The approach worked only
for restore, where we wait for the migration to finish during qemu
startup.
Fix the issue by calling qemuProcessRefreshState both from
qemuProcessStart if there's no incomming migration and from
qemuMigrationFinish so that the code actually works as expected.
In 2074ef6cd4 and c56cdf259 (and friends) we've added two
attributes to virtio NICs: rx_queue_size and tx_queue_size.
However, sysadmins might want to set these on per-host basis but
don't necessarily have an access to domain XML (e.g. because they
are generated by some other app). So let's expose them under
qemu.conf (the settings from domain XML still take precedence as
they are more specific ones).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Commit id 'bc444666f' added a check if the returned data
buffer had an error, but failed to adjust the event from
VIR_DOMAIN_BLOCK_JOB_COMPLETED to VIR_DOMAIN_BLOCK_JOB_FAILED
in order to propagate an error such as "File descriptor in bad
state" that may be returned from QEMU when both @offset and
@len are set to 0 such as is the case when performing an async
block job read on a read only filesystem.
Signed-off-by: Jie Wang <wangjie88@huawei.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Use the DEVICE_MISSING error code when helpers fail to find
the requested device. This makes it easier for consumers to
key off the error code rather than the error message.
Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Modify OPERATION_FAILED and INTERNAL_ERROR error codes to
use DEVICE_MISSING instead for failures associated with the
inability to find the device. This makes it easier for consumers
to key off the error code rather than the error message.
Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Add new error code to be able to allow consumers (such as Nova) to be
able to key of a specific error code rather than needing to search the
error message."
Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Now that we can open connections to the secondary drivers on demand,
there is no need to pass a virConnectPtr into all the backend
functions.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Instead of passing around a virConnectPtr object, just open a connection
to the nodedev driver at time of use. Opening connections on demand will
be beneficial when the nodedev driver is in a separate daemon. It also
solves the problem that a number of callers just pass in a NULL
connection today which prevents nodedev lookup working at all.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Instead of passing around a virConnectPtr object, just open a connection
to the secret driver at time of use. Opening connections on demand will
be beneficial when the secret driver is in a separate daemon. It also
solves the problem that a number of callers just pass in a NULL
connection today which prevents secret lookup working at all.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Various parts of libvirt will want to open connections to secondary
drivers. The right URI to use will depend on the context, so rather than
duplicating that logic in various places, use some helper APIs. This
will also make it easier for us to later pre-open/cache connections to
avoid repeated opening & closing the same connectiong during autostart.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Allow the possibility of opening a connection to only the secret
driver, by defining secret:///system and secret:///session URIs
and registering a fake hypervisor driver that supports them.
The hypervisor drivers can now directly open a secret driver
connection at time of need, instead of having to pass around a
virConnectPtr through many functions. This will facilitate the later
change to support separate daemons for each driver.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Allow the possibility of opening a connection to only the nodedev
driver, by defining nodedev:///system and nodedev:///session URIs
and registering a fake hypervisor driver that supports them.
The hypervisor drivers can now directly open a nodedev driver
connection at time of need, instead of having to pass around a
virConnectPtr through many functions. This will facilitate the later
change to support separate daemons for each driver.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Allow the possibility of opening a connection to only the interface
driver, by defining interface:///system and interface:///session URIs
and registering a fake hypervisor driver that supports them.
The hypervisor drivers can now directly open a interface driver
connection at time of need, instead of having to pass around a
virConnectPtr through many functions. This will facilitate the later
change to support separate daemons for each driver.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Allow the possibility of opening a connection to only the storage
driver, by defining a nwfilter:///system URI and registering a fake
hypervisor driver that supports it.
The hypervisor drivers can now directly open a nwfilter driver
connection at time of need, instead of having to pass around a
virConnectPtr through many functions. This will facilitate the later
change to support separate daemons for each driver.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Allow the possibility of opening a connection to only the network
driver, by defining network:///system and network:///session URIs
and registering a fake hypervisor driver that supports them.
The hypervisor drivers can now directly open a network driver
connection at time of need, instead of having to pass around a
virConnectPtr through many functions. This will facilitate the later
change to support separate daemons for each driver.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
By convention the last thing in the driver.c files should be the driver
callback table and function to register it.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Allow the possibility of opening a connection to only the storage
driver, by defining storage:///system and storage:///session URIs
and registering a fake hypervisor driver that supports them.
The hypervisor drivers can now directly open a storage driver
connection at time of need, instead of having to pass around a
virConnectPtr through many functions. This will facilitate the later
change to support separate daemons for each driver.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
By convention the last thing in the driver.c files should be the driver
callback table and function to register it.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Some platforms/toolchains will complain about casting
sockaddr_storage to sockaddr_un because it breaks strict
aliasing rule
../../src/util/virutil.c: In function 'virGetUNIXSocketPath':
../../src/util/virutil.c:2005: error: dereferencing pointer 'un' does break strict-aliasing rules [-Wstrict-aliasing]
Change the code to use a union, in the same way that the
virsocketaddr.h header does.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Now that the controller model is updated during post parse callback,
this code no longer needs to fetch the model based on the capabilities
and can just return the model directly if the controller is found.
Removal of @qemuCaps cascades through various callers which are now
updated to not pass the capabilities.
Now that post parse processing handles setting the SCSI controller
model, there's no need to call qemuDomainGetSCSIControllerModel to
get the "default controller" when building the command line controller
string or when assigning the spaprvio address since the controller
model value will already be filled in.
When an implicit controller is added, the model is defined as -1
(IOW: undefined). So, if an implicit SCSI controller was added,
can set the model to the default value if the underlying hypervisor
supports it.
During post parse processing, let's force setting the controller
model to default value if not already set for defined controllers
(e.g. the non implicit ones).
If we're going to add a controller to the domain, let's set the
default SCSI model value if we cannot find another SCSI controller
already present.
NB: Requires updating the live output test data since the model
will now be formatted.
Rename and rework qemuDomainSetSCSIControllerModel since we're
really not setting the SCSI controller model. Instead the code
is either returning the existing SCSI controller model value, the
default value based on the capabilities, or -1 with the error set.
Rather than repeat multiple steps in order to find the SCSI
controller model, let's combine them into one helper that will
return either the model from the definition or the default
model based on the capabilities.
This patch adds an extra check/error that the controller
that's being found actually exists. This just clarifies that
the error was because the controller doesn't exist rather
than the more generic error that we were unable to determine
the model from qemuDomainSetSCSIControllerModel when a -1
was passed in and the capabilities were unable to find one.
As it turns out virDomainDeviceFindControllerModel was only ever
called for SCSI controllers using VIR_DOMAIN_CONTROLLER_TYPE_SCSI
as a parameter.
So rename to virDomainDeviceFindSCSIController and rather than
return a model, let's return a virDomainControllerDefPtr to let
the caller reference whatever it wants.
Rather than one function serving two purposes, let's split out the
else condition which is checking whether the model can be used
during command line building based on the capabilities.
Signed-off-by: John Ferlan <jferlan@redhat.com>
When starting an LXC container, the /dev entries are created
under temp root (/var/run/libvirt/lxc/$name.dev), relabelled and
then the root is pivoted. However, when it comes to USB devices
which keep path to the device in the structure we need a way to
override the default /dev/usb/... path because we want to work
with the one under temp root. That's what @vroot argument is for
in virUSBDeviceNew. However, what is being passed there is:
vroot = /var/run/libvirt/lxc/lxc_0.dev/bus/usb
Therefore, constructed path is wrong:
dev->path = //var/run/libvirt/lxc/lxc_0.dev/bus/usb//dev/bus/usb/002/002
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
Add a virtlockd-admin-sock can serves the admin protocol for the virtlockd
daemon and define a virtlockd:///{system,session} URI scheme for
connecting to it.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a virtlogd-admin-sock can serves the admin protocol for the virtlogd
daemon and define a virtlogd:///{system,session} URI scheme for
connecting to it.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
With the current code it is neccessary to call
virNetDaemonNewPostExecRestart()
and then for each server that needs restarting you are supposed
to call
virNetDaemonAddSeverPostExecRestart()
This is fine if there's only ever one server, but as soon as you
have two servers it is impossible to use this design. The code
has no idea which servers were recorded in the JSON state doc,
nor in which order the hash table serialized its keys.
So this patch changes things so that we only call
virNetDaemonNewPostExecRestart()
passing in a callback, which is invoked once for each server
found int he JSON state doc.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
It is not possible to blindly call virNetDaemonGetServer()
because in a post-exec restart scenario, some servers may
not exist and this method will pollute the error logs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The server name and client data callbacks need to be non-NULL or the
system will crash at various times. This is particularly bad when some
of the crashes only occur post-exec restart.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virNetServer class is passing a pointer to itself to the
virNetServerClient as a 'void *' pointer. This is presumably due to fact
that the virnetserverclient.h file doesn't see the virNetServerPtr
typedef. The typedef is easily movable though, which lets us get
typesafe parameter passing, removing the confusion of passing two
distinct 'void *' pointers to one method.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When receiving multiple socket FDs from systemd, it is critical to know
what socket address each corresponds to so we can setup the right
protocols on each.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We don't have any per-client private data we need to persist, but the
RPC infrastructure requires that we provide the callbacks and serialize
an empty JSON object. This makes us future proof going forwards.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The admin server functionality is a generic concept that should be wired
up into all libvirt daemons, but is currently integrated with the
libvirtd code. Move it all into the src/admin directory to prepare for
broader reuse.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
During reconnect we need to reconstruct the paths of all cachetunes so that they
get cleaned up when the domain is stopped.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The virresctrl will use this as well and we need to have that info after restart
to properly clean up /sys/fs/resctrl.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Due to confusing naming the pointer to the mask got copied which must not
happen, so use UpdateMask instead of SetMask. That also means we can get
completely rid of SetMask.
Also don't clear the free bits since it is not used again (leftover from
previous versions).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Introduce virResctrlAllocCopyMasks() and use that to initially copy the default
group schemata to the allocation before reserving any parts of the cache. The
reason for this is that when new group is created the schemata will have unknown
data in it. If there was previously group with the same CLoS ID, it will have
the previous valies, if not it will have all bits set. And we need to set all
unspecified (in the XML) allocations to the same one as the default group.
Some non-Linux functions now need to be made public due to this change.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1289368
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
While the QEMU QAPI schema describes 'lun' as a number, the code dealing
with JSON strings does not strictly adhere to this schema and thus
formats the number back as a string. Use the new helper to retrieve both
possibilities.
Note that the formatting code is okay and qemu will accept it as an int.
Tweak also one of the test strings to verify that both formats work
with libvirt.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1540290
The helper is useful in cases when the JSON we have to parse may contain
one of the two due to historical reasons and the number value itself
would be stored as a string.
The <capabilities> feature has an attribute named 'policy', but the
error message mentioned the non-existing 'state' attribute instead.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Commit 000e950455 tried to fix improper bracketing when refreshing disk
volume stats for a backing volume. Unfortunately the condition is still
wrong as in cases as the backing store being inaccessible
storageBackendUpdateVolTargetInfo returns -2 if instructed to ignore
errors. The condition does not take this into account.
Dumping XML of a volume which has inacessible backing store would then
result into:
# virsh vol-dumpxml http.img --pool default
error: An error occurred, but the cause is unknown
Properly ignore -2 for backing volumes.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1540022
We are skipping non-directories under /sys/fs/resctrl/(info/) since those are not
interesting for us. However in tests it can sometimes happen that ent->d_type
is 0 instead of 4 (DT_DIR) for directories.
I've seen it fail on two machines. Different machines, different systems, I
cannot reproduce it even using the same setup. So one of the ways how to work
around this is call stat() on it. The other one is not checking if it is a
directory since we'll find out eventually when we want to read some files
underneath it.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This wil be used in the future, but it makes sense for now as well. It makes
sure there is no mask leftover that would leak.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Pointed out during review on one or two places, but it actually appears in lot
more places. So let's be consistent.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
When working on the CAT series one of the changes was that the pointer got
allocated in another part of the code, even when resctrl was not available on
the host system. However this one particular place neglected that so it needs
to be fixed in order to get the proper error message when requesting
<cachetune/> on HW with no support for it.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Commits f83c7c88 and 6eb1f2b9 broke the build on FreeBSD and OSX because
of symbols being undefined for those platforms.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
After processing the processEvent->data for a qemuProcessEventHandler
callout, it's expected that the called processEvent->eventType helper
will perform the proper free on the data field. In this case it's
a qemuMonitorEventPanicInfoPtr.
Just like SRIOV, a PCI device is only capable of the mediated devices
framework when it's bound to the vendor native driver, thus if a driver
change occurs, e.g. vendor_native->vfio, we need to refresh some of the
device's capabilities to reflect the reality, mdev included.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Suggested-by: Wu Zongyong <cordius.wu@huawei.com>
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Now that we have all the building blocks in place, switch the nodedev
driver to use the "new" virMediatedDeviceType type instead of the "old"
virNodeDevCapMdevType one.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
These are not necessary anymore, since these are going to be shadowed by
the helpers provided by util/virmdev.c module.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This is a replacement for the existing udevPCIGetMdevTypesCap which is
static to the udev backend. This simple helper constructs the sysfs path
from the device's base path for each mdev type and queries the
corresponding attributes of that type.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This should serve as a replacement for the existing udevFillMdevType
which is responsible for fetching the device type's attributes from the
sysfs interface. The problem with the existing solution is that it's
tied to the udev backend.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This is later going to replace the existing virNodeDevCapMdevType, since:
1) it's going to couple related stuff in a single module
2) util is supposed to contain helpers that are widely accessible across
the whole repository.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Most of them are static, however in case of PCI and SCSI_HOST devices,
the nested capabilities can change dynamically, e.g. due to a driver
change (from host_pci_driver -> vfio_pci).
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Suggested-by: Wu Zongyong <cordius.wu@huawei.com>
Whether asking for a number of capabilities supported by a device or
listing them, it's handled essentially by a copy-paste code, so extract
the common stuff into this new helper which also updates all
capabilities just before touching them.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Since we moved the helpers from nodedev driver to src/conf, the actual
'update' function using those helpers should be moved as well so that we
don't need to call back into the driver.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
The capabilities are defined/parsed/formatted/queried from this module,
no reason for 'update' not being part of the module as well. This also
involves some module-specific prefix changes.
This patch also drops the node_device_linux_sysfs module from the repo
since:
a) it only contained the capability handlers we just moved
b) it's only linked with the driver (by design) and thus unreachable to
other modules
c) we touch sysfs across all the src/util modules so the module being
deleted hasn't been serving its original intention for some time already.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This patch drops the capability matching redundancy by simply converting
the string input to our internal types which are then in turn used for
the actual capability matching.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
We currently have 2 methods that do the capability matching. This should
be condensed to a single function and all the derivates should just call
into that using a proper type conversion.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
We currently have 2 methods that do the capability matching. This should
be condensed to a single function and all the derivates should just call
into that using a proper type conversion.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
There is no method to rename inactive domains for test driver.
After this patch, we can rename the domains using 'domrename'.
virsh# domrename test anothertest
Domain successfully renamed
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For vhost-user ports, Open vSwitch acts as the server and QEMU the client.
When OVS crashes or restarts, the QEMU process should be reconnected to
OVS.
Signed-off-by: ZhiPeng Lu <lu.zhipeng@zte.com.cn>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The virNetSocketWriteSASL method has to encode the buffer it is given and then
write it to the underlying socket. This write is not guaranteed to send the
full amount of data that was encoded by SASL. We cache the SASL encoded data so
that on the next invocation of virNetSocketWriteSASL we carry on sending it.
The subtle problem is that the 'len' value passed into virNetSocketWriteSASL on
the 2nd call may be larger than the original value. So when we've completed
sending the SASL encoded data we previously cached, we must return the original
length we encoded, not the new length.
This flaw means we could potentially have been discarded queued data without
sending it. This would have exhibited itself as a libvirt client never receiving
the reply to a method it invokes, async events silently going missing, or worse
stream data silently getting dropped.
For this to be a problem libvirtd would have to be queued data to send to the
client, while at the same time the TCP socket send buffer is full (due to a very
slow client). This is quite unlikely so if this bug was ever triggered by a real
world user it would be almost impossible to reproduce or diagnose, if indeed it
was ever noticed at all.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
During migration, the lock process is paused in the perform phase
but not resumed if there is a subsequent failure, leaving the locked
resource unprotected.
The perform phase itself can fail, in which case the lock process
should be resumed before returning from perform. The finish phase
could also fail on the destination host, in which case the migration
is canceled in the confirm phase and the VM is resumed. The lock
process needs to be resumed there as well.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
We've been building up to this. This adds support for cputune/cachetune
settings for domains in the QEMU driver. The addition into
qemuProcessSetupVcpu() automatically adds support for hotplug. For hot-unplug
we need to remove the allocation only if all the vCPUs were unplugged. But
since the threads are left running, we can't really do much about it now.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1289368
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
More info in the documentation, this is basically the XML parsing/formatting
support, schemas, tests and documentation for the new cputune/cachetune element
that will get used by following patches.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
With this commit we finally have a way to read and manipulate basic resctrl
settings. Locking is done only on exposed functions that read/write from/to
resctrlfs. Not in functions that are exposed in virresctrlpriv.h as those are
only supposed to be used from tests.
More information about how resctrl works:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/x86/intel_rdt_ui.txt
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This will make the current functions obsolete and it will provide more
information to the virresctrl module so that it can be used later.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This wires up the previously added OEM strings XML schema to be able to
generate comamnd line args for QEMU. This requires QEMU >= 2.12 release
containing this patch:
commit 2d6dcbf93fb01b4a7f45a93d276d4d74b16392dd
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Sat Oct 28 21:51:36 2017 +0100
smbios: support setting OEM strings table
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The OEM strings table in SMBIOS allows the vendor to pass arbitrary
strings into the guest OS. This can be used as a way to pass data to an
application like cloud-init, or potentially as an alternative to the
kernel command line for OS installers where you can't modify the install
ISO image to change the kernel args.
As an example, consider if cloud-init and anaconda supported OEM strings
you could use something like
<oemStrings>
<entry>cloud-init:ds=nocloud-net;s=http://10.10.0.1:8000/</entry>
<entry>anaconda:method=http://dl.fedoraproject.org/pub/fedora/linux/releases/25/x86_64/os</entry>
</oemStrings>
use of a application specific prefix as illustrated above is
recommended, but not mandated, so that an app can reliably identify
which of the many OEM strings are targetted at it.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We can start qemu with a "cpu,+la57" to set 57-bit vitrual address
space. So VM can be aware that it need to enable 5-level paging.
Corresponding QEMU commits:
al57 6c7c3c21f95dd9af8a0691c0dd29b07247984122
We recently added a generic XHCI USB3 controller to QEMU, and libvirt
supports adding that controller rather than the NEC XHCI USB3
controller, but when auto-adding a USB controller to Q35 domains we
were still adding the vendor-specific NEC controller. This patch
changes to add the generic controller instead, if it's available in
the QEMU binary that will be used.
Signed-off-by: Laine Stump <laine@laine.org>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Whenever a different kernel is booted, some capabilities related to KVM
(such as CPUID bits) may change. We need to refresh the cache to see the
changes.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
qemuDomainDefValidateVideo() (called from qemuDomainDefValidate()) is
just a loop performing various checks on each video device. Rather
than maintaining this separate function, just fold the validations
into qemuDomainDeviceDefValidateVideo(), which is called once for each
video device.
Commit 10c73bf1 fixed a bug that I had introduced back in commit
70249927 - if a vhost-scsi device had no manually assigned PCI
address, one wouldn't be assigned automatically. There was a slight
problem with the logic of the fix though - in the case of domains with
pcie-root (e.g. those with a q35 machinetype),
qemuDomainDeviceCalculatePCIConnectFlags() will attempt to determine
if the host-side PCI device is Express or legacy by examining sysfs
based on the host-side PCI address stored in
hostdev->source.subsys.u.pci.addr, but that part of the union is only
valid for PCI hostdevs, *not* for SCSI hostdevs. So we end up trying
to read sysfs for some probably-non-existent device, which fails, and
the function virPCIDeviceIsPCIExpress() returns failure (-1).
By coincidence, the return value is being examined as a boolean, and
since -1 is true, we still end up assigning the vhost-scsi device to
an Express slot, but that is just by chance (and could fail in the
case that the gibberish in the "hostside PCI address" was the address
of a real device that happened to be legacy PCI).
Since (according to Paolo Bonzini) vhost-scsi devices appear just like
virtio-scsi devices in the guest, they should follow the same rules as
virtio devices when deciding whether they should be placed in an
Express or a legacy slot. That's accomplished in this patch by
returning early with virtioFlags, rather than erroneously using
hostdev->source.subsys.u.pci.addr. It also adds a test case for PCIe
to assure it doesn't get broken in the future.
Commit 8708ca01c added virNetDevSwitchdevFeature() to check if a network
device has Switchdev capabilities. virNetDevSwitchdevFeature() attempts
to retrieve the PCI device associated with the network device, ignoring
non-PCI devices. It does so via the following call chain
virNetDevSwitchdevFeature()->virNetDevGetPCIDevice()->
virPCIGetDeviceAddressFromSysfsLink()
For non-PCI network devices (qeth, Xen vif, etc),
virPCIGetDeviceAddressFromSysfsLink() will report an error when
virPCIDeviceAddressParse() fails. virPCIDeviceAddressParse() also
logs an error. After commit 8708ca01c there are now two errors reported
for each non-PCI network device even though the errors are harmless.
To avoid the errors, introduce virNetDevIsPCIDevice() and use it in
virNetDevGetPCIDevice() before attempting to retrieve the associated
PCI device. virNetDevIsPCIDevice() uses the 'subsystem' property of the
device to determine if it is PCI. See the sysfs rules in kernel
documentation for more details
https://www.kernel.org/doc/html/latest/admin-guide/sysfs-rules.html
https://bugzilla.redhat.com/show_bug.cgi?id=1536461
This reverts commit aeda1b8c56.
Problem is that we need mon->lastError to be set because it's
used all over the place. Also, there's nothing wrong with
reporting error if one occurred. I mean, if there's a thread
executing an API and which currently is talking on monitor it
definitely wants the error reported.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When migrating a shutoff domain (i.e., offline migration), we have no
statistics to report and thus jobInfo will be NULL in
qemuMigrationFinish.
Broken by me in v3.10.0-183-ge8784e7868.
https://bugzilla.redhat.com/show_bug.cgi?id=1536351
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This is a variant of EPYC with indirect branch prediction protection.
The only difference between EPYC and EPYC-IBPB is the added "ibpb"
feature.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
We read from QEMU until seeing a \r\n pair to indicate a completed reply
or event. To avoid memory denial-of-service though, we must have a size
limit on amount of data we buffer. 10 MB is large enough that it ought
to cope with normal QEMU replies, and small enough that we're not
consuming unreasonable mem.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit 7a931a4204 refactored the code and probably forgot to add
this line.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
This is a variant of Skylake-Server with indirect branch prediction
protection. The only difference between Skylake-Server and
Skylake-Server-IBRS is the added "spec-ctrl" feature.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This is a variant of Skylake-Client with indirect branch prediction
protection. The only difference between Skylake-Client and
Skylake-Client-IBRS is the added "spec-ctrl" feature.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This is a variant of Broadwell with indirect branch prediction
protection. The only difference between Broadwell and Broadwell-IBRS is
the added "spec-ctrl" feature.
The Broadwell-IBRS model in QEMU is a bit different since Broadwell got
several additional features since we added it in cpu_map.xml:
abm, arat, f16c, rdrand, vme, xsaveopt
Adding them only to the -IBRS variant would confuse our CPU detection
code.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This is a variant of Broadwell-noTSX with indirect branch prediction
protection. The only difference between Broadwell-noTSX and
Broadwell-noTSX-IBRS is the added "spec-ctrl" feature.
The Broadwell-noTSX-IBRS model in QEMU is a bit different since
Broadwell-noTSX got several additional features since we added it in
cpu_map.xml:
abm, arat, f16c, rdrand, vme, xsaveopt
Adding them only to the -IBRS variant would confuse our CPU detection
code.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This is a variant of Haswell with indirect branch prediction protection.
The only difference between Haswell and Haswell-IBRS is the added
"spec-ctrl" feature.
The Haswell-IBRS model in QEMU is a bit different since Haswell got
several additional features since we added it in cpu_map.xml:
arat, abm, f16c, rdrand, vme, xsaveopt
Adding them only to the -IBRS variant would confuse our CPU detection
code.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This is a variant of Haswell-noTSX with indirect branch prediction
protection. The only difference between Haswell-noTSX and
Haswell-noTSX-IBRS is the added "spec-ctrl" feature.
The Haswell-noTSX-IBRS model in QEMU is a bit different since
Haswell-noTSX got several additional features since we added it in
cpu_map.xml:
arat, abm, f16c, rdrand, vme, xsaveopt
Adding them only to the -IBRS variant would confuse our CPU detection
code.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This is a variant of IvyBridge with indirect branch prediction
protection. The only difference between IvyBridge and IvyBridge-IBRS is
the added "spec-ctrl" feature.
The IvyBridge-IBRS model in QEMU is a bit different since IvyBridge got
several additional features since we added it in cpu_map.xml:
arat, vme, xsaveopt
Adding them only to the -IBRS variant would confuse our CPU detection
code.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This is a variant of SandyBridge with indirect branch prediction
protection. The only difference between SandyBridge and SandyBridge-IBRS
is the added "spec-ctrl" feature.
The SandyBridge-IBRS model in QEMU is a bit different since SandyBridge
got several additional features since we added it in cpu_map.xml:
arat, vme, xsaveopt
Adding them only to the -IBRS variant would confuse our CPU detection
code.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This is a variant of Westmere with indirect branch prediction
protection. The only difference between Westmere and Westmere-IBRS is
the added "spec-ctrl" feature.
The Westmere-IBRS model in QEMU is a bit different since Westmere got
several additional features since we added it in cpu_map.xml:
arat, pclmuldq, vme
Adding them only to the -IBRS variant would confuse our CPU detection
code.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
This is a variant of Nehalem with indirect branch prediction protection.
The only difference between Nehalem and Nehalem-IBRS is the added
"spec-ctrl" feature.
Thus the diff matches QEMU, but the new CPU model itself is different.
The QEMU's versions of both models contain "vme" feature, while this
feature is missing in libvirt's models. While we can't change the
existing Nehalem CPU model, we could add "vme" to Nehalem-IBRS to make
it similar to QEMU, but doing so would fool our CPU detecting code so
that any Nehalem CPU with "vme" feature would be detected as
Nehalem-IBRS CPU without spec-ctrl. Not adding "vme" to Nehalem-IBRS is
safe as QEMU will just provide the feature anyway, which matches what
happens with Nehalem (and new enough machine types).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Added in QEMU commits TBD and TBD.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Add a check if it's a iSCSI hostdev and if it's not then don't use the
union member 'iscsi'. The segmentation fault occured when accessing
secinfo->type, but this can vary from case to case.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Similar to commit @f44ec9c1, commit @500cbc06 introduced a new nested
'mdev_types' capability, however the mentioned commit didn't adjust
virNodeDeviceNumOfCaps and virNodeDeviceListCaps functions accordingly
to provide proper support for this capability.
After applying this patch the following python snippet returns the
expected results:
import libvirt
conn = libvirt.openReadOnly('qemu:///system')
devs = conn.listAllDevices()
for dev in devs:
if 'mdev_types' in dev.listCaps():
print dev.name(),dev.numOfCaps(),dev.listCaps()
Signed-off-by: Dan Zheng <dzheng@redhat.com>
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Let's also parse the available processor frequency information on S390
so that it can be utilized by virsh sysinfo:
# virsh sysinfo
<sysinfo type='smbios'>
...
<processor>
<entry name='family'>2964</entry>
<entry name='manufacturer'>IBM/S390</entry>
<entry name='version'>00</entry>
<entry name='max_speed'>5000</entry>
<entry name='serial_number'>145F07</entry>
</processor>
...
</sysinfo>
Reviewed-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Qemu 2.11 allows case-insensitive specification of CPU models.
This patch fixes the resulting problems on (at least) POWER
arch machines so that Power8 and POWER8 are not different.
Signed-off-by: Scott Garfinkle <scottgar@linux.vnet.ibm.com>
Libvirt 3.7.0 and earlier libvirt reported a migration job as completed
immediately after QEMU finished sending migration data at which point
migration was not really complete yet. Commit v3.7.0-29-g3f2d6d829e
fixed this, but caused a regression in reporting statistics for
completed jobs which started reporting the job as still running. This
happened because the completed job statistics including the job status
are copied from the running job before we finally mark it as completed.
Let's make sure QEMU_DOMAIN_JOB_STATUS_COMPLETED is always set in the
completed job info even when the job has not finished yet.
https://bugzilla.redhat.com/show_bug.cgi?id=1523036
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
When reconnecting to a running domain with host-model CPU started by old
libvirt which did not store the actual CPU in the status XML, we need to
ignore the fallback attribute to make sure we can translate the detected
host CPU model to a model which is supported by the running QEMU.
https://bugzilla.redhat.com/show_bug.cgi?id=1532980
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>