Add monitor tooling for calling query-named-block-nodes. The monitor
returns the data as the raw JSON array that is returned from the
monitor.
Unfortunately the logic to extract the node names for a complete backing
chain will be so complex that I won't be able to extract any meaningful
subset of the data in the monitor code.
The code is currently simple, but if we later add node names, it will be
necessary to generate the names based on the node name. Add a helper so
that there's a central point to fix once we add self-generated node
names.
The event is fired when a given block backend node (identified by the
node name) experiences a write beyond the bound set via
block-set-write-threshold QMP command. This wires up the monitor code to
extract the data and allow us receiving the events and the capability.
When using thin provisioning, management tools need to resize the disk
in certain cases. To avoid having them to poll disk usage introduce an
event which will be fired when a given offset of the storage is written
by the hypervisor. Together with the API which will be added later, it
will allow registering thresholds for given storage backing volumes and
this event will then notify management if the threshold is exceeded.
The function has very specific semantics. Split out the part that parses
the backing store specification string into a separate helper so that it
can be reused later while keeping the wrapper with existing semantics.
Note that virStorageFileParseChainIndex is pretty well covered by the
test suite.
Along with video and VNC support, bhyve has introduced USB tablet
support as an input device. This tablet is exposed to a guest
as a device on an XHCI controller.
At present, tablet is the only supported device on the XHCI controller
in bhyve, so to make things simple, it's allowed to only have a
single XHCI controller with a single tablet device.
In detail, this commit:
- Introduces a new capability bit for XHCI support in bhyve
- Adds an XHCI controller and tabled support with 1:1 mapping
between them
- Adds a couple of unit tests
There are a number of functions in bhyve_capabilities.c that probe
hypervisor capabilities by executing the bhyve(1) binary with the
specific device arugment, checking error message (if any) and setting
proper capability bit. As those are extremely similar, move this logic
into a helper function and convert existing functions to use that.
* Extract filling bhyve capabilities from virBhyveDomainCapsBuild()
into a new function virBhyveDomainCapsFill() to make testing
easier by not having to mock firmware directory listing and
hypervisor capabilities probing
* Also, just presence of the firmware files is not sufficient
to enable os.loader.supported, hypervisor should support UEFI
boot too
* Add tests to domaincapstest for the main caps possible flows:
- when UEFI bootrom is supported
- when video (fbus) is supported
- neither of above is supported
qemuMigrationResetTLS() does not initialize 'ret' by default,
so when it jumps to 'cleanup' on error, the 'ret' variable will be
uninitialized, which clang complains about.
Set it to '-1' by default.
https://bugzilla.redhat.com/show_bug.cgi?id=1300769
If the migration flags indicate this migration will be using TLS,
then while we have connection in the Begin phase check and setup the
TLS environment that will be used by virMigrationRun during the Perform
phase for the source to configure TLS.
Processing adds an "-object tls-creds-x509,endpoint=client,..." and
possibly an "-object secret,..." to handle the passphrase response.
Then it sets the 'tls-creds' and possibly 'tls-hostname' migration
parameters.
The qemuMigrateCancel will clean up and reset the environment as it
was originally found.
Signed-off-by: John Ferlan <jferlan@redhat.com>
If the migration flags indicate this migration will be using TLS,
then set up the destination during the prepare phase once the target
domain has been started to add the TLS objects to perform the migration.
This will create at least an "-object tls-creds-x509,endpoint=server,..."
for TLS credentials and potentially an "-object secret,..." to handle the
passphrase response to access the TLS credentials. The alias/id used for
the TLS objects will contain "libvirt_migrate".
Once the objects are created, the code will set the "tls-creds" and
"tls-hostname" migration parameters to signify usage of TLS.
During the Finish phase we'll be sure to attempt to clear the
migration parameters and delete those objects (whether or not they
were created). We'll also perform the same reset during recovery
if we've reached FINISH3.
If the migration isn't using TLS, then be sure to check if the
migration parameters exist and clear them if so.
Add an asyncJob argument for add/delete TLS Objects. A future patch will
add/delete TLS objects from a migration which may have a job to join.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add the fields to support setting tls-creds and tls-hostname during
a migration (either source or target). Modify the query migration
function to check for the presence and set the field for future
consumers to determine which of 3 conditions is being met (NULL,
present and set to "", or present and sent to something). These
correspond to qemu commit id '4af245dc3' which added support to
default the value to "" and allow setting (or resetting) to ""
in order to disable. This reset option allows libvirt to properly
use the tls-creds and tls-hostname parameters.
Modify code paths that either allocate or use stack space in order
to call qemuMigrationParamsClear or qemuMigrationParamsFree for cleanup.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add a new TLS X.509 certificate type - "migrate". This will handle the
creation of a TLS certificate capability (and possibly repository) to
be used for migrations. Similar to chardev's, credentials will be handled
via a libvirt secrets; however, unlike chardev's enablement and usage
will be via a CLI flag instead of a conf flag and a domain XML attribute.
The migrations using the *x509_verify flag require the client-cert.pem
and client-key.pem files to be present in the TLS directory - so let's
also be sure to note that in the qemu.conf file.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Fix typo in virNetDevPFGetVF() stub:
ATTRUBUTE_UNUSED -> ATTRIBUTE_UNUSED.
While here, use common indent style for arguments in
virNetDevGetVirtualFunctionIndex() stub.
If the variable store (<nvram>) file is raw qemu can't do a snapshot of
it and thus the snapshot fails. QEMU rejects such snapshot by a message
which would not be properly interpreted as an error by libvirt.
Additionally allowing to use a qcow2 variable store backing file would
solve this issue but then it would become eligible to become target of
the memory dump.
Offline internal snapshot would be incomplete too with either storage
format since libvirt does not handle the pflash file in this case.
Forbid such snapshot so that we can avoid problems.
commit 00d28a78 added a check to see if there were any IPv6 routes
added by RA (Router Advertisement) via an interface that had accept_ra
set to something other than "2". The check was being done
unconditionally, but it's only relevant if IPv6 forwarding is going to
be turned on, and that will only happen if the network has an IPv6
address.
Given an SRIOV PF netdev name (e.g. "enp2s0f0") and VF#, this new
function returns the netdev name of the referenced VF device
(e.g. "enp2s11f6"), or NULL if the device isn't bound to a net driver.
We will want to allow silent failure of virNetDevSetMAC() in the case
that the SIOSIFHWADDR ioctl fails with errno == EADDRNOTAVAIL. (Yes,
that is very specific, but we really *do* want a logged failure in all
other circumstances, and don't want to duplicate code in the caller
for the other possibilities).
This patch renames the 3 different virNetDevSetMAC() functions to
virNetDevSetMACInternal(), adding a 3rd arg called "quiet" and making
them static (because this extra control will only be needed within
virnetdev.c). A new global virNetDevSetMAC() is defined that calls
whichever of the three *Internal() functions gets compiled with quiet
= false. Callers in virnetdev.c that want to notice a failure with
errno == EADDRNOTAVAIL and retry with a different strategy rather than
immediately failing, can call virNetDevSetMACInternal(..., true).
This function unbinds a device from its driver, then immediately
rebinds it to its driver again. The code for this new function is just
the 2nd half of virPCIDeviceBindWithDriverOverride(), so that
function's 2nd half is replaced with a call to virPCIDeviceRebind().
...and cleanup the callers to report it when it *is* an error.
In many cases It's useful for virPCIGetNetName() to not log an error
and simply return a NULL pointer when the given device isn't bound to
a net driver (e.g. we're looking at a VF that is permanently bound to
vfio-pci). The existing code would silently return an error in this
case, which could eventually lead to the dreaded "An error occurred
but the cause is unknown" log message.
This patch changes virPCIGetNetName() to still return success if the
device simply isn't bound to a net driver, and adjusts all the callers
that require a non-null netname to check for that condition and log an
error when it happens.
vf in virNetDevMacVLanDeleteWithVPortProfile() is initialized to -1
and never set. It's not set for a good reason - because it doesn't
make sense during macvtap device setup to refer to a VF device as
"PF:VF#". This patch replaces the two uses of "vf" with "-1", and
removes the local variable, so that it's more clear we are always
calling the utility functions with vf set to -1.
This function is only called in two places, and the ifindex,
nltarget_kernel, and getPidFunc args are never used (and never will
be).
ifindex - we always know the name of the device, and never know the
ifindex - if we really did need the ifindex we would have to get it
from the name using virNetDevGetIndex(). In practice, we just send -1
to virNetDevSetVfConfig(), which doesn't bother to learn the real
ifindex (you only need a name *or* an ifindex for the netlink command
to succeed, not both).
nltarget_kernel - messages to set the config of an SRIOV VF will
always go to netlink in the kernel, not to another user process, so
this arg is always true (there are other uses of netlink messages
where the message might need to go to another user process, but never
in the case of RTM_SETLINK for SRIOV).
getPidFunc - this arg is only used if nltarget_kernel is false, and it
never is.
None of this has any functional effect, it just makes it easier to
follow what's happening when virNetDevSetVfConfig() is called.
virNetDevParseVfConfig() assumed that both the MAC address and VLAN
tag pointers were valid, so even if you only wanted one or the other,
you would need a variable to hold the returned value for both. This
patch checks each for a NULL pointer before filling it in.
The source code will check for NULL arguments for 'macvtap_macaddr' and
'vmuuid', so no need for the NONNULL in the prototypes. Following the stack
for both arguments to virNetDevVPortProfileOpSetLink also shows called
functions would handle a NULL value.
Additionally, modified the prototype to use the same 'macvtap_macaddr'
name as the source code for consistency.
Since the code checks 'mgr == NULL' anyway, no need for the prototype
to have the NONNULL arg check. Also add an error message to indicate what
the failure is so that there isn't a failed for some reason error.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The comparison code used STREQ_NULLABLE anyway for both 'drv_name' and
'dom_name', so no need. Add a NULLSTR on the 'dom_name' too.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The comparison code used STREQ_NULLABLE anyway for both 'drv_name' and
'dom_name', so no need. Add a NULLSTR on the 'dom_name' too.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The code checks and handles a NULL 'str', so just remove the NONNULL.
Update the error message to add the NULLSTR() around 'str' also.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The 'mon' argument validity is checked in the QEMU_CHECK_MONITOR for the
following functions, so they don't need the NONNULL on their prototype:
qemuMonitorUpdateVideoMemorySize
qemuMonitorUpdateVideoVram64Size
qemuMonitorGetAllBlockStatsInfo
qemuMonitorBlockStatsUpdateCapacity
Signed-off-by: John Ferlan <jferlan@redhat.com>
The prototype requires not passing a NULL in the parameter and the callers
all would fail far before this code would fail if 'vm' was NULL, so just
remove the check.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The prototype requires a NONNULL argument and the only caller passes in
a non-null parameter. Besides the "else if" condition would deref it anyway.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Since the code checks and handles NULL parameters, remove the NONNULL
from the prototype.
Also fix the comment in the source to reference the right name.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Since the code checks and handles a NULL 'node' before proceeding
there's no need for the prototype with the NONNULL(2).
Signed-off-by: John Ferlan <jferlan@redhat.com>
If a network is destroyed and restarted, or its bridge is changed, any
tap devices that had previously been connected to the bridge will no
longer be connected. As a first step in automating a reconnection of
all tap devices when this happens, this patch modifies
networkNotifyActualDevice() (which is called once for every
<interface> of every active domain whenever libvirtd is restarted) to
reconnect any tap devices that it finds disconnected.
With this patch in place, you will need to restart libvirtd to
reconnect all the taps to their proper bridges. A future patch will
add a callback that hypervisor drivers can register with the network
driver to that the network driver can trigger this behavior
automatically whenever a network is started.
This patch splits out the part of virNetDevTapCreateInBridgePort()
that would need to be re-done if an existing tap device had to be
re-attached to a bridge, and puts it into a separate function. This
can be used both when an existing domain interface config is updated
to change its connection, and also to re-attach to the "same" bridge
when a network has been stopped and restarted. So far it is used for
nothing.
This function provides the bridge/bond device that the given network
device is attached to. The return value is 0 or -1, and the master
device is a char** argument to the function - this is needed in order
to allow for a "success" return from a device that has no master.
The only reason that the ethtool features weren't being retrieved in
an unprivileged libvirtd was because they required ioctl(), and the
ioctl was using an AF_PACKET socket, which requires root. Now that we
are using AF_UNIX for ioctl(), this restriction can be removed.
The exact family of the socket created for the fd used by ioctl(7)
doesn't matter, it just needs to be a socket and not a file. But for
some reason when macvtap support was added, it used
AF_PACKET/SOCK_DGRAM sockets for its ioctls; we later used the same
AF_PACKET/SOCK_DGRAM socket for new ioctls we added, and eventually
modified the other pre-existing ioctl sockets (for creating/deleting
bridges) to also use AF_PACKET/SOCK_DGRAM (that code originally used
AF_UNIX/SOCK_STREAM).
The problem with using AF_PACKET (intended for sending/receiving "raw"
packets, i.e. packets that can be some protocol other than TCP or UDP)
is that it requires root privileges. This meant that none of the
ioctls in virnetdev.c or virnetdevip.c would work when running
libvirtd unprivileged.
This packet solves that problem by changing the family to AF_UNIX when
creating the socket used for any ioctl().
For some drivers the domain's machine type makes no sense. They
just don't use it. A great example is bhyve driver. Therefore it
makes very less sense to report machine in domain capabilities
XML.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When enabling IPv6 on all interfaces, we may get the host Router
Advertisement routes discarded. To avoid this, the user needs to set
accept_ra to 2 for the interfaces with such routes.
See https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt
on this topic.
To avoid user mistakenly losing routes on their hosts, check
accept_ra values before enabling IPv6 forwarding. If a RA route is
detected, but neither the corresponding device nor global accept_ra
is set to 2, the network will fail to start.
virNetlinkCommand() processes only one response message, while some
netlink commands, like route dumping, need to process several.
Add virNetlinkDumpCommand() as a virNetlinkCommand() sister.
Allow to reuse as much as possible from virNetlinkCommand(). This
comment prepares for the introduction of virNetlinkDumpCommand()
only differing by how it handles the responses.
Commit id '85af0b8' added a 'timeout' as the 4th parameter to
qemuMonitorOpen, but neglected to update the ATTRIBUTE_NONNULL(4)
to be (5) for the cb parameter.
It was pointed out here:
https://bugzilla.redhat.com/show_bug.cgi?id=1331796#c4
that we shouldn't be adding a "no-resolv" to the dnsmasq.conf file for
a network if there isn't any <forwarder> element that specifies an IP
address but no qualifying domain. If there is such an element, it will
handle all DNS requests that weren't otherwise handled by one of the
forwarder entries with a matching domain attribute. If not, then DNS
requests that don't match the domain of any <forwarder> would not be
resolved if we added no-resolv.
So, only add "no-resolv" when there is at least one <forwarder>
element that specifies an IP address but no qualifying domain.
We reported error in caller virQEMUCapsCacheLookupByArch.
So the same error messages in qemuConnectGetDomainCapabilities
is useless.
Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
Calling virCPUUpdateLive on a domain with no guest CPU configuration
does not make sense. Especially when doing so would crash libvirtd.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Use "virStoragePoolObj" as a prefix for any external API in virstorageobj.
Also a couple of functions were local to virstorageobj.c, so remove their
external defs iin virstorageobj.h.
NB: The virStorageVolDef* API's won't change.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In an effort to be consistent with the source module, alter the function
prototypes to follow the similar style of source with the "type" on one
line followed by the function name and arguments on subsequent lines with
with argument getting it's own line.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Alter the format of the code to follow more recent style guidelines of
two empty lines between functions, function decls with "[static] type"
on one line followed by function name with arguments to functions each
on one line.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move all the StoragePoolObj related API's into their own module
virstorageobj from the storage_conf
Purely code motion at this point, plus adjustments to cleanly build
Signed-off-by: John Ferlan <jferlan@redhat.com>
The log and lock protocol don't have an extra handshake to close the
connection. Instead they just close the socket. Unfortunately that
resulted into a lot of spurious garbage logged to the system log files:
2017-03-17 14:00:09.730+0000: 4714: error : virNetSocketReadWire:1800 : End of file while reading data: Input/output error
or in the journal as:
Mar 13 16:19:33 xxxx virtlogd[32360]: End of file while reading data: Input/output error
Use the new facility in the netserverclient to suppress the IO error
report from the virNetSocket layer.
When starting a domain with custom guest CPU specification QEMU may add
or remove some CPU features. There are several reasons for this, e.g.,
QEMU/KVM does not support some requested features or the definition of
the requested CPU model in libvirt's cpu_map.xml differs from the one
QEMU is using. We can't really avoid this because CPU models are allowed
to change with machine types and libvirt doesn't know (and probably
doesn't even want to know) about such changes.
Thus when we want to make sure guest ABI doesn't change when a domain
gets migrated to another host, we need to update our live CPU definition
according to the CPU QEMU created. Once updated, we will change CPU
checking to VIR_CPU_CHECK_FULL to make sure the virtual CPU created
after migration exactly matches the one on the source.
https://bugzilla.redhat.com/show_bug.cgi?id=822148https://bugzilla.redhat.com/show_bug.cgi?id=824989
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemuMonitorGetGuestCPU can now optionally create CPU data from
filtered-features in addition to feature-words.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The checks are now in a dedicated qemuProcessVerifyHypervFeatures
function.
In addition to moving the code this patch also fixes a few bugs: the
original code was leaking cpuFeature and the return value of
virCPUDataCheckFeature was not checked properly.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The attribute can be used to request a specific way of checking whether
the virtual CPU matches created by the hypervisor matches the
specification in domain XML.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The disk tuning group parameter is ignored by qemu if no other
throttling options are set. Reject such configuration, since the name
would not be honored after setting parameters via the live tuning API.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1433180
When checking capabilities for qemu we need to check whether subsets of
the disk throttling settings are supported. Extract the checks into a
separate functions as they will be reused in next patch.
While the code path that queries the monitor allocates a separate copy
of the 'group_name' string the path querying the config would not copy
it. The call to virTypedParameterAssign would then steal the pointer
(without clearing it) and the RPC layer freed it. Any subsequent call
resulted into a crash.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1433183
ioh3420 is emulated Intel hardware, so it always looked
quite out of place in aarch64/virt guests. Even for x86/q35
guests, the recently-introduced pcie-root-port is a better
choice because, unlike ioh3420, it doesn't require IO space
(a fairly constrained resource) to work.
If pcie-root-port is available in QEMU, use it; ioh3420 is
still used as fallback for when pcie-root-port is not
available.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1408808
QEMU 2.9 introduces the pcie-root-port device, which is
a generic version of the existing ioh3420 device.
Make the new device available to libvirt users.
If the SASL config does not have any mechanisms we currently
just report an empty list to the client which will then
fail to identify a usable mechanism. This is a server config
error, so we should fail immediately on the server side.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Linux still defaults to a 1024 open file handle limit. This causes
scalability problems for libvirtd / virtlockd / virtlogd on large
hosts which might want > 1024 guest to be running. In fact if each
guest needs > 1 FD, we can't even get to 500 guests. This is not
good enough when we see machines with 100's of physical cores and
TBs of RAM.
In comparison to other memory requirements of libvirtd & related
daemons, the resource usage associated with open file handles
is essentially line noise. It is thus reasonable to increase the
limits unconditionally for all installs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There were couple of reports on the list (e.g. [1]) that guests
with huge amounts of RAM are unable to start because libvirt
kills qemu in the initialization phase. The problem is that if
guest is configured to use hugepages kernel has to zero them all
out before handing over to qemu process. For instance, 402GiB
worth of 1GiB pages took around 105 seconds (~3.8GiB/s). Since we
do not want to make the timeout for connecting to monitor
configurable, we have to teach libvirt to count with this
fact. This commit implements "1s per each 1GiB of RAM" approach
as suggested here [2].
1: https://www.redhat.com/archives/libvir-list/2017-March/msg00373.html
2: https://www.redhat.com/archives/libvir-list/2017-March/msg00405.html
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
While connecting to qemu monitor, the first thing we do is wait
for it to show up. However, we are doing it with some timeout to
avoid indefinite waits (e.g. when qemu doesn't create the monitor
socket at all). After beaa447a29 we are using exponential back
off timeout meaning, after the first connection attempt we wait
1ms, then 2ms, then 4 and so on. This allows us to bring down
wait time for small domains where qemu initializes quickly.
However, on the other end of this scale are some domains with
huge amounts of guest memory. Now imagine that we've gotten up to
wait time of 15 seconds. The next one is going to be 30 seconds,
and the one after that whole minute. Well, okay - with current
code we are not going to wait longer than 30 seconds in total,
but this is going to change in the next commit.
The exponential back off is usable only for first few iterations.
Then it needs to be caped (one second was chosen as the limit)
and switch to constant wait time.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Fix a "bug" in the storage pool test driver code which "assumed"
testStoragePoolObjSetDefaults should fill in the configFile for
both the Define/Create (persistent) and CreateXML (transient) pools
by just VIR_FREE()'ing it during CreateXML. Because the configFile
was filled in, during Destroy the pool wouldn't be free'd which
could cause issues for future patches which add tests to validate
vHBA creation for the storage pool using the same name.
Commit id 'bb74a7ffe' added a fairly non specific message when providing
only the <parent wwnn='xxx'/> or <parent wwpn='xxx'/> instead of providing
both wwnn and wwpn. This patch just modifies the message to be more specific
about which was missing.
Rather than returning true/false and having the caller check if the
vHBA was actually created, let's do that check within the CreateVport
function. That way the caller can faithfully assume success based
on a name start the thread looking for the LUNs. Prior to this change
it's possible that the vHBA wasn't really created (e.g if the call to
virVHBAGetHostByWWN returned NULL), we'd claim success, but in reality
there'd be no vHBA for the pool. This also fixes a second yet seen
issue that if the nodedev was present, but the parent by name wasn't
provided (perhaps parent by wwnn/wwpn or by fabric_name), then a failure
would be returned. For this path it shouldn't be an error - we should
just be happy that something else is managing the device and we don't
have to create/delete it.
The end result is that the createVport code can now just start the
refresh thread once it gets a non NULL name back.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move the bulk of createVport and rename to virNodeDeviceCreateVport.
Remove the deleteVport entirely and replace with virNodeDeviceDeleteVport
Signed-off-by: John Ferlan <jferlan@redhat.com>
The function is actually in virutil.c, but prototyped in virfile.h.
This patch fixes that by renaming the function to virWaitForDevices,
adding the prototype in virutil.h and libvirt_private.syms, and then
changing the callers to use the new name.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move the virStoragePoolSourceAdapter from storage_conf.h and rename
to virStorageAdapter.
Continue with code realignment for brevity and flow.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rework the code to use the new FCHost specific adapter structures.
Also rework the parameters to only pass what's need and leave logic in
the caller for the adapter type and the need to call the helpers.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rework the helpers/APIs to use the FCHost and SCSIHost adapter types.
Continue to realign the code for shorter lines.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rework the helpers/APIs to use the FCHost and SCSIHost adapter types.
Continue to realign the code for shorter lines.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than have lots of ugly inline code, create helpers to try and
make things more readable. While creating the helpers realign the code
as necessary.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than have lots of ugly inline code, create helpers to try and
make things more readable. While creating the helpers realign the code
as necessary.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than use virXPathString, pass along an virXPathNode and alter
the parsing to use virXMLPropString.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Split out the code that munges through the storage pool adapter into
helpers - it's about to be moved into it's own source file.
This is purely code motion at this point.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Commit id 'bb74a7ffe' added some new fields to search for a fchost by
parent wwnn/wwpn or parent_fabric_name, but neglected to validate that
the data within the fields was valid at parse time. This could lead to
eventual failure at run time, so rather than have the failure then, let's
validate now.
Signed-off-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1428209
Commit id 'bb74a7ffe' neglected to check that both the parent_wwnn
parent_wwpn are in the XML if one or the other is similar to how
the node device code checked (commit id '2b13361bc').
If only one is provided, the "default" is to use a vHBA capable
adapter (see commit id '78be2e8b'), so the vHBA could start, but
perhaps not on the expected adapter.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Some users might want to pass a blockdev or a chardev as a
backend for NVDIMM. In fact, this is expected to be the mostly
used configuration. Therefore libvirt should allow the device in
devices CGroup then.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Now that we have APIs for relabel memdevs on hotplug, fill in the
missing implementation in qemu hotplug code.
The qemuSecurity wrappers might look like overkill for now,
because qemu namespace code does not deal with the nvdimms yet.
Nor does our cgroup code. But hey, there's cgroup_device_acl
variable in qemu.conf. If users add their /dev/pmem* device in
there, the device is allowed in cgroups and created in the
namespace so they can successfully passthrough it to the domain.
It doesn't look like overkill after all, does it?
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When domain is being started up, we ought to relabel the host
side of NVDIMM so qemu has access to it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When domain is being started up, we ought to relabel the host
side of NVDIMM so qemu has access to it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
For NVDIMM devices it is optionally possible to specify the size
of internal storage for namespaces. Namespaces are a feature that
allows users to partition the NVDIMM for different uses.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Now that NVDIMM has found its way into libvirt, users might want
to fine tune some settings for each module separately. One such
setting is 'share=on|off' for the memory-backend-file object.
This setting - just like its name suggest already - enables
sharing the nvdimm module with other applications. Under the hood
it controls whether qemu mmaps() the file as MAP_PRIVATE or
MAP_SHARED.
Yet again, we have such config knob in domain XML, but it's just
an attribute to numa <cell/>. This does not give fine enough
tuning on per-memdevice basis so we need to have the attribute
for each device too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So, majority of the code is just ready as-is. Well, with one
slight change: differentiate between dimm and nvdimm in places
like device alias generation, generating the command line and so
on.
Speaking of the command line, we also need to append 'nvdimm=on'
to the '-machine' argument so that the nvdimm feature is
advertised in the ACPI tables properly.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
NVDIMM is new type of memory introduced into QEMU 2.6. The idea
is that we have a Non-Volatile memory module that keeps the data
persistent across domain reboots.
At the domain XML level, we already have some representation of
'dimm' modules. Long story short, NVDIMM will utilize the
existing <memory/> element that lives under <devices/> by adding
a new attribute 'nvdimm' to the existing @model and introduce a
new <path/> element for <source/> while reusing other fields. The
resulting XML would appear as:
<memory model='nvdimm'>
<source>
<path>/tmp/nvdimm</path>
</source>
<target>
<size unit='KiB'>523264</size>
<node>0</node>
</target>
<address type='dimm' slot='0'/>
</memory>
So far, this is just a XML parser/formatter extension. QEMU
driver implementation is in the next commit.
For more info on NVDIMM visit the following web page:
http://pmem.io/
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Frankly, this function is one big mess. A lot of arguments,
complicated behaviour. It's really surprising that arguments were
in random order (input and output arguments were mixed together),
the documentation was outdated, the description of return values
was bogus.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Even though this variable contains just values from an enum where
zero has the usual meaning, it's enum after all and we should
check it as such.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Introduce config file support for the bhyve driver. The only available
setting at present is 'firmware_dir' for specifying a directory with
UEFI firmware files.
One of the main reasons for introducing host-model CPU definition in a
domain capabilities XML was the inability to express disabled features
in a host capabilities XML. That is, when a host CPU is, e.g., Haswell
without x2apic support, host capabilities XML will have to report it as
Westmere + a bunch of additional features., but we really want to use
Haswell - x2apic when creating a host-model CPU.
Unfortunately, I somehow forgot to do the last step and the code would
just copy the CPU definition found in the host capabilities XML. This
changed recently for new QEMU versions which allow us to query host CPU,
but any slightly older QEMU will not benefit from any change I did. This
patch makes sure the right CPU model is filled in the domain
capabilities even with old QEMU.
The issue was reported in
https://bugzilla.redhat.com/show_bug.cgi?id=1426456
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The function is now called virQEMUCapsProbeHostCPU. Both the refactoring
and the change of the name is done for consistency with a new function
which will be introduced in the following commit.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When creating host CPU definition usable with a given emulator, the CPU
should not be defined using an unsupported CPU model. The new @models
and @nmodels parameters can be used to limit CPU models which can be
used in the result.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The parameter can be used to request either VIR_CPU_TYPE_HOST (which has
been assumed so far) or VIR_CPU_TYPE_GUEST definition.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
cpuNodeData has always been followed by cpuDecode as no hypervisor
driver is really interested in raw CPUID data for a host CPU. Let's
create a new CPU driver API which returns virCPUDefPtr directly.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1431112
Yeah, that's right. A mount point doesn't have to be a directory.
It can be a file too. However, the code that tries to preserve
mount points under /dev for new namespace for qemu does not count
with that option.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
VIR_CONNECT_LIST_STORAGE_POOLS_VSTORAGE and
VIR_CONNECT_LIST_STORAGE_POOLS_ZFS were added to libvirt but the listing
API was not properly updated to use them.
https://bugzilla.redhat.com/show_bug.cgi?id=1431543
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
bhyve supports 'gop' video device that allows clients to connect
to VMs using VNC clients. This commit adds support for that to
the bhyve driver:
- Introducr 'gop' video device type
- Add capabilities probing for the 'fbuf' device that's
responsible for graphics
- Update command builder routines to let users configure
domain's VNC via gop graphics.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
Extend domain capabilities XML with the information about
available UEFI firmware files. It searches in the location
that the sysutils/bhyve-firmware FreeBSD port installs
files to.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
Allow to boot using UEFI rather than using an external boot loader
such as bhyveload or grub-bhyve.
Also, make LPC PCI-ISA bridge handling more flexible as now it's
needed not only for serial ports, but for bootrom as well.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
Implement the BHACE_CAP_LPC_BOOTROM capability by checking the stderr
output of 'bhyve -l bootrom'. If the bootrom option is unsupported, this
will contain the following output:
bhyve: invalid lpc device configuration 'bootrom'
On newer bhyve versions that do support specifying a bootrom image, the
standard help will be printed.
Add a new test to fchosttest in order to test creation of our vHBA
via the Storage Pool logic. Unlike the real code, we cannot yet use
the virVHBA* API's because they (currently) traverse the file system
in order to get the parent vport capable scsi_host. Besides there's
no "real" NPIV device here - so we have to take some liberties, at
least for now.
Instead, we'll follow the node device tests partially in order to
create and destroy the vHBA with the test node devices.
Signed-off-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1430634
If a qemu process has died, we get EOF on its monitor. At this
point, since qemu process was the only one running in the
namespace kernel has already cleaned the namespace up. Any
attempt of ours to enter it has to fail.
This really happened in the bug linked above. We've tried to
attach a disk to qemu and while we were in the monitor talking to
qemu it just died. Therefore our code tried to do some roll back
(e.g. deny the device in cgroups again, restore labels, etc.).
However, during the roll back (esp. when restoring labels) we
still thought that domain has a namespace. So we used secdriver's
transactions. This failed as there is no namespace to enter.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If the delivery of the DEVICE_DELETED event for the vCPU being deleted
would time out, the code would not call 'qemuDomainResetDeviceRemoval'.
Since the waiting thread did not unregister itself prior to stopping the
waiting the monitor code would try to wake it up instead of dispatching
it to the event worker. As a result the unplug process would not be
completed and the definition would not be updated.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1428893https://bugzilla.redhat.com/show_bug.cgi?id=1427801
This reverts commit c96bd78e4e.
So our code is one big mess and we modify domain definition while
building qemu_command line and our hotplug code share only part
of the parsing and command line building code. Let's revert
that change because to fix it properly would require refactor and
move a lot of things.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1430275
We should skip <listen type='socket'/> only if the 'socket' path
is specified because if there is no 'socket' path we need to
keep that element in migratable XML.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1366088
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
When libvirtd is started we call qemuDomainRecheckInternalPaths
to detect whether a domain has VNC socket path generated by libvirt
based on option from qemu.conf. However if we are parsing status XML
for running domain the existing socket path can be generated also if
the config XML uses the new <listen type='socket'/> element without
specifying any socket.
The current code doesn't make difference how the socket was generated
and always marks it as "fromConfig". We need to store the
"autoGenerated" value in the status XML in order to preserve that
information.
The difference between "fromConfig" and "autoGenerated" is important
for migration, because if the socket is based on "fromConfig" we don't
print it into the migratable XML and we assume that user has properly
configured qemu.conf on both hosts. However if the socket is based
on "autoGenerated" it means that a new feature was used and therefore
we need to leave the socket in migratable XML to make sure that if
this feature is not supported on destination the migration will fail.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Split apart and rename qemuDomainGetChardevTLSObjects in order to make a
more generic API that can create the TLS JSON prop objects (secret and
tls-creds-x509) to be used to create the objects
Signed-off-by: John Ferlan <jferlan@redhat.com>
Create a qemuDomainAddChardevTLSObjects which will encapsulate the
qemuDomainGetChardevTLSObjects and qemuDomainAddTLSObjects so that
the callers don't need to worry about the props.
Move the dev->type and haveTLS checks in to the Add function to avoid
an unnecessary call to qemuDomainAddTLSObjects
Signed-off-by: John Ferlan <jferlan@redhat.com>
Refactor the TLS object adding code to make two separate API's that will
handle the add/remove of the "secret" and "tls-creds-x509" objects including
the Enter/Exit monitor commands.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Since qemuDomainObjExitMonitor can also generate error messages,
let's move it inside any error message saving code on error paths
for various hotplug add activities.
Signed-off-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1379200
When we are restoring a domain from a saved image, or just
updating its XML in the saved image - we have to make sure that
the ABI guests sees will not change. We have a function for that
which reports errors. But for some reason if this function fails,
we call it again with slightly different argument. Therefore it
might happen that we overwrite the original error and leave user
with less helpful one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In an effort to be consistent with the source module, alter the function
prototypes to follow the similar style of source with the "type" on one
line followed by the function name and arguments on subsequent lines with
with argument getting it's own line.
Alter the format of the code to follow more recent style guidelines of
two empty lines between functions, function decls with "[static] type"
on one line followed by function name with arguments to functions each
on one line.
Move all the NWFilterObj API's into their own module virnwfilterobj
from the nwfilter_conf
Purely code motion at this point, plus adjustments to cleanly build.
Found by Coverity. Because there's an "if ((cur = strstr(base, "revision"))
!= NULL) {" followed by a "base = cur" coverity notes that 'base' could
then be NULL causing the return to the top of the "while ((tmp_base =
strstr(base, "processor")) != NULL) {" to have strstr deref a NULL 'base'
pointer because the setting of base at the bottom of the loop is unconditional.
Alter the code to set "base = cur" after processing each key. That will
"ensure" that base doesn't get set to NULL if both "cpu" and "revision"
do no follow a "processor".
While a /proc/cpuinfo file that has a "processor" key but with neither
a "cpu" nor a "revision" doesn't seem feasible, the code is written as if
it could happen, so we have to account for it.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Calls to virFileReadAll after a VIR_ALLOC that return NULL all show
a memory leak since 'ret' isn't virSysinfoDefFree'd and normal path
"return ret" doesn't free outbuf.
Reported by Coverity
Signed-off-by: John Ferlan <jferlan@redhat.com>
$ virsh vol-clone /tmp/test.iso new.iso
error: Failed to clone vol from test.iso
error: internal error: Child process (/bin/qemu-img convert -f iso -O iso /tmp/test.iso /tmp/new.iso) unexpected exit status 1: qemu-img: Could not open '/tmp/test.iso': Unknown driver 'iso'
Map iso->raw before sending the format value to qemu-img
https://bugzilla.redhat.com/show_bug.cgi?id=972784https://bugzilla.redhat.com/show_bug.cgi?id=1419395
Whole implementations along with helper totalling screens of code were
conditionally compiled. That made the code totally unreadable and
untestable. Rename functions to have the architecture in the name so
that all can be compiled at the same time and introduce header to allow
testing them all.
Proposed formal coding conventions encourage defining typedefs for
vir[Blah] and vir[Blah]Ptr separately from the associated struct named
_vir[Blah]:
typedef struct _virBlah virBlah;
typedef virBlah *virBlahPtr;
struct _virBlah {
...
};
At some point in the past, I had submitted several patches using a
more compact style that I prefer, and they were accepted:
typedef struct _virBlah {
...
} virBlah, *virBlahPtr;
Since these are by far a minority among all struct definitions, this
patch changes all those definitions to reflect the style prefered by
the proposal so that there is 100% consistency.
After the system has been booted, it should not change.
Cache the return value of virSystemdHasMachined.
Allow starting and terminating machines with just one
DBus call, instead of three, reducing the chance of
the call timing out.
Also introduce a small function for resetting the cache
to be used in tests.
Both virSystemdTerminateMachine and virSystemdCreateMachine
propagate the error to tell between a non-systemd system
and a hard error.
In virSystemdGetMachineNameByPID both are treated the same,
but an error is ignored by the callers.
Split out the checks into a separate function.
Make common helpers testNetworkObjFindByUUID and testStoragePoolObjFindByUUID
which will replace the repeated patter for each to find objects by UUID.
As a bonus, the error message processing will also provide the failed uuidstr
rather than a generic error message.
Rather than have multiple places using the same pattern to find
a network by name using virNetworkObjFindByName, create a common
helper which will provide a consistent error message as well.
Rather than have continued repeated sequences of :
testDriverLock()
xxx = vir*ObjFindByName()
testDriverUnlock()
if (xxx == NULL) {
virReportError
goto cleanup;
}
Make some common helpers which will use the pattern and make a single
reference using a single common error message.
Altered for Interfaces, Storage Pools, Storage Volumes, and Node Devices.
For each the common error message can now also indicate which 'name' was
not found. For Storage Volumes, the "new" error will be more specific
rather than just invalid argument.
In an effort to be consistent with the source module, alter the function
prototypes to follow the similar style of source with the "type" on one
line followed by the function name and arguments on subsequent lines with
with argument getting it's own line.
Alter the format of the code to follow more recent style guidelines of
two empty lines between functions, function decls with "[static] type"
on one line followed by function name with arguments to functions each
on one line.
Now that we have some qemuSecurity wrappers over
virSecurityManager APIs, lets make sure everybody sticks with
them. We have them for a reason and calling virSecurityManager
API directly instead of wrapper may lead into accidentally
labelling a file on the host instead of namespace.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The implementation matches virStringListFreeCount. The only difference
between the two functions is the ordering of their parameters.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Changes in commit id 'dec6d9df' caused a compilation failure on a RHEL6
CI build environment. So just replace 'system' with 'syscap' as a name.
cc1: warnings being treated as errors
../../src/conf/node_device_conf.c: In function 'virNodeDevCapSystemParseXML':
../../src/conf/node_device_conf.c:1415: error: declaration of 'system' shadows a global declaration [-Wshadow]
In an effort to be consistent with the source module, alter the function
prototypes to follow the similar style of source with the "type" on one
line followed by the function name and arguments on subsequent lines with
with argument getting it's own line.
Alter the format of the code to follow more recent style guidelines of
two empty lines between functions, function decls with "[static] type"
on one line followed by function name with arguments to functions each
on one line.
Move all the NodeDeviceObj API's into their own module virnodedeviceobj
from the node_device_conf
Purely code motion at this point, plus adjustments to cleanly build.
AArch64 kernels are technically capable of running armv7l binaries.
Though some vendors disable this feature during kernel build, we
need to allow it in LXC.
Signed-off-by: Matwey V. Kornilov <matwey.kornilov@gmail.com>
All Intel Haswell processors (except Xeon E7 v3 with stepping >= 4) have
TSX disabled by microcode update. As not all CPUs are guaranteed to be
patched with microcode updates we need to explicitly disable TSX on
affected CPUs to avoid its accidental usage.
https://bugzilla.redhat.com/show_bug.cgi?id=1406791
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The static CPU model expansion is designed to return only canonical
names of all CPU properties. To maintain backwards compatibility libvirt
is stuck with different spelling of some of the features, but we need to
use the full expansion to get the additional spellings. In addition to
returning all spelling variants for all properties the full expansion
will contain properties which are not guaranteed to be migration
compatible. Thus, we need to combine both expansions. First we need to
call the static expansion to limit the result to migratable properties.
Then we can use the result of the static expansion as an input to the
full expansion to get both canonical names and their aliases.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Querying "host" CPU model expansion only makes sense for KVM. QEMU 2.9.0
introduces a new "max" CPU model which can be used to ask QEMU what the
best CPU it can provide to a TCG domain is.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
While query-cpu-model-expansion returns only boolean features on s390,
but x86_64 reports some integer and string properties which we are
interested in.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The build system for libvirt correctly detects the location of blkid
using PKG_CONFIG_PATH environment variable. The file blkid.pc states
that the include flags should be: 'Cflags: -I${includedir}/blkid' but
libvirt searches for blkid.h inside ${includedir}/blkid/blkid, which is
wrong. Until now, the compilation for libvirt succeeded because of pure
luck, as it had -I/usr/include as a CFLAG. This issue was faced while
compiling libvirt on Ubuntu 16.04.2 with bare minimum dev packages and a
custom compiled blkid kept in a non-standard $prefix.
Signed-off-by: Nehal J Wani <nehaljw.kkd1@gmail.com>
virQEMUCapsHasPCIMultiBus() performs a version check on
the QEMU binary to figure out whether multiple buses are
supported, so to get the correct aliases assigned when
dealing with pSeries guests we need to spoof the version
accordingly in the test suite.
Due to the extra architecture-specific logic, it's already
necessary for users to call virQEMUCapsHasPCIMultiBus(),
so the capability itself is just a pointless distraction.
Our documentation states that the chardev logging file is truncated
unless append='on' is specified. QEMU also behaves the same way and
truncates the file unless we provide the argument. The new virlogd
implementation did not honor if the argument was missing and continued
to append to the file.
Truncate the file even when the 'append' attribute is not present to
behave the same with both implementations and adhere to the docs.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1420205
This function is calling public APIs (virNodeDeviceLookupByName
etc.). That requires the driver lock to be unlocked and locked
again. If we, however, replace the public APIs calls with the
internal calls (that public APIs call anyway), we can drop the
lock/unlock exercise.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The 'nodes' is overwritten after the first usage and possibly leaked
if any code in the first set of parsing goes to error.
Found by Coverity.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Arguably though, function returning only on success is a very
interesting, although quite impractical concept. Also, the errno isn't
and shouldn't be preserved in this case, since the errno can be directly
fed to the virReportSystemError.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
After eca76884ea in case of error in qemuDomainSetPrivatePaths()
in pretended start we jump to stop. I've changed this during
review from 'cleanup' which turned out to be correct. Well, sort
of. We can't call qemuProcessStop() as it decrements
driver->nactive and we did not increment it. However, it calls
virDomainObjRemoveTransientDef() which is basically the only
function we need to call. So call that function and goto cleanup;
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The API is useful for creating virCPUData in a hypervisor driver from
data we got by querying the hypervisor.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The API is useful for creating virCPUData in a hypervisor driver from
data we got by querying the hypervisor.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The API is useful for creating virCPUData in a hypervisor driver from
data we got by querying the hypervisor.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The CPU driver provides APIs to create and free virCPUDataPtr. Thus all
APIs exported from the driver should work with that rather than
requiring the caller to pass a pointer to an internal part of the
structure.
In other words
virCPUx86DataAddCPUID(cpudata, &cpuid)
is much better than the original
virCPUx86DataAddCPUID(&cpudata->data.x86, &cpuid)
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>