Currently the disk and chardev seclabels are validated immediately at
the time their data is parsed. This forces the parser to fill in the
top level secmodel at time of parsing which is an undesirable thing.
This validation conceptually should be done in the post-parse phase
instead.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Instead of using the virCapsPtr information, pass the driver specific
netprefix in the domain parser struct. This eliminates one more use of
virCapsPtr from the XML parsing/formatting code.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
To enable the virCapsPtr parameter to the post parse method to be
eliminated, the drivers must fetch the virCapsPtr from their own
driver via the opaque parameter, or use an alternative approach
to validate the parsed data.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The XML parser currently calls virCapabilitiesDomainDataLookup during
parsing to find the domain capabilities matching the triple
(virt type, os type, arch)
This is, however, bogus with the QEMU driver as it assumes that there
is an emulator known to the default driver capabilities that matches
this triple. It is entirely possible for the driver to be parsing an
XML file with a custom emulator path specified pointing to a binary
that doesn't exist in the default driver capabilities. This will,
for example be the case on a RHEL host which only installs the host
native emulator to /usr/bin. The user can have built a custom QEMU
for non-native arches into $HOME and wish to use that.
Aside from validation, this call is also used to fill in a machine type
for the guest if not otherwise specified. Again, this data may be
incorrect for the QEMU driver because it is not taking account of
the emulator binary that is referenced.
To start fixing this, move the validation to the post-parse callbacks
where more intelligent driver specific logic can be applied.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When parsing the guest XML we must fill in the default guest arch if it
is not already present because later parts of the parsing process need
this information.
If no arch is specified we lookup the first guest in the capabilities
data matching the os type and virt type. In most cases this will result
in picking the host architecture but there are some exceptions...
- The test driver is hardcoded to always use i686 arch
- The VMWare/ESX drivers will always place i686 guests ahead
of x86_64 guests in capabilities, so effectively they always
use i686
- The QEMU driver can potentially return any arch at all
depending on what combination of QEMU binaries are installed.
The domain XML hardware configurations are inherently architecture
specific in many places. As a result whomever/whatever created the
domain XML will have had a particular architecture in mind when
specifying the config. In pretty much any sensible case this arch
will have been the native host architecture. i686 on x86_64 is
the only sensible divergance because both these archs are
compatible from a domaain XML config POV.
IOW, although the QEMU driver can pick an almost arbitrary arch as its
default, in the real world no application or user is likely to be
relying on this default arch being anything other than native.
With all this in mind, it is reasonable to change the XML parser to
allow the default architecture to be passed via the domain XML options
struct. If no info is explicitly given then it is safe & sane to pick
the host native architecture as the default for the guest.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Moving their instance parameter to be the first one, and give consistent
ordering of other parameters across all functions. Ensure that the xml
options are passed into both functions in prep for future work.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Our normal practice is for the object type to be the name prefix, and
the object instance be the first parameter passed in.
Rename these to virDomainObjSave and virDomainDefSave moving their
primary parameter to be the first one. Ensure that the xml options
are passed into both functions in prep for future work.
Finally enforce checking of the return type and mark all parameters
as non-NULL.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Currently the virQEMUCapsPtr objects are just empty. Future patches are
going to expect them to contain real data. Start off by populating the
machine types and arch information.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
As part of a goal to eliminate the need to use virCapsPtr for anything
other than the virConnectGetCapabilies() API impl, cache the host arch
against the QEMU driver struct and use that field directly.
In the tests we move virArchFromHost() globally in testutils.c so that
every test runs with a fixed default architecture reported.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The functions for converting migration typed parameters to QEMU
migration parameters and back were only implemented for integer types.
This patch adds support for string parameters.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
With blockdev we need to refer to the nodename of the disk source image
as the source argument for the blockdev-mirror operation while still
keeping the old job name. With blockdev we must also persist the job in
qemu.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Separate out allocation of the virStorageSource corresponding to the
target NBD export of the migration.
As part of the splitout we allocate the export name explicitly as that
one must not change regardless whether blockdev is used or not to
provide compatibility.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The non-shared-storage migration tracks the storage source used
explicitly in the migration data so we must allow for processing of the
block job which has NULL mirror as the mirror will not be populated.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Now that the cleanup section does not exist remove the label.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
qemuMigrationSrcNBDCopyCancelOne uses the block job data structure but
generated it's own job name rather than taking it from the block job
data.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
With -blockdev we must use the nodename as the export but we must keep
the name of the export as it was before to ensure compatiblity.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Declare the variable inside the loop with automatic clearing.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
qemuDomainBlockJobSetSpeed was not converted to get the job name from
the block job data. This means that after enabling blockdev the API call
would fail as we wouldn't use the appropriate name.
https://bugzilla.redhat.com/show_bug.cgi?id=1780497
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Where appropriate replace the open coded call with the qemu wrapper
which already reports the error.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
With this patch users can cold unplug some sound devices.
use "virsh detach-device vm sound.xml --config" command.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Jidong Xia <xiajidong@cmss.chinamobile.com>
This is a follow-up to patch series posted in
https://www.redhat.com/archives/libvir-list/2019-November/msg01180.html
It implements a suggestion made by Cole in
https://www.redhat.com/archives/libvir-list/2019-November/msg01207.html
and discussed in follow-up messages as there were no objections to the
change.
The aim is to make the code more readable by replacing nested branching
with a flat structure.
Signed-off-by: Pavel Mores <pmores@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In v5.8.0-rc1~122 we've removed the only use of @safename in
qemuMonitorTextLoadSnapshot(). What we are left with is an
declared but not initialized variable that is passed to
VIR_FREE().
Caught by libvirt-php test suite.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In v5.9.0-370-g8fa0374c5b I've tried to fix a bug by removing
some stale XATTRs in qemuProcessStop(). However, I forgot to
do nothing when the VIR_QEMU_PROCESS_STOP_NO_RELABEL flag was
specified.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Until now we only really aborted migration via qemuDomainAbortJob. This
will change with the upcoming addition of the backup job. Additionally
there were a bunch of if statements checking various aspects of the
current job.
To make it more obvious convert qemuDomainAbortJob to use a switch
statement and move the individual conditions to the appropriate job
type.
Every job type has now it's own case despite multiple job types just
plainly cancelling the job for clarity and future extension.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Following patch will refactor qemuDomainAbortJob to use a per-job-type
switch where we will need to abort a migration job in various branches.
Save some code duplication by introducing a helper.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
As part of a goal to eliminate Perl from libvirt build tools,
rewrite the pdwtags processing script in Python.
The original inline shell and perl code was completely
unintelligible. The new python code is a manual conversion
that attempts todo basically the same thing.
Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Introduced in c8007fdc5d, it should use 'greater than max' instead of
'equal or greater than max' for the condition of checking invalid scsi
unit.
Signed-off-by: Han Han <hhan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
If we static link to libvirt_util.la then we can't override functions in
this file by simply implementing them in the test code. Any tests should
dynamic link to the main libvirt.la and ensure symbols are exported.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
We clear some capabilities here so the lockouts need to be
re-evaluated.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Blockdev is required to do incremental backups properly. Add a helper
function for locking out capabilities and export it to allow re-doing
the processing if a different code path modifies capabilities.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Checking whether a qemu capability set right before clearing it without
any other logic doesn't make sense.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Do all post-processing of capabilities in qemuProcessPrepareQEMUCaps.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Move the post-processing of the QEMU_CAPS_CHARDEV_FD_PASS flag to the
new function.
The clearing of the capability is based on the presence of
VIR_QEMU_PROCESS_START_STANDALONE so we must also pass in the process
start flags.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Start aggregating all capability post-processing code in one place.
The comment was modified while moving it as it was mentioning floppies
which are no longer clearing the blockdev capability.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function is now used only in qemu_process.c so move it there and
name it 'qemuProcessPrepareQEMUCaps' which is more appropriate to what
it's doing.
The reworded comment now mentions that it will also post-process the
caps for VM startup.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The redetection was originally added in 43c01d3838 as a way to recover
from libvirtd upgrade from the time when we didn't persist the qemu
capabilities in the status XML. Also this the oldest supported qemu by
more than two years.
Even if somebody would have a running VM running at least qemu 1.5 with
such an old libvirt we certainly wouldn't do the right thing by
redetecting the capabilities and then trying to communicate with qemu.
For now it will be the best to just stop considering this scenario any
more and error out for such VM.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Commit c90fb5a828 added explicit use of the private copy of the qemu
capabilities to various places. The change to qemuProcessInit was bogus
though as at the point where we re-initiate the post parse callbacks
priv->qemuCaps is still NULL as we clear it after shutdown of the VM and
don't initiate it until a later point.
Using the value from priv->qemuCaps might mislead readers of the code
into thinking that something useful is being passed at that point so go
with an explicit NULL instead.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
qemuDomainGetJobInfo didn't always reset the return data in @info.
Thankfully this wouldn't be a problem as the RPC layer does it but we
should do it anyways.
Since we reset the struct we don't have to set the type to
VIR_DOMAIN_JOB_NONE as the value is 0.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
virDomainGetJobStats destroys the completed statistics on the first
read. Give the user possibility to keep them around if they wish so.
Add a flag VIR_DOMAIN_JOB_STATS_KEEP_COMPLETED which will read the stats
without destroying them.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
For managed save we can choose between various compression
methods. I randomly tested the 'xz' program on a 8 GB guest
and was surprised to have to wait > 50 minutes for it to
finish compressing, with 'xz' burning 100% cpu for the
entire time. Despite the impressive compression, this is
completely useless in the real world as it is far too long
to wait to save the VM.
The 'xz' binary defaults to '-6' optimization level which
aims for high compression, with moderate memory usage,
at the expense of speed.
This change switches it to use the '-3' optimization level
which is documented as being the one that optimizes speed
at expense of compression. Even with this, it will still
outperform all the other options in terms of compression
level. It is a little less than x4 faster than '-6' which
means it starts to be a viable choice to use 'xz' for
people who really want best compression.
The test results on a 1 GB, fairly freshly booted VM are
as follows
format | save | restore size
=======+=======+=============
raw | 05s | 1s | 428 MB
lzop | 05s | 3s | 160 MB
gzip | 29s | 5s | 118 MB
bz2 | 54s | 22s | 114 MB
xz | 4m37s | 13s | 86 MB
xz -3 | 1m20s | 12s | 95 MB
Based on this we can say
* For moderate compression with no noticable loss in speed
=> use lzop
* For high compression with moderate loss in speed
=> use gzip
* For best compression with significant loss in speed
=> use xz
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This is a very simple and straightforward implementation of the opposite
what buildPool does for the disk backend.
The background for this change comes from an existing test case in TCK
which does use the delete method for a pool of type disk, but it
truly could not have ever worked since the implementation simply
wasn't there for the pool of type disk.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Commit 4b58fdf280 which enabled block copy also for network
destinations needed to limit when the 'mirror' storage source is
initialized in cases when we e.g. don't have an appropriate backend.
Limiting it just to virStorageFileSupportsCreate is too restrictive as
for example we can't precreate block devices and thus wouldn't
initialize the 'mirror' but since it's a local source we'd try to
examine it. This would fail since it wouldn't be initialized.
Fix it by introducing a more granular check whether certain operations
are supported and fix the check interlocks.
https://bugzilla.redhat.com/show_bug.cgi?id=1778058
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We tolerate image format detection during block copy in very specific
circumstances, but the code didn't error out on failure of the format
detection.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The API XML files are generated files, so live in the build dir not the
source dir.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
In v5.9.0-273-g8ecab214de I've tried to fix a lock ordering
problem, but introduced a crasher. Problem is that because the
client lock is unlocked (in order to honour lock ordering) the
stream we are currently checking in daemonStreamFilter() might be
freed and thus stream->priv might not even exist when the control
get to virMutexLock() call.
To resolve this, grab an extra reference to the stream and handle
its cleanup should the refcounter reach zero after the deref.
If that's the case and we are the only ones holding a reference
to the stream, we MUST return a positive value to make
virNetServerClientDispatchRead() break its loop where it iterates
over filters. The problem is, if we did not do so, then
"filter = filter->next" line will read from a memory that was
just freed (freeing a stream also unregisters its filter).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In commit 2ccb5335dc I've refactored how we fill the typed parameters
for domain statistics. The commit introduced a regression in the
formating of stats for IOthreads by using the array index to label the
entries as it's common for all other types of statistics rather than
the iothread IDs used for iothreads.
Since only the design of iothread deviates from the common approach used
in all other statistic types this was not caught.
https://bugzilla.redhat.com/show_bug.cgi?id=1778014
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The original implementation used QEMU_ADD_COUNT_PARAM which added the
'count' suffix, but 'cnt' was documented. Fix the documentation to
conform with the original implementation.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The virTypedParamsFilter function doesn't mind params == NULL if nparams
is zero. And there's no need to check for params == NULL && nparams > 0
because this is checked higher in the stack.
In fact all the virCheckNonNull* checks in virTypedParamsFilter are
useless.
https://bugzilla.redhat.com/show_bug.cgi?id=1777094
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Now that we have a separate job type which will not trigger normal code
paths for terminating job we can remove the ad-hoc handling.
This possibly fixes the issue of a broken job inheriting the disk and
then finishing in which case we'd not detach the backing chain.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
To better track jobs we couldn't parse let's introduce a new job type
which will clarify semantics internally in few places.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
We will need to clear per-job type data when we will be marking a
blockjob as broken in the new way. Extract the code for future reuse.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Both failure to refresh and to dismiss the job are very unlikely but if
they happen there's not much we can do about the blockjob.
The concluded job handlers treat it as if the job failed if we don't
update the state to 'QEMU_BLOCKJOB_STATE_COMPLETED' which is probably
the safest thing to do here.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Otherwise it would get dropped later on as untracked despite us knowing
about it. Additionally since we cancelled it we must wait to dismiss it
which would not be possible if we unregister it. This also opened a
window for a race condition since the job state change event of the
just-cancelled job might be delivered prior to us unregistering the job
in which case everything would work properly.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Since we don't know what happened to the job we can't do much about it
but we can at least log that this happened.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
We must exit the monitor prior to refusing other work, otherwise the VM
object will become unusable.
This bug was introduced in commit v5.5.0-244-gc412383796 but thankfully
the code path was not excercised without QEMU_CAPS_BLOCKDEV.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Block jobs may be members of async jobs so it makes more sense to
refresh block job state after we do steps for async job recovery.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
qemu returns an error message in the job statistics even if the job was
cancelled to emphasize it was not successful. Libvirt didn't properly
transform it into QEMU_BLOCKJOB_STATE_CANCELLED though.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Commit ed56851f1b didn't wire up fetching of the statistics for the
job which are reported by 'query-jobs'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
The magic number is taken from the coreutils stat.c file since
there is no constant for it in normal system headers.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Fabiano Fidêncio <fidencio@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This reverts commit 421c9550f5
qemuDomainBlockPullCommon calls virDomainObjEndAPI internally so the
original commit made us shed two references of @vm instead of one
getting us into a premature free of @vm.
This is not a straight revert as qemuDomainBlockPull was modified
meanwhile. I've also added a warning comment that @vm is consumed.
https://bugzilla.redhat.com/show_bug.cgi?id=1777230
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
There are two daemons that wait for acquiring their pid files:
virtnetworkd and virtstoraged. This is undesirable as the idea
is to quit early if unable to acquire the pid file.
Fixes: v5.6.0-rc1~207.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
In the past the network driver was (mistakenly) being called for all
interfaces, not just those of type='network', and so it had a chance
to validate all interface configs after the actual type of the
interface was known.
But since the network driver has been more completely/properly
separated from qemu, the network driver isn't called during the
startup of any interfaces except those with type='network', so this
validation no longer takes place for, e.g. <interface type='bridge'>
(or direct, etc). This in turn meant that a config could erroneously
specify a vlan tag, or bandwidth settings, for a type of interface
that didn't support it, and the domain would start without complaint,
just silently ignoring those settings.
This patch moves those validation checks out of the network driver,
and into virDomainActualNetDefValidate() so they will be done for all
interfaces, not just type='network'.
https://bugzilla.redhat.com/1741121
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
<interface> devices (virDomainNetDef) are a bit different from other
types of devices in that their actual type may come from a network (in
the form of a port connection), and that doesn't happen until the
domain is started. This means that any validation of an <interface> at
parse time needs to be a bit liberal in what it accepts - when
type='network', you could think that something is/isn't allowed, but
once the domain is started and a port is created by the configured
network, the opposite might be true.
To solve this problem hypervisor drivers need to do an extra
validation step when the domain is being started. I recently (commit
3cff23f7, libvirt 5.7.0) added a function to peform such validation
for all interfaces to the QEMU driver -
qemuDomainValidateActualNetDef() - but while that function is a good
single point to call for the multiple places that need to "start" an
interface (domain startup, device hotplug, device update), it can't be
called by the other hypervisor drivers, since 1) it's in the QEMU
driver, and 2) it contains some checks specific to QEMU. For
validation that applies to network devices on *all* hypervisors, we
need yet another interface validation function that can be called by
any hypervisor driver (not just QEMU) right after its network port has
been created during domain startup or hotplug. This patch adds that
function - virDomainActualNetDefValidate(), in the conf directory,
and calls it in appropriate places in the QEMU, lxc, and libxl
drivers.
This new function is the place to put all network device validation
that 1) is hypervisor agnostic, and 2) can't be done until we know the
"actual type" of an interface.
There is no framework for validation at domain startup as there is for
post-parse validation, but I don't want to create a whole elaborate
system that will only be used by one type of device. For that reason,
I just made a single function that should be called directly from the
hypervisors, when they are initializing interfaces to start a domain,
right after conditionally allocating the network port (and regardless
of whether or not that was actually needed). In the case of the QEMU
driver, qemuDomainValidateActualNetDef() is already called in all the
appropriate places, so we can just call the new function from
there. In the case of the other hypervisors, we search for
virDomainNetAllocateActualDevice() (which is the hypervisor-agnostic
function that calls virNetworkPortCreateXML()), and add the call to our
new function right after that.
The new function itself could be plunked down into many places in the
code, but we already have 3 validation functions for network devices
in 2 different places (not counting any basic validation done in
virDomainNetDefParseXML() itself):
1) post-parse hypervisor-agnostic
(virDomainNetDefValidate() - domain_conf.c:6145)
2) post-parse hypervisor-specific
(qemuDomainDeviceDefValidateNetwork() - qemu_domain.c:5498)
3) domain-start hypervisor-specific
(qemuDomainValidateActualNetDef() - qemu_domain.c:5390)
I placed (3) right next to (2) when I added it, specifically to avoid
spreading validation all over the code. For the same reason, I decided
to put this new function right next to (1) - this way if someone needs
to add validation specific to qemu, they go to one location, and if
they need to add validation applying to everyone, they go to the
other. It looks a bit strange to have a public function in between a
bunch of statics, but I think it's better than the alternative of
further fragmentation. (I'm open to other ideas though, of course.)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
These all just return a scalar value, so there's no daisy-chained
fallout from changing them, and they can easily be combined in a
single patch.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
This also isn't required (due to the vportprofile being stored in the
NetDef as a pointer rather than being directly contained), but it
seemed dishonest to not mark it as const (and thus permit users to
modify its contents)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
In this case, the virNetDevBandwidthPtr that is returned is not to a
region within the virDomainNetDef arg, but points elsewhere (the
NetDef has the pointer, not the entire object), so technically it's
not necessary to make the return value a const, but it's a bit
disingenuous to *not* do it.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
This is needed if we want to call the function when the
virDomainNetDef* we have is a const.
Since virDomainNetGetActualVlan returns a pointer to memory that is
within the virDomainNetDefPtr arg, the returned pointer must also be
made const. This leads to a cascade of other virNetDevVlanPtr's that
must be changed to "const virNetDevVlan *".
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
This makes it easier to understand which interface's config caused the
error.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
The cpuModels member of _virQEMUCapsAccel struct is not a
virObject but regular struct with a free function defined:
qemuMonitorCPUDefsFree(). Use that when clearing parent structure
instead of virObjectUnref() to avoid a memleak:
==212322== 57,275 (48 direct, 57,227 indirect) bytes in 3 blocks are definitely lost in loss record 623 of 627
==212322== at 0x4838B86: calloc (vg_replace_malloc.c:762)
==212322== by 0x554A158: g_malloc0 (in /usr/lib64/libglib-2.0.so.0.6000.6)
==212322== by 0x17B14BF5: qemuMonitorCPUDefsNew (qemu_monitor.c:3587)
==212322== by 0x17B27BA7: qemuMonitorJSONGetCPUDefinitions (qemu_monitor_json.c:5616)
==212322== by 0x17B14B0B: qemuMonitorGetCPUDefinitions (qemu_monitor.c:3559)
==212322== by 0x17A6AFBB: virQEMUCapsFetchCPUDefinitions (qemu_capabilities.c:2571)
==212322== by 0x17A6B2CC: virQEMUCapsProbeQMPCPUDefinitions (qemu_capabilities.c:2629)
==212322== by 0x17A70C00: virQEMUCapsInitQMPMonitorTCG (qemu_capabilities.c:4769)
==212322== by 0x17A70DDF: virQEMUCapsInitQMPSingle (qemu_capabilities.c:4820)
==212322== by 0x17A70E99: virQEMUCapsInitQMP (qemu_capabilities.c:4848)
==212322== by 0x17A71044: virQEMUCapsNewForBinaryInternal (qemu_capabilities.c:4891)
==212322== by 0x17A7119C: virQEMUCapsNewData (qemu_capabilities.c:4923)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
On s390 machines host-passthrough and host-model CPUs result in the same
guest ABI (with QEMU new enough to be able to tell us what "host" CPU is
expanded to, which was implemented around 2.9.0). So instead of using
host-passthrough CPU when there's no CPU specified in a domain XML we
can safely use host-model and benefit from CPU compatibility checks
during migration, snapshot restore and similar operations.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The match attribute is only relevant for custom mode CPUs. Reporting
failure when match == 'minimum' regardless on CPU mode can cause
unexpected failures. We should only report the error for custom CPUs. In
fact, calling virCPUs390Update on a custom mode CPU should always report
an error as optional features are not supported on s390 either.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Most likely for historical reasons our CPU def formatting code is
happily adding useless <model fallback='allow'/> for host-model CPUs. We
can just drop it.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Commit v0.8.4-66-g95ff6b18ec (9 years ago) changed the default value for
the cpu/@match attribute to 'exact' in a rather complicated way. It did
so only if <model> subelement was present and set -1 otherwise (which is
not expected to ever happen). Thus the following two equivalent XML
elements:
<cpu mode='host-model'/>
and
<cpu mode='host-model'>
<model/>
</cpu>
would be parsed differently. The former would end up with match == -1
while the latter would have match == 1 ('exact'). This is not a big deal
since the match attribute is ignored for host-model CPUs, but we can
simplify the code and make it a little bit saner anyway.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
If a graphics device was added to XML that had no video device, libvirt
automatically added a video device which was always of type 'cirrus' on
x86_64, even if the underlying qemu didn't support cirrus.
This patch refines a bit the decision about the type of the video device.
Based on QEMU capabilities, cirrus is still preferred but only added if
QEMU supports it, otherwise VGA is used if supported by QEMU. There is now
no fallback as libvirt only aspires to generate a basic working config and
leaves anything more specific up to higher-level management tools.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Pavel Mores <pmores@redhat.com>
The default video device type selection algorithm we're about to deploy will
increase the amount of code dedicated to the task by amount enough to warrant
factoring the whole thing into its own function so as not to pollute the
caller qemuDomainDeviceVideoDefPostParse(). Do it now so that the actual
algorithm change later on is in a clean commit by itself and easy to review.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Pavel Mores <pmores@redhat.com>
This reverts commit f4db846c32.
This patch results in the following error when trying to start
essentially any VM with default network:
unsupported configuration: QOS must be defined for network 'default'
Coverity didn't see that the bandwidth == NULL it complained about in
virNetDevBandwidthPlug was already checked properly in
networkCheckBandwidth, thus causing networkPlugBandwidth to return 0
and finish before a call to virNetDevBandwidthPlug would have been even
made.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This previous commit introduced a simpler free callback for
hash data with only 1 arg, the value to free:
commit 49288fac96
Author: Peter Krempa <pkrempa@redhat.com>
Date: Wed Oct 9 15:26:37 2019 +0200
util: hash: Add possibility to use simpler data free function in virHash
It missed two functions in the hash table code which need
to call the alternate data free function, virHashRemoveEntry
and virHashRemoveSet.
After the previous patch though, there is no code that
makes functional use of the 2nd key arg in the data
free function. There is merely one log message that can
be dropped.
We can thus purge the current virHashDataFree callback
entirely, and rename virHashDataFreeSimple to replace
it.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The virChrdevHashEntryFree method uses the hash 'key'
as the name of the logfile it has to remove. By storing
a struct as the value which contains the stream and
the dev path, we can avoid relying on the hash key
when free'ing entries.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Now that all pieces are in place (hopefully) let's enable -blockdev.
We base the capability on presence of the fix for 'auto-read-only' on
files so that blockdev works properly, mandate that qemu supports
explicit SCSI id strings to avoid ABI regression and that the fix for
'savevm' is present so that internal snapshots work.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The 'savevm' HMP command didn't work properly with blockdev as it tried
to do snapshot of everything including the protocol nodes accessing
files which are not snapshottable. Qemu fixed this bug so now we need to
detect it to allow enabling blockdev.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The top level commands now can have 'feature' flags for fixes so add
support for querying those as well.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Initial implementation of 'auto-read-only' didn't reopen the backing
files when needed. For '-blockdev' to work we need to be able to tel
qemu to open a file read-only and change it during blockjobs as we label
backing chains with a sVirt label which does not allow writing. The
dynamic auto-read-only supports this as it reopens files when writing
is demanded.
Add a capability to detect that the posix file based backends support
the dynamic part.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The qemu driver will obey <backingStore> when we support blockdev.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Historically we've only supported the <backingStore> as an output-only
element for domain disks. The documentation states that it may become
supported on input. To allow management apps detectin once that happens
add a domain capability which will be asserted if the hypervisor driver
will be able to obey the <backingStore> as configured on input.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
If user starts a blockcommit or a blockcopy then we modify access
for qemu on both images and leave it like that until the job
terminates. So far so good. Problem is, if user instead of
terminating the job (where we would modify the access again so
that the state before the job is restored) calls destroy on the
domain or if qemu dies whilst executing the block job. In this
case we don't ever clear the access we granted at the beginning.
To fix this, maybe a bit harsh approach is used, but it works:
after all labels were restored (that is after
qemuSecurityRestoreAllLabel() was called), we iterate over each
disk in the domain and remove XATTRs from the whole backing chain
and also from any file the disk is being mirrored to.
This would have been done at the time of pivot, but it isn't
because user decided to kill the domain instead. If we don't do
this and leave some XATTRs behind the domain might be unable to
start.
Also, secdriver can't do this because it doesn't know if there is
any job running. It's outside of its scope - the hypervisor
driver is responsible for calling secdriver's APIs.
Moreover, this is safe to call because we don't remember labels
for any member of a backing chain except of the top layer. But
that one was restored in qemuSecurityRestoreAllLabel() call done
earlier. Therefore, not only we don't remember labels (and thus
this is basically a NOP for other images in the backing chain) it
is also safe to call this when no blockjob was started in the
first place, or if some parts of the backing chain are shared
with some other domains - this is NOP, unless a block job is
active at the time of domain destroy.
https://bugzilla.redhat.com/show_bug.cgi?id=1741456#c19
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
There are four places where we remove image XATTRs and in all of
them we have the same for() loop with the same body. Move it into
a separate function because I'm about to introduce fifth place
where the same needs to be done.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Install the convertor function which enables the internals that will use
-blockdev to make qemu open the firmware image and stop using -drive.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The old way to instantiate a pflash device via -drive was a hack since
it's a platform device.
The modern approach calls for configuring it via -machine and takes the
node name as an argument.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
As a first step we will build the blockdevs which will be supposed to
back the pflash drives when moving away from -drive.
This code is similar to the way we build the blockdevs for the disk, but
skips the copy-on-read layer and doesn't implement any legacy approach.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Add a helper which will covert the PFLASH code file and variable file
into the virStorageSource objects stored in private data so that we can
use them with -blockdev while keeping the infrastructure to determine
the path to the loaders intact.
This is a temporary solution until we will want to do snapshots of the
pflash where we will be forced do track the full backing chain in the
XML.
In the meanwhile just convert it partially so that we can stop using
-drive.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
To allow converting the pflash drives to blockdev we will need a
virStorageSource to allow using our helpers. Temporarily prior to
coverting loader data to a virStorageSoruce add private data which will
house this.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Extract the old way to instantiate pflash devices to hold the firmware
via -drive to a separate function so that it can later be conditionally
disabled when -blockdev will be used.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Commit 5751a0b6b1 added a helper function
called virDomainCapsFeaturesInitUnsupported which initialized all domain
capability features as unsupported.
When adding a new feature this would initialize it as unsupported also
for hypervisor drivers which the original author possibly didn't intend
to modify. To prevent accidental wrong value being reported in such case
revert back to initializing individual features in the hypervisor
drivers themselves.
This is not a straight revert as additonal patches modified how we store
the capabilities.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Pre-Glib era which used malloc allowed the size of the client-side
buffers to be declared as 0, because malloc documents that it can either
return 0 or a unique pointer on 0 size allocations.
With glib this doesn't work anymore, because glib documents that for
such allocation requests NULL is always returned which results in an
error in our public API checks server-side.
This patch complements the fix in the RPC layer by explicitly erroring
out on the following combination of args used by our legacy APIs (their
moder equivalents don't suffer from this):
function(caller-allocated-array, size, ...) {
if (!caller-allocated-array && size > 0)
return error;
}
treating everything else as a valid input and potentially let that fail
on the server-side rather than client-side.
https://bugzilla.redhat.com/show_bug.cgi?id=1772842
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The qemu_domain_monitor_event_msg struct in qemu_protocol.x
defines event as a nonnull_string and qemuMonitorJSONIOProcessEvent
also errors out on a non-NULL event.
Drop the check to fix the build with static analysis.
This essentially reverts commit d343e8203d
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Shared memory devices need qemu to be able to access certain paths
either for the shared memory directly (mostly ivshmem-plain) or for a
socket (mostly ivshmem-doorbell).
Add logic to virt-aa-helper to render those apparmor rules based
on the domain configuration.
https://bugzilla.redhat.com/show_bug.cgi?id=1761645
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Acked-by: Jamie Strandboge <jamie@canonical.com>
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
There are currently broken use cases, e.g. snapshotting more than one disk at
once like:
$ virsh snapshot-create-as --domain eoan --disk-only --atomic
--diskspec vda,snapshot=no --diskspec vdb,snapshot=no
--diskspec vdc,file=/test/disk1.snapshot1.qcow,snapshot=external
--diskspec vdd,file=/test/disk2.snapshot1.qcow,snapshot=external
The command above will iterate from qemuDomainSnapshotCreateDiskActive and
eventually add /test/disk1.snapshot1.qcow first (appears in the rules)
to then later add /test/disk2.snapshot1.qcow and while doing so throwing
away the former rule causing it to fail.
All other calls to (re)load_profile already use append=true when adding
rules append=false is only used when restoring rules [1].
Fix this by letting AppArmorSetSecurityImageLabel use append=true as well.
Since this is removing a (unintentional) trigger to revoke all rules
appended so far we agreed on review to do some tests, but in the tests
no rules came back on:
- hot-plug
- hot-unplug
- snapshotting
Bugs:
https://bugs.launchpad.net/libvirt/+bug/1845506https://bugzilla.redhat.com/show_bug.cgi?id=1746684
[1]: https://bugs.launchpad.net/libvirt/+bug/1845506/comments/13
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Acked-by: Jamie Strandboge <jamie@canonical.com>
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
A lot of the code in AppArmorSetSecurityImageLabel is a duplicate of
what is in reload_profile, this refactors AppArmorSetSecurityImageLabel
to use reload_profile instead.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Acked-by: Jamie Strandboge <jamie@canonical.com>
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
reload_profile calls get_profile_name for no particular gain, lets
remove that call. The string isn't used in that function later on
and not registered/passed anywhere.
It can only fail if it either can't allocate or if the
virDomainDefPtr would have no uuid set (which isn't allowed).
Thereby the only "check" it really provides is if it can allocate the
string to then free it again.
This was initially added in [1] when the code was still in
AppArmorRestoreSecurityImageLabel (later moved) and even back then had
no further effect than described above.
[1]: https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/security/security_apparmor.c;h=16de0f26f41689e0c50481120d9f8a59ba1f4073;hb=bbaecd6a8f15345bc822ab4b79eb0955986bb2fd#l487
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Acked-by: Jamie Strandboge <jamie@canonical.com>
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
While only used internally from libvirt the options still are misleading
enough to cause issues every now and then.
Group modes, options and an adding extra file and extend the wording of
the latter which had the biggest lack of clarity.
Both add a file to the end of the rules, but one re-generates the
rules from XML and the other keeps the existing rules as-is not
considering the XML content.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Acked-by: Jamie Strandboge <jamie@canonical.com>
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
When starting a domain without a CPU model specified in the domain XML,
QEMU will choose a default one. Which is fine unless the domain gets
migrated to another host because libvirt doesn't perform any CPU ABI
checks and the virtual CPU provided by QEMU on the destination host can
differ from the one on the source host.
With QEMU 4.2.0 we can probe for the default CPU model used by QEMU for
a particular machine type and store it in the domain XML. This way the
chosen CPU model is more visible to users and libvirt will make sure
the guest will see the exact same CPU after migration.
Architecture specific notes
- aarch64: We only set the default CPU for TCG domains as KVM requires
explicit "-cpu host" to work.
- ppc64: The default CPU for KVM is "host" thanks to some hacks in QEMU,
we will translate the default model to the model corresponding to the
host CPU ("POWER8" on a Power8 host, "POWER9" on Power9 host, etc.).
This is not a problem as the corresponding CPU model is in fact an
alias for "host". This is probably not ideal, but it's not wrong and
the default virtual CPU configured by libvirt is the same QEMU would
use. TCG uses various CPU models depending on machine type and its
version.
- s390x: The default CPU for KVM is "host" while TCG defaults to "qemu".
- x86_64: The default CPU model (qemu64) is not runnable on any host
with KVM, but QEMU just disables unavailable features and starts
happily.
https://bugzilla.redhat.com/show_bug.cgi?id=1598151https://bugzilla.redhat.com/show_bug.cgi?id=1598162
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
QEMU 4.2.0 will report default CPU types used by each machine type and
we will want to start using it.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Almost all TCG query-machines replies match KVM. The only exceptions are
4.2.0 replies on s390x which differ in the reported default CPU type.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Some specifics of machine types may depend on the accelerator and thus
the data should be moved to virQEMUCapsAccel. The TCG machine types are
just copied from the ones probed for KVM to simplify the changes to
qemucapabilitiestest data files.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function copies machine type data from one QEMU caps structure to
another.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In preparation for making machine types dependent on the accelerator,
the <machine> elements are formatted between <cpu type='kvm'> and
<cpu type='tcg'>.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
All the code for formatting machine type data was moved to a standalone
virQEMUCapsFormatMachines function.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
All the code for loading machine type data was moved to a standalone
virQEMUCapsLoadMachines function.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
To avoid duplicating code which selects the right virQEMUCapsAccel data
to be filled during probing.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
It is a tiny wrapper around virQEMUCapsProbeQMPCPUDefinitions which will
soon get private parameters and thus it cannot be exposed outside
qemu_capabilities.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
And make it use virQEMUCapsGetAccel once rather than repeating the same
code in all functions called from virQEMUCapsFormatAccel.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
And make it use virQEMUCapsGetAccel once rather than repeating the same
code in all functions called from virQEMUCapsLoadAccel.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function can be used to get the pointer to all data which depend on
the accelerator.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This is container for capabilities data that depend on the accelerator.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The new functions are designed to load and format capabilities which
depend on the accelerator (host CPU expansion and CPU models).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We need to create a mapping between CPU model names and their
corresponding QOM types.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Both virDomainCapsCPUModelsAdd and virDomainCapsCPUModelsAddSteal are so
simple we can just squash the code in a single function.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We will need to keep some QEMU-specific data for each CPU model
supported by a QEMU binary. Instead of complicating the generic
virDomainCapsCPUModelsPtr, we can just directly store
qemuMonitorCPUDefsPtr returned by the capabilities probing code.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Most of the code moved to a new virQEMUCapsFetchCPUDefinitions function
and the existing virQEMUCapsFetchCPUModels just becomes a small wrapper
around virQEMUCapsFetchCPUDefinitions and virQEMUCapsCPUDefsToModels.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The functions return virDomainCapsCPUModelsPtr and thus they should be
called *CPUModels for consistency. Functions called *CPUDefinitions will
work on qemuMonitorCPUDefsPtr.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function translates qemuMonitorCPUDefsPtr (used by QEMU caps probing
code) into virDomainCapsCPUModelsPtr used by domain capabilities.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
While virDomainCapsCPUModel structure contains 'usable' field with
virDomainCapsCPUUsable type, the lower level structure specific to QEMU
driver used virTriStateBool for the same thing and we had to translate
between them.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Let's store qemuMonitorCPUDefInfo directly in the array of CPUs in
qemuMonitorCPUDefs rather then using an array of pointers.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
It is a container for a CPU models list (qemuMonitorCPUDefInfo) and a
number of elements in this list.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The function would return a valid virDomainCapsCPUModelsPtr with empty
CPU models list if query-cpu-definitions exists in QEMU, but returns
GenericError meaning it's not in fact implemented. This behaviour is a
bit strange especially after such virDomainCapsCPUModels structure is
stored in capabilities XML and parsed back, which will result in NULL
virDomainCapsCPUModelsPtr rather than a structure containing nothing.
Let's just keep virDomainCapsCPUModelsPtr NULL if the QMP command is not
implemented and change the return value to int so that callers can
easily check for failure or success.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Some callers of virQEMUCapsGetCPUDefinitions will need to filter the
returned list of CPU models. Let's add the filtering parameters directly
to virQEMUCapsGetCPUDefinitions to avoid copying the CPU models list
twice.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Rather than returning a direct pointer the list stored in qemuCaps the
function now creates a new copy of the CPU models list.
The main purpose of this seemingly useless change is to update callers
to free the result returned by virQEMUCapsGetCPUDefinitions because the
internals of this function will change significantly in the following
patches.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
As part of a goal to eliminate Perl from libvirt build tools,
rewrite the genpolkit.pl tool in Python.
This was a straight conversion, manually going line-by-line to
change the syntax from Perl to Python. Thus the overall structure
of the file and approach is the same.
Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
As part of a goal to eliminate Perl from libvirt build tools,
rewrite the check-aclrules.pl tool in Python.
This was a straight conversion, manually going line-by-line to
change the syntax from Perl to Python. Thus the overall structure
of the file and approach is the same.
Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
As part of a goal to eliminate Perl from libvirt build tools,
rewrite the check-driverimpls.pl tool in Python.
This was a straight conversion, manually going line-by-line to
change the syntax from Perl to Python. Thus the overall structure
of the file and approach is the same.
Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
As part of a goal to eliminate Perl from libvirt build tools,
rewrite the check-drivername.pl tool in Python.
This was mostly a straight conversion, manually going line-by-line
to change the syntax from Perl to Python. Thus the overall structure
of the file and approach is the same.
In testing though it was discovered the existing code was broken
since it hadn't been updated after driver.h was split into many
files. Since the old code is being thrown away, the fix was done
as part of the rewrite rather than split into a separate commit.
Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
As part of a goal to eliminate Perl from libvirt build tools,
rewrite the gensystemtap.pl tool in Python.
This was a straight conversion, manually going line-by-line to
change the syntax from Perl to Python. Thus the overall structure
of the file and approach is the same.
Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
As part of a goal to eliminate Perl from libvirt build tools,
rewrite the dtrace2systemtap.pl tool in Python.
This was a straight conversion, manually going line-by-line to
change the syntax from Perl to Python. Thus the overall structure
of the file and approach is the same.
The "--with-modules" flag was dropped because this functionality
is not implicitly always enabled.
Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
As part of a goal to eliminate Perl from libvirt build tools,
rewrite the check-symfile.pl tool in Python.
This was a straight conversion, manually going line-by-line to
change the syntax from Perl to Python. Thus the overall structure
of the file and approach is the same.
Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
As part of a goal to eliminate Perl from libvirt build tools,
rewrite the check-symsorting.pl tool in Python.
This was a straight conversion, manually going line-by-line to
change the syntax from Perl to Python. Thus the overall structure
of the file and approach is the same.
Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
As part of a goal to eliminate Perl from libvirt build tools,
rewrite the check-aclperms.pl tool in Python.
This was a straight conversion, manually going line-by-line to
change the syntax from Perl to Python. Thus the overall structure
of the file and approach is the same.
Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Use the new helper in qemuCheckpointDiscard rather than constructing the
array manually.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Another weird bug appeared concerning qemu namespaces. Basically
the problem is as follows:
1) Issue an API that causes libvirt to create a node in domain's
namespace, say /dev/nvme0n1 with 8:0 as major:minor (the API can
be attach-disk for instance). Or simply create the node from a
console by hand.
2) Detach the disk from qemu.
3) Do something that makes /dev/nvme0n1 change it's minor number.
4) Try to attach the disk again.
The problem is, in a few cases - like disk-detach - we don't
remove the corresponding /dev node from the mount namespace
(because it may be used by some other disk's backing chain). But
this creates a problem, because if the node changes its MAJ:MIN
numbers we don't propagate the change into the domain's
namespace. We do plain mknod() and ignore EEXIST which obviously
is not enough because it doesn't guarantee that the node has
updated MAJ:MIN pair.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1752978
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Last usage was removed by commit
<41f88886198e231285cc813f8c0687c8ec5c9488> and commit
<0f4d31720430b4e3735064cc0d8f88a1a438e154> forgot to drop include.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Fabiano Fidêncio <fidencio@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We replaced them by use of transaction to simplify possible failure
scenarios.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Delete/merge bitmaps when deleting checkpoints using a 'transaction' so
that we don't have to deal with halfway-failed scenarios and also fix
access to 'vm' while in the monitor lock.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
When dispatching a message read from client it is first passed
through registered filters. If one of the filters consumes the
message no further processing of the message is done. However,
the filter callbacks are called with the client object locked.
This breaks lock ordering in case of virStream filter, we always
acquire stream private data lock without the client object
locked. In other words, the daemonStreamFilter() does not follow
the lock ordering.
Signed-off-by: LanceLiu <liu.lance.89@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
If networkAllocatePort calls networkPlugBandwidth eventually the
port->bandwidth would be passed to virNetDevBandwidthPlug which
requires that the parameter is non-NULL. Coverity additionally
notes that since (!port->bandwidth) is checked earlier in the
networkAllocatePort method that the subsequent call to blindly
use if for a function that requires it needs to check.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
We go through the trouble of checking {old|new}Bandwidth[->in] and
storing the result in local @old_floor and @new_floor, but then
we don't use them. Instead we make derefs to the longer name. This
caused Coverity to note dereferencing newBandwidth->in without first
checking @newBandwidth like was done for new_floor could cause a
NULL dereference.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Since g_strdup_printf will abort, we know @newfile won't be NULL.
Found by Coverity
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The @def variable holds pointer to the domain defintion, but is
set only somewhere in the middle of the function. This is
suboptimal.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
This flag is not implied by g_mkstemp_full, only by g_mkstemp.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reported-by: Bjoern Walk <bwalk@linux.ibm.com>
Fixes: 4ac4773040
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
qemuDomainDefFormatBufInternal function wasn't testing whether the CPU
was actually defined in the XML and saving such a domain resulted in the
following backtrace:
0 in qemuDomainMakeCPUMigratable (cpu=0x0)
1 in qemuDomainDefFormatBufInternal()
2 in qemuDomainDefFormatXMLInternal()
3 in qemuDomainDefFormatLive()
4 in qemuDomainSaveInternal()
5 in qemuDomainSaveFlags()
6 in qemuDomainSave()
7 in virDomainSave()
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Commit <f136b83139c63f20de0df3285d9e82df2fb97bfc> reworked process
affinity setting but did not take cgroups into account which introduced
an issue when starting VM with custom cpuset.cpus for the whole machine
group.
If the machine group is limited to some pCPUs libvirt should not try to
set a VM to run on all pCPUs as it will result in permission denied when
writing to cpuset.cpus.
To fix this the affinity has to be set separately from cgroups cpuset.
Resolves: <https://bugzilla.redhat.com/show_bug.cgi?id=1746517>
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In functions implemented here we fill this attr union (type of
bpf_attr) and just pass it to syscall(2). Thing is that some of
the union members are type of __aligned_u64. This is not regular
uint64_t. This one is explicitly aligned to 8 bytes, while
uint64_t can be aligned to 4 bytes (on 32 bits). We've used
explicit typecast to uint64_t to shut compiler which would
otherwise complain of assigning a pointer into an integer. Well,
we have uintptr_t just for that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In virCgroupV2DevicesReallocMap() we are debug printing both
arguments passed to the function. However, the @size argument is
type of size_t but '%lu' is used to format it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There are some OSes which don't have syscall() nor
<sys/syscall.h>. We already check for the header file in
configure phase, so we just need to add check for
HAVE_SYS_SYSCALL_H to HAVE_DECL_BPF_PROG_QUERY.
While I'm at it, some header files we are including are not
needed, so their includes can be safely dropped.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Ensure that both x and y are non-zero when resolution is specified for a
video device.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Since this function is now only called when an 'acceleration' element is
present in the xml, any failure to parse the element will be considered
an error.
Previously, we detected some types of errors, but we would only log an
error (virReportError()), but still return a partially-specified accel
object to the caller. This patch returns NULL for all parsing errors and
reports that error back up to the caller.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
The current code doesn't properly handle errors when parsing a video
device's resolution. We were returning a NULL structure for the case
where 'x' or 'y' were missing. But for the other error cases, we were
logging an error (virReportError()), but still returning an
under-specified structure. That under-specified structure was used by
the calling function rather than properly reporting an error.
This patch changes the parse function to return NULL on any parsing
error and changes the calling function to report an error when NULL is
returned.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Previously, we were passing the video "model" node to the "acceleration"
and "resolution" parsing functions and requiring them to iterate over
the children to discover and parse the appropriate node. It makes more
sense to move this responsibility up to the parent function and just
pass these functions the node that needs to be parsed.
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Call first virCgroupNew on the parent group virCgroupNewPartition if
it is available on before the creation of the child group. This
ensures that the creation of a first level group on the unified
architecture, as the check at virCgroupV2ParseControllersFile as the
parent file is there.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1760233
Signed-off-by: Miguel Ángel Arruga Vivas <rosen644835@gmail.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Enables hosting a pool on an existing zfs pool without affecting
other datasets there.
Specify dataset instead of pool as source to use.
Parent of dataset must exist for pool-build to succeed.
Beware that pool-delete destroys the source dataset and all children.
Solves: https://www.redhat.com/archives/libvirt-users/2017-April/msg00041.html
Signed-off-by: Gregor Kopka <gregor@kopka.net>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Glib implementation follows the ISO C99 standard so it's safe to replace
the gnulib implementation.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
We need to mock virCgroupV2DevicesAvailable() in order to remove any
dependency on kernel as BPF devices might not be available.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
So the issue here is that you can end up with configuration where
you have cgroup v1 and v2 enabled at the same time and the devices
controllers is enabled for cgroup v1.
In cgroup v2 there is no devices controller, the device access is
controlled using BPF and since it is not a cgroup controller both
of them can exists at the same time and both of them are applied while
resolving access to devices.
In order to avoid configuring both BPF and cgroup v1 devices we will
use BPF if possible and otherwise fallback to cgroup v1 devices.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
If we want to deny all devices we just need to replace any existing
program with new program with empty map.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
If we want to allow all devices with all permissions we need to replace
any existing program that has any rule configured, otherwise we just
need to add new rule which will for example allow read access to all
devices.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In order to deny device we need to check if there is any entry in BPF
map and we need to load the current value from map if there is already
entry for that device. If both values are same we can remove that entry
but if they are different we need to update the entry because we don't
have to deny all access, but for example only write access.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
In order to allow device we need to create key and value which will be
used to update BPF map. virBPFUpdateElem() can override existing
entries in BPF map so we need to check if that entry exists in order to
track number of entries in our map.
This can add rule for specific device but major and minor can be both
-1 which follows the same behavior as in cgroup v1.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Device rules are stored in BPF map that is a hash type, this function
will create a key based on major and minor id of device.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
We need to close our FD that we have for BPF program and map in order
to let kernel remove all resources once the cgroup is removed as well.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This function will be called for every virCgroup(Allow|Deny)* API in
order to prepare BPF program for guest. Since libvirtd can be restarted
at any point we will first try to detect existing progam, if there is
none we will create a new empty BPF program and lastly if we don't have
any space left in the existing BPF map we will create a new copy of the
BPF map with more space and attach a new program with that map into the
guest cgroup.
This solution allows us to start with reasonably small BPF map consuming
only small amount of memory and if needed we can easily extend the BPF
map if there is a lot of host devices used in guest or if user wants to
hot-plug a lot of devices once the guest is running.
Since there is no way how to reallocate existing BPF map we need to
create a new copy if we run out of space in current BPF map.
This overcomes all the limitations in BPF:
- map used in program has to be created before the program is loaded
into kernel
- once map is created you cannot change its size
- you cannot replace map in existing program
- you cannot use an array of maps because it can store FD to maps
of one specific size so we would not be able to use it to overcome
the second issue
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This function creates new BPF program with new empty BPF map with the
default size and attaches it to the guest cgroup.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This function will be called if libvirtd was restarted while some
domains were running. It will try to detect existing programs attached
to the guest cgroup.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This function loads the BPF prog with prepared map into kernel and
attaches it into guest cgroup. It can be also used to replace existing
program in the cgroup if we need to resize BPF map to store more rules
for devices. The old program will be closed and removed from kernel.
There are two possible ways how to create BPF program:
- One way is to write simple C-like code which can by compiled into
BPF object file which can be loaded into kernel using elfutils.
- The second way is to define macros which look like assembler
instructions and can be used directly to create BPF program that
can be directly loaded into kernel.
Since the program is not too complex we can use the second option.
If there is no program, all devices are allowed, if there is some
program it is executed and based on the exit status the access is
denied for 0 and allowed for 1.
Our program will follow these rules:
- first it will try to look for the specific key using major and
minor to see if there is any rule for that specific device
- if there is no specific rule it will try to look for any rule that
matches only major of the device
- if there is no match with major it will try the same but with
minor of the device
- as the last attempt it will try to look for rule for all devices
and if there is no match it will return 0 to deny that access
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>