Support using nftables to setup the firewall for each virtual network,
rather than iptables. The initial implementation of the nftables
backend creates (almost) exactly the same ruleset as the iptables
backend, determined by running the following commands on a host that
has an active virtual network:
iptables-save >iptables.txt
iptables-restore-translate -f iptables.txt
(and the similar ip6tables-save/ip6tables-restore-translate for an
IPv6 network). Correctness of the new backend was checked by comparing
the output of:
nft list ruleset
when the backend is set to iptables and when it is set to nftables.
This page was used as a guide:
https://wiki.nftables.org/wiki-nftables/index.php/Moving_from_iptables_to_nftables
The only differences between the rules created by the nftables backed
vs. the iptables backend (aside from a few inconsequential changes in
display order of some chains/options) are:
1) When we add nftables rules, rather than adding them in the
system-created "filter" and "nat" tables, we add them in a private
table (ie only we should be using it) created by us called "libvirt"
(the system-created "filter" and "nat" tables can't be used because
adding any rules to those tables directly with nft will cause failure
of any legacy application attempting to use iptables when it tries to
list the iptables rules (e.g. "iptables -S").
(NB: in nftables only a single table is required for both nat and
filter rules - the chains for each are differentiated by specifying
different "hook" locations for the toplevel chain of each)
2) Since the rules that were added to allow tftp/dns/dhcp traffic from
the guests to the host are unnecessary in the context of nftables,
those rules aren't added.
(Longer explanation: In the case of iptables, all rules were in a
single table, and it was always assumed that there would be some
"catch-all" REJECT rule added by "someone else" in the case that a
packet didn't match any specific rules, so libvirt added these
specific rules to ensure that, no matter what other rules were added
by any other subsystem, the guests would still have functional
tftp/dns/dhcp. For nftables though, the rules added by each subsystem
are in a separate table, and in order for traffic to be accepted, it
must be accepted by *all* tables, so just adding the specific rules to
libvirt's table doesn't help anything (as the default for the libvirt
table is ACCEPT anyway) and it just isn't practical/possible for
libvirt to find *all* other tables and add rules in all of them to
make sure the traffic is accepted. libvirt does this for firewalld (it
creates a "libvirt" zone that allows tftp/dns/dhcp, and adds all
virtual network bridges to that zone), however, so in that case no
extra work is required of the sysadmin.)
3) nftables doesn't support the "checksum mangle" rule (or any
equivalent functionality) that we have historically added to our
iptables rules, so the nftables rules we add have nothing related to
checksum mangling.
(NB: The result of (3) is that if you a) have a very old guest (RHEL5
era or earlier) and b) that guest is using a virtio-net network
device, and c) the virtio-net device is using vhost packet processing
(the default) then DHCP on the guest will fail. You can work around
this by adding <driver name='qemu'/> to the <interface> XML for the
guest).
There are certainly much better nftables rulesets that could be used
instead of those implemented here, and everything is in place to make
future changes to the rules that are used simple and free of surprises
(e.g. the rules that are added have coresponding "removal" commands
added to the network status so that we will always remove exactly the
rules that were previously added rather than trying to remove the
rules that "the current build of libvirt would have added" (which will
be incorrect the first time we run a libvirt with a newly modified
ruleset). For this initial implementation though, I wanted the
nftables rules to be as identical to the iptables rules as possible,
just to make it easier to verify that everything is working.
The backend can be manually chosen using the firewall_backend setting
in /etc/libvirt/network.conf. libvirtd/virtnetworkd will read this
setting when it starts; if there is no explicit setting, it will check
for availability of FIREWALL_BACKEND_DEFAULT_1 and then
FIREWALL_BACKEND_DEFAULT_2 (which are set at build time in
meson_options.txt or by adding -Dfirewall_backend_default_n=blah to
the meson commandline), and use the first backend that is available
(ie, that has the necessary programs installed). The standard
meson_options.txt is set to check for nftables first, and then
iptables.
Although it should be very safe to change the default backend from
iptables to nftables, that change is left for a later patch, to show
how the change in default can be undone if someone really needs to do
that.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Daniel P. Berrangé <berrange@redhat.com>
These functions are only ever used by the network driver, and are so
specific to the network driver's usage of iptables that they likely
won't ever be used elsewhere. The files are renamed to
network_iptables.[ch] to be more in line with driver-specific file
naming conventions.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
This allows users to SSH into a domain with a VSOCK device:
ssh user@qemu/machineName
So far, only QEMU domains are supported AND qemu:///system is
looked for the first for 'machineName' followed by
qemu:///session. I took an inspiration from Systemd's ssh proxy
[1] [2].
To just work out of the box, it requires (yet unreleased) systemd
to be running inside the guest to set up a socket activated SSHD
on the VSOCK. Alternatively, users can set up the socket
activation themselves, or just run a socat that'll forward vsock
<-> TCP communication.
1: https://github.com/systemd/systemd/blob/main/src/ssh-generator/ssh-proxy.c
2: https://github.com/systemd/systemd/blob/main/src/ssh-generator/20-systemd-ssh-proxy.conf.in
Resolves: https://gitlab.com/libvirt/libvirt/-/issues/579
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The VPD parsing is fragile and depends on hardware vendor's adherance to
standards. Since libvirt only ever uses this data to report it in the
nodedev XML which ignores any errors there's no much point in having
error reporting which I've added recently.
Turn the errors into VIR_DEBUG statements in preparation for upcoming
patch which completely removes the expectation to report errors.
This effectively reverts commits dfc85658bd and f85a382a0e.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Currently, virSocketSendMsgWithFDs() reports two errors:
1) if CMSG_FIRSTHDR() fails,
2) if sendmsg() fails.
Well, the latter sets an errno, so caller can just use
virReportSystemError(). And the former - it is very unlikely to
fail because memory for whole control message was allocated just
a few lines above.
The motivation is to unify behavior of virSocketSendMsgWithFDs()
and virSocketSendFD() because the latter is just a subset of the
former (will be addressed later).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
enable VIR_DOMAIN_NET_TYPE_ETHERNET network support for ch guests.
Tested with following interface config:
<interface type='ethernet'>
<target dev='chtap0' managed="yes"/>
<model type='virtio'/>
<driver queues='2'/>
<interface>
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
virSocketSendMsgWithFDs method send fds along with payload using
SCM_RIGHTS. virSocketRecv method polls, receives and sends the response
to callers.
These methods are required to add network suppport in ch driver.
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Move domain interface management methods from qemu to hypervisor. This
refactoring allows the domain management methods to be shared between CH and
qemu drivers.
This commit does not introduce any functional changes.
Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This will allow us to use it for nbdkit logging in upcoming commits.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Implement the loadFile and saveFile virFileCacheHandlers callbacks so
that nbdkit capabilities are cached perstistently across daemon
restarts. The format and implementation is modeled on the qemu
capabilities, but simplified slightly.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
At this moment it is not possible to launch a 'riscv64' domain if a CPU
definition is presented in the domain. For example, adding this CPU
definition:
<cpu mode='custom' match='exact' check='none'>
<model fallback='forbid'>rv64</model>
</cpu>
Will trigger the following error:
$ sudo ./run tools/virsh start riscv-virt1
error: Failed to start domain 'riscv-virt1'
error: this function is not supported by the connection driver:
cannot update guest CPU for riscv64 architecture
The error comes from virCPUUpdate(), via qemuProcessUpdateGuestCPU(),
and it's caused by the absence of the 'update' API in the existing
RISC-V driver.
Add an 'update' API impl to the RISC-V driver to allow for CPU
definitions to be declared in RISC-V domains. This API was copied from
the ARM driver (virCPUarmUpdate()) since it's a good enough
implementation to get us going.
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
The aim of this new module is to contain code that's parsing ACPI
tables. For now, only parsing of IORT table is implemented (it's
ARM specific table). And since we only need to check whether the
table contains SMMU record, the code is very simplified.
I've followed the specification published here:
https://developer.arm.com/documentation/den0049/latest/
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Before, logs from deleted machines have been piling up, since there were
no garbage collection mechanism. Now, virtlogd can be configured to
periodically scan the log folder for orphan logs with no recent modifications
and delete it.
A single chain of recent and rotated logs is deleted in a single transaction.
Signed-off-by: Oleg Vasilev <oleg.vasilev@virtuozzo.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
This consists of (1) adding the necessary args to the qemu commandline
netdev option, and (2) starting a passt process prior to starting
qemu, and making sure that it is terminated when it's no longer
needed. Under normal circumstances, passt will terminate itself as
soon as qemu closes its socket, but in case of some error where qemu
is never started, or fails to startup completely, we need to terminate
passt manually.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Now that all code was refactored to use the new version we can remove
the old code.
For now the new close callbacks code has no error messages so
syntax-check forced me to remove the POTFILES entry for
virclosecallbacks.c
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
The following patches move job object as a member into the domain
object. Because of this, domain_conf (where the domain object is
defined) needs to import the file with the job object.
It makes sense to move jobs to the same level as the domain_conf:
into src/conf/
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
This patch moves qemuDomainObjBeginJobInternal() as
virDomainObjBeginJobInternal() into hypervisor in order to be
used by other hypervisors in the following patches.
Signed-off-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Remove the unused source code for the sheepdog storage backend.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Similar to the other drivers, virNetworkDriverState now has a
virObject-derived object called virNetworkDriverConfig which is used
for config items.
As a starting point, the directory paths used by the network driver
are moved there (again, parallelling what is done for other drivers).
Using items in virNetworkDriverConfig is (yes, again) similar to using
items in the other drivers' config - anything in the config object is
immutable (once initialized), so the state object only needs to be
locked while getting a reference to the config object, and then the
members of the config object can be safely used until the config
object is unrefed.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The domain post parse functions currently live in domain_conf.c
which thus grows always larger. Mimic what we've done for the
validation code and move the post parse code into a separate
file: domain_postparse.c.
I've started by moving every function with PostParse in its name
into the new file and then compile hunting for helper functions
only to move them as well.
In the end, I've moved virDomainDefPostParse symbol in
libvirt_private.syms into a new section. And while
virDomainDeviceDefPostParseOne() is made 'public' in
domain_postparse.h too, I'm not exporting it because it has no
caller outside src/conf/ and it's unlikely it ever will.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Add a method to parse a ccw device address from a string.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Now that we have dropped prefixes from the file, it no longer
needs to go through configure_file() and we can use it directly.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
There was no need to handle files for translation from build directory
but that will change with following patches where we will stop
generating source files into source directory.
In order to have them included for translation we have to prefix each
file with SRCDIR or BUILDDIR.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Commit <22d8e27ccd5faf48ee2bf288a1b9059aa7ffd28b> introduced our
syntax-check.mk file based on gnulib rules. However, the rule was
completely ignored as we don't have POTFILES.in file.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
There is no need to have the libvirt-admin.so library definition in the
src directory. In addition the library uses directly code from admin
sub-directory so move the remaining bits there as well.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
The getopt-posix module fixes a problem with optind being incorrectly
set after a failed option parse. It was also previously used to allow
the bhyve driver to access a private internal reentrant getopt impl.
None of this matters to libvirt code any more.
This partially reverts
commit b436a8ae5c
Author: Fabian Freyer <fabian.freyer@physik.tu-berlin.de>
Date: Thu Jun 9 00:50:35 2016 +0000
gnulib: add getopt module
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The xenapi driver has not seen any development since its initial
contribution 9 years ago. There have been no bug reports, no patches,
and no queries about the driver on the developer or user mailing lists.
Remove the driver from the libvirt sources.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
After the legacy xen driver was removed the libxl driver became
the only consumer of xenconfig. Move the few files in xenconfig
to the libxl driver and remove the directory.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Introduce a bunch of new virsh commands for managing checkpoints in
isolation. More commands are needed for performing incremental
backups, but these commands were easy to implement by modeling heavily
after virsh-snapshot.c. There is no need for checkpoint-revert or
checkpoint-current since those snapshot APIs have no checkpoint
counterpart. Similarly, it is not necessary to change which
checkpoint is current when redefining from XML, since until we
integrate checkpoints with snapshots, there is only a linear chain
(and you can deduce the current checkpoint by instead using
'checkpoint-list --leaves'). Other aspects of checkpoint-list are
also a bit simpler than the snapshot counterpart, in part because we
don't have to cater to back-compat to older API.
Upcoming patches will test these interfaces once the test driver
supports checkpoints.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Add a new file checkpoint_conf.c that performs the translation to and
from new XML describing a checkpoint. The code shares a common base
class with snapshots, since a checkpoint similarly represents the
domain state at a moment in time. Add some basic testing of round trip
XML handling through the new code.
Of note - this code intentionally differs from snapshots in that XML
schema validation is unconditional, rather than based on a public API
flag. We have many existing interfaces that still need to add a flag
for opt-in schema validation, but those interfaces have existing
clients that may not have been producing strictly-compliant XML, or we
may still uncover bugs where our RNG grammar is inconsistent with our
code (where omitting the opt-in flag allows existing apps to keep
working while waiting for an RNG patch). But since checkpoints are
brand-new, it's easier to ensure the code matches the schema by always
using the schema. If needed, a later patch could extend the API and
add a flag to turn on to request schema validation, rather than having
it forced (possibly just the validation of the <domain> sub-element
during REDEFINE) - but if a user encounters XML that looks like it
should be good but fails to validate with our RNG schema, they would
either have to upgrade to a new libvirt that adds the new flag, or
upgrade to a new libvirt that fixes the RNG schema, which implies
adding such a flag won't help much.
Also, the redefine flag requires the <domain> sub-element to be
present, rather than catering to historical back-compat to older
versions.
Signed-off-by: Eric Blake <eblake@redhat.com>
Introduce a bunch of new public APIs related to backup checkpoints.
Checkpoints are modeled heavily after virDomainSnapshotPtr (both
represent a point in time of the guest), although a snapshot exists
with the intent of rolling back to that state, while a checkpoint
exists to make it possible to create an incremental backup at a later
time. We may have a future hypervisor that can completely manage
checkpoints without libvirt metadata, but the first two planned
hypervisors (qemu and test) both always use libvirt for tracking
metadata relations between checkpoints, so for now, I've deferred
the counterpart of virDomainSnapshotHasMetadata for a separate
API addition at a later date if there is ever a need for it.
Note that until we allow snapshots and checkpoints to exist
simultaneously on the same domain (although the actual prevention of
this will be in a separate patch for the sake of an easier revert down
the road), that it is not possible to branch out to create more than
one checkpoint child to a given parent, although it may become
possible later when we revert to a snapshot that coincides with a
checkpoint. This also means that for now, the decision of which
checkpoint becomes the parent of a newly created one is the only
checkpoint with no child (so while there are APIs for dealing with a
current snapshot, we do not need those for checkpoints). We may end
up exposing a notion of a current checkpoint later, but it's easier to
add stuff when proven needed than to blindly support it now and wish
we hadn't exposed it.
The following map shows the API relations to snapshots, with new APIs
on the right:
Operate on a domain object to create/redefine a child:
virDomainSnapshotCreateXML virDomainCheckpointCreateXML
Operate on a child object for lifetime management:
virDomainSnapshotDelete virDomainCheckpointDelete
virDomainSnapshotFree virDomainCheckpointFree
virDomainSnapshotRef virDomainCheckpointRef
Operate on a child object to learn more about it:
virDomainSnapshotGetXMLDesc virDomainCheckpointGetXMLDesc
virDomainSnapshotGetConnect virDomainCheckpointGetConnect
virDomainSnapshotGetDomain virDomainCheckpointGetDomain
virDomainSnapshotGetName virDomainCheckpiontGetName
virDomainSnapshotGetParent virDomainCheckpiontGetParent
virDomainSnapshotHasMetadata (deferred for later)
virDomainSnapshotIsCurrent (no counterpart, see note above)
Operate on a domain object to list all children:
virDomainSnapshotNum (no counterparts, these are the old
virDomainSnapshotListNames racy interfaces)
virDomainSnapshotListAllSnapshots virDomainListAllCheckpoints
Operate on a child object to list descendents:
virDomainSnapshotNumChildren (no counterparts, these are the old
virDomainSnapshotListChildrenNames racy interfaces)
virDomainSnapshotListAllChildren virDomainCheckpointListAllChildren
Operate on a domain to locate a particular child:
virDomainSnapshotLookupByName virDomainCheckpointLookupByName
virDomainSnapshotCurrent (no counterpart, see note above)
virDomainHasCurrentSnapshot (no counterpart, old racy interface)
Operate on a snapshot to roll back to earlier state:
virDomainSnapshotRevert (no counterpart, instead checkpoints
are used in incremental backups via
XML to virDomainBackupBegin)
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Libvirtd has long had integration with avahi for advertising libvirtd
using mDNS when TCP/TLS listening is enabled. For a long time the
virt-manager application had support for auto-detecting libvirtds
on the local network using mDNS, but this was removed last year
commit fc8f8d5d7e3ba80a0771df19cf20e84a05ed2422
Author: Cole Robinson <crobinso@redhat.com>
Date: Sat Oct 6 20:55:31 2018 -0400
connect: Drop avahi support
Libvirtd can advertise itself over avahi. The feature is disabled by
default though and in practice I hear of no one actually using it
and frankly I don't think it's all that useful
The 'Open Connection' wizard has a disproportionate amount of code
devoted to this feature, but I don't think it's useful or worth
maintaining, so let's drop it
I've never heard of any other applications having support for using
mDNS to detect libvirtd instances. Though it is theoretically possible
something exists out there, it is clearly going to be a niche use case
in the virt ecosystem as a whole.
By removing avahi integration we can cut down the dependency chain for
the basic libvirtd install and reduce our code maint burden.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The only user is now in qemu_monitor_json.c to re-parse the command line
format into keyvalue pairs for use in QMP command construction.
Move and rename the functions.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
The driver is unmaintained, untested and severely broken for
quite some time now. Since nobody even reported any issue with it
let us drop it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Historically we have relied on autopoint/gettextize to install a
standard po/Makefile.in.in. There is very limited scope for customizing
this and it also causes a bunch of extra stuff to be pulled into
configure.ac which potentially clashes with gnulib. Writing make rules
for po file management is no more difficult than any other rules libvirt
has, so stop using autopoint/gettextize.
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>