When an nwfilter rule sets the parameter CTRL_IP_LEARNING to "dhcp",
this turns on the "dhcpsnoop" thread, which uses libpcap to monitor
traffic on the domain's tap device and extract the IP address from the
DHCP response.
If libpcap on the host is built with HAVE_TPACKET3 defined (to enable
support for TPACKET_V3), the dhcpsnoop code's initialization of the
libpcap socket would fail with the following error:
virNWFilterSnoopDHCPOpen:1134 : internal error: pcap_setfilter: can't remove kernel filter: Bad file descriptor
It turns out that this was because TPACKET_V3 requires a larger buffer
size than libvirt was setting (we were setting it to 128k). Changing
the buffer size to 256k eliminates the error, and the dhcpsnoop thread
once again works properly.
A fuller explanation of why TPACKET_V3 requires such a large buffer,
for future git spelunkers:
libpcap calls setsockopt(... SOL_PACKET, PACKET_RX_RING...) to setup a
ring buffer for receiving packets; two of the attributes sent to this
API are called tp_frame_size, and tp_frame_nr. If libpcap was built
with HAVE_TPACKET3 defined, tp_trame_size is set to MAXIMUM_SNAPLEN
(defined in libpcap sources as 262144) and tp_frame_nr is set to:
[the buffer size we set, i.e. PCAP_BUFFERSIZE i.e. 262144] / tp_frame_size.
So if PCAP_BUFFERSIZE < MAXIMUM_SNAPLEN, then tp_frame_nr (the number
of frames in the ring buffer) is 0, which is nonsensical. This same
value is later used as a multiplier to determine the size for a call
to malloc() (which would also fail).
(NB: if HAVE_TPACKET3 is *not* defined, then tp_frame_size is set to
the snaplen set by the user (in our case 576) plus a small amount to
account for ethernet headers, so 256k is far more than adequate)
Since the TPACKET_V3 code in libpcap actually reads multiple packets
into each frame, it's not a problem to have only a single frame
(especially when we are monitoring such infrequent traffic), so it's
okay to set this relatively small buffer size (in comparison to the
default, which is 2MB), which is important since every guest using
dhcp snooping in a nwfilter rule will hold 2 of these buffers for the
entire life of the guest.
Thanks to Christian Ehrhardt for discovering that buffer size was the
problem (this was not at all obvious from the error that was logged!)
Resolves: https://bugzilla.redhat.com/1547237
Fixes: https://bugs.launchpad.net/libvirt/+bug/1758037
Signed-off-by: Laine Stump <laine@laine.org>
Reviewed-by: Christian Ehrhardt <christian.ehrhardt@canonical.com> (V1)
Reviewed-by: John Ferlan <jferlan@redhat.com>
Tested-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
(cherry picked from commit ce5aebeacd10a1c15cb3ee46a59c8b5ff235589e)
commit 77a12987a48 changed the "virDomainChrSourceDef source" inside
virDomainChrDef to "virDomainChrSourceDefPtr source", and started
allocating source inside virDomainChrDefNew(), but vboxDumpSerial()
was allocating a virDomainChrDef with a simple VIR_ALLOC() (i.e. never
calling virDomainChrDefNew()), so source was never initialized,
leading to a SEGV any time a serial port was present. The same problem
was created in vboxDumpParallel().
This patch changes vboxDumpSerial() and vboxDumpParallel() to use
virDomainChrDefNew() instead of VIR_ALLOC(), and changes both of those
functions to return an error if virDomainChrDef() (or any other
allocation) fails.
This resolves: https://bugzilla.redhat.com/1536649
(cherry picked from commit 9c27e464e3b4603cbe13c00787f4c89e5b1e7a68)
Signed-off-by: Laine Stump <laine@laine.org>
https://bugzilla.redhat.com/show_bug.cgi?id=1528502
So imagine you have /dev/blah symlink which points to /dev/sda.
You attach /dev/blah as disk to your domain. Libvirt correctly
creates the /dev/blah -> /dev/sda symlink in the qemu namespace.
However, then you detach the disk, change the symlink so that it
points to /dev/sdb and tries to attach the disk again. This time,
however, the attach fails (well, qemu attaches wrong disk)
because the code assumes that symlinks don't change. Well they
do.
This is inspired by test fix written by Eduardo Habkost.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
(cherry picked from commit db98e7f67ea0d7699410f514f01947cef5128a6c)
The fix for CVE-2018-6764 introduced a potential deadlock scenario
that gets triggered by the NSS module when virGetHostname() calls
getaddrinfo to resolve the hostname:
#0 0x00007f6e714b57e7 in futex_wait
#1 futex_wait_simple
#2 __pthread_once_slow
#3 0x00007f6e71d16e7d in virOnce
#4 0x00007f6e71d0997c in virLogInitialize
#5 0x00007f6e71d0a09a in virLogVMessage
#6 0x00007f6e71d09ffd in virLogMessage
#7 0x00007f6e71d0db22 in virObjectNew
#8 0x00007f6e71d0dbf1 in virObjectLockableNew
#9 0x00007f6e71d0d3e5 in virMacMapNew
#10 0x00007f6e71cdc50a in findLease
#11 0x00007f6e71cdcc56 in _nss_libvirt_gethostbyname4_r
#12 0x00007f6e724631fc in gaih_inet
#13 0x00007f6e72464697 in __GI_getaddrinfo
#14 0x00007f6e71d19e81 in virGetHostnameImpl
#15 0x00007f6e71d1a057 in virGetHostnameQuiet
#16 0x00007f6e71d09936 in virLogOnceInit
#17 0x00007f6e71d09952 in virLogOnce
#18 0x00007f6e714b5829 in __pthread_once_slow
#19 0x00007f6e71d16e7d in virOnce
#20 0x00007f6e71d0997c in virLogInitialize
#21 0x00007f6e71d0a09a in virLogVMessage
#22 0x00007f6e71d09ffd in virLogMessage
#23 0x00007f6e71d0db22 in virObjectNew
#24 0x00007f6e71d0dbf1 in virObjectLockableNew
#25 0x00007f6e71d0d3e5 in virMacMapNew
#26 0x00007f6e71cdc50a in findLease
#27 0x00007f6e71cdc839 in _nss_libvirt_gethostbyname3_r
#28 0x00007f6e71cdc724 in _nss_libvirt_gethostbyname2_r
#29 0x00007f6e7248f72f in __gethostbyname2_r
#30 0x00007f6e7248f494 in gethostbyname2
#31 0x000056348c30c36d in hosts_keys
#32 0x000056348c30b7d2 in main
Fortunately the extra stuff virGetHostname does is totally irrelevant to
the needs of the logging code, so we can just inline a call to the
native hostname() syscall directly.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit c2dc6698c88fb591639e542c8ecb0076c54f3dfb)
Broken by 759b4d1b0fe5f4d84d98b99153dfa7ac289dd167.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
(cherry picked from commit 6ce3acc129bfdbe7fd02bcb8bbe8af6d13903684)
At later point it might not be possible or even safe to use getaddrinfo(). It
can in turn result in a load of NSS module.
Notably, on a LXC container startup we may find ourselves with the guest
filesystem already having replaced the host one. Loading a NSS module
from the guest tree would allow a malicous guest to escape the
confinement of its container environment because libvirt will not yet
have locked it down.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
(cherry picked from commit 759b4d1b0fe5f4d84d98b99153dfa7ac289dd167)
The PROBE macro used in qemuMonitorIOProcess and the VIR_DEBUG message
in qemuMonitorJSONIOProcess create a lot of logging churn when debug
logging is enabled during monitor communication.
The messages logged from the PROBE macro are rather useless since they
are reporting the partial state of receiving the reply from qemu. The
actual full reply is still logged in qemuMonitorJSONIOProcessLine once
the full message is received.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
(cherry picked from commit f10bb3347b43d900ff361cda5fe1996782284991)
PROBE macro adds a logging entry, when used in places seeing a lot of
traffic this can cause a significant slowdown.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
(cherry picked from commit f06e488d5484031a76e7ed231c8fef8fa1181d2c)
TPM 2 does not implement sysfs files for cancellation of commands.
We therefore use /dev/null for the cancel path passed to QEMU.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Tested-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
(cherry picked from commit dfbb15b75433e520fb1b905c1c3e28753e53e4a5)
The default_tls_x509_verify (and related) parameters in qemu.conf
control whether the QEMU TLS servers request & verify certificates
from clients. This works as a simple access control system for
servers by requiring the CA to issue certs to permitted clients.
This use of client certificates is disabled by default, since it
requires extra work to issue client certificates.
Unfortunately the code was using this configuration parameter when
setting up both TLS clients and servers in QEMU. The result was that
TLS clients for character devices and disk devices had verification
turned off, meaning they would ignore errors while validating the
server certificate.
This allows for trivial MITM attacks between client and server,
as any certificate returned by the attacker will be accepted by
the client.
This is assigned CVE-2017-1000256 / LSN-2017-0002
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 441d3eb6d1be940a67ce45a286602a967601b157)
If you use the VDDK library to access virtual machines remotely, you
really need to know the Managed Object Reference ("moref") of the VM.
This must be passed each time you connect to the API.
For example nbdkit's VDDK plugin requires a moref to be passed to
mount up a VM's disk remotely:
nbdkit vddk user=root password=+/tmp/rootpw \
server=esxi.example.com thumbprint=xx:xx:xx:... \
vm=moref=2 \
file="[datastore1] Fedora/Fedora.vmdk"
Getting the moref is a huge pain. To get some idea of what it is, why
it is needed, and how much trouble it is to get it, see:
https://blogs.vmware.com/vsphere/2012/02/uniquely-identifying-virtual-machines-in-vsphere-and-vcloud-part-1-overview.htmlhttps://blogs.vmware.com/vsphere/2012/02/uniquely-identifying-virtual-machines-in-vsphere-and-vcloud-part-2-technical.html
However the moref is available conveniently in the internals of the
libvirt VMX driver. This patch exposes it as a custom XML element
using the same "vmware:" namespace which was previously used for the
datacenterpath (see libvirt commit 636a99058758a044).
It appears in the XML like this:
<domain type='vmware' xmlns:vmware='http://libvirt.org/schemas/domain/vmware/1.0'>
<name>Fedora</name>
...
<vmware:datacenterpath>ha-datacenter</vmware:datacenterpath>
<vmware:moref>2</vmware:moref>
</domain>
Note that the moref can appear as either a simple ID (for esx://
connections) or as a "vm-<ID>" (for vpx:// connections). It should be
treated by users as an opaque string.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
commit '96e55048' caused make check failure for virschematest:
1929) Checking ../docs/news.xml against ../news.rng ... libvirt: XML Util error : XML document failed to validate against schema: Unable to validate doc against /home/jferlan/git/libvirt.work/docs/schemas/../news.rng
Datatype element summary has child elements
Element summary failed to validate content
Datatype element summary has child elements
Element summary failed to validate content
^[[31m^[[1mFAILED^[[0m
That's because <code> elements don't appear to be allowed in the schema.
Rather than attempt to fix the schema, figured it was simpler to just
remove them and let the schema fix happen later.
Documents some changes that have slipped through the cracks
during the development cycle.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
This patch documents support for managedsave-dumpxml,
managedsave-define and managedsave-edit commands.
Signed-off-by: Kothapally Madhu Pavan <kmp@linux.vnet.ibm.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1487322
In ace45e67abbd I tried to fix a problem that we get the reply to
a D-Bus call while we were sleeping. In that case the callback
was never set. So I changed the code that the callback is called
directly in this case. However, I hadn't realized that since the
callback is called out of order it locks the virNetDaemon.
Exactly the very same virNetDaemon object that we are dealing
with right now and that we have locked already (in
virNetDaemonAddShutdownInhibition())
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We call qemuDomainGetMachineName on domain start. On first
start (after daemon start) pid is 0 and virSystemdGetMachineNameByPID
don't get called. But after domain shutting down pid became -1 so
on next start virSystemdGetMachineNameByPID is called and returned an error.
Error is ignored so it is not critical. But at least on my system
(systemd-219 with extra patches) systemd-machined is crashed on
this request.
This behaviour is triggered by eaf2c9f89.
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1484230
When updating a virtio enabled vNIC and trying to change either
of rx_queue_size or tx_queue_size success is reported although no
operation is actually performed. Moreover, there's no way how to
change these on the fly. This is due to way we check for changes:
explicitly for each struct member. Therefore it's easy to miss
one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1437797
Rather than using refreshVol which essentially only updates the
allocation, capacity, and permissions for the volume, but not
the format which does get updated in a pool refresh - let's use
the same helper that pool refresh uses in order to update the
volume target.
Currently while parsing domain XML we clear the UNIX path if it matches
one of the auto-generated paths by libvirt. After that when the guest
is started new path is generated but the mode is also changed to "bind".
In the real-world use-case the mode should not change, it only happens
if a user provides a mode='connect' and path that matches one of the
auto-generated path or not provides a path at all.
Before *reconnect* feature was introduced there was no issue, but with
the new feature we need to make sure that it's used only with "connect"
mode, therefore we need to move the mode change into parsing in order
to have a proper error reported by validation code.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The test was introduced by 60135b22db6d.
The auto-generated path is removed by post-parse callback which
also changes the mode from "connect" to "bind" since the auto-generated
path makes sense only for "bind" mode.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
With gnutls 3.6.0, SHA1 is no longer accepted for certificate
signatures. We must usw SHA256 instead.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit e4cb8500810a changed the way ssh command line is created by
adding '--' before the hostname in order to fix a potential security
flaw. However it failed to modify the tests, so let's do that.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Inspired by the recent GIT / Mercurial security flaws
(http://blog.recurity-labs.com/2017-08-10/scm-vulns),
consider someone/something manages to feed libvirt a bogus
URI such as:
virsh -c qemu+ssh://-oProxyCommand=gnome-calculator/system
In this case, the hosname "-oProxyCommand=gnome-calculator"
will get interpreted as an argument to ssh, not a hostname.
Fortunately, due to the set of args we have following the
hostname, SSH will then interpret our bit of shell script
that runs 'nc' on the remote host as a cipher name, which is
clearly invalid. This makes ssh exit during argv parsing and
so it never tries to run gnome-calculator.
We are lucky this time, but lets be more paranoid, by using
'--' to explicitly tell SSH when it has finished seeing
command line options. This forces it to interpret
"-oProxyCommand=gnome-calculator" as a hostname, and thus
see a fail from hostname lookup.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When recreating folders with namespaces, the directory type was not
being handled at all. It's not special, we probably just didn't know
that that can be used as a volume path as well. The code failed
gracefully, but we want to allow that so that we can use <disk
type='dir'> in domains again.
Partially-resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1443434
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Our backing probing code handles directory file types properly in
virStorageFileGetMetadataRecurse(), by that I mean it leaves them
alone. However its caller, the virStorageFileGetMetadata() resets the
type to raw before probing, without even checking the type. We need
to special-case TYPE_DIR in order to achieve desired results.
Also, in order to properly test this, we need to stop resetting format
of volumes in tests for TYPE_DIR (probably the reason why we didn't
catch that and why the test data didn't need to be modified).
Partially-resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1443434
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This commit adds qemu driver implementation to edit xml
configuration of managed save state file of a domain.
Signed-off-by: Kothapally Madhu Pavan <kmp@linux.vnet.ibm.com>
This commit adds qemu driver implementation to get xml description
for managed save state domain.
Signed-off-by: Kothapally Madhu Pavan <kmp@linux.vnet.ibm.com>
Similar to domainSaveImageDefineXML this commit adds domainManagedSaveDefineXML
API which allows to edit domain's managed save state xml configuration.
Signed-off-by: Kothapally Madhu Pavan <kmp@linux.vnet.ibm.com>
Similar to domainSaveImageGetXMLDesc this commit adds domainManagedSaveGetXMLDesc
API which allows to get the xml of managed save state domain.
Signed-off-by: Kothapally Madhu Pavan <kmp@linux.vnet.ibm.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1476866
For some reason, we completely ignore <on_reboot/> setting for
domains. The implementation is simply not there. It never was.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This API is definitely modifying state of @vm. Therefore it
should grab a job.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
At some places we either already have synchronous job or we just
released it. Also, some APIs might want to use this code without
having to release their job. Anyway, the job acquire code is
moved out to qemuDomainRemoveInactiveJob so that
qemuDomainRemoveInactive does just what it promises.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: John Ferlan <jferlan@redhat.com>
We can now check for the error and not care about the return value as
it will be properly handled in virBufferContentAndReset() anyway.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The function is useful even without using the return value. And if
needed, the return value can be obtained by other calls as well. The
potential for clean-up can be seen in the following patch.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Otherwise longer domain names might generate paths that are too long
to be created. This follows what other parts of the code do as well.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1453194
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>