Libvirt provides a portable, long term stable C API for managing the virtualization technologies provided by many operating systems. It includes support for QEMU, KVM, Xen, LXC, bhyve, Virtuozzo, VMware vCenter and ESX, VMware Desktop, Hyper-V, VirtualBox and the POWER Hypervisor.
Go to file
Laine Stump 42ab0148dd network: add rule to nftables backend that zeroes checksum of DHCP responses
Many years ago (April 2010), soon after "vhost" in-kernel packet
processing was added to the virtio-net driver, people running RHEL5
virtual machines with a virtio-net interface connected via a libvirt
virtual network noticed that when vhost packet processing was enabled,
their VMs could no longer get an IP address via DHCP - the guest was
ignoring the DHCP response packets sent by the host.

(I've been informed by danpb that the same issue had been encountered,
and "fixed" even earlier than that, in 2006, with Xen as the
hypervisor.)

The "gory details" of the 2010 discussion are chronicled here:

  https://lists.isc.org/pipermail/dhcp-hackers/2010-April/001835.html

but basically it was because packet checksums weren't being fully
computed on the host side (because QEMU on the host and the NIC driver
in the guest had agreed between themselves to turn off checksums
because they were unnecessary due to the "link" between the two being
entirely in local memory rather than an error-prone physical cable),
but

1) a partial checksum was being put into the packets at some point by
   "someone"

2) the "don't use checksums" info was known by the guest kernel, which
   would properly ignore the "bad" checksum), and

3) the packets were being read by the dhclient application on the
   guest side with a "raw" socket (thus bypassing the guest kernel UDP
   processing that would have known the checksum was irrelevant and
   ignore it)),

The "fix" for this ended up being two-tiered:

1) The ISC DHCP package (which contains the aforementioned dhclient
program) made a fix to their dhclient code which caused it to accept
packets anyway even if they didn't have a proper checksum (NB: that's
not a full explanation, and possibly not accurate). This remedied the
problem for guests with an updated dhclient. Here is the code with the
fix to ISC DHCP:

  https://github.com/isc-projects/dhcp/blob/master/common/packet.c#L365

This eliminated the issue for any new/updated guests that had the
fixed dhclient, but it didn't solve the problem for existing/old guest
images that didn't/couldn't get their dhclient updated. This brings us
to:

2) iptables added a new "CHECKSUM" target and "--checksum-fill"
action:

  http://patchwork.ozlabs.org/patch/58525/

and libvirt added an iptables rule for each virtual network to match
DHCP response packets and perform --checksum-fill. This way by the
time dhclient on the guest read the raw packet, the checksum would be
corrected, and the packet would be accepted. This was pushed upstream
in libvirt commit v0.8.2-142-gfd5b15ff1a.

The word at the time from those more knowledgeable than me was that
the bad checksum problem was really specific to ISC's dhclient running
on Linux, and so once their fix was in use everywhere dhclient was
used, bad checksums would be a thing of the past and the
--checksum-fill iptables rules would no longer be needed (but would
otherwise be harmless if they were still there).

(Plot twist: the dhclient code in fix (1) above apparently is on a
Linux-only code path - this is very important later!)

Based on this information (and also due to the opinion that fixing it
by having iptables modify the packet checksum was really the wrong way
to permanently fix things, i.e. an "ugly hack"), the nftables
developers made the decision to not implement an equivalent to
--checksum-fill in nftables. As a result, when I wrote the nftables
firewall backend for libvirt virtual networks earlier this year, it
didn't add in any rule to "fix" broken UDP checksums (since there was
apparently no equivalent in nftables and, after all, that was fixed
somewhere else 14 years ago, right???)

But last week, when Rich Jones was doing routine testing using a Fedora
40 host (the first Fedora release to use the nftables backend of libvirt's
network driver by default) and a FreeBSD guest, for "some strange
reason", the FreeBSD guest was unable to get an IP address from DHCP!!

  https://www.spinics.net/linux/fedora/libvirt-users/msg14356.html

A few quick tests proved that it was the same old "bad checksum"
problem from 2010 come back to haunt us - it wasn't a Linux-only issue
after all.

Phil Sutter and Eric Garver (nftables people) pointed out that, while
nftables doesn't have an action that will *compute* the checksum of a
packet, it *does* have an action that will set the checksum to 0, and
suggested we try adding a "zero the checksum" rule for dhcp response
packets to our nftables ruleset. (Why? Because a checksum value of 0
in a IPv4 UDP packet is defined by RFC768 to mean "no checksum
generated", implying "checksum not needed").  It turns out that this
works - dhclient properly recognizes that a 0 checksum means "don't
bother with the checksum", and accepts the packet as valid.

So to once again fix this timeless bug, this patch adds such a
checksum zeroing rule to the nftables rules setup for each virtual
network.

This has been verified (on a Fedora 40 host) to fix DHCP with FreeBSD
and OpenBSD guests, while not breaking it for Fedora or Windows (10)
guests.

Fixes: b89c4991da
Reported-by: Rich Jones <rjones@redhat.com>
Fix-Suggested-by: Eric Garver <egarver@redhat.com>
Fix-Suggested-by: Phil Sutter <psutter@redhat.com>
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-10-25 12:00:52 -04:00
.ctags.d maint: Add support for .ctags.d 2019-05-31 17:54:28 +02:00
.github/workflows github: Update lockdown message when opening a PR 2024-05-15 12:31:23 +02:00
.gitlab/issue_templates gitlab: issue_template: Remove labelling commands 2022-06-01 12:27:10 +02:00
build-aux virshtest: Prepare for testing against output files 2024-04-02 14:24:30 +02:00
ci ci: Move definition of exit codes allowed to fail for cirrus jobs 2024-10-23 11:07:01 +02:00
docs domain_capabilities: Report CPU blockers 2024-10-24 15:53:51 +02:00
examples examples: Define _GNU_SOURCE for more examples 2024-02-07 18:01:03 +01:00
include docs: Add warning about using a cleared image with VIR_MIGRATE_PARAM_MIGRATE_DISKS_DETECT_ZEROES_ZEROES 2024-10-14 16:25:21 +02:00
po po: Refresh potfile for v10.9.0 2024-10-25 08:30:50 +02:00
scripts docs: Prohibit 'external' links within the webpage 2024-10-09 16:00:44 +02:00
src network: add rule to nftables backend that zeroes checksum of DHCP responses 2024-10-25 12:00:52 -04:00
subprojects Move src/keycodemapdb -> subprojects/keycodemapdb 2023-04-17 15:02:38 +02:00
tests network: add rule to nftables backend that zeroes checksum of DHCP responses 2024-10-25 12:00:52 -04:00
tools wireshark: drop gmodule.h include to avoid glib warnings 2024-10-16 15:47:52 +01:00
.ctags ctags: Generate tags for headers, i.e. function prototypes 2018-09-18 14:21:33 +02:00
.dir-locals.el
.editorconfig Add .editorconfig 2019-09-06 12:47:46 +02:00
.gitattributes Add .gitattributes file 2022-03-17 14:33:12 +01:00
.gitignore Revert ".gitignore: Ignore cscope and other *tags files" 2023-02-08 17:24:31 +01:00
.gitlab_pages_redirects docs: gitlab redirects: Drop '/libvirt' prefix for hosting the web through gitlab pages 2024-02-13 16:56:49 +01:00
.gitlab-ci.yml gitlab: add missing job inheritance for codestyle 2024-06-17 11:36:00 +01:00
.gitmodules Move src/keycodemapdb -> subprojects/keycodemapdb 2023-04-17 15:02:38 +02:00
.gitpublish gitpublish: Tweak prefix 2023-12-05 11:48:28 +01:00
.mailmap mailmap: consolidate my email addresses 2020-10-06 12:05:09 +02:00
AUTHORS.rst.in AUTHORS: change my (Nikolay Shirokovskiy) email 2022-04-06 11:00:53 +03:00
config.h configure: bump min required CLang to 6.0 / XCode 10.0 2022-01-17 10:44:29 +00:00
configmake.h.in meson: generate configmake.h 2020-08-03 09:26:48 +02:00
CONTRIBUTING.rst meson: adjust our documentation to mention meson instead of autoconf 2020-08-03 09:27:09 +02:00
COPYING
COPYING.LESSER maint: Remove control characters from LGPL license file 2015-09-25 09:16:24 +02:00
gitdm.config gitdm: add 'ibm' file 2019-10-18 17:32:52 +02:00
libvirt-admin.pc.in
libvirt-lxc.pc.in
libvirt-qemu.pc.in
libvirt.pc.in
libvirt.spec.in spec: Drop nwfilter dependency in libvirt-daemon-xen 2024-10-22 09:46:32 -06:00
meson_options.txt meson: options: drop yajl 2024-09-24 08:24:00 +02:00
meson.build Post-release version bump to 10.9.0 2024-10-01 11:06:24 +02:00
NEWS.rst NEWS: Report CPU model blockers in domain capabilities 2024-10-24 15:53:51 +02:00
README.rst docs: update docs pointing to old mailing list addrs 2023-10-31 10:04:27 +00:00
run.in run.in: Detect binaries in builddir properly 2024-06-04 14:39:00 +02:00

GitLab CI Build Status

CII Best Practices

Translation status

Libvirt API for virtualization

Libvirt provides a portable, long term stable C API for managing the virtualization technologies provided by many operating systems. It includes support for QEMU, KVM, Xen, LXC, bhyve, Virtuozzo, VMware vCenter and ESX, VMware Desktop, Hyper-V, VirtualBox and the POWER Hypervisor.

For some of these hypervisors, it provides a stateful management daemon which runs on the virtualization host allowing access to the API both by non-privileged local users and remote users.

Layered packages provide bindings of the libvirt C API into other languages including Python, Perl, PHP, Go, Java, OCaml, as well as mappings into object systems such as GObject, CIM and SNMP.

Further information about the libvirt project can be found on the website:

https://libvirt.org

License

The libvirt C API is distributed under the terms of GNU Lesser General Public License, version 2.1 (or later). Some parts of the code that are not part of the C library may have the more restrictive GNU General Public License, version 2.0 (or later). See the files COPYING.LESSER and COPYING for full license terms & conditions.

Installation

Instructions on building and installing libvirt can be found on the website:

https://libvirt.org/compiling.html

Contributing

The libvirt project welcomes contributions in many ways. For most components the best way to contribute is to send patches to the primary development mailing list. Further guidance on this can be found on the website:

https://libvirt.org/contribute.html

Contact

The libvirt project has two primary mailing lists:

Further details on contacting the project are available on the website:

https://libvirt.org/contact.html