Most of our code base uses space after comma but not before;
fix the remaining uses before adding a syntax check.
* src/network/bridge_driver.c: Consistently use commas.
* src/node_device/node_device_hal.c: Likewise.
* src/node_device/node_device_udev.c: Likewise.
* src/storage/storage_backend_rbd.c: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
This resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1020135
If networkAllocateActualDevice() had failed due to a pool of hostdev
or direct devices being depleted, the calling function could still
call networkReleaseActualDevice() as part of its cleanup, and that
function would then unconditionally decrement the connections count
for the network, even though it hadn't been incremented (due to
failure of allocate). This *was* necessary because the .actual member
of the netdef was allocated with a "lazy" algorithm, only being
created if there was a need to store data there (e.g. if a device was
allocated from a pool, or bandwidth was allocated for the device), so
there was no simple way for networkReleaseActualDevice() to tell if
something really had been allocated (i.e. if "connections++" had been
executed).
This patch changes networkAllocateDevice() to *always* allocate an
actual device for any netdef of type='network', even if it isn't
needed for any other reason. This has no ill effects anywhere else in
the code (except for using a small amount of memory), and
networkReleaseActualDevice() can then determine if there was a
previous successful allocate by checking for .actual != NULL (if not,
it skips the "connections--").
Currently, we ignore whether dnsmasqCapsRefresh succeeds or fails. We
shouldn't do that as we may generate wrong dnsmasq command line (what
is done just a few lines below).
Signed-off-by: Hongwei Bi <hwbi2008@gmail.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Packets sent by guests on virbrN, *or* by dnsmasq on the same, to
- 255.255.255.255/32 (netmask-independent local network broadcast
address), or to
- 224.0.0.0/24 (local subnetwork multicast range)
are never forwarded, hence it is not necessary to masquerade them.
In fact we must not masquerade them: translating their source addresses or
source ports (where applicable) may confuse receivers on virbrN.
One example is the DHCP client in OVMF (= UEFI firmware for virtual
machines):
http://thread.gmane.org/gmane.comp.bios.tianocore.devel/1506/focus=2640
It expects DHCP replies to arrive from remote source port 67. Even though
dnsmasq conforms to that, the destination address (255.255.255.255) and
the source address (eg. 192.168.122.1) in the reply allow the UDP
masquerading rule to match, which rewrites the source port to or above
1024. This prevents the DHCP client in OVMF from accepting the packet.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=709418
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Useful to set custom forwarders instead of using the contents of
/etc/resolv.conf. It helps me to setup dnsmasq as local nameserver to
resolve VM domain names from domain 0, when domain option is used.
Signed-off-by: Diego Woitasen <diego.woitasen@vhgroup.net>
Signed-off-by: Eric Blake <eblake@redhat.com>
Similarly to qemu_driver.c, we can join often repeating code of looking
up network into one function: networkObjFromNetwork.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This resolves the issue that prompted the filing of
https://bugzilla.redhat.com/show_bug.cgi?id=928638
(although the request there is for something much larger and more
general than this patch).
commit f3868259ca disabled the
forwarding to upstream DNS servers of unresolved DNS requests for
names that had no domain, but were just simple host names (no "."
character anywhere in the name). While this behavior is frowned upon
by DNS root servers (that's why it was changed in libvirt), it is
convenient in some cases, and since dnsmasq can be configured to allow
it, it must not be strictly forbidden.
This patch restores the old behavior, but since it is usually
undesirable, restoring it requires specification of a new option in
the network config. Adding the attribute "forwardPlainNames='yes'" to
the <dns> elemnt does the trick - when that attribute is added to a
network config, any simple hostnames that can't be resolved by the
network's dnsmasq instance will be forwarded to the DNS servers listed
in the host's /etc/resolv.conf for an attempt at resolution (just as
any FQDN would be forwarded).
When that attribute *isn't* specified, unresolved simple names will
*not* be forwarded to the upstream DNS server - this is the default
behavior.
* Move platform specific things (e.g. firewalling and route
collision checks) into bridge_driver_platform
* Create two platform specific implementations:
- bridge_driver_linux: Linux implementation using iptables,
it's actually the code moved from bridge_driver.c
- bridge_driver_nop: dumb implementation that does nothing
Signed-off-by: Eric Blake <eblake@redhat.com>
This is another cleanup before extracting platform-specific
parts from bridge_driver.
Rename struct network_driver to _virNetworkDriverState and
add appropriate typedefs: virNetworkDriverState and
virNetworkDriverStatePtr.
This will help us to avoid potential problems when moving
this struct to the .h file.
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
iptablesContext holds only 4 pairs of iptables
(table, chain) and there's no need to pass
it around.
This is a first step towards separating bridge_driver.c
in platform-specific parts.
If networkUnplugBandwidth is called on a network which has
no bandwidth defined, print a warning instead of crashing.
This can happen when destroying a domain with bandwidth if
bandwidth was removed from the network after the domain was
started.
https://bugzilla.redhat.com/show_bug.cgi?id=975359
Although SRIOV network cards support setting a vlan tag on their
virtual functions, and although setting this vlan tag via a <vlan>
element in a domain's <interface> works, setting a vlan tag for these
devices in a <network> definition, or in a network <portgroup>
definition is also supposed to work (and the comment that validates
<vlan> usage even says that!). However, the check to allow it only
checked for an openvswitch network, so attempts to add <vlan> to a
network of type='hostdev' would fail.
This fixes the problem reported in:
https://bugzilla.redhat.com/show_bug.cgi?id=972690
When checking for a collision of a new libvirt network's subnet with
any existing routes, we read all of /proc/net/route into memory, then
parse all the entries. The function that we use to read this file
requires a "maximum length" parameter, which had previously been set
to 64*1024. As each line in /proc/net/route is 128 bytes, this would
allow for a maximum of 512 entries in the routing table.
This patch increases that number to 128 * 100000, which allows for
100,000 routing table entries. This means that it's possible that 12MB
would be allocated, but that would only happen if there really were
100,000 route table entries on the system, it's only held for a very
short time.
Since there is no method of specifying and unlimited max (and that
would create a potential denial of service anyway) hopefully this
limit is large enough to accomodate everyone.
In order to learn libvirt multiqueue several things must be done:
1) The '/dev/net/tun' device needs to be opened multiple times with
IFF_MULTI_QUEUE flag passed to ioctl(fd, TUNSETIFF, &ifr);
2) Similarly, '/dev/vhost-net' must be opened as many times as in 1)
in order to keep 1:1 ratio recommended by qemu and kernel folks.
3) The command line construction code needs to switch from 'fd=X' to
'fds=X:Y:...:Z' and from 'vhostfd=X' to 'vhostfds=X:Y:...:Z'.
4) The monitor handling code needs to learn to pass multiple FDs.
network: static route support for <network>
This patch adds the <route> subelement of <network> to define a static
route. the address and prefix (or netmask) attribute identify the
destination network, and the gateway attribute specifies the next hop
address (which must be directly reachable from the containing
<network>) which is to receive the packets destined for
"address/(prefix|netmask)".
These attributes are translated into an "ip route add" command that is
executed when the network is started. The command used is of the
following form:
ip route add <address>/<prefix> via <gateway> \
dev <virbr-bridge> proto static metric <metric>
Tests are done to validate that the input data are correct. For
example, for a static route ip definition, the address must be a
network address and not a host address. Additional checks are added
to ensure that the specified gateway is directly reachable via this
network (i.e. that the gateway IP address is in the same subnet as one
of the IP's defined for the network).
prefix='0' is supported for both family='ipv4' address='0.0.0.0'
netmask='0.0.0.0' or prefix='0', and for family='ipv6' address='::',
prefix=0', although care should be taken to not override a desired
system default route.
Anytime an attempt is made to define a static route which *exactly*
duplicates an existing static route (for example, address=::,
prefix=0, metric=1), the following error message will be sent to
syslog:
RTNETLINK answers: File exists
This can be overridden by decreasing the metric value for the route
that should be preferred, or increasing the metric for the route that
shouldn't be preferred (and is thus in place only in anticipation that
the preferred route may be removed in the future). Caution should be
used when manipulating route metrics, especially for a default route.
Note: The use of the command-line interface should be replaced by
direct use of libnl so that error conditions can be handled better. But,
that is being left as an exercise for another day.
Signed-off-by: Gene Czarcinski <gene@czarc.net>
Signed-off-by: Laine Stump <laine@laine.org>
This should resolve https://bugzilla.redhat.com/show_bug.cgi?id=958907
Recent new addition of code to read/write active network state to the
NETWORK_STATE_DIR in the network driver broke startup for
qemu:///session. The network driver had several state file paths
hardcoded to /var, which could never possibly work in session mode.
This patch modifies *all* state files to use a variable string that is
set differently according to whether or not we're running
privileged. (It turns out that logDir was never used, so it's been
completely eliminated.)
There are very definitely other problems preventing dnsmasq and radvd
from running in non-privileged mode, but it's more consistent to have
the directories used by them be determined in the same fashion.
NB: I've noted before that the network driver is storing its state
(including dnsmasq and radvd state) in /var/lib, while qemu stores its
state in /var/run. It would probably have been better if the two
matched, but it's been this way for a long time, and changing it would
break running installations during an upgrade, so it's best to just
leave it as it is.
The call to virReportError conditionally switched between
two format strings, with different numbers of placeholders.
This meant the format string with no placeholders was not
protected by a "%s".
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The source code base needs to be adapted as well. Some files
include virutil.h just for the string related functions (here,
the include is substituted to match the new file), some include
virutil.h without any need (here, the include is removed), and
some require both.
On the off-chance that creation of persistent configuration file would
fail when defining a network that is already started as transient, the
code would remove the transient data structure and thus the network.
This patch changes the code so that in such case, the network is again
marked as transient and left behind.
This isn't strictly speaking a bugfix, but I realized I'd gotten a bit
too verbose when I chose the names for
VIR_DOMAIN_HOSTDEV_PCI_BACKEND_TYPE_*. This shortens them all a bit.
I remembered to document this bit, but somehow forgot to implement it.
This adds <driver name='kvm|vfio'/> as a subelement to the <forward>
element of a network (this puts it parallel to the match between
mode='hostdev' attribute in a network and type='hostdev' in an
<interface>).
Since it's already documented, only the parser, formatter, backend
driver recognition (it just translates/moves the flag into the
<interface> at the appropriate time), and a test case were needed.
(I used a separate enum for the values both because the original is
defined in domain_conf.h, which is unavailable from network_conf.h,
and because in the future it's possible that we may want to support
other non-hostdev oriented driver names in the network parser; this
makes sure that one can be expanded without the other).
There will soon be other items related to pci hostdevs that need to be
in the same part of the hostdevsubsys union as the pci address (which
is currently a single member called "pci". This patch replaces the
single member named pci with a struct named pci that contains a single
member named "addr".
Ensure that all drivers implementing public APIs use a
naming convention for their implementation that matches
the public API name.
eg for the public API virDomainCreate make sure QEMU
uses qemuDomainCreate and not qemuDomainStart
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
It will simplify later work if the sub-drivers have dedicated
APIs / field names. ie virNetworkDriver should have
virDrvNetworkOpen and virDrvNetworkClose methods
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ensure that the driver struct field names match the public
API names. For an API virXXXX we must have a driver struct
field xXXXX. ie strip the leading 'vir' and lowercase any
leading uppercase letters.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Until now tranisent networks weren't really useful as libvirtd wasn't
able to remember them across restarts. This patch adds support for
loading status files of transient networks (that already were generated)
so that the status isn't lost.
This patch chops up virNetworkObjUpdateParseFile and turns it into
virNetworkLoadState and a few friends that will help us to load status
XMLs and refactors the functions that are loading the configs to use
them.
Detected by a simple Shell script:
for i in $(git ls-files -- '*.[ch]'); do
awk 'BEGIN {
fail=0
}
/# *include.*\.h/{
match($0, /["<][^">]*[">]/)
arr[substr($0, RSTART+1, RLENGTH-2)]++
}
END {
for (key in arr) {
if (arr[key] > 1) {
fail=1
printf("%d %s\n", arr[key], key)
}
}
if (fail == 1)
exit 1
}' $i
if test $? != 0; then
echo "Duplicate header(s) in $i"
fi
done;
A later patch will add the syntax-check to avoid duplicate
headers.
By current implementation, network inbound is required in order
to use 'floor' for guaranteeing minimal throughput. This is so,
because we want user to tell us the maximal throughput of the
network instead of finding out ourselves (and detect bogus values
in case of virtual interfaces). However, we are nowadays
requiring this only on documentation level. So if user starts a
domain with 'floor' set on one its interfaces, we silently ignore
the setting. We should error out instead.
This reverts commit 383ebc4694.
We decided the xml for this feature needed more thought to make sure
we are doing it the best way, in particular wrt option values that
have multiple items.
Originally, only a host name was used to associate a
DHCPv6 request with a specific IPv6 address. Further testing
demonstrates that this is an unreliable method and, instead,
a client-id or DUID needs to be used. According to DHCPv6
standards, this id can be a duid-LLT, duid-LL, or duid-UUID
even though dnsmasq will accept almost any text string.
Although validity checking of a specified string makes sure it is
hexadecimal notation with bytes separated by colons, there is no
rigorous check to make sure it meets the standard.
Documentation and schemas have been updated.
Signed-off-by: Gene Czarcinski <gene@czarc.net>
Signed-off-by: Laine Stump <laine@laine.org>
This patch adds support for a new <option>-Tag in the <dhcp> block of
network configs, based on a subset of the fifth proposal by Laine
Stump in the mailing list discussion at
https://www.redhat.com/archives/libvir-list/2012-November/msg01054.html.
Any such defined option will result in a dhcp-option=<number>,"<value>"
statement in the generated dnsmasq configuration file.
Currently, DHCP options can be specified by number only and there is
no whitelisting or blacklisting of option numbers, which should
probably be added.
Signed-off-by: Pieter Hollants <pieter@hollants.com>
Signed-off-by: Laine Stump <laine@laine.org>
We pass over the address/port start/end values many times so we put
them in structs.
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Signed-off-by: Laine Stump <laine@laine.org>
Let users set the port range to be used for forward mode NAT:
...
<forward mode='nat'>
<nat>
<port start='1024' end='65535'/>
</nat>
</forward>
...
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Signed-off-by: Laine Stump <laine@laine.org>
Support setting which public ip to use for NAT via attribute
address in subelement <nat> in <forward>:
...
<forward mode='nat'>
<address start='1.2.3.4' end='1.2.3.10'/>
</forward>
...
This will construct an iptables line using:
'-j SNAT --to-source <start>-<end>'
instead of:
'-j MASQUERADE'
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Signed-off-by: Laine Stump <laine@laine.org>
The conditional setting of cmdout in networkBuildDhcpDaemonCommandLine()
caused Coverity to complain that 'cmd' could be leaked if !cmdout. Since
the function is local and only called with cmdout being passed those checks
have been removed.
The fetch of 'ipdef' in networkRefreshDhcpDaemon() when the loop to fill
in ipv4def fails to find an ipv4 address with dhcp defined. The filled in
ipdef value was not used. Code was made unnecessary with commit it 2d5cd1.
The bandwidth plug and unplug functions were assuming that an
interface's bandwidth setting was always specified directly in the
domain's <interface> definition, but that's not necessarily true - it
could have been obtained from a <portgroup> definition in the network
definition. This patch fixes those functions to use
virDomainNetGetActualBandwidth(), which gets the bandwidth pointer
from iface->data.network.actual if it exists, otherwise returns
iface->bandwidth.
Remove extraneous check for 'netdef' when dereferencing for vlan.nTags.
Prior code would already check if netdef was NULL.
Coverity complained about a path where the 'vlan' was potentially valid,
but a prior checks may not have allocated 'iface->data.network.actual',
so like other paths it needs to be allocated on the fly.
Move the copying of vlan up earlier in networkAllocateActualDevice, so
that actual.type gets properly set.
Since the first assignment to vlan is redundant except in the case of
jumping immediately to validate from the start of the function,
eliminate its initial setting at the top of the function in favor of
calling the helper function virDomainNetGetActualVlan() (which doesn't
depend on the local vlan pointer being initialized) down at validate:
Signed-off-by: Laine Stump <laine@redhat.com>
If addition of rules in networkAddIptablesRules() failed the real error
was masked by error reported when trying to clean up the remaining
rules.
With this patch the original error message is saved and set back after
the removal is complete.
Commit 0211fd6e04 introduced regression
where newly defined networks were not made persistent.
This patch makes the network persistent on each successful definition.
This is yet another refinement to the fix for CVE-2012-3411:
https://bugzilla.redhat.com/show_bug.cgi?id=833033
It turns out that it would be very intrusive to correctly backport the
entire --bind-dynamic option to older dnsmasq versions
(e.g. dnsmasq-2.48 that is used on RHEL6.x and CentOS 6.x), but very
simple to patch those versions to just use SO_BINDTODEVICE on all
their listening sockets (SO_BINDTODEVICE also has the desired effect
of permitting only traffic that was received on the interface(s) where
dnsmasq was set to listen.)
This patch modifies the dnsmasq capabilities detection to detect the
string:
--bind-interfaces with SO_BINDTODEVICE
in the output of "dnsmasq --version", and in that case realize that
using the old --bind-interfaces option is just as safe as
--bind-dynamic (and therefore *not* forbid creation of networks that
use public IP address ranges).
If -bind-dynamic is available, it is still preferred over
--bind-interfaces.
Note that this patch does no harm in upstream, or in any distro's
downstream if it happens to end up there, but builds for distros that
have a new enough dnsmasq to support --bind-dynamic do *NOT* need to
specifically backport this patch; it's only required for distro
releases that have dnsmasq too old to have --bind-dynamic (and those
distros will need to add the SO_BINDTODEVICE patch to dnsmasq,
*including the extra string in the --version output*, as well.
Somehow I managed to push the changes to this file with improper
indentation. This patch just re-indents, reformats the comment lines,
and re-groups a couple of multi-line strings so that they fit within
80 columns. The resulting binary should be identical.
A forgotten "!" in recently-modified code at the top of
networkRefreshDaemon() meant an improper early return, which led to 1)
dnsmasq config files not being updated from the newly modified config,
and 2) dnsmasq not being sent a SIGHUP so that it could learn about
the changes to the config.
virNetworkDefGetIpByIndex() returns NULL if there are no ip objects of
the requested type, and if there are no IP elements, then dnsmasq
shouldn't be running, so we can return early. Otherwise we should
rewrite the config files and send a SIGHUP.
This patch resolves the problem reported in:
https://bugzilla.redhat.com/show_bug.cgi?id=886663
The source of the problem was the fix for CVE 2011-3411:
https://bugzilla.redhat.com/show_bug.cgi?id=833033
which was originally committed upstream in commit
753ff83a50. That commit improperly
removed the "--except-interface lo" from dnsmasq commandlines when
--bind-dynamic was used (based on comments in the latter bug).
It turns out that the problem reported in the CVE could be eliminated
without removing "--except-interface lo", and removing it actually
caused each instance of dnsmasq to listen on localhost on port 53,
which created a new problem:
If another instance of dnsmasq using "bind-interfaces" (instead of
"bind-dynamic") had already been started (or if another instance
started later used "bind-dynamic"), this wouldn't have any immediately
visible ill effects, but if you tried to start another dnsmasq
instance using "bind-interfaces" *after* starting any libvirt
networks, the new dnsmasq would fail to start, because there was
already another process listening on port 53.
(Subsequent to the CVE fix, another patch changed the network driver
to put dnsmasq options in a conf file rather than directly on the
dnsmasq commandline, but preserved the same options.)
This patch changes the network driver to *always* add
"except-interface=lo" to dnsmasq conf files, regardless of whether we use
bind-dynamic or bind-interfaces. This way no libvirt dnsmasq instances
are listening on localhost (and the CVE is still fixed).
The actual code change is miniscule, but must be propogated through all
of the test files as well.
I noticed that /var/lib/libvirt/dnsmasq/*.conf used the wrong word;
it was intended to match the wording in src/util/xml.c.
* src/network/bridge_driver.c (networkDnsmasqConfContents): Fix typo.
* tests/networkxml2confdata/*.conf: Update accordingly.
Currently, we are only keeping a inactive XML configuration
in status dir. This is no longer enough as we need to keep
this class_id attribute so we don't overwrite old entries
when the daemon restarts. However, since there has already
been release which has just <network/> as root element,
and we want to keep things compatible, detect that loaded
status file is older one, and don't scream about it.
Network should be notified if we plug in or unplug an
interface, so it can perform some action, e.g. set/unset
network part of QoS. However, we are doing this in very
early stage, so iface->ifname isn't filled in yet. So
whenever we want to report an error, we must use a different
identifier, e.g. the MAC address.
These classes can borrow unused bandwidth. Basically,
only egress qdsics can have classes, therefore we can
do this kind of traffic shaping only on host's outgoing,
that is domain's incoming traffic.
This patch changes how parameters are passed to dnsmasq. Instead of
being on the command line, the parameters are put into a file (one
parameter per line) and a commandline --conf-file= specifies the
location of the file. The file is located in the same directory as
the leases file.
Putting the dnsmasq parameters into a configuration file
allows them to be examined and more easily understood than
examining the command lines displayed by "ps ax". This is
especially true when a number of networks have been started.
When the use of dnsmasq was originally done, the required command line
was simple, but it has gotten more complicated over time and will
likely become even more complicated in the future.
Note: The test conf files have all been renamed .conf instead of
.argv, and tests/networkxml2xmlargvdata was moved to
tests/networkxml2xmlconfdata.
The DHCPv6 support includes IPV6 dhcp-range and dhcp-host for one
IPv6 subnetwork on one interface. This support will only work
if dnsmasq version >= 2.64; otherwise an error occurs if
dhcp-range or dhcp-host is specified for an IPv6 address.
Essentially, this change provides the same DHCP support for IPv6
that has been available for IPv4.
With dnsmasq >= 2.64, support for the RA service is also now provided
by dnsmasq (radvd is no longer used/started). (Although at least one
version of dnsmasq prior to 2.64 "supported" IPv6 Router
Advertisement, there were bugs (fixed in 2.64) that rendered it
unusable.)
Documentation and the network schema has been updated
to reflect the new support.
The attributes of a <network> element's <forward> element were
previously stored directly in the virNetworkDef object, but
virNetworkUpdateForward() needs to operate on a <forward> in
isolation, so this patchs pulls out all those attributes into a
separate virNetworkForwardDef struct (and shortens their names
appropriately). This new object is contained in the virNetworkDef, not
pointed to by it, so there is no extra memory management.
This patch makes no functional changes, it only changes, e.g.,
"nForwardIfs" to "forward.nifs".
Since there is only a single virNetworkDNSDef for any virNetworkDef,
and it's trivial to determine whether or not it contains any real
data, it's much simpler (and fits more uniformly with the parse
function calling sequence of the parsers for many other objects that
are subordinates of virNetworkDef) if virNetworkDef *contains* an
virNetworkDNSDef rather than pointing to one.
Since it is now just a part of another object rather than its own
object, it no longer makes sense to have a *Free() function, so that
is changed to a *Clear() function.
More importantly though, ParseXML and Clear functions are needed for
the individual items contained in a virNetworkDNSDef (srv, txt, and
host records), but none of them have a *Clear(), and only two of the
three had *ParseXML() functions (both of which used a non-uniform
arglist). Those problems are cleared up by this patch - it splits the
higher-level Clear function into separate functions for each of the
three, creates a parse for txt records, and cleans up the srv and host
parsers, so we now have all the utility functions necessary to
implement virNetworkDefUpdateDNS(Host|Srv|Txt).
This shortens the name of the structs for srv and txt, and their
instances in virNetworkDNSDef, to be more compact and uniform with the
naming of the dns host array. It also changes the type of ntxts, etc
from unsigned int to size_t, so that they can be used directly as args
to VIR_*_ELEMENT.
This resolves: https://bugzilla.redhat.com/show_bug.cgi?id=767057
It was possible to define a network with <forward mode='bridge'> that
had both a bridge device and a forward device defined. These two are
mutually exclusive by definition (if you are using a bridge device,
then this is a host bridge, and if you have a forward dev defined,
this is using macvtap). It was also possible to put <ip>, <dns>, and
<domain> elements in this definition, although those aren't supported
by the current driver (although it's conceivable that some other
driver might support that).
The items that are invalid by definition, are now checked in the XML
parser (since they will definitely *always* be wrong), and the others
are checked in networkValidate() in the network driver (since, as
mentioned, it's possible that some other network driver, or even this
one, could some day support setting those).
This patch adds the capability for virtual guests to do IPv6
communication via a virtual network interface with no IPv6 (gateway)
addresses specified. This capability has always been enabled by
default for IPv4, but disabled for IPv6 for security concerns, and
because it requires the ip6tables command to be operational (which
isn't the case on a system with the ipv6 module completely disabled).
This patch adds a new attribute "ipv6" at the toplevel of a <network>
object. If ipv6='yes', the extra ip6tables rules required to permite
inter-guest communications are added when the network is started. If
it is 'no', or not present, those rules will not be added; thus the
default behavior doesn't change, so there should be no compatibility
issues with any existing installations.
Note that virtual guests cannot communication with the virtualization
host via this interface, because the following kernel tunable has
been set:
net.ipv6.conf.<bridge_interface_name>.disable_ipv6 = 1
This assures that the bridge interface will not have an IPv6
link-local (fe80::) address.
To control this behavior so that it is not enabled by default, the parameter
ipv6='yes' on the <network> statement has been added.
Documentation related to this patch has been updated.
The network schema has also been updated.
Currently to deal with auto-shutdown libvirtd must periodically
poll all stateful drivers. Thus sucks because it requires
acquiring both the driver lock and locks on every single virtual
machine. Instead pass in a "inhibit" callback to virStateInitialize
which drivers can invoke whenever they want to inhibit shutdown
due to existance of active VMs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The only important state that should prevent libvirtd shutdown
is from running VMs. Networks, host devices, network filters
and storage pools are all long lived resources that have no
significant in-memory state. They should not block shutdown.
This bug resolves CVE-2012-3411, which is described in the following
bugzilla report:
https://bugzilla.redhat.com/show_bug.cgi?id=833033
The following report is specifically for libvirt on Fedora:
https://bugzilla.redhat.com/show_bug.cgi?id=874702
In short, a dnsmasq instance run with the intention of listening for
DHCP/DNS requests only on a libvirt virtual network (which is
constructed using a Linux host bridge) would also answer queries sent
from outside the virtualization host.
This patch takes advantage of a new dnsmasq option "--bind-dynamic",
which will cause the listening socket to be setup such that it will
only receive those requests that actually come in via the bridge
interface. In order for this behavior to actually occur, not only must
"--bind-interfaces" be replaced with "--bind-dynamic", but also all
"--listen-address" options must be replaced with a single
"--interface" option. Fully:
--bind-interfaces --except-interface lo --listen-address x.x.x.x ...
(with --listen-address possibly repeated) is replaced with:
--bind-dynamic --interface virbrX
Of course libvirt can't use this new option if the host's dnsmasq
doesn't have it, but we still want libvirt to function (because the
great majority of libvirt installations, which only have mode='nat'
networks using RFC1918 private address ranges (e.g. 192.168.122.0/24),
are immune to this vulnerability from anywhere beyond the local subnet
of the host), so we use the new dnsmasqCaps API to check if dnsmasq
supports the new option and, if not, we use the "old" option style
instead. In order to assure that this permissiveness doesn't lead to a
vulnerable system, we do check for non-private addresses in this case,
and refuse to start the network if both a) we are using the old-style
options, and b) the network has a publicly routable IP
address. Hopefully this will provide the proper balance of not being
disruptive to those not practically affected, and making sure that
those who *are* affected get their dnsmasq upgraded.
(--bind-dynamic was added to dnsmasq in upstream commit
54dd393f3938fc0c19088fbd319b95e37d81a2b0, which was included in
dnsmasq-2.63)
In order to optionally take advantage of new features in dnsmasq when
the host's version of dnsmasq supports them, but still be able to run
on hosts that don't support the new features, we need to be able to
detect the version of dnsmasq running on the host, and possibly
determine from the help output what options are in this dnsmasq.
This patch implements a greatly simplified version of the capabilities
code we already have for qemu. A dnsmasqCaps device can be created and
populated either from running a program on disk, reading a file with
the concatenated output of "dnsmasq --version; dnsmasq --help", or
examining a buffer in memory that contains the concatenated output of
those two commands. Simple functions to retrieve capabilities flags,
the version number, and the path of the binary are also included.
bridge_driver.c creates a single dnsmasqCaps object at driver startup,
and disposes of it at driver shutdown. Any time it must be used, the
dnsmasqCapsRefresh method is called - it checks the mtime of the
binary, and re-runs the checks if the binary has changed.
networkxml2argvtest.c creates 2 "artificial" dnsmasqCaps objects at
startup - one "restricted" (doesn't support --bind-dynamic) and one
"full" (does support --bind-dynamic). Some of the test cases use one
and some the other, to make sure both code pathes are tested.
The virStateInitialize method and several cgroups methods were
using an 'int privileged' parameter or similar for dual-state
values. These are better represented with the bool type.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
bridge_driver.h: silence gcc warnings:
statement with no effect [-Wunused-value]
unused variable 'net' [-Wunused-variable]
virdrivermoduletest.c: don't require network driver module
if it hasn't been built.
The libvirt coding standard is to use 'function(...args...)'
instead of 'function (...args...)'. A non-trivial number of
places did not follow this rule and are fixed in this patch.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When assigning the new persistent definition for a transient network
(thus making it persistent) the network needs to be marked persistent
before actually atempting to assign the definition.
Until now, the network undefine API was able to undefine only inactive
networks. The restriction doesn't make sense any more so this patch
implements changing networks to transient.
When a transient network was created some of the checks weren't run on
the definition allowing to start invalid networks.
This patch splits out code to the network validation function and
re-uses that code when creating transient networks.
The network driver didn't care about config files when a network was
destroyed, just when it was undefined leaving behind files for transient
networks.
This patch splits out the cleanup code to a helper function that handles
the cleanup if the inactive network object is being removed and re-uses
this code when getting rid of inactive networks.
The hosts file was created in the network definition function. This
patch moves the place the file is being created to the point where
dnsmasq is being started.
Three FORWARD chain rules are added and two INPUT chain rules
are added when a network is started but only the FORWARD chain
rules are removed when the network is destroyed.
This was found during testing of the fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=868483
networkValidate was supposed to check for the existence of multiple
portgroups and report an error if this was encountered. It did, but
there were two problems:
1) even though it logged an error, it still returned success, allowing
the operation to continue.
2) It could exit the portgroup checking loop early (or possibly not
even do it once) if a vlan tag was supplied in the base network config
or one of the portgroups.
This patch fixes networkValidate to return failure in addition to
logging the error, and also changes it to not exit the portgroup
checking loop early. The logic was a bit off in the checking for vlan
anyway, and it's intertwined with fixing the early loop exit, so I
fixed that as well. Now it correctly checks for combinations where a
<virtualport> is specified in the base network def and <vlan> is given
in a portgroup, as well as the opposite (<vlan> in base network def
and <virtualport> in portgroup), and ignores the case of a disallowed
vlan when using *no* portgroup if there is a default portgroup (since
in that case there is no way to not use any portgroup).