All of the functions in util/interface.c were returning 0 on success,
but some returned -1 on error, and some returned a positive value
(usually the value of errno, but sometimes just 1). Libvirt's standard
is to return < 0 on error (in the case of functions that need to
return errno, -errno is returned.
This patch modifies all functions in interface.c to consistently
return < 0 on error, and makes changes to callers of those functions
where necessary.
This is in response to bugzilla 664629
https://bugzilla.redhat.com/show_bug.cgi?id=664629
The patch below returns an appropriate error message if the chain of
nwfilters is found to contain unresolvable variables and therefore
cannot be instantiated.
Example: The following XMl added to a domain:
<interface type='bridge'>
<mac address='52:54:00:9f:80:45'/>
<source bridge='virbr0'/>
<model type='virtio'/>
<filterref filter='test'/>
</interface>
that references the following filter
<filter name='test' chain='root'>
<filterref filter='clean-traffic'/>
<filterref filter='allow-dhcp-server'/>
</filter>
now displays upon 'virsh start mydomain'
error: Failed to start domain mydomain
error: internal error Cannot instantiate filter due to unresolvable variable: DHCPSERVER
'DHPCSERVER' is contained in allow-dhcp-server.
Done as a separate commit to make backporting the next patch easier.
We are already using "intprops.h", but this makes it explicit.
* .gnulib: Update, for syntax-check fix.
* bootstrap.conf (gnulib_modules): Make intprops use explicit.
* src/locking/domain_lock.c (includes): Drop unused header.
* src/nwfilter/nwfilter_learnipaddr.c (includes): Use "", not <>,
for gnulib.
Seems reasonable to have all command wrappers in the same place
v2:
Dont move SetInherit
v3:
Comment spelling fix
Adjust WARN0 comment
Remove spurious #include movement
Don't include sys/types.h
Combine virExec enums
Signed-off-by: Cole Robinson <crobinso@redhat.com>
This patch reorders the locks for the nwfilter updates and the access
the nwfilter objects. In the case that the IP address learning thread
was instantiating filters while an update happened, the previous order
lead to a deadlock.
In most cases this affects flags parameters that are unsigned in the
public and driver API but signed in the XDR protocol. Switch the
XDR protocol to unsigned for those.
A counterexample is virNWFilterGetXMLDesc. Its flags parameter is signed
in the public API and XDR protocol, but unsigned in the driver API.
This patch enables filtering of gratuitous ARP packets using the following XML:
<rule action='accept' direction='in' priority='425'>
<arp gratuitous='true'/>
</rule>
The public API and RPC over-the-wire format have no flags argument,
so neither should the internal callback API. This simplifies the
RPC generator.
* src/driver.h (virDrvNWFilterDefineXML): Drop argument that does
not match public API.
* src/nwfilter/nwfilter_driver.c (nwfilterDefine): Likewise.
* src/libvirt.c (virNWFilterDefineXML): Likewise.
* daemon/remote_generator.pl: Drop special case.
These VIR_XXXX0 APIs make us confused, use the non-0-suffix APIs instead.
How do these coversions works? The magic is using the gcc extension of ##.
When __VA_ARGS__ is empty, "##" will swallow the "," in "fmt," to
avoid compile error.
example: origin after CPP
high_level_api("%d", a_int) low_level_api("%d", a_int)
high_level_api("a string") low_level_api("a string")
About 400 conversions.
8 special conversions:
VIR_XXXX0("") -> VIR_XXXX("msg") (avoid empty format) 2 conversions
VIR_XXXX0(string_literal_with_%) -> VIR_XXXX(%->%%) 0 conversions
VIR_XXXX0(non_string_literal) -> VIR_XXXX("%s", non_string_literal)
(for security) 6 conversions
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
This matches the public API and helps to get rid of some special
case code in the remote generator.
Rename driver API functions and XDR protocol structs.
No functional change included outside of the remote generator.
We already have virAsprintf, so picking a similar name helps for
seeing a similar purpose. Furthermore, the prefix V before printf
generally implies 'va_list', even though this variant was '...', and
the old name got in the way of adding a new va_list version.
global rename performed with:
$ git grep -l virBufferVSprintf \
| xargs -L1 sed -i 's/virBufferVSprintf/virBufferAsprintf/g'
then revert the changes in ChangeLog-old.
Call shutdown functions for all subcomponents in nwfilterDriverShutdown.
Make sure that this shutdown functions can safely be called multiple times
and independent from the actual subcomponents state.
gcc 4.6 warns when a variable is initialized but isn't used afterwards:
vmware/vmware_driver.c:449:18: warning: variable 'vmxPath' set but not used [-Wunused-but-set-variable]
This patch fixes these warnings. There are still 2 offending files:
- vbox_tmpl.c: the variable is used inside an #ifdef and is assigned several
times outside of #ifdef. Fixing the warning would have required wrapping
all the assignment inside #ifdef which hurts readability.
vbox/vbox_tmpl.c: In function 'vboxAttachDrives':
vbox/vbox_tmpl.c:3918:22: warning: variable 'accessMode' set but not used [-Wunused-but-set-variable]
- esx_vi_types.generated.c: the name implies it's generated code and I
didn't want to dive into the code generator
esx/esx_vi_types.generated.c: In function 'esxVI_FileQueryFlags_Free':
esx/esx_vi_types.generated.c:1203:3: warning: variable 'item' set but not used [-Wunused-but-set-variable]
This patch adds support for the evaluation of TCP flags in nwfilters.
It adds documentation to the web page and extends the tests as well.
Also, the nwfilter schema is extended.
The following are some example for rules using the tcp flags:
<rule action='accept' direction='in'>
<tcp state='NONE' flags='SYN/ALL' dsptportstart='80'/>
</rule>
<rule action='drop' direction='in'>
<tcp state='NONE' flags='SYN/ALL'/>
</rule>
Child processes don't always reach _exit(); if they die from a
signal, then any messages should still be accurate. Most users
either expect a 0 status (thankfully, if status==0, then
WIFEXITED(status) is true and WEXITSTATUS(status)==0 for all
known platforms) or were filtering on WIFEXITED before printing
a status, but a few were missing this check. Additionally,
nwfilter_ebiptables_driver was making an assumption that works
on Linux (where WEXITSTATUS shifts and WTERMSIG just masks)
but fails on other platforms (where WEXITSTATUS just masks and
WTERMSIG shifts).
* src/util/command.h (virCommandTranslateStatus): New helper.
* src/libvirt_private.syms (command.h): Export it.
* src/util/command.c (virCommandTranslateStatus): New function.
(virCommandWait): Use it to also diagnose status from signals.
* src/security/security_apparmor.c (load_profile): Likewise.
* src/storage/storage_backend.c
(virStorageBackendQEMUImgBackingFormat): Likewise.
* src/util/util.c (virExecDaemonize, virRunWithHook)
(virFileOperation, virDirCreate): Likewise.
* daemon/remote.c (remoteDispatchAuthPolkit): Likewise.
* src/nwfilter/nwfilter_ebiptables_driver.c (ebiptablesExecCLI):
Likewise.
Relax the restriction that the hash table key must be a string
by allowing an arbitrary hash code generator + comparison func
to be provided
* util/hash.c, util/hash.h: Allow any pointer as a key
* internal.h: Include stdbool.h as standard.
* conf/domain_conf.c, conf/domain_conf.c,
conf/nwfilter_params.c, nwfilter/nwfilter_gentech_driver.c,
nwfilter/nwfilter_gentech_driver.h, nwfilter/nwfilter_learnipaddr.c,
qemu/qemu_command.c, qemu/qemu_driver.c,
qemu/qemu_process.c, uml/uml_driver.c,
xen/xm_internal.c: s/char */void */ in hash callbacks
Since the deallocator is passed into the constructor of
a hash table it is not desirable to pass it into each
function again. Remove it from all functions, but provide
a virHashSteal to allow a item to be removed from a hash
table without deleteing it.
* src/util/hash.c, src/util/hash.h: Remove deallocator
param from all functions. Add virHashSteal
* src/libvirt_private.syms: Add virHashSteal
* src/conf/domain_conf.c, src/conf/nwfilter_params.c,
src/nwfilter/nwfilter_learnipaddr.c,
src/qemu/qemu_command.c, src/xen/xm_internal.c: Update
for changed hash API
This patch adds the possibility to not just drop packets, but to also have them rejected where iptables at least sends an ICMP msg back to the originator. On ebtables this again maps into dropping packets since rejecting is not supported.
I am adding 'since 0.8.9' to the docs assuming this will be the next version of libvirt.
Now that the virHash handling functions call virReportOOMError by
themselves when needed, users of the virHash API no longer need to
do it by themselves. Since users of the virHash API were not
consistently calling virReportOOMError after memory failures from
the virHash code, this has the added benefit of making OOM
reporting from this code more consistent and reliable.
This patch reorders the connlimit and comment match extensions relative to the state match (-m state); connlimit being most useful if found after a -m state --state NEW and not before it.
When run non-root the nwfilter driver logs error messages about
being unable to find iptables/ebtables commands (they are in
/sbin which isn't in $PATH). The nwfilter driver can't ever work
as non-root, so simply skip it entirely thus avoiding the error
messages
* src/conf/nwfilter_conf.h, src/nwfilter/nwfilter_driver.c,
src/nwfilter/nwfilter_gentech_driver.c,
src/nwfilter/nwfilter_gentech_driver.h: Pass 'bool privileged'
flag down to final driver impl
* src/nwfilter/nwfilter_ebiptables_driver.c: Skip initialization
if not privileged
The public object is called NWFilter but the corresponding private
object is called NWFilterPool. I don't see compelling reasons for this
Pool suffix. One might argue that an NWFilter is a "pool" of rules, etc.
Remove the Pool suffix from NWFilterPool. No functional change included.
The IP address learning thread was causing a deadlock when it instantiated a filter while a filter update/change was ongoing. The reason for this was the ordering of locks due to the following calls
virNWFilterUnlockFilterUpdates()
virNWFilterPoolObjFindByName()
The below patch now puts the order of the locks in the above shown order when instantiating the filter from the IP address learning thread.
Rather than only cleaning any remaining ebtables rules, also clean those applied to iptables and ip6tables when detecting the IP address of an interface. Previous applied iptables rules may hinder DHCP packets.
* src/nwfilter/nwfilter_ebiptables_driver.c
(ebiptablesWriteToTempFile): Use /bin/sh.
(bash_cmd_path): Delete.
(ebiptablesDriverInit, ebiptablesDriverShutdown): No need to
search for bash.
(CMD_EXEC): Prefer $() over ``, since we can assume POSIX.
(iptablesSetupVirtInPost): Use portable 'test' syntax.
(iptablesLinkIPTablesBaseChain): Use POSIX $(()) syntax.
Using automated replacement with sed and editing I have now replaced all
occurrences of close() with VIR_(FORCE_)CLOSE() except for one, of
course. Some replacements were straight forward, others I needed to pay
attention. I hope I payed attention in all the right places... Please
have a look. This should have at least solved one more double-close
error.
The inet_pton and inet_ntop functions are obsolete, replaced
by getaddrinfo+getnameinfo with the AI_NUMERICHOST flag set.
These can be accessed via the virSocket APIs.
The bridge.c code had methods for fetching the IP address of
a bridge which used inet_ntop. Aside from the use of inet_ntop
these methods are broken, because a NIC can have multiple
addresses and this only returns one address. Since the methods
are never used, just remove them.
* src/conf/network_conf.c, src/nwfilter/nwfilter_learnipaddr.c:
Replace inet_pton and inet_ntop with virSocket APIs
* src/util/bridge.c, src/util/bridge.h: Remove unused methods
which called inet_ntop.
The getnameinfo() function is more flexible than inet_ntop()
avoiding the need to if/else the code based on socket family.
Also make it support UNIX socket addrs and allow inclusion
of a port (service) address. Finally do proper error reporting
via normal APIs.
* src/conf/domain_conf.c, src/nwfilter/nwfilter_ebiptables_driver.c,
src/qemu/qemu_conf.c: Fix error handling with virSocketFormat
* src/util/network.c: Rewrite virSocketFormat to use getnameinfo
and cope with UNIX socket addrs.
The nwIPAddress was simply a wrapper about virSocketAddr.
Just use the latter directly, removing all the extra field
de-references from code & helper APIs for parsing/formatting.
Also remove all the redundant casts from strong types to
void * and then immediately back to strong types.
* src/conf/nwfilter_conf.h: Remove nwIPAddress
* src/conf/nwfilter_conf.c, src/nwfilter/nwfilter_ebiptables_driver.c:
Update to use virSocketAddr and remove void * casts.
In the table built for traffic coming from the VM going to the host make the following changes:
- don't ACCEPT the packets but do a 'RETURN' and let the host-specific firewall rules in subsequent rules evaluate whether the traffic is allowed to enter
- use the '-m state' in the rules as everywhere else
The following filter transition from a filter allowing incoming TCP connections
<rule action='accept' direction='in' priority='401'>
<tcp/>
</rule>
<rule action='accept' direction='out' priority='500'>
<tcp/>
</rule>
to one that does not allow them
<rule action='drop' direction='in' priority='401'>
<tcp/>
</rule>
<rule action='accept' direction='out' priority='500'>
<tcp/>
</rule>
did previously not cut off existing (ssh) connections but only prevented newly initiated ones. The attached patch allows to cut off existing connections as well, thus enforcing what the filter is showing.
I had only tested with a configuration where the physical interface is connected to the bridge where the filters are applied. This patch now also solves a filtering problem where the physical interface is not connected to the bridge, but the bridge is given an IP address and the host routes between bridge and physical interface. Here the filters drop non-allowed traffic on the outgoing side on the host.
This is from a bug report and conversation on IRC where Soren reported that while a filter update is occurring on one or more VMs (due to a rule having been edited for example), a deadlock can occur when a VM referencing a filter is started.
The problem is caused by the two locking sequences of
qemu driver, qemu domain, filter # for the VM start operation
filter, qemu_driver, qemu_domain # for the filter update operation
that obviously don't lock in the same order. The problem is the 2nd lock sequence. Here the qemu_driver lock is being grabbed in qemu_driver:qemudVMFilterRebuild()
The following solution is based on the idea of trying to re-arrange the 2nd sequence of locks as follows:
qemu_driver, filter, qemu_driver, qemu_domain
and making the qemu driver recursively lockable so that a second lock can occur, this would then lead to the following net-locking sequence
qemu_driver, filter, qemu_domain
where the 2nd qemu_driver lock has been ( logically ) eliminated.
The 2nd part of the idea is that the sequence of locks (filter, qemu_domain) and (qemu_domain, filter) becomes interchangeable if all code paths where filter AND qemu_domain are locked have a preceding qemu_domain lock that basically blocks their concurrent execution
So, the following code paths exist towards qemu_driver:qemudVMFilterRebuild where we now want to put a qemu_driver lock in front of the filter lock.
-> nwfilterUndefine() [ locks the filter ]
-> virNWFilterTestUnassignDef()
-> virNWFilterTriggerVMFilterRebuild()
-> qemudVMFilterRebuild()
-> nwfilterDefine()
-> virNWFilterPoolAssignDef() [ locks the filter ]
-> virNWFilterTriggerVMFilterRebuild()
-> qemudVMFilterRebuild()
-> nwfilterDriverReload()
-> virNWFilterPoolLoadAllConfigs()
->virNWFilterPoolObjLoad()
-> virNWFilterPoolAssignDef() [ locks the filter ]
-> virNWFilterTriggerVMFilterRebuild()
-> qemudVMFilterRebuild()
-> nwfilterDriverStartup()
-> virNWFilterPoolLoadAllConfigs()
->virNWFilterPoolObjLoad()
-> virNWFilterPoolAssignDef() [ locks the filter ]
-> virNWFilterTriggerVMFilterRebuild()
-> qemudVMFilterRebuild()
Qemu is not the only driver using the nwfilter driver, but also the UML driver calls into it. Therefore qemuVMFilterRebuild() can be exchanged with umlVMFilterRebuild() along with the driver lock of qemu_driver that can now be a uml_driver. Further, since UML and Qemu domains can be running on the same machine, the triggering of a rebuild of the filter can touch both types of drivers and their domains.
In the patch below I am now extending each nwfilter callback driver with functions for locking and unlocking the (VM) driver (UML, QEMU) and introduce new functions for locking all registered callback drivers and unlocking them. Then I am distributing the lock-all-cbdrivers/unlock-all-cbdrivers call into the above call paths. The last shown callpath starting with nwfilterDriverStart() is problematic since it is initialize before the Qemu and UML drives are and thus a lock in the path would result in a NULL pointer attempted to be locked -- the call to virNWFilterTriggerVMFilterRebuild() is never called, so we never lock either the qemu_driver or the uml_driver in that path. Therefore, only the first 3 paths now receive calls to lock and unlock all callback drivers. Now that the locks are distributed where it matters I can remove the qemu_driver and uml_driver lock from qemudVMFilterRebuild() and umlVMFilterRebuild() and not requiring the recursive locks.
For now I want to put this out as an RFC patch. I have tested it by 'stretching' the critical section after the define/undefine functions each lock the filter so I can (easily) concurrently execute another VM operation (suspend,start). That code is in this patch and if you want you can de-activate it. It seems to work ok and operations are being blocked while the update is being done.
I still also want to verify the other assumption above that locking filter and qemu_domain always has a preceding qemu_driver lock.
In this patch I am extending the rule instantiator to create the state
match according to the state attribute in the XML. Only one iptables
rule in the incoming or outgoing direction will be created for a rule
in direction 'in' or 'out' respectively. A rule in direction 'inout' does
get iptables rules in both directions.
In this patch I am extending the rule instantiator to create the comment
node where supported, which is the case for iptables and ip6tables.
Since commands are written in the format
cmd='iptables ...-m comment --comment \"\" '
certain characters ('`) in the comment need to be escaped to
prevent comments from becoming commands themselves or cause other
forms of (bash) substitutions. I have tested this with various input and in
my tests the input made it straight into the comment. A test case for TCK
will be provided separately that tests this.