Even though we hit an error in client's IO loop, we still want to
process any pending data. So instead of reporting the error right away,
we can finish the current iteration and report the error once we're done
with it. Note that the error is stored in client->error by
virNetClientMarkClose so we don't need to worry about it being reset or
rewritten by any API we call in the meantime.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Whenever a connection was closed due to keepalive timeout, we would log
a warning but the interrupted API would return rather useless generic
error:
internal error: received hangup / error event on socket
Let's report a proper keepalive timeout error and make sure it is
propagated to all pending APIs. The error should be better now:
internal error: connection closed due to keepalive timeout
Based on an old patch from Martin Kletzander.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
By default, getaddrinfo() will return addresses for both
IPv4 and IPv6 if both protocols are enabled, and so the
RPC code will listen/connect to both protocols too. There
may be cases where it is desirable to restrict this to
just one of the two protocols, so add an 'int family'
parameter to all the TCP related APIs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
As of bba93d40 all of our RPC objects are derived from
virObjectLockable. However, during rewrite some errors sneaked
in. For instance, the dispose functions to virNetClient and
virNetServerClient objects were not only freeing allocated
memory, but unlocking themselves. This is wrong. Object should
never disappear while locked.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Whenever client socket was marked as closed for some reason, it could've
been changed when really closing the connection. With this patch the
proper reason is kept since the first time it's marked as closed.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Currently, we use pthread_sigmask(SIG_BLOCK, ...) prior to calling
poll(). This is okay, as we don't want poll() to be interrupted.
However, then - immediately as we fall out from the poll() - we try to
restore the original sigmask - again using SIG_BLOCK. But as the man
page says, SIG_BLOCK adds signals to the signal mask:
SIG_BLOCK
The set of blocked signals is the union of the current set and the set argument.
Therefore, when restoring the original mask, we need to completely
overwrite the one we set earlier and hence we should be using:
SIG_SETMASK
The set of blocked signals is set to the argument set.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Any source file which calls the logging APIs now needs
to have a VIR_LOG_INIT("source.name") declaration at
the start of the file. This provides a static variable
of the virLogSource type.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The dtrace probe macros rely on the logging API. We can't make
the internal.h header include the virlog.h header though since
that'd be a circular include. Instead simply split the dtrace
probes into their own header file, since there's no compelling
reason for them to be in the main internal.h header.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit a1cbe4b5 added a check for spaces around assignments and this
patch extends it to checks for spaces around '=='. One exception is
virAssertCmpInt where comma after '==' is acceptable (since it is a
macro and '==' is its argument).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This patch enables the password authentication in the libssh2 connection
driver. There are a few benefits to this step:
1) Hosts with challenge response authentication will now be supported
with the libssh2 connection driver.
2) Credential for hosts can now be stored in the authentication
credential config file
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
http://www.uhv.edu/ac/newsletters/writing/grammartip2009.07.01.htm
(and several other sites) give hints that 'onto' is best used if
you can also add 'up' just before it and still make sense. In many
cases in the code base, we really want the two-word form, or even
a simplification to just 'on' or 'to'.
* docs/hacking.html.in: Use correct 'on to'.
* python/libvirt-override.c: Likewise.
* src/lxc/lxc_controller.c: Likewise.
* src/util/virpci.c: Likewise.
* daemon/THREADS.txt: Use simpler 'on'.
* docs/formatdomain.html.in: Better usage.
* docs/internals/rpc.html.in: Likewise.
* src/conf/domain_event.c: Likewise.
* src/rpc/virnetclient.c: Likewise.
* tests/qemumonitortestutils.c: Likewise.
* HACKING: Regenerate.
Signed-off-by: Eric Blake <eblake@redhat.com>
Despite the comment stating virNetClientIncomingEvent handler should
never be called with either client->haveTheBuck or client->wantClose
set, there is a sequence of events that may lead to both booleans being
true when virNetClientIncomingEvent is called. However, when that
happens, we must not immediately close the socket as there are other
threads waiting for the buck and they would cause SIGSEGV once they are
woken up after the socket was closed. Another thing is we should clear
all remaining calls in the queue after closing the socket.
The situation that can lead to the crash involves three threads, one of
them running event loop and the other two calling libvirt APIs. The
event loop thread detects an event on client->sock and calls
virNetClientIncomingEvent handler. But before the handler gets a chance
to lock client, the other two threads (T1 and T2) start calling some
APIs. T1 gets the buck and detects EOF on client->sock while processing
its RPC call. Since T2 is waiting for its own call, T1 passes the buck
on to it and unlocks client. But before T2 gets the signal, the event
loop thread wakes up, does its job and closes client->sock. The crash
happens when T2 actually wakes up and tries to do its job using a closed
client->sock.
When converting to virObject, the probes on the 'Free' functions
were removed on the basis that there is a probe on virObjectFree
that suffices. This puts a burden on people writing probe scripts
to identify which object is being dispose. This adds back probes
in the 'Dispose' functions and updates the rpc monitor systemtap
example to use them
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When creating the virClass object for virNetClient, we specified
virObject as the parent instead of virObjectLockable
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently all classes must directly inherit from virObject.
This allows for arbitrarily deep hierarchy. There's not much
to this aside from chaining up the 'dispose' handlers from
each class & providing APIs to check types.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A number of bugs handling file descriptors received from the
server caused the FDs to be lost and leaked.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The libvirt coding standard is to use 'function(...args...)'
instead of 'function (...args...)'. A non-trivial number of
places did not follow this rule and are fixed in this patch.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://www.gnu.org/licenses/gpl-howto.html recommends that
the 'If not, see <url>.' phrase be a separate sentence.
* tests/securityselinuxhelper.c: Remove doubled line.
* tests/securityselinuxtest.c: Likewise.
* globally: s/; If/. If/
e5a1bee07 introduced a regression in Boxes: when Boxes is left idle
(it's still doing some libvirt calls in the background), the
libvirt connection gets closed after a few minutes. What happens is
that this code in virNetClientIOHandleOutput gets triggered:
if (!thecall)
return -1; /* Shouldn't happen, but you never know... */
and after the changes in e5a1bee07, this causes the libvirt connection
to be closed.
Upon further investigation, what happens is that
virNetClientIOHandleOutput is called from gvir_event_handle_dispatch
in libvirt-glib, which is triggered because the client fd became
writable. However, between the times gvir_event_handle_dispatch
is called, and the time the client lock is grabbed and
virNetClientIOHandleOutput is called, another thread runs and
completes the current call. 'thecall' is then NULL when the first
thread gets to run virNetClientIOHandleOutput.
After describing this situation on IRC, danpb suggested this:
11:37 < danpb> In that case I think the correct thing would be to change
'return -1' above to 'return 0' since that's not actually an
error - its a rare, but expected event
which is what this patch is doing. I've tested it against master
libvirt, and I didn't get disconnected in ~10 minutes while this
happens in less than 5 minutes without this patch.
Unfortunately libssh2 doesn't support all types of host keys that can be
saved in the known_hosts file. Also it does not report that parsing of
the file failed. This results into truncated known_hosts files where the
standard client stores keys also in other formats (eg.
ecdsa-sha2-nistp256).
This patch changes the default location of the known_hosts file into the
libvirt private configuration directory, where it will be only written
by the libssh2 layer itself. This prevents trashing user's known_host
file.
This patch adds a glue layer to enable using libssh2 code with the
network client code.
As in the original client implementation, shell code is sent to the
server to detect correct options for netcat and connect to libvirt's
unix socket.
Currently the virNetClientPtr constructor will always register
the async IO event handler and the keepalive objects. In the
case of the lock manager, there will be no event loop available
nor keepalive support required. Split this setup out of the
constructor and into separate methods.
The remote driver will enable async IO and keepalives, while
the LXC driver will only enable async IO
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
From man poll(2), poll does not set errno=EAGAIN on interrupt, however
it does set errno=EINTR. Have libvirt retry on the appropriate errno.
Under heavy load, a program of mine kept getting libvirt errors 'poll on
socket failed: Interrupted system call'. The signals were SIGCHLD from
processes forked by threads unrelated to those using libvirt.