If 2 threads call abort for example then one of them
will hang because client will send 2 abort messages and
server will reply only on first of them, the second will be
ignored. And on server reply client changes the state only
one of abort message to complete, the second will hang forever.
There are other similar issues.
We should complete all messages waiting reply if we got
error or expected abort/finish reply from server. Also if one
thread send finish and another abort one of them will win
the race and server will either abort or finish stream. If
stream is aborted then thread requested finishing should report
error. In order to archive this let's keep stream closing reason
in @closed field. If we receive VIR_NET_OK message for stream
then stream is finished if oldest (closest to queue end) message
in stream queue is finish message and stream is aborted if oldest
message is abort message. Otherwise it is protocol error.
By the way we need to fix case of receiving VIR_NET_CONTINUE
message. Now we take oldest message in queue and check if
this is dummy message. If one thread first sends abort and
second thread then receives data then oldest message is abort
message and second thread won't be notified when data arrives.
Let's find oldest dummy message instead.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Checking virNetClientStreamRaiseError without client lock
is racy which is fixed in [1] for example. Thus let's remove such checks
when we are sending message to server. And in other cases
(like virNetClientStreamRecvHole for example) let's move the check
into client stream code.
virNetClientStreamRecvPacket already have stream lock so we could
introduce another error checking function like virNetClientStreamRaiseErrorLocked
but as error is set when both client and stream lock are hold we
can remove locking from virNetClientStreamRaiseError because all
callers hold either client or stream lock.
Also let's split virNetClientStreamRaiseErrorLocked into checking
state function and checking message send status function. They are
same yet.
[1] 1b6a29c21: rpc: fix race on stream abort/finish and server side abort
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Stream server error is not propagated if thread does not have the buck.
In case we have the buck we are ok due to the code added in [1].
Let's check for stream error on all paths. Now we don't need
to raise error in virNetClientCallDispatchStream.
Old code reported error only if the first message in wait
queue awaits reply. It is odd as depends on wait queue
situation. For example if we have only TX
message in queue and in one iteration loop both send the
message and receive error then thread sending TX message did
not receive the error. Next if we have RX message (first)
and TX message (second) in queue and in one iteration
loop both send the TX message and receive error then
thread sending TX message received error. In short
it was inconsistent. Let's report error whenever
we received it and for every type of message as it makes
sense to report errors as early as possible.
[1] 16c6e2b41: Fix propagation of RPC errors from streams
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
In next patches we'll add stream state checks to this
function that applicable to all call paths. This is handy
place because we hold client lock here.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Stream abort/finish can hang because we can receive abort message
from server and yet sent abort/finish message to server. The latter
will not be answered ever because after server sends abort message
it forgets the stream and messages for unknown stream are simply ignored.
We check for stream error at the very beginning of remoteStreamFinish/remoteStreamAbort
but stream error can be set after the check in another thread operating
on stream. Let's check for stream error under client lock similar
to what's done in [1].
[1] 833b901cb: stream: Check for stream EOF
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Missing semicolon at the end of macros can confuse some analyzers
(like cppcheck <filename>). VIR_ONCE_GLOBAL_INIT is almost
exclusively called without an ending semicolon, but let's
standardize on using one like the other macros.
Add a dummy struct definition at the end of the macro, so
the compiler will require callers to add a semicolon.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Cole Robinson <crobinso@redhat.com>
if virNetClientNew finishes with error before sock is set
to client object then sock does not get unrefed. This is
unexpected by function clients like virNetClientNewUNIX.
Let's make sure sock gets unrefed on any error path.
Next some clients like virNetClientNewLibSSH2 try to unref
sock on virNetClientNew errors. This is not correct even
before this patch because in some cases virNetClientNew
unrefed sock on error path by itself. Let's give up
sock managment to virNetClientNew entirely.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
In many files there are header comments that contain an Author:
statement, supposedly reflecting who originally wrote the code.
In a large collaborative project like libvirt, any non-trivial
file will have been modified by a large number of different
contributors. IOW, the Author: comments are quickly out of date,
omitting people who have made significant contribitions.
In some places Author: lines have been added despite the person
merely being responsible for creating the file by moving existing
code out of another file. IOW, the Author: lines give an incorrect
record of authorship.
With this all in mind, the comments are useless as a means to identify
who to talk to about code in a particular file. Contributors will always
be better off using 'git log' and 'git blame' if they need to find the
author of a particular bit of code.
This commit thus deletes all Author: comments from the source and adds
a rule to prevent them reappearing.
The Copyright headers are similarly misleading and inaccurate, however,
we cannot delete these as they have legal meaning, despite being largely
inaccurate. In addition only the copyright holder is permitted to change
their respective copyright statement.
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Replace instances where we previously called virGetLastError just to
either get the code or to check if an error exists with
virGetLastErrorCode to avoid a validity pre-check.
Signed-off-by: Ramy Elkest <ramyelkest@gmail.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
So far we are repeating the following lines over and over:
if (!(virSomeObjectClass = virClassNew(virClassForObject(),
"virSomeObject",
sizeof(virSomeObject),
virSomeObjectDispose)))
return -1;
While this works, it is impossible to do some checking. Firstly,
the class name (the 2nd argument) doesn't match the name in the
code in all cases (the 3rd argument). Secondly, the current style
is needlessly verbose. This commit turns example into following:
if (!(VIR_CLASS_NEW(virSomeObject,
virClassForObject)))
return -1;
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Ensure all enum cases are listed in switch statements.
Reviewed-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Right-aligning backslashes when defining macros or using complex
commands in Makefiles looks cute, but as soon as any changes is
required to the code you end up with either distractingly broken
alignment or unnecessarily big diffs where most of the changes
are just pushing all backslashes a few characters to one side.
Generated using
$ git grep -El '[[:blank:]][[:blank:]]\\$' | \
grep -E '*\.([chx]|am|mk)$$' | \
while read f; do \
sed -Ei 's/[[:blank:]]*[[:blank:]]\\$/ \\/g' "$f"; \
done
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
The packet with passed FD has the following format:
--------------------------
| len | header | payload |
--------------------------
where "payload" has an additional count of FDs before the actual data:
------------------
| nfds | payload |
------------------
When the packet is received we parse the "header", which as a side
effect updates msg->bufferOffset to point to the beginning of "payload".
If the message call contains FDs, we need to also parse the count of
FDs, which also updates the msg->bufferOffset.
The issue here is that when we attempt to read the FDs data from the
socket and we receive EAGAIN we finish the reading and call poll()
to wait for the data the we need. When the data arrives we already have
the packet in our buffer so we read the "header" again but this time
we don't read the count of FDs because we already have it stored.
That means that the msg->bufferOffset is not updated to point to the
actual beginning of the payload data, but it points to the count of
FDs. After all FDs are processed we dispatch the message to process
it and decode the payload. Since the msg->bufferOffset points to wrong
data, we decode the wrong payload and the API call fails with
error messages:
Domain not found: no domain with matching uuid '67656e65-7269-6300-0c87-5003ca6941f2' ()
Broken by commit 133c511b52 which fixed a FD and memory leak.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
This is a special type of stream packet, that is bidirectional
and contains information regarding how many bytes each side will
be skipping in the stream.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When we get a POLLHUP or VIR_EVENT_HANDLE_HANGUP event for a client, we
still want to read from the socket to process any accumulated data. But
doing so inevitably results in an error and a call to
virNetClientMarkClose before we get to processing the hangup event (and
another call to virNetClientMarkClose). However the close reason passed
to the second virNetClientMarkClose call is ignored because another one
was already set. We need to pass the correct close reason when marking
the socket to be closed for the first time.
https://bugzilla.redhat.com/show_bug.cgi?id=1373859
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
In the RPC client event loop code, if poll() returns only a POLLHUP
or POLLERR status, then we end up reporting a bogus error message:
error: failed to connect to the hypervisor
error: An error occurred, but the cause is unknown
We do actually report an error, but we virNetClientMarkClose method
has already captured the error status before we report it, so the
real error gets thrown away. The key fix is to report the error
before calling virNetClientMarkClose(). In changing this, we also
split out reporting of POLLHUP vs POLLERR to make any future bugs
easier to diagnose.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When composing the path to the default known_hosts file (for the libssh
and libssh2 drivers), do not check whether the configuration directory
(determined by virGetUserConfigDirectory()) exists: both the drivers can
handle non-existing files, and are able to create them (and their
directories) in that case.
This adds a small behaviour change: before, the key for an unknown host,
and manually accepted, was saved only if the configuration directory
existed -- a bit incoherent behaviour though.
If any of them is specified for the libssh and libssh2 drivers, there is
no need to depend on checks based on other paths: in particular, a
specified path for known_hosts was ignored if the local config directory
could not be determined, and the path for keyfile was ignored if the
home could not be determined.
Instead, lazily determine and use these two paths only in case they are
needed.
Implement in virtNetClient and VirNetSocket the needed functions to
expose a new libssh transport, providing all the options that the
libssh2 transport supports.
Add a couple of helper functions to check whether one of the default
names of SSH keys (as documented in ssh-keygen(1)) exists, and use them
to specify a key for the libssh2 transport if none was passed.
This partially reverts commit 9b45c9f049.
It changed the default format of socket address from the one SASL
requires, but did not adjust all the callers.
It also removed the test coverage for it.
Revert most of the changes except the virSocketAddrFormatFull support
for URI-formatted strings.
This fixes https://bugzilla.redhat.com/show_bug.cgi?id=1345743 while
reverting the format used by virt-admin's client-info command from
the URI one to the SASL one.
https://bugzilla.redhat.com/show_bug.cgi?id=1345743
This removes the opencoded payload freeing in the client, to use
the shared virNetMessageClearPayload call. Two changes:
- ClearPayload sets nfds=0, which fixes a potential crash if
an error path called virNetMessageFree/Clear on the message
after fds was free'd
- We drop the inner loop VIR_FORCE_CLOSE... this may mean fds are
kept open a little bit longer if the call is blocking but in
practice I don't think it will have any effect
Our socket address format is in a rather non-standard format and that is
because sasl library requires the IP address and service to be delimited by a
semicolon. The string form is a completely internal matter, however once the
admin interfaces to retrieve client identity information are merged, we should
return the socket address string in a common format, e.g. format defined by
URI rfc-3986, i.e. the IP address and service are delimited by a colon and
in case of an IPv6 address, square brackets are added:
Examples:
127.0.0.1:1234
[::1]:1234
This patch changes our default format to the one described above, while adding
separate methods to request the non-standard SASL format using semicolon as a
delimiter.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
FD passing APIs like CreateXMLWithFiles or OpenGraphicsFD will leak
file descriptors. The user passes in an fd, which is dup()'d in
virNetClientProgramCall. The new fd is what is transfered to the
server virNetClientIOWriteMessage.
Once all the fds have been written though, the parent msg->fds list
is immediately free'd, so the individual fds are never closed.
This closes each FD as its send to the server, so all fds have been
closed by the time msg->fds is free'd.
https://bugzilla.redhat.com/show_bug.cgi?id=1159766
Even though we hit an error in client's IO loop, we still want to
process any pending data. So instead of reporting the error right away,
we can finish the current iteration and report the error once we're done
with it. Note that the error is stored in client->error by
virNetClientMarkClose so we don't need to worry about it being reset or
rewritten by any API we call in the meantime.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Whenever a connection was closed due to keepalive timeout, we would log
a warning but the interrupted API would return rather useless generic
error:
internal error: received hangup / error event on socket
Let's report a proper keepalive timeout error and make sure it is
propagated to all pending APIs. The error should be better now:
internal error: connection closed due to keepalive timeout
Based on an old patch from Martin Kletzander.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
By default, getaddrinfo() will return addresses for both
IPv4 and IPv6 if both protocols are enabled, and so the
RPC code will listen/connect to both protocols too. There
may be cases where it is desirable to restrict this to
just one of the two protocols, so add an 'int family'
parameter to all the TCP related APIs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
As of bba93d40 all of our RPC objects are derived from
virObjectLockable. However, during rewrite some errors sneaked
in. For instance, the dispose functions to virNetClient and
virNetServerClient objects were not only freeing allocated
memory, but unlocking themselves. This is wrong. Object should
never disappear while locked.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Whenever client socket was marked as closed for some reason, it could've
been changed when really closing the connection. With this patch the
proper reason is kept since the first time it's marked as closed.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Currently, we use pthread_sigmask(SIG_BLOCK, ...) prior to calling
poll(). This is okay, as we don't want poll() to be interrupted.
However, then - immediately as we fall out from the poll() - we try to
restore the original sigmask - again using SIG_BLOCK. But as the man
page says, SIG_BLOCK adds signals to the signal mask:
SIG_BLOCK
The set of blocked signals is the union of the current set and the set argument.
Therefore, when restoring the original mask, we need to completely
overwrite the one we set earlier and hence we should be using:
SIG_SETMASK
The set of blocked signals is set to the argument set.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Any source file which calls the logging APIs now needs
to have a VIR_LOG_INIT("source.name") declaration at
the start of the file. This provides a static variable
of the virLogSource type.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The dtrace probe macros rely on the logging API. We can't make
the internal.h header include the virlog.h header though since
that'd be a circular include. Instead simply split the dtrace
probes into their own header file, since there's no compelling
reason for them to be in the main internal.h header.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit a1cbe4b5 added a check for spaces around assignments and this
patch extends it to checks for spaces around '=='. One exception is
virAssertCmpInt where comma after '==' is acceptable (since it is a
macro and '==' is its argument).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This patch enables the password authentication in the libssh2 connection
driver. There are a few benefits to this step:
1) Hosts with challenge response authentication will now be supported
with the libssh2 connection driver.
2) Credential for hosts can now be stored in the authentication
credential config file
Convert the type of loop iterators named 'i', 'j', k',
'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or
'unsigned int', also santizing 'ii', 'jj', 'kk' to use
the normal 'i', 'j', 'k' naming
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
http://www.uhv.edu/ac/newsletters/writing/grammartip2009.07.01.htm
(and several other sites) give hints that 'onto' is best used if
you can also add 'up' just before it and still make sense. In many
cases in the code base, we really want the two-word form, or even
a simplification to just 'on' or 'to'.
* docs/hacking.html.in: Use correct 'on to'.
* python/libvirt-override.c: Likewise.
* src/lxc/lxc_controller.c: Likewise.
* src/util/virpci.c: Likewise.
* daemon/THREADS.txt: Use simpler 'on'.
* docs/formatdomain.html.in: Better usage.
* docs/internals/rpc.html.in: Likewise.
* src/conf/domain_event.c: Likewise.
* src/rpc/virnetclient.c: Likewise.
* tests/qemumonitortestutils.c: Likewise.
* HACKING: Regenerate.
Signed-off-by: Eric Blake <eblake@redhat.com>
Despite the comment stating virNetClientIncomingEvent handler should
never be called with either client->haveTheBuck or client->wantClose
set, there is a sequence of events that may lead to both booleans being
true when virNetClientIncomingEvent is called. However, when that
happens, we must not immediately close the socket as there are other
threads waiting for the buck and they would cause SIGSEGV once they are
woken up after the socket was closed. Another thing is we should clear
all remaining calls in the queue after closing the socket.
The situation that can lead to the crash involves three threads, one of
them running event loop and the other two calling libvirt APIs. The
event loop thread detects an event on client->sock and calls
virNetClientIncomingEvent handler. But before the handler gets a chance
to lock client, the other two threads (T1 and T2) start calling some
APIs. T1 gets the buck and detects EOF on client->sock while processing
its RPC call. Since T2 is waiting for its own call, T1 passes the buck
on to it and unlocks client. But before T2 gets the signal, the event
loop thread wakes up, does its job and closes client->sock. The crash
happens when T2 actually wakes up and tries to do its job using a closed
client->sock.
When converting to virObject, the probes on the 'Free' functions
were removed on the basis that there is a probe on virObjectFree
that suffices. This puts a burden on people writing probe scripts
to identify which object is being dispose. This adds back probes
in the 'Dispose' functions and updates the rpc monitor systemtap
example to use them
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>