After the virNetDaemonAddServerPostExec call in virtlogd we should have
netserver refcount set to 2. One goes to netdaemon servers hashtable
and one goes to virt{logd,lock} own reference to netserver. Let's add
the missing increment in virNetDaemonAddServerPostExec itself while
holding the daemon lock.
Since lockd defers management of the @srv object by the presence
in the hash table, virLockDaemonNewPostExecRestart must Unref the
alloc'd Ref on the @srv object done as part of virNetDaemonAddServerPostExec
and virNetServerNewPostExecRestart processing. The virNetDaemonGetServer
in lock_daemon main will also take a reference which is Unref'd during
main cleanup.
Commit id '252610f7d' used a hash table to store the @srv, but
didn't handle the virObjectUnref if virNetDaemonNew failed nor
did it use virObjectUnref once successfully placed into the table
which will now be managing it's lifetime (and would cause the
virObjectRef if successfully inserted into the table).
Right-aligning backslashes when defining macros or using complex
commands in Makefiles looks cute, but as soon as any changes is
required to the code you end up with either distractingly broken
alignment or unnecessarily big diffs where most of the changes
are just pushing all backslashes a few characters to one side.
Generated using
$ git grep -El '[[:blank:]][[:blank:]]\\$' | \
grep -E '*\.([chx]|am|mk)$$' | \
while read f; do \
sed -Ei 's/[[:blank:]]*[[:blank:]]\\$/ \\/g' "$f"; \
done
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
The assumption so far was an average of 4 disks per guest.
But some architectures, like s390x, still often use plenty of smaller disks.
To include those in the considerations an assumption of an average of 10
disks is more reasonable.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Seeing a log message saying 'flags=93' is ambiguous & confusing unless
you happen to know that libvirt always prints flags as hex. Change our
debug messages so that they always add a '0x' prefix when printing flags,
and '0' prefix when printing mode. A few other misc places gain a '0x'
prefix in error messages too.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There were a few places in our code where the following pattern in 'if'
condition occurred:
if ((foo = bar() < 0))
do something;
This patch adjusts the conditions to the expected format:
if ((foo = bar()) < 0)
do something;
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1488192
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Commit 94c465d0 refactored the logging setup phase but introduced an
issue, where the daemon ignores verbose mode when there are no outputs
defined and the default must be used. The problem is that the default
output was determined too early, thus ignoring the potential '--verbose'
option taking effect. This patch postpones the creation of the default
output to the very last moment when nothing else can change. Since the
default output is only created during the init phase, it's safe to leave
the pointer as NULL for a while, but it will be set eventually, thus not
affecting runtime.
Patch also adjusts both the other daemons.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1442947
Signed-off-by: Erik Skultety <eskultet@redhat.com>
The default conf files, for example libvirtd.conf, virtlockd.conf, and
virtlogd.conf, should be located under the directory "/etc/libvirt" when
root as root, rather than "/etc". When run as non-root, the configuration
files should be located under "$XDG_CONFIG_HOME/libvirt/", rather than
"XDG_CONFIG_HOME".
Signed-off-by: Lily Zhu <lizhu@redhat.com>
Signed-off-by: Erik Skultety <eskultet@redhat.com>
The recently added sanlock_strerror function can be used to translate
sanlock's numeric errors into human readable strings.
https://bugzilla.redhat.com/show_bug.cgi?id=1409511
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The log and lock protocol don't have an extra handshake to close the
connection. Instead they just close the socket. Unfortunately that
resulted into a lot of spurious garbage logged to the system log files:
2017-03-17 14:00:09.730+0000: 4714: error : virNetSocketReadWire:1800 : End of file while reading data: Input/output error
or in the journal as:
Mar 13 16:19:33 xxxx virtlogd[32360]: End of file while reading data: Input/output error
Use the new facility in the netserverclient to suppress the IO error
report from the virNetSocket layer.
Linux still defaults to a 1024 open file handle limit. This causes
scalability problems for libvirtd / virtlockd / virtlogd on large
hosts which might want > 1024 guest to be running. In fact if each
guest needs > 1 FD, we can't even get to 500 guests. This is not
good enough when we see machines with 100's of physical cores and
TBs of RAM.
In comparison to other memory requirements of libvirtd & related
daemons, the resource usage associated with open file handles
is essentially line noise. It is thus reasonable to increase the
limits unconditionally for all installs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
After deploying virtlogd by default we identified a number of
mistakes in the systemd unit file. virtlockd's relationship
to libvirtd is the same as virtlogd, so we must apply the
same unit file fixes to virtlockd
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Now that virLog{Get,Set}DefaultOutput routines are introduced we can wire them
up to the daemon's logging initialization code. Also, change the order of
operations a bit so that we still strictly honor our precedence of settings:
cmdline > env > config now that outputs and filters are not appended anymore.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Along with an empty string, it should also be possible for users to pass
NULL to the public APIs which in turn would trigger a routine(future
work) responsible for defining an appropriate default logging output
given the current circumstances.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This initially started as a fix of some debug printing in
virCgroupDetect. However it turned out that other places suffer
from the similar problem. While dealing with pids, esp. in cases
where we cannot use pid_t for ABI stability reasons, we often
chose an unsigned integer type. This makes no sense as pid_t is
signed.
Also, new syntax-check rule is introduced so we won't repeat this
mistake.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Similar to outputs, parser should do parsing only, thus the 'define' logic
is going to be stripped from virLogParseAndDefineFilters by replacing calls to
this method to virLogSetFilters instead.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Since virLogParseAndDefineOutputs is going to be stripped from 'output defining'
logic, replace all relevant occurrences with virLogSetOutputs call to make the
change transparent to all original callers (daemons mostly).
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Right now virLogParse* functions are doing both parsing and defining of filters
and outputs which should be two separate operations. Since the naming is
apparently a bit poor this patch renames these functions to
virLogParseAndDefine* which eventually will be replaced by virLogSet*.
Additionally, virLogParse{Filter,Output} will be later (after the split) reused,
so that these functions do exactly what the their name suggests.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1292984
Hold on to your hats, because this is gonna be wild.
In bd3e16a3 I've tried to expose sanlock io_timeout. What I had
not realized (because there is like no documentation for sanlock
at all) was very unusual way their APIs work. Basically, what we
do currently is:
sanlock_add_lockspace_timeout(&ls, io_timeout);
which adds a lockspace to sanlock daemon. One would expect that
io_timeout sets the io_timeout for it. Nah! That's where you are
completely off the tracks. It sets timeout for next lockspace you
will probably add later. Therefore:
sanlock_add_lockspace_timeout(&ls, io_timeout = 10);
/* adds new lockspace with default io_timeout */
sanlock_add_lockspace_timeout(&ls, io_timeout = 20);
/* adds new lockspace with io_timeout = 10 */
sanlock_add_lockspace_timeout(&ls, io_timeout = 40);
/* adds new lockspace with io_timeout = 20 */
And so on. You get the picture.
Fortunately, we don't allow setting io_timeout per domain or per
domain disk. So we just need to set the default used in the very
first step and hope for the best (as all the io_timeout-s used
later will have the same value).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Currently, we are checking for sanlock_add_lockspace_timeout
which is good for now. But in a subsequent patch we are going to
use sanlock_write_lockspace (which sets an initial value for io
timeout for sanlock). Now, there is no reason to check for both
functions in sanlock library as the sanlock_write_lockspace was
introduced in 2.7 release and the one we are currently checking
for in the 2.5 release. Therefore it is safe to assume presence
of sanlock_add_lockspace_timeout when sanlock_write_lockspace
is detected.
Moreover, the macro for conditional compilation is renamed to
HAVE_SANLOCK_IO_TIMEOUT (as it now encapsulates two functions).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Admin API needs a way of addressing specific clients. Unlike servers, which we
are happy to address by names both because its name reflects its purpose (to
some extent) and we only have two of them (so far), naming clients doesn't make
any sense, since a) each client is an anonymous, i.e. not recognized after a
disconnect followed by a reconnect, b) we can't predict what kind of requests
it's going to send to daemon, and c) the are loads of them comming and going,
so the only viable option is to use an ID which is of a reasonably wide data
type.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
After this commit, all man pages are generated using the same two
steps:
1. Process a source $command.pod file with pod2man(1) to obtain
a valid man page in $command.$section.in
2. Process $command.$section.in with sed(1) to obtain the final
man page in $command.$section
Since servers know their name, there is no need to supply such
information twice. Also defeats inconsistencies.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
At first I did not want to do this, but after trying to implement some
newer feaures in the admin API I realized we need that to make our lives
easier. On the other hand they are not saved redundantly and the
virNetServer objects are still kept in a hash table.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Since the daemon can manage and add (at fresh start) multiple servers,
we also should be able to add them from a JSON state file in case of a
daemon restart, so post exec restart support for multiple servers is also
provided. Patch also updates virnetdaemontest accordingly.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Our existing virHashForEach method iterates through all items disregarding the
fact, that some of the iterators might have actually failed. Errors are usually
dispatched through an error element in opaque data which then causes the
original caller of virHashForEach to return -1. In that case, virHashForEach
could return as soon as one of the iterators fail. This patch changes the
iterator return type and adjusts all of its instances accordingly, so the
actual refactor of virHashForEach method can be dealt with later.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
When building with gcc-5 (particularly gcc-5.3.0 now) and having pdwtags
installed (package dwarves) make check fails with the following error:
$ make lock_protocol-struct
GEN lock_protocol-struct
--- lock_protocol-structs 2016-01-13 15:04:59.318809607 +0100
+++ lock_protocol-struct-t3 2016-01-13 15:05:17.703501234 +0100
@@ -26,10 +26,6 @@
virLockSpaceProtocolNonNullString name;
u_int flags;
};
-enum virLockSpaceProtocolAcquireResourceFlags {
- VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_SHARED = 1,
- VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_AUTOCREATE = 2,
-};
struct virLockSpaceProtocolAcquireResourceArgs {
virLockSpaceProtocolNonNullString path;
virLockSpaceProtocolNonNullString name;
Makefile:10415: recipe for target 'lock_protocol-struct' failed
make: *** [lock_protocol-struct] Error 1
That happens because without any specific options gcc doesn't keep enum
information in the resulting binary object. I managed to isolate the
parameters of gcc that caused this issue to disappear, however I
remember that they influenced the resulting binaries quite a bit and
were definitely not something we would want to add as mandatory to the
build process.
So to deal with this cleanly, let's take that enum and separate it out
to its own header file. Since it is only used in the lockd driver and
the protocol, lock_driver_lockd.h feels like a suitable name.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Commit b22344f328 mistakenly reordered
Default-* lines. Thanks to that I noticed that we are very inconsistent
with our init scripts, so I took the liberty of synchronizing them,
updating them and making them all look shiny and new. So apart from
fixing the LSB requirements, I also fixed the ordering, specified
runlevels and fix the link to the reference specification.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1251190
So, if domain loses access to storage, sanlock tries to kill it
after some timeout. So far, the default is 80 seconds. But for
some scenarios this might not be enough. We should allow users to
adjust the timeout according to their needs.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Lets use wrapper functions virLockDaemonLock and
virLockDaemonUnlock instead of virMutexLock and virMutexUnlock.
This has no functional impact, but it's easier to read (at least
for me).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So we have this mechanism that on SIGUSR1 the virtlockd dumps its
internal state into a JSON file, reexec itself and the reloads
the internal state back. However, there's a bug in our
implementation:
(gdb) signal SIGUSR1
Continuing with signal SIGUSR1.
[Thread 0x7fd094f7b700 (LWP 10602) exited]
process 10600 is executing new program: /home/zippy/work/libvirt/libvirt.git/src/virtlockd
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7fb28bc3c700 (LWP 14501)]
Program received signal SIGSEGV, Segmentation fault.
0x00007fb29133d530 in virExpandN (ptrptr=0x70, size=8, countptr=0x68, add=1, report=true, domcode=7, filename=0x7fb29138aeab "rpc/virnetserver.c", funcname=0x7fb29138b680 <__FUNCTION__.15821> "virNetServerAddProgram", linenr=661) at util/viralloc.c:288
288 if (*countptr + add < *countptr) {
(gdb) bt
#0 0x00007fb29133d530 in virExpandN (ptrptr=0x70, size=8, countptr=0x68, add=1, report=true, domcode=7, filename=0x7fb29138aeab "rpc/virnetserver.c", funcname=0x7fb29138b680 <__FUNCTION__.15821> "virNetServerAddProgram", linenr=661) at util/viralloc.c:288
#1 0x00007fb29132a267 in virNetServerAddProgram (srv=0x0, prog=0x7fb2915d08b0) at rpc/virnetserver.c:661
#2 0x00007fb29131f27f in main (argc=1, argv=0x7fff8f771298) at locking/lock_daemon.c:1445
Notice the NULL @srv passed to frame 2? Usually, the @srv
variable is initialized on fresh start. However, in case of
daemon reload, the code path that is responsible for initializing
the value was not triggered and therefore we crashed immediately.
Fix this by always setting the variable.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The internal representation of a JSON array counts the items in
size_t. However, for some reason, when asking for the count it's
reported as int. Firstly, we need the function to return a signed
type as it's returning -1 on an error. But, not every system has
integer the same size as size_t. Therefore, lets return ssize_t.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Now that we have virNetDaemon object holding all the data and being
capable of referencing multiple servers, having a duplicate reference to
a single server stored in virLockDaemon isn't necessary anymore. This
patch removes the above described element.
Since its introduction in 2011 (particularly in commit f4324e3292),
the option doesn't work. It just effectively disables all incoming
connections. That's because the client private data that contain the
'keepalive_supported' boolean, are initialized to zeroes so the bool is
false and the only other place where the bool is used is when checking
whether the client supports keepalive. Thus, according to the server,
no client supports keepalive.
Removing this instead of fixing it is better because a) apparently
nobody ever tried it since 2011 (4 years without one month) and b) we
cannot know whether the client supports keepalive until we get a ping or
pong keepalive packet. And that won't happen until after we dispatched
the ConnectOpen call.
Another two reasons would be c) the keepalive_required was tracked on
the server level, but keepalive_supported was in private data of the
client as well as the check that was made in the remote layer, thus
making all other instances of virNetServer miss this feature unless they
all implemented it for themselves and d) we can always add it back in
case there is a request and a use-case for it.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Some hypervisors like Xen do not have PIDs associated with domains.
Relax the requirement for PID != 0 in the locking code so it can
be used by hypervisors that do not represent domains as a process
running on the host.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>