I've noticed that variable @reply is not initialized and if
something at the beginning of the function fails, e.g.
virDBusGetSystemBus(), the control jump straight to cleanup label
where dbus_message_unref() is then called over this uninitialized
variable.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
I've noticed couple of warning in dmesg while debugging
something:
[ 9683.973754] HTB: quantum of class 10001 is big. Consider r2q change.
[ 9683.976460] HTB: quantum of class 10002 is big. Consider r2q change.
I've read the HTB documentation and linux kernel code to find out
what's wrong. Basically we need to pass another argument
"quantum" to our tc cmd line because the default computed by HTB
does not always work in which case the warning message is printed
out.
You can read more details here:
http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm#sharing
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So, systemd-machined has this philosophy that machine names are like
hostnames and hence should follow the same rules. But we always allowed
international characters in domain names. Thus we need to modify the
machine name we are passing to systemd.
In order to change some machine names that we will be passing to systemd,
we also need to call TerminateMachine at the end of a lifetime of a
domain. Even for domains that were started with older libvirt. That
can be achieved thanks to virSystemdGetMachineNameByPID(). And because
we can change machine names, we can get rid of the inconsistent and
pointless escaping of domain names when creating machine names.
So this patch modifies the naming in the following way. It creates the
name as <drivername>-<id>-<name> where invalid hostname characters are
stripped out of the name and if the resulting name is longer, it
truncates it to 64 characters. That way we can start domains we
couldn't start before. Well, at least on systemd.
To make it work all together, the machineName (which is needed only with
systemd) is saved in domain's private data. That way the generation is
moved to the driver and we don't need to pass various unnecessary
arguments to cgroup functions.
The only thing this complicates a bit is the scope generation when
validating a cgroup where we must check both old and new naming, so a
slight modification was needed there.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1282846
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Same as for deserializer, this method might get handy for admin one day.
The major reason for this patch is to stay consistent with idea, i.e.
when deserializer can be shared, why not serializer as well. The only
problem to be solved was that the daemon side serializer uses a code
snippet which handles sparse arrays returned by some APIs as well as
removes any string parameters that can't be returned to older clients.
This patch makes of the new virTypedParameterRemote datatype introduced
by one of the pvious patches.
Since the method is static to remote_driver, it can't even be used by our
daemon. Other than that, it would be useful to be able to use it with admin as
well. This patch uses the new virTypedParameterRemote datatype introduced in
one of previous patches.
Currently, the deserializer is hardcoded into remote_driver which makes
it impossible for admin to use it. One way to achieve a shared implementation
(besides moving the code to another module) would be pass @ret_params_val as a
void pointer as opposed to the remote_typed_param pointer and add a new extra
argument specifying which of those two protocols is being used and typecast
the pointer at the function entry. An example from remote_protocol:
struct remote_typed_param_value {
int type;
union {
int i;
u_int ui;
int64_t l;
uint64_t ul;
double d;
int b;
remote_nonnull_string s;
} remote_typed_param_value_u;
};
typedef struct remote_typed_param_value remote_typed_param_value;
struct remote_typed_param {
remote_nonnull_string field;
remote_typed_param_value value;
};
That would leave us with a bunch of if-then-elses that needed to be used across
the method. This patch takes the other approach using the new datatype
introduced in one of earlier commits.
Both admin and remote protocols define their own types
(remote_typed_param vs admin_typed_param). Because of the naming convention,
admin typed params wouldn't be able to reuse the serialization/deserialization
methods, which are tailored for use by remote protocol, even if those method
were exported properly. In that case, introduce a new internal data type
structurally copying both admin and remote protocols which, eventually, would
allow serializer and deserializer to be used in a more generic way.
Use 'ret' for return variable name, clarify use of 'param_idx' and avoid
unnecessary 'success' label. No functional changes. Also document the
function.
The affected functions are:
virPCIDeviceGetManaged()
virPCIDeviceGetUnbindFromStub()
virPCIDeviceGetRemoveSlot()
virPCIDeviceGetReprobe()
Change their return type from unsigned int to bool: the corresponding
members in struct _virPCIDevice are defined as bool, and even the
corresponding virPCIDeviceSet*() functions take a bool value as input
so there's no point in these functions having unsigned int as return
type.
Suggested-by: John Ferlan <jferlan@redhat.com>
Unbinding a PCI device from the stub driver can require several steps,
and it can be useful for debugging to be able to trace which of these
steps are performed and which are skipped for each device.
The name is confusing, and there are just two uses: one is a test case,
and the other will be removed as part of an upcoming refactoring of
the hostdev code.
Commit 871e10f fixed a memory corruption error, but called strlen()
twice on the same string to do so. Even though the compiler is
probably smart enough to optimize the second call away, having a
single invocation makes the code slightly cleaner.
Suggested-by: Michal Privoznik <mprivozn@redhat.com>
In 370608b4c7 we have introduced two new internal APIs.
However, there are no stubs for build without macvtap. Therefore
build on systems lacking macvtap support (e.g. mingw or freebds)
fails when trying to link.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
libvirtd crashes on free()ing portData for an open vswitch port if that port
was deleted. To reproduce:
ovs-vsctl del-port vnet0
virsh migrate --live kvm1 qemu+ssh://dstHost/system
Error message:
libvirtd: *** Error in `/usr/sbin/libvirtd': free(): invalid pointer: 0x000003ff90001e20 ***
The problem is that virCommandRun can return an empty string in the event that
the port being queried does not exist. When this happens then we are
unconditionally overwriting a newline character at position strlen()-1. When
strlen is 0, we overwrite memory that does not belong to the string.
The fix: Only overwrite the newline if the string is not empty.
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
This patch creates two bitmaps, one for macvlan device names and one
for macvtap. The bitmap position is used to indicate that libvirt is
currently using a device with the name macvtap%d/macvlan%d, where %d
is the position in the bitmap. When requested to create a new
macvtap/macvlan device, libvirt will now look for the first clear bit
in the appropriate bitmap and derive the device name from that rather
than just starting at 0 and counting up until one works.
When libvirtd is restarted, the qemu driver code that reattaches to
active domains calls the appropriate function to "re-reserve" the
device names as it is scanning the status of running domains.
Note that it may seem strange that the retry counter now starts at
8191 instead of 5. This is because we now don't do a "pre-check" for
the existence of a device once we've reserved it in the bitmap - we
move straight to creating it; although very unlikely, it's possible
that someone has a running system where they have a large number of
network devices *created outside libvirt* named "macvtap%d" or
"macvlan%d" - such a setup would still allow creating more devices
with the old code, while a low retry max in the new code would cause a
failure. Since the objective of the retry max is just to prevent an
infinite loop, and it's highly unlikely to do more than 1 iteration
anyway, having a high max is a reasonable concession in order to
prevent lots of new failures.
In the following cases nl_recv() was returning the error "No buffer
space available":
* When switching CPUs to offline/online in a system more than 128 cpus
* When using virsh to destroy domain in a system with many interfaces
This patch sets the buffer size for all netlink sockets created by
libnl to 128K and turns on message peeking for nl_recv(). This
eliminates the "No buffer space available" errors seen in the cases
above, and also preempts other future errors the smaller buffers could
have caused.
Signed-off-by: Leno Hou <houqy@linux.vnet.ibm.com>
Signed-off-by: Laine Stump <laine@laine.org>
In dc576025c3 we renamed virCgroupIsolateMount function to
virCgroupBindMount. However, we forgot about one occurrence in
section of the code which provides stubs for platforms without
support for CGroups like *BSD for instance.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
On the host when we start a container, it will be
placed in a cgroup path of
/machine.slice/machine-lxc\x2ddemo.scope
under /sys/fs/cgroup/*
Inside the containers' namespace we need to setup
/sys/fs/cgroup mounts, and currently will bind
mount /machine.slice/machine-lxc\x2ddemo.scope on
the host to appear as / in the container.
While this may sound nice, it confuses applications
dealing with cgroups, because /proc/$PID/cgroup
now does not match the directory in /sys/fs/cgroup
This particularly causes problems for systems and
will make it create repeated path components in
the cgroup for apps run in the container eg
/machine.slice/machine-lxc\x2ddemo.scope/machine.slice/machine-lxc\x2ddemo.scope/user.slice/user-0.slice/session-61.scope
This also causes any systemd service that uses
sd-notify to fail to start, because when systemd
receives the notification it won't be able to
identify the corresponding unit it came from.
In particular this break rabbitmq-server startup
Future kernels will provide proper cgroup namespacing
which will handle this problem, but until that time
we should not try to play games with hiding parent
cgroups.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
libvirt always resets the MAC address of the physdev used for macvtap
passthrough when the guest is finished with it. This was happening
prior to the 802.1Qb[gh] DISASSOCIATE command, and was quite often
failing, presumably because the driver wouldn't allow the MAC address
to be reset while the association was still active, with a log message
like this:
virNetDevSetMAC:168 : Cannot set interface MAC to 00:00:00:00:00:00 on 'eth13': Cannot assign requested address
This patch changes the order - we now do the 802.1Qb[gh] disassociate
and delete the macvtap interface first, then and reset the MAC
address.
if instanceId is NULL
When virNetDevVPortProfileGetStatus() was called with instanceId =
NULL (which is the case for all DISASSOCIATE requests in 802.1Qbh) it
would log the following error:
Could not find netlink response with expected parameters
even though the disassociate had been successfully completely. Then,
due to the fortunate coincidence of status having been initialized to
0 and then not changed when the "failure" was encountered, it would
still return a status of 0 (PORT_VDP_RESPONSE_SUCCESS), so the caller
would assume a successful operation.
This would result in a spurious log message though, and would fill in
LastErrorMessage, so that the API would return that error if it
happened during cleanup from some other error. That, in turn, would
lead to an incorrect supposition that the response to the port profile
disassociate was the cause of the failure.
During debugging, I noticed that the VF in question usually had *no
uuid* associated with it (big surprise)by the time the disassociate
completed, so the solution is *not* to send the previous instanceId
down.
This patch fixes virNetDevVPortProfileGetStatus() to only check the
VF's uuid in the status if it was given an instanceId to check against
when originally called. Otherwise it only checks that the particular
VF is present (it will be).
This does cause a slight difference in behavior - rather than
returning with status unchanged (and thus always 0) it will actually
get the IFLA_PORT_RESPONSE. This could lead to revelation of error
conditions we were previously ignoring. Or not. So far "not".
Some of the protocol files already include handing of the missing int
types such as xdr_uint64_t, some don't. To fix it everywhere, move out
of the appropriate defines to the utils/virxdrdefs.h file and include
it where needed.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
As cgroup implementation only works on Linux, it does not
make much sense to include sys/mount.h if other requirements are
not met, such as HAVE_MNTENT_H and HAVE_GETMNTENT_R.
Also, it fixes build on OpenBSD that requires to include sys/param.h
along with sys/mount.h.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
Most of the changes to the list of active and inactive PCI devices
happen in virHostdev, where they are properly logged.
virPCIDeviceDetach() and virPCIDeviceReattach(), however, change the
inactive list as well, so they should be logging similar messages.
Instead of misusing a const string to hold up runtime allocated
data, introduce new variable @hoststr and obey const correctness.
==6879== 15 bytes in 1 blocks are definitely lost in loss record 68 of 1,064
==6879== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==6879== by 0xA7DDF97: vasprintf (in /lib64/libc-2.21.so)
==6879== by 0x552BBC6: virVasprintfInternal (virstring.c:493)
==6879== by 0x552BCDB: virAsprintfInternal (virstring.c:514)
==6879== by 0x54FA44C: virLogHostnameString (virlog.c:468)
==6879== by 0x54FAB0F: virLogVMessage (virlog.c:645)
==6879== by 0x54FA680: virLogMessage (virlog.c:531)
==6879== by 0x54FBBF4: virLogParseOutputs (virlog.c:1130)
==6879== by 0x11CB4F: daemonSetupLogging (libvirtd.c:685)
==6879== by 0x11E137: main (libvirtd.c:1297)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Once @hostname is printed into @hoststr we don't need it anymore.
==6879== 5 bytes in 1 blocks are definitely lost in loss record 10 of 1,064
==6879== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==6879== by 0xA7ED599: strdup (in /lib64/libc-2.21.so)
==6879== by 0x552C126: virStrdup (virstring.c:726)
==6879== by 0x553B13E: virGetHostnameImpl (virutil.c:720)
==6879== by 0x553B1BF: virGetHostnameQuiet (virutil.c:741)
==6879== by 0x54FA3FD: virLogHostnameString (virlog.c:462)
==6879== by 0x54FAB0F: virLogVMessage (virlog.c:645)
==6879== by 0x54FA680: virLogMessage (virlog.c:531)
==6879== by 0x54FBBF4: virLogParseOutputs (virlog.c:1130)
==6879== by 0x11CB4F: daemonSetupLogging (libvirtd.c:685)
==6879== by 0x11E137: main (libvirtd.c:1297)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If no port number was provided for a storage pool libvirt defaults to
port 6789; however, librbd/librados already default to 6789 when no port
number is provided.
In the future Ceph will switch to a new port for the Ceph monitors since
port 6789 is already assigned to a different application by IANA.
Port 6789 is assigned to SMC-HTTPS and Ceph now has port 3300 assigned as
the 'Ceph monitor' port.
In this case it is the best solution to not hardcode any port number into
libvirt and let librados handle the connection.
Only if a user specifies a different port number we pass it down to librados,
otherwise we leave it blank.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
merge
Due to debug logs like this:
virPCIGetDeviceAddressFromSysfsLink:2432 : Attempting to resolve device path from device link '/sys/class/net/eth1/device/virtfn6'
logStrToLong_ui:2369 : Converted '0000:07:00.7' to unsigned int 0
logStrToLong_ui:2369 : Converted '07:00.7' to unsigned int 7
logStrToLong_ui:2369 : Converted '00.7' to unsigned int 0
logStrToLong_ui:2369 : Converted '7' to unsigned int 7
virPCIGetDeviceAddressFromSysfs:1947 : virPCIDeviceAddress 0000:07:00.7
virPCIGetVirtualFunctions:2554 : Found virtual function 7
printed *once for each SR-IOV Virtual Function* of a Physical Function
each time libvirt retrieved the list of VFs (so if the system has 128
VFs, there would be 900 lines of log for each call), the debug logs on
any system with a large number of VFs was dominated by "information"
that was possibly useful for debugging when the code was being
written, but is now useless for debugging of any problem on a running
system, and only serves to obscure the real useful information. This
overkill has no place in production code, so this patch removes it.
The previous error message just indicated that the desired response
couldn't be found, this patch tells what was desired, as well as
listing out the entire table that had been in the netlink response, to
give some kind of idea why it failed.
I noticed in a log file that we had failed to set a MAC address. The
log said which interface we were trying to set, but didn't give the
offending MAC address, which could have been useful in determining the
source of the problem. This patch modifies all three places in the
code that set MAC addresses to report the failed MAC as well as
interface.
The manpage for sysconf() suggest including unistd.h as the
function is declared there. Even though we are not hitting any
compile issues currently, let's include the correct header file
instead of relying on some hidden include chain.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We want to eventually factor out the code dealing with device detaching
and reattaching, so that we can share it and make sure it's called eg.
when 'virsh nodedev-detach' is used.
For that to happen, it's important that the lists of active and inactive
PCI devices are updated every time a device changes its state.
Instead of passing NULL as the last argument of virPCIDeviceDetach() and
virPCIDeviceReattach(), pass the proper list so that it can be updated.
This replaces the virPCIKnownStubs string array that was used
internally for stub driver validation.
Advantages:
* possible values are well-defined
* typos in driver names will be detected at compile time
* avoids having several copies of the same string around
* no error checking required when setting / getting value
The names used mirror those in the
virDomainHostdevSubsysPCIBackendType enumeration.
This internal function supports, in theory, binding to a different
stub driver than the one the PCI device has been configured to use.
In practice, it is only ever called like
virPCIDeviceBindToStub(dev, dev->stubDriver);
which makes its second parameter redundant. Get rid of it, along
with the extra string copy required to support it.
This function can be used to retrieve the current locked memory
limit for a process, so that the setting can be later restored.
Add a configure check for getrlimit(), which we now use.
The prlimit() function allows both getting and setting limits for
a process; expose the same functionality in our wrapper.
Add the const modifier for new_limit, in accordance with the
prototype for prlimit().
Instead of replicating the information (domain, bus, slot, function)
inside the virPCIDevice structure, use the already-existing
virPCIDeviceAddress structure.
For users of the module, this means that the object returned by
virPCIDeviceGetAddress() can no longer be NULL and must no longer
be freed by the caller.
virCgroupNewMachine used to add the pidleader to the newly created
machine cgroup. Do not do this implicit anymore.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
Firstly, there's a bug (or typo) in the only place where we call
this function: @multiqueue is set whenever @tapfdSize is greater
than zero, while in fact the condition should have been 'greater
than one'.
Then, secondly, since the condition depends on just one
variable, that we are even passing down to the function, we can
move the condition into the function and drop useless argument.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>