Introduce a new API to move tasks of one controller from a cgroup to another cgroup
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Introduce the function virCgroupForEmulator() to create sub directory
for simulator thread(include I/O thread, vhost-net thread)
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
When gcc atomic intrinsics are not available (such as on RHEL 5
with gcc 4.1.2), we were getting link errors due to multiple
definitions:
./.libs/libvirt_util.a(libvirt_util_la-virobject.o): In function `virAtomicIntXor':
/home/dummy/l,ibvirt/src/util/viratomoic.h:404: multiple definition of `virAtomicIntXor'
./.libs/libvirt_util.a(libvirt_util_la-viratomic.o):/home/dummy/libvirt/src/util/viratomic.h:404: first defined here
Solve this by conditionally marking the functions static (the
condition avoids falling foul of gcc warnings about unused
static function declarations).
* src/util/viratomic.h: When not using gcc intrinsics, use static
functions to avoid linker errors on duplicate functions.
We already skip out on building the LXC under RHEL 5, because the
kernel is too old (commits 4c18acf, 2dee896); but commit 9612e4b
moved some LXC-only code into common files, resulting in this
build failure:
util/virfile.c: In function 'virFileLoopDeviceAssociate':
util/virfile.c:580: error: 'LO_FLAGS_AUTOCLEAR' undeclared (first use in this function)
Unfortunately, the kernel folks only made it an enum, rather than
also a #define, so we have to modify configure.ac to record when
it is usable.
* configure.ac (with_lxc): Mark when LO_FLAGS_AUTOCLEAR was found.
* src/util/virfile.c (virFileLoopDeviceAssociate): Avoid
compilation when kernel is too old.
Fix possible double close in the child process after the fork in case
infd and outfd are equal, just like they are after being called from
virNetSocketNewConnectCommand.
* configure.ac, spec file: firewalld defaults to enabled if dbus is
available, otherwise is disabled. If --with_firewalld is explicitly
requested and dbus is not available, configure will fail.
* bridge_driver: add dbus filters to get the FirewallD1.Reloaded
signal and DBus.NameOwnerChanged on org.fedoraproject.FirewallD1.
When these are encountered, reload all the iptables reuls of all
libvirt's virtual networks (similar to what happens when libvirtd is
restarted).
* iptables, ebtables: use firewall-cmd's direct passthrough interface
when available, otherwise use iptables and ebtables commands. This
decision is made once the first time libvirt calls
iptables/ebtables, and that decision is maintained for the life of
libvirtd.
* Note that the nwfilter part of this patch was separated out into
another patch by Stefan in V2, so that needs to be revised and
re-reviewed as well.
================
All the configure.ac and specfile changes are unchanged from Thomas'
V3.
V3 re-ran "firewall-cmd --state" every time a new rule was added,
which was extremely inefficient. V4 uses VIR_ONCE_GLOBAL_INIT to set
up a one-time initialization function.
The VIR_ONCE_GLOBAL_INIT(x) macro references a static function called
vir(Ip|Eb)OnceInit(), which will then be called the first time that
the static function vir(Ip|Eb)TablesInitialize() is called (that
function is defined for you by the macro). This is
thread-safe, so there is no chance of any race.
IMPORTANT NOTE: I've left the VIR_DEBUG messages in these two init
functions (one for iptables, on for ebtables) as VIR_WARN so that I
don't have to turn on all the other debug message just to see
these. Even if this patch doesn't need any other modification, those
messages need to be changed to VIR_DEBUG before pushing.
This one-time initialization works well. However, I've encountered
problems with testing:
1) Whenever I have enabled the firewalld service, *all* attempts to
call firewall-cmd from within libvirtd end with firewall-cmd hanging
internally somewhere. This is *not* the case if firewall-cmd returns
non-0 in response to "firewall-cmd --state" (i.e. *that* command runs
and returns to libvirt successfully.)
2) If I start libvirtd while firewalld is stopped, then start
firewalld later, this triggers libvirtd to reload its iptables rules,
however it also spits out a *ton* of complaints about deletion failing
(I suppose because firewalld has nuked all of libvirt's rules). I
guess we need to suppress those messages (which is a more annoying
problem to fix than you might think, but that's another story).
3) I noticed a few times during this long line of errors that
firewalld made a complaint about "Resource Temporarily
unavailable. Having libvirtd access iptables commands directly at the
same time as firewalld is doing so is apparently problematic.
4) In general, I'm concerned about the "set it once and never change
it" method - if firewalld is disabled at libvirtd startup, causing
libvirtd to always use iptables/ebtables directly, this won't cause
*terrible* problems, but if libvirtd decides to use firewall-cmd and
firewalld is later disabled, libvirtd will not be able to recover.
This patch adds helper functions that enable us to use libssh2 in
conjunction with libvirt's virNetSockets for ssh transport instead of
spawning "ssh" client process.
This implemetation supports tunneled plaintext, keyboard-interactive,
private key, ssh agent based and null authentication. Libvirt's Auth
callback is used for interaction with the user. (Keyboard interactive
authentication, adding of host keys, private key passphrases). This
enables seamless integration into the application using libvirt. No
helpers as "ssh-askpass" are needed.
Reading and writing of OpenSSH style "known_hosts" files is supported.
Communication is done using SSH exec channel, where the user may specify
arbitrary command to be executed on the remote side and reads and writes
to/from stdin/out are sent through the ssh channel. Usage of stderr is
not (yet) supported.
The network pool should be able to keep track of both network device
names and PCI addresses, and return the appropriate one in the
actualDevice when networkAllocateActualDevice is called.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Move the functions the parse/format, and validate PCI addresses to
their own file so they can be conveniently used in other places
besides device_conf.c
Refactoring existing code without causing any functional changes to
prepare for new code.
This patch makes the code reusable.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Add the ability to support VLAN tags for Open vSwitch virtual port
types. To accomplish this, modify virNetDevOpenvswitchAddPort and
virNetDevTapCreateInBridgePort to take a virNetDevVlanPtr
argument. When adding the port to the OVS bridge, setup either a
single VLAN or a trunk port based on the configuration from the
virNetDevVlanPtr.
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
When a network device that is a VF of an SR-IOV card was assigned to a
guest using <interface type='hostdev'>, only the MAC address was being
saved/restored, but the VLAN tag was left untouched. Up to now we
haven't actually used vlan tags on SR-IOV devices, so the guest would
have used whatever was set, and left it the same at the end.
The patch following this one will hook up the <vlan> element from the
interface config, so save/restore of the device state needs to also
include the vlan tag.
MAC address is being saved as a simple ASCII string in a file named
for the device under /var/run. The VLAN tag is now just added at the
end of that file, after a newline. It might be nicer if the file was
XML (in case it ever gets more complicated) but at the moment there's
nothing else on the horizon, and this makes backward compatibility
easier.
To allow for the possibility of vlan "trunks", which have more than
one vlan tag associated with them, we need a vlan struct. Since it
will be used by multiple files in src/util, src/conf, src/network, and
src/qemu, it must be defined in src/util. Unfortunately there isn't
currently a common file for simple netdev data definitions, so I
created a new file.
This caused compilation of virnetdevvportprofile.c to fail on systems
without IFLA support in netlink (these are netlink commands used to
configure the VF's of SR-IOV network devices).
It is desirable to be able to query the config params of
the thread pool, in order to save the server state. Add
virThreadPoolGetMinWorkers, virThreadPoolGetMaxWorkers
and virThreadPoolGetPriorityWorkers APIs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
While the QEMU monitor/agent do not want JSON strings pretty
printed, other parts of libvirt might. Instead of hardcoding
QEMU's desired behaviour in virJSONValueToString(), add a
boolean flag to control pretty printing
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Use of ldexp() requires -lm on some platforms; use gnulib to determine
this for our makefile. Also, optimize virRandomInt() for the case
of a power-of-two limit (actually rather common, given that Daniel
has a pending patch to replace virRandomBits(10) with code that will
default to virRandomInt(1024) on default SELinux settings).
* .gnulib: Update to latest, for ldexp.
* bootstrap.conf (gnulib_modules): Import ldexp.
* src/Makefile.am (libvirt_util_la_CFLAGS): Link with -lm when
needed.
* src/util/virrandom.c (virRandomInt): Optimize powers of 2.
This patch adds three utility functions that operate on
virNetDevVPortProfile objects.
* virNetDevVPortProfileCheckComplete() - verifies that all attributes
required for the type of the given virtport are specified.
* virNetDevVPortProfileCheckNoExtras() - verifies that there are no
attributes specified which are inappropriate for the type of the
given virtport.
* virNetDevVPortProfileMerge3() - merges 3 virtports into a single,
newly allocated virtport. If any attributes are specified in
more than one of the three sources, and do not exactly match,
an error is logged and the function fails.
These new functions depend on new fields in the virNetDevVPortProfile
object that keep track of whether or not each attribute was
specified. Since the higher level parse function doesn't yet set those
fields, these functions are not actually usable yet (but that's okay,
because they also aren't yet used - all of that functionality comes in
a later patch.)
Note that these three functions return 0 on success and -1 on
failure. This may seem odd for the first two Check functions, since
they could also easily return true/false, but since they actually log
an error when the requested condition isn't met (and should result in
a failure of the calling function), I thought 0/-1 was more
appropriate.
virNetDevVPortProfile has (had) a type field that can be set to one of
several values, and a union of several structs, one for each
type. When a domain's interface object is of type "network", the
domain config may not know beforehand which type of virtualport is
going to be provided in the actual device handed down from the network
driver at runtime, but may want to set some values in the virtualport
that may or may not be used, depending on the type. To support this
usage, this patch replaces the union of structs with toplevel fields
in the struct, making it possible for all of the fields to be set at
the same time.
Both of these functions returned void, but it's convenient for them to
return a const char* of the char* that is passed in. This was you can
call the function and use the result in the same expression/arg.
The current virRandomBits() API is only usable if the caller wants
a random number in the range [0, n-1) where n is a power of two.
This adds a virRandom() API which generates a double in the
range [0.0,1.0) with 48 bits of entropy. It then also adds a
virRandomInt(uint32_t max) API which generates an unsigned
in the range [0,@max)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
libvirt creates invalid commands if wrong locale is selected. For
example with locale that uses comma as a decimal point, JSON commands
created with decimal numbers are invalid because comma separates the
entries in JSON. Fortunately even when decimal point is affected,
thousands grouping is not, because for grouping to be enabled with
*printf, there has to be an apostrophe flag specified (and supported).
This patch adds specific internal function for converting doubles to
strings with C locale.
This patch introduces a new error code VIR_ERR_OPERATION_UNSUPPORTED to
mark error messages regarding operations that failed due to lack of
support on the hypervisor or other than libvirt issues.
The code is first used in reporting error if qemu does not support block
IO tuning variables yielding error message:
error: Unable to get block I/O throttle parameters
error: Operation not supported: block_io_throttle field
'total_bytes_sec' missing in qemu's output
instead of:
error: Unable to get block I/O throttle parameters
error: internal error cannot read total_bytes_sec
Otherwise, in locations like virobject.c where PROBE is used,
for certain configure options, the compiler warns:
util/virobject.c:110:1: error: 'intptr_t' undeclared (first use in this function)
As long as we are making this header always available, we can
clean up several other files.
* src/internal.h (includes): Pull in <stdint.h>.
* src/conf/nwfilter_conf.h: Rely on internal.h.
* src/storage/storage_backend.c: Likewise.
* src/storage/storage_backend.h: Likewise.
* src/util/cgroup.c: Likewise.
* src/util/sexpr.h: Likewise.
* src/util/virhashcode.h: Likewise.
* src/util/virnetdevvportprofile.h: Likewise.
* src/util/virnetlink.h: Likewise.
* src/util/virrandom.h: Likewise.
* src/vbox/vbox_driver.c: Likewise.
* src/xenapi/xenapi_driver.c: Likewise.
* src/xenapi/xenapi_utils.c: Likewise.
* src/xenapi/xenapi_utils.h: Likewise.
* src/xenxs/xenxs_private.h: Likewise.
* tests/storagebackendsheepdogtest.c: Likewise.
Both LVM volumes and SCSI LUNs have a globally unique
identifier associated with them. It is useful to be able
to query this identifier to then perform disk locking,
rather than try to figure out a stable pathname.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Prevents libvirt from treating RBD backing stores as files. Without this
patch, creating a domain with a qcow2 overlay on an RBD would fail.
This patch essentially extends 9c7c4a4fc5,
which allows nbd backing stores, to allow rbd backing stores.
From man poll(2), poll does not set errno=EAGAIN on interrupt, however
it does set errno=EINTR. Have libvirt retry on the appropriate errno.
Under heavy load, a program of mine kept getting libvirt errors 'poll on
socket failed: Interrupted system call'. The signals were SIGCHLD from
processes forked by threads unrelated to those using libvirt.
This patch is in response to:
https://bugzilla.redhat.com/show_bug.cgi?id=818467
If a caller to virCommandRun doesn't ask for the exitstatus of the
program it's running, the virCommand functions assume that they should
log an error message and return failure if the exit code isn't
0. However, only the commandline and exit status are logged, while
potentially useful information sent by the program to stderr is
discarded.
Fortunately, virCommandRun is already checking if the caller had asked
for stderr to be saved and, if not, sets things up to save it in
*cmd->errbuf. This makes it fairly simple for virCommandWait to
include *cmd->errbuf in the error log (there are still other callers
that don't setup errbuf, and even virCommandRun won't set it up if the
command is being daemonized, so we have to check that it's non-zero).
This introduces a fairly basic reference counted virObject type
and an associated virClass type, that use atomic operations for
ref counting.
In a global initializer (recommended to be invoked using the
virOnceInit API), a virClass type must be allocated for each
object type. This requires a class name, a "dispose" callback
which will be invoked to free memory associated with the object's
fields, and the size in bytes of the object struct.
eg,
virClassPtr connclass = virClassNew("virConnect",
sizeof(virConnect),
virConnectDispose);
The struct for the object, must include 'virObject' as its
first member
eg
struct _virConnect {
virObject object;
virURIPtr uri;
};
The 'dispose' callback is only responsible for freeing
fields in the object, not the object itself. eg a suitable
impl for the above struct would be
void virConnectDispose(void *obj) {
virConnectPtr conn = obj;
virURIFree(conn->uri);
}
There is no need to reset fields to 'NULL' or '0' in the
dispose callback, since the entire object will be memset
to 0, and the klass pointer & magic integer fields will
be poisoned with 0xDEADBEEF before being free()d
When creating an instance of an object, one needs simply
pass the virClassPtr eg
virConnectPtr conn = virObjectNew(connclass);
if (!conn)
return NULL;
conn->uri = virURIParse("foo:///bar")
Object references can be manipulated with
virObjectRef(conn)
virObjectUnref(conn)
The latter returns a true value, if the object has been
freed (ie its ref count hit zero)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
All callers used the same initialization seed (well, the new
viratomictest forgot to look at getpid()); so we might as well
make this value automatic. And while it may feel like we are
giving up functionality, I documented how to get it back in the
unlikely case that you actually need to debug with a fixed
pseudo-random sequence. I left that crippled by default, so
that a stray environment variable doesn't cause a lack of
randomness to become a security issue.
* src/util/virrandom.c (virRandomInitialize): Rename...
(virRandomOnceInit): ...and make static, with one-shot call.
Document how to do fixed-seed debugging.
* src/util/virrandom.h (virRandomInitialize): Drop prototype.
* src/libvirt_private.syms (virrandom.h): Don't export it.
* src/libvirt.c (virInitialize): Adjust caller.
* src/lxc/lxc_controller.c (main): Likewise.
* src/security/virt-aa-helper.c (main): Likewise.
* src/util/iohelper.c (main): Likewise.
* tests/seclabeltest.c (main): Likewise.
* tests/testutils.c (virtTestMain): Likewise.
* tests/viratomictest.c (mymain): Likewise.
There are a few issues with the current virAtomic APIs
- They require use of a virAtomicInt struct instead of a plain
int type
- Several of the methods do not implement memory barriers
- The methods do not implement compiler re-ordering barriers
- There is no Win32 native impl
The GLib library has a nice LGPLv2+ licensed impl of atomic
ops that works with GCC, Win32, or pthreads.h that addresses
all these problems. The main downside to their code is that
the pthreads impl uses a single global mutex, instead of
a per-variable mutex. Given that it does have a Win32 impl
though, we don't expect anyone to seriously use the pthread.h
impl, so this downside is not significant.
* .gitignore: Ignore test case
* configure.ac: Check for which atomic ops impl to use
* src/Makefile.am: Add viratomic.c
* src/nwfilter/nwfilter_dhcpsnoop.c: Switch to new atomic
ops APIs and plain int datatype
* src/util/viratomic.h: inline impls of all atomic ops
for GCC, Win32 and pthreads
* src/util/viratomic.c: Global pthreads mutex for atomic
ops
* tests/viratomictest.c: Test validate to validate safety
of atomic ops.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove the use of a manually run virLogStartup and
virNodeSuspendInitialize methods. Instead make sure they
are automatically run using VIR_ONCE_GLOBAL_INIT
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add function virCommandNewVAList which is equivalent to the
virCommandNewArgList but with va_list instead of a variable number
of arguments.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Parallels Cloud Server is a cloud-ready virtualization
solution that allows users to simultaneously run multiple virtual
machines and containers on the same physical server.
More information can be found here: http://www.parallels.com/products/pcs/
Also beta version of Parallels Cloud Server can be downloaded there.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
This is a follow up patch of commit f9ce7dad6, it modifies all
the files which declare the copyright like "See COPYING.LIB for
the License of this software" to use the detailed/consistent one.
And deserts the outdated comments like:
* libvirt-qemu.h:
* Summary: qemu specific interfaces
* Description: Provides the interfaces of the libvirt library to handle
* qemu specific methods
*
* Copy: Copyright (C) 2010, 2012 Red Hat, Inc.
Uses the more compact style like:
* libvirt-qemu.h: Interfaces specific for QEMU/KVM driver
*
* Copyright (C) 2010, 2012 Red Hat, Inc.
Change the permissible minimum value of nodesuspend duration time
to 60 seconds. If option is less than the value, reports error.
Update virsh help and manpage the infomation.
No check for conn->uri being NULL in virAuthGetConfigFilePath (valid
state) made the client segfault. This happens for example with these
settings:
- no virtualbox driver installed (modifies conn->uri)
- no default URI set (VIRSH_DEFAULT_CONNECT_URI="",
LIBVIRT_DEFAULT_URI="", uri_default="")
- auth_sock_rw="sasl"
- virsh run as root
That are unfortunately the settings with fresh Fedora 17 installation
with VDSM.
The check ought to be enough as conn->uri being NULL is valid in later
code and is handled properly.
Per the FSF address could be changed from time to time, and GNU
recommends the following now: (http://www.gnu.org/licenses/gpl-howto.html)
You should have received a copy of the GNU General Public License
along with Foobar. If not, see <http://www.gnu.org/licenses/>.
This patch removes the explicit FSF address, and uses above instead
(of course, with inserting 'Lesser' before 'General').
Except a bunch of files for security driver, all others are changed
automatically, the copyright for securify files are not complete,
that's why to do it manually:
src/security/security_selinux.h
src/security/security_driver.h
src/security/security_selinux.c
src/security/security_apparmor.h
src/security/security_apparmor.c
src/security/security_driver.c
When reporting a system error (VIR_ERR_SYSTEM_ERROR) via
virReportSystemError, we should copy the errno value into
the 'int1' field of the virErrorPtr struct. This allows
callers to detect certain errno conditions & discard the
error
* src/util/virterror.c: Place errno value in int1 field
ensures that initialization will always take place when it is
needed, and guarantees it only occurs once. The problem is that
the code to setup a global initializer with proper error
propagation is tedious. This introduces VIR_ONCE_GLOBAL_INIT
macro to simplify this.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This removes nearly all the per-file error reporting macros
from the code in src/util/. A few custom macros remain for the
case, where the file needs to report errors with a variety of
different codes or parameters
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virnetdevtap.c and viruri.c files had two error report
messages which were not annotated with _(...)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Nearly every source file does something like
#define VIR_FROM_THIS VIR_FROM_FOO
#define virFooReportErorr(code, ...) \
virReportErrorHelper(VIR_FROM_THIS, code, __FILE__, \
__FUNCTION__, __LINE__, \
__VA_ARGS__)
This creates needless duplication and inconsistent error
reporting function names in each file. It is trivial to
just have virterror_internal.h provide a virReportError
macro that is equivalent
* src/util/virterror_internal.h: Define virReportError(code, ...)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Introduce new members in the virMacAddr 'class'
- virMacAddrSet: set virMacAddr from a virMacAddr
- virMacAddrSetRaw: setting virMacAddr from raw 6 byte MAC address buffer
- virMacAddrGetRaw: writing virMacAddr into raw 6 byte MAC address buffer
- virMacAddrCmp: comparing two virMacAddr
- virMacAddrCmpRaw: comparing a virMacAddr with a raw 6 byte MAC address buffer
then replace raw MAC addresses by replacing
- 'unsigned char *' with virMacAddrPtr
- 'unsigned char ... [VIR_MAC_BUFLEN]' with virMacAddr
and introduce usage of above functions where necessary.
When building with --disable-debug, VIR_DEBUG expands to a nop.
But parameters to VIR_DEBUG can be variables that are passed only
to VIR_DEBUG. In the case the building system complains about unused
variables.
Instead of changing the existed virFileMakePath to accept mode
argument and modifying a pile of its uses, this patch introduces
virFileMakePathWithMode, and use it instead of mkdir() to create
the readline history dir.
While it is not currently used elsewhere in libvirt, the code
for finding a free loop device & associating a file with it
is not LXC specific. Move it into the viffile.{c,h} file where
potentially shared code is more commonly kept.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Hello,
This is a patch to fix vm's outbound traffic control problem.
Currently, vm's outbound traffic control by libvirt doesn't go well.
This problem was previously discussed at libvir-list ML, however
it seems that there isn't still any answer to the problem.
http://www.redhat.com/archives/libvir-list/2011-August/msg00333.html
I measured Guest(with virtio-net) to Host TCP throughput with the
command "netperf -H".
Here are the outbound QoS parameters and the results.
outbound average rate[kilobytes/s] : Guest to Host throughput[Mbit/s]
======================================================================
1024 (8Mbit/s) : 4.56
2048 (16Mbit/s) : 3.29
4096 (32Mbit/s) : 3.35
8192 (64Mbit/s) : 3.95
16384 (128Mbit/s) : 4.08
32768 (256Mbit/s) : 3.94
65536 (512Mbit/s) : 3.23
The outbound traffic goes down unreasonably and is even not controled.
The cause of this problem is too large mtu value in "tc filter" command run by
libvirt. The command uses burst value to set mtu and the burst is equal to
average rate value if it's not set. This value is too large. For example
if the average rate is set to 1024 kilobytes/s, the mtu value is set to 1024
kilobytes. That's too large compared to the size of network packets.
Here libvirt applies tc ingress filter to Host's vnet(tun) device.
Tc ingress filter is implemented with TBF(Token Buckets Filter) algorithm. TBF
uses mtu value to calculate the amount of token consumed by each packet. With too
large mtu value, the token consumption rate is set too large. This leads to
token starvation and deterioration of TCP throughput.
Then, should we use the default mtu value 2 kilobytes?
The anser is No, because Guest with virtio-net device uses 65536 bytes
as mtu to transmit packets to Host, and the tc filter with the default mtu
value 2k drops packets whose size is larger than 2k. So, the most packets
is droped and again leads to deterioration of TCP throughput.
The appropriate mtu value is 65536 bytes which is equal to the maximum value
of network interface device defined in <linux/netdevice.h>. The value is
not so large that it causes token starvation and not so small that it
drops most packets.
Therefore this patch set the mtu value to 64kb(== 65535 bytes).
Again, here are the outbound QoS parameters and the TCP throughput with
the libvirt patched.
outbound average rate[kilobytes/s] : Guest to Host throughput[Mbit/s]
======================================================================
1024 (8Mbit/s) : 8.22
2048 (16Mbit/s) : 16.42
4096 (32Mbit/s) : 32.93
8192 (64Mbit/s) : 66.85
16384 (128Mbit/s) : 133.88
32768 (256Mbit/s) : 271.01
65536 (512Mbit/s) : 547.32
The outbound traffic conforms to the given limit.
Thank you,
Signed-off-by: Eiichi Tsukata <eiichi.tsukata.xh@hitachi.com>
In order to retrieve some sysinfo data we need to parse /proc/sysinfo and
/proc/cpuinfo.
Signed-off-by: Thang Pham <thang.pham@us.ibm.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Some GNULIB headers (eg unistd.h) will often need to include
winsock2.h for various symbols. There is a rule that winsock2.h
must be included before windows.h. This means that any file
which does
#ifdef WIN32
#include <windows.h>
#endif
#include <unistd.h>
is potentially broken. A simple rule is that /all/ includes of
windows.h must be matched with a preceding include of winsock2.h
regardless of whether unistd.h is used currently
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A core use case of the hook scripts is to be able to do things
to a guest's network configuration. It is possible to hook into
the 'start' operation for a QEMU guest which runs just before
the guest is started. The TAP devices will exist at this point,
but the QEMU process will not. It can be desirable to have a
'started' hook too, which runs once QEMU has started.
If libvirtd is restarted it will re-populate firewall rules,
but there is no QEMU hook to trigger for existing domains.
This is solved with a 'reconnect' hook.
Finally, if attaching to an external QEMU process there needs
to be an 'attach' hook script.
This all also applies to the LXC driver
* docs/hooks.html.in: Document new operations
* src/util/hooks.c, src/util/hooks.c: Add 'started', 'reconnect'
and 'attach' operations for QEMU. Add 'prepare', 'started',
'release' and 'reconnect' operations for LXC
* src/lxc/lxc_driver.c: Add hooks for 'prepare', 'started',
'release' and 'reconnect' operations
* src/qemu/qemu_process.c: Add hooks for 'started', 'reconnect'
and 'reconnect' operations
Right now, the only way to get at the contents of a virBuffer is
to destroy it. But there are cases in my upcoming patches where
peeking at the contents makes life easier. I suppose this does
open up the potential for bad code to dereference a stale pointer,
by disregarding the docs that the return value is invalid on the
next virBuf operation, but such is life.
* src/util/buf.h (virBufferCurrentContent): New declaration.
* src/util/buf.c (virBufferCurrentContent): Implement it.
* src/libvirt_private.syms (buf.h): Export it.
* tests/virbuftest.c (testBufAutoIndent): Test it.
When libvirtd forks off a new child, the child then calls virLogReset(),
which ends up closing file descriptors used as log outputs. However, we
recently started logging closed file descriptors, which means we need to
lock logging mutex which was already locked by virLogReset(). We don't
really want to log anything when we are in the process of closing log
outputs.
There is a theoretical problem of an extreme bug where we can get
into deadlock due to command handshaking. Thanks to a pair of pipes,
we have a situation where the parent thinks the child reported an
error and is waiting for a message from the child to explain the
error; but at the same time the child thinks it reported success
and is waiting for the parent to acknowledge the success; so both
processes are now blocked.
Thankfully, I don't think this deadlock is possible without at
least one other bug in the code, but I did see exactly that sort
of situation prior to commit da831af - I saw a backtrace where a
double close bug in the parent caused the parent to read from the
wrong fd and assume the child failed, even though the child really
sent success.
This potential deadlock is not quite like commit 858c247 (a deadlock
due to multiple readers on one pipe preventing a write from completing),
although the solution is similar - always close unused pipe fds before
blocking, rather than after.
* src/util/command.c (virCommandHandshakeWait): Close unused fds
sooner.
It is possible to deadlock libvirt by having a domain with XML
longer than PIPE_BUF, and by writing a hook script that closes
stdin early. This is because libvirt was keeping a copy of the
child's stdin read fd open, which means the write fd in the
parent will never see EPIPE (remember, libvirt should always be
run with SIGPIPE ignored, so we should never get a SIGPIPE signal).
Since there is no error, libvirt blocks waiting for a write to
complete, even though the only reader is also libvirt. The
solution is to ensure that only the child can act as a reader
before the parent does any writes; and then dealing with the
fallout of dealing with EPIPE.
Thankfully, this is not a security hole - since the only way to
trigger the deadlock is to install a custom hook script, anyone
that already has privileges to install a hook script already has
privileges to do any number of other equally disruptive things
to libvirt; it would only be a security hole if an unprivileged
user could install a hook script to DoS a privileged user.
* src/util/command.c (virCommandRun): Close parent's copy of child
read fd earlier.
(virCommandProcessIO): Don't let EPIPE be fatal; the child may
be done parsing input.
* tests/commandhelper.c (main): Set up a SIGPIPE situation.
* tests/commandtest.c (test20): Trigger it.
* tests/commanddata/test20.log: New file.
EBADF errors are logged as warnings as they normally indicate a double
close bug. This patch also provides VIR_MASS_CLOSE helper to be user in
the only case of mass close after fork when EBADF should rather be
ignored.
KAMEZAWA Hiroyuki reported a nasty double-free bug when virCommand
is used to convert a string into input to a child command. The
problem is that the poll() loop of virCommandProcessIO would close()
the write end of the pipe in order to let the child see EOF, then
the caller virCommandRun() would also close the same fd number, with
the second close possibly nuking an fd opened by some other thread
in the meantime. This in turn can have all sorts of bad effects.
The bug has been present since the introduction of virCommand in
commit f16ad06f.
This is based on his first attempt at a patch, at
https://bugzilla.redhat.com/show_bug.cgi?id=823716
* src/util/command.c (_virCommand): Drop inpipe member.
(virCommandProcessIO): Add argument, to avoid closing caller's fd
without informing caller.
(virCommandRun, virCommandNewArgs): Adjust clients.
Currently, we are logging only one side of pipes we
create in virCommandRequireHandshake(); This is enough
in cases where pipe2() returns two consecutive FDs. However,
it is not guaranteed and it may return any FDs.
Therefore, it's wise to log the other ends as well.
To ensure consistent error reporting of invalid arguments,
provide a number of predefined helper methods & macros.
- An arg which must not be NULL:
virCheckNonNullArgReturn(argname, retvalue)
virCheckNonNullArgGoto(argname, label)
- An arg which must be NULL
virCheckNullArgGoto(argname, label)
- An arg which must be positive (ie 1 or greater)
virCheckPositiveArgGoto(argname, label)
- An arg which must not be 0
virCheckNonZeroArgGoto(argname, label)
- An arg which must be zero
virCheckZeroArgGoto(argname, label)
- An arg which must not be negative (ie 0 or greater)
virCheckNonNegativeArgGoto(argname, label)
* src/libvirt.c, src/libvirt-qemu.c,
src/nodeinfo.c, src/datatypes.c: Update to use
virCheckXXXX macros
* po/POTFILES.in: Add libvirt-qemu.c and virterror_internal.h
* src/internal.h: Define macros for checking invalid args
* src/util/virterror_internal.h: Define macros for reporting
invalid args
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add an impl of +virGetUserRuntimeDirectory, virGetUserCacheDirectory
virGetUserConfigDirectory and virGetUserDirectory for Win32 platform.
Also create stubs for non-Win32 platforms which lack getpwuid_r()
In adding these two helpers were added virFileIsAbsPath and
virFileSkipRoot, along with some macros VIR_FILE_DIR_SEPARATOR,
VIR_FILE_DIR_SEPARATOR_S, VIR_FILE_IS_DIR_SEPARATOR,
VIR_FILE_PATH_SEPARATOR, VIR_FILE_PATH_SEPARATOR_S
All this code was adapted from GLib2 under terms of LGPLv2+ license.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove the uid param from virGetUserConfigDirectory,
virGetUserCacheDirectory, virGetUserRuntimeDirectory,
and virGetUserDirectory
These functions were universally called with the
results of getuid() or geteuid(). To make it practical
to port to Win32, remove the uid parameter and hardcode
geteuid()
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a VIR_ERR_DOMAIN_LAST sentinel for virErrorDomain and
replace the virErrorDomainName function by a VIR_ENUM_IMPL
In the process the naming of error domains is sanitized
* src/util/virterror.c: Use VIR_ENUM_IMPL for converting
error domains to strings
* include/libvirt/virterror.h: Add VIR_ERR_DOMAIN_LAST
The libvirt_private.syms file exports virNetlinkEventServiceLocalPid
so there needs to be a no-op stub for Win32 to avoid linker errors
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
I'm tired of writing:
bool sep = false;
while (...) {
if (sep)
virBufferAddChar(buf, ',');
sep = true;
virBufferAdd(buf, str);
}
This makes it easier, allowing one to write:
while (...)
virBufferAsprintf(buf, "%s,", str);
virBufferTrim(buf, ",", -1);
to trim any remaining comma.
* src/util/buf.h (virBufferTrim): Declare.
* src/util/buf.c (virBufferTrim): New function.
* tests/virbuftest.c (testBufTrim): Test it.
We were being lazy - virnetlink.c was getting uint32_t as a
side-effect from glibc 2.14's <unistd.h>, but older glibc 2.11
does not provide uint32_t from <unistd.h>. In fact, POSIX states
that <unistd.h> need only provide intptr_t, not all of <stdint.h>,
so the bug really is ours. Reported by Jonathan Alescio.
* src/util/virnetlink.h: Include <stdint.h>.
This involves setting the cpuacct cgroup to a per-vcpu granularity,
as well as summing the each vcpu accounting into a common array.
Now that we are reading more than one cgroup file, we double-check
that cpus weren't hot-plugged between reads to invalidate our
summing.
Signed-off-by: Eric Blake <eblake@redhat.com>
Allow the logging APIs to be called with a va_list for format
args, instead of requiring var-args usage.
* src/util/logging.h, src/util/logging.c: Add virLogVMessage
The use of readlink() in lxc_container.c is intentional; we don't
want an absolute pathname there.
* src/util/cgroup.h (VIR_CGROUP_SYSFS_MOUNT): Indent properly.
* cfg.mk (exclude_file_name_regexp--sc_prohibit_readlink): Add
exemption.
Normal practice is for cgroups controllers to be mounted at
/sys/fs/cgroup. When setting up a container, /sys is mounted
with a new sysfs instance, thus we must re-mount all the
cgroups controllers. The complexity is that we must mount
them in the same layout as the host OS. ie if 'cpu' and 'cpuacct'
were mounted at the same location in the host we must preserve
this in the container. Also if any controllers are co-located
we must setup symlinks from the individual controller name to
the co-located mount-point
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Callers of virGetUser{Config,Runtime,Cache}Directory all
append further path component. We should not be
adding a trailing slash in the return path otherwise we
get paths containing '//'
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Sometimes it is useful to see the callpath for log messages.
This change enhances the log filter syntax so that stack traces
can be show by setting '1:+NAME' instead of '1:NAME'.
This results in output like:
2012-05-09 14:18:45.136+0000: 13314: debug : virInitialize:414 : register drivers
/home/berrange/src/virt/libvirt/src/.libs/libvirt.so.0(virInitialize+0xd6)[0x7f89188ebe86]
/home/berrange/src/virt/libvirt/tools/.libs/lt-virsh[0x431921]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x3a21e21735]
/home/berrange/src/virt/libvirt/tools/.libs/lt-virsh[0x40a279]
2012-05-09 14:18:45.136+0000: 13314: debug : virRegisterDriver:775 : driver=0x7f8918d02760 name=Test
/home/berrange/src/virt/libvirt/src/.libs/libvirt.so.0(virRegisterDriver+0x6b)[0x7f89188ec717]
/home/berrange/src/virt/libvirt/src/.libs/libvirt.so.0(+0x11b3ad)[0x7f891891e3ad]
/home/berrange/src/virt/libvirt/src/.libs/libvirt.so.0(virInitialize+0xf3)[0x7f89188ebea3]
/home/berrange/src/virt/libvirt/tools/.libs/lt-virsh[0x431921]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x3a21e21735]
/home/berrange/src/virt/libvirt/tools/.libs/lt-virsh[0x40a279]
* docs/logging.html.in: Document new syntax
* configure.ac: Check for execinfo.h
* src/util/logging.c, src/util/logging.h: Add support for
stack traces
* tests/testutils.c: Adapt to API change
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
As defined in:
http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html
This offers a number of advantages:
* Allows sharing a home directory between different machines, or
sessions (eg. using NFS)
* Cleanly separates cache, runtime (eg. sockets), or app data from
user settings
* Supports performing smart or selective migration of settings
between different OS versions
* Supports reseting settings without breaking things
* Makes it possible to clear cache data to make room when the disk
is filling up
* Allows us to write a robust and efficient backup solution
* Allows an admin flexibility to change where data and settings are stored
* Dramatically reduces the complexity and incoherence of the
system for administrators
Until now, the nl_pid of the source address of every message sent by
virNetlinkCommand has been set to the value of getpid(). Most of the
time this doesn't matter, and in the one case where it does
(communication with lldpad), it previously was the proper thing to do,
because the netlink event service (which listens on a netlink socket
for unsolicited messages from lldpad) coincidentally always happened
to bind with a local nl_pid == getpid().
With the fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=816465
that particular nl_pid is now effectively a reserved value, so the
netlink event service will always bind to something else
(coincidentally "getpid() + (1 << 22)", but it really could be
anything). The result is that communication between lldpad and
libvirtd is broken (lldpad gets a "disconnected" error when it tries
to send a directed message).
The solution to this problem caused by a solution, is to query the
netlink event service's nlhandle for its "local_port", and send that
as the source nl_pid (but only when sending to lldpad, of course - in
other cases we maintain the old behavior of sending getpid()).
There are two cases where a message is being directed at lldpad - one
in virNetDevLinkDump, and one in virNetDevVPortProfileOpSetLink.
The case of virNetDevVPortProfileOpSetLink is simplest to explain -
only if !nltarget_kernel, i.e. the message isn't targetted for the
kernel, is the dst_pid set (by calling
virNetDevVPortProfileGetLldpadPid()), so only in that case do we call
virNetlinkEventServiceLocalPid() to set src_pid.
For virNetDevLinkDump, it's a bit more complicated. The call to
virNetDevVPortProfileGetLldpadPid() was effectively up one level (in
virNetDevVPortProfileOpCommon), although obscured by an unnecessary
passing of a function pointer. This patch removes the function
pointer, and calls virNetDevVPortProfileGetLldpadPid() directly in
virNetDevVPortProfileOpCommon - if it's doing this, it knows that it
should also call virNetlinkEventServiceLocalPid() to set src_pid too;
then it just passes src_pid and dst_pid down to
virNetDevLinkDump. Since (src_pid == 0 && dst_pid == 0) implies that
the kernel is the destination, there is no longer any need to send
nltarget_kernel as an arg to virNetDevLinkDump, so it's been removed.
The disparity between src_pid being int and dst_pid being uint32_t may
be a bit disconcerting to some, but I didn't want to complicate
virNetlinkEventServiceLocalPid() by having status returned separately
from the value.
This value will be needed to set the src_pid when sending netlink
messages to lldpad. It is part of the solution to:
https://bugzilla.redhat.com/show_bug.cgi?id=816465
Note that libnl's port generation algorithm guarantees that the
nl_socket_get_local_port() will always be > 0 (since it is "getpid() +
(n << 22>" where n is always < 1024), so it is okay to cast the
uint32_t to int (thus allowing us to use -1 as an error sentinel).
Until now, virNetlinkCommand has assumed that the nl_pid in the source
address of outgoing netlink messages should always be the return value
of getpid(). In most cases it actually doesn't matter, but in the case
of communication with lldpad, lldpad saves this info and later uses it
to send netlink messages back to libvirt. A recent patch to fix Bug
816465 changed the order of the universe such that the netlink event
service socket is no longer bound with nl_pid == getpid(), so lldpad
could no longer send unsolicited messages to libvirtd. Adding src_pid
as an argument to virNetlinkCommand() is the first step in notifying
lldpad of the proper address of the netlink event service socket.
usbFindDevice():get usb device according to
idVendor, idProduct, bus, device
it is the exact match of the four parameters
usbFindDeviceByBus():get usb device according to bus, device
it returns only one usb device same as usbFindDevice
usbFindDeviceByVendor():get usb device according to idVendor,idProduct
it probably returns multiple usb devices.
usbDeviceSearch(): a helper function to do the actual search
These two functions are called from main() on all platforms, and
always return success on platforms that don't support libnl. They
still log an error message, though, which doesn't make sense - they
should just be NOPs on those platforms. (Per a suggestion during
review, I've turned the logs into debug messages rather than removing
them completely).
Error: STRING_NULL:
/libvirt/src/util/uuid.c:273:
string_null_argument: Function "getDMISystemUUID" does not terminate string "*dmiuuid".
/libvirt/src/util/uuid.c:241:
string_null_argument: Function "saferead" fills array "*uuid" with a non-terminated string.
/libvirt/src/util/util.c:101:
string_null_argument: Function "read" fills array "*buf" with a non-terminated string.
/libvirt/src/util/uuid.c:274:
string_null: Passing unterminated string "dmiuuid" to a function expecting a null-terminated string.
/libvirt/src/util/uuid.c:138:
var_assign_parm: Assigning: "cur" = "uuidstr". They now point to the same thing.
/libvirt/src/util/uuid.c:164:
string_null_sink_loop: Searching for null termination in an unterminated array "cur".
configure.ac: check for libnl-3 in addition to libnl-1
src/Makefile.am: link against libnl when needed
src/util/virnetlink.c:
support libnl3 api. To minimize impact on code flow, wrap the
differences under the virNetlink* namespace.
Unfortunately libnl3 moves netlink/msg.h to
/usr/include/libnl3/netlink/msg.h, so the LIBNL_CFLAGS need to be added
to a bunch of places where they weren't needed with libnl1.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Add function virJSONValueObjectKeysNumber, virJSONValueObjectGetKey
and virJSONValueObjectGetValue, which allow you to iterate over all
fields of json object: you can get number of fields and then get
name and value, stored in field with that name by index.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
In fact, the 'tapfd' is always NULL, the function 'virNetDevTapCreate()' hasn't
assign 'fd' to 'tapfd', when the function 'virNetDevSetMAC()' is failed then
goto 'error' label, finally, the VIR_FORCE_CLOSE() will deref a NULL 'tapfd'.
* util/virnetdevtap.c (virNetDevTapCreateInBridgePort): fix a NULL pointer derefing.
* How to reproduce?
$ cat > /tmp/net.xml <<EOF
<network>
<name>test</name>
<forward mode='nat'/>
<bridge name='br1' stp='off' delay='1' />
<mac address='00:00:00:00:00:00'/>
<ip address='192.168.100.1' netmask='255.255.255.0'>
<dhcp>
<range start='192.168.100.2' end='192.168.100.254' />
</dhcp>
</ip>
</network>
EOF
$ virsh net-define /tmp/net.xml
$ virsh net-start test
error: Failed to start network brTest
error: End of file while reading data: Input/output error
Signed-off-by: Alex Jia <ajia@redhat.com>
When libvirtd is started, we create "libvirt/qemu" directories under
hugetlbfs mount point. Only the "qemu" subdirectory is chowned to qemu
user and "libvirt" remains owned by root. If umask was too restrictive
when libvirtd started, qemu user may lose access to "qemu"
subdirectory. Let's explicitly grant search permissions to "libvirt"
directory for all users.
Below patch fixes the following coverity findings
Error: OVERRUN_STATIC:
/libvirt/src/qemu/qemu_command.c:152:
overrun-buffer-val: Overrunning static array "net->mac" of size 6 bytes by passing it as an argument to a function which indexes it at byte position 15.
/libvirt/src/util/virnetdevmacvlan.c:948:
access_dbuff_const: Calling "virNetDevMacVLanVPortProfileRegisterCallback" indexes array "macaddress" at byte position 15.
/libvirt/src/util/virnetdevmacvlan.c:773:
access_dbuff_const: Calling "memcpy" indexes array "macaddress" with index "16UL" at byte position 15.
Error: OVERRUN_STATIC:
/libvirt/src/qemu/qemu_migration.c:2744:
overrun-buffer-val: Overrunning static array "net->mac" of size 6 bytes by passing it as an argument to a function which indexes it at byte position 15.
/libvirt/src/util/virnetdevmacvlan.c:773:
access_dbuff_const: Calling "memcpy" indexes array "macaddress" with index "16UL" at byte position 15.
Error: OVERRUN_STATIC:
/libvirt/src/qemu/qemu_driver.c:435:
overrun-buffer-val: Overrunning static array "net->mac" of size 6 bytes by passing it as an argument to a function which indexes it at byte position 15.
/libvirt/src/util/virnetdevmacvlan.c:1036:
access_dbuff_const: Calling "virNetDevMacVLanVPortProfileRegisterCallback" indexes array "macaddress" at byte position 15.
/libvirt/src/util/virnetdevmacvlan.c:773:
access_dbuff_const: Calling "memcpy" indexes array "macaddress" with index "16UL" at byte position 15.
Some of the error messages in this function should have been
virReportSystemError (since they have an errno they want to log), but
were mistakenly written as netlinkError, which expects a libvirt error
code instead. The result was that when one of the errors was
encountered, "No error message provided" would be printed instead of
something meaningful (see
https://bugzilla.redhat.com/show_bug.cgi?id=816465 for an example).
This patch resolves https://bugzilla.redhat.com/show_bug.cgi?id=815270
The function virNetDevMacVLanVPortProfileRegisterCallback() takes an
arg "virtPortProfile", and was checking it for non-NULL before using
it. However, the prototype for
virNetDevMacVLanPortProfileRegisterCallback had marked that arg with
ATTRIBUTE_NONNULL(). Contrary to what one may think,
ATTRIBUTE_NONNULL() does not provide any guarantee that an arg marked
as such really is always non-null; the only effect to the code
generated by gcc, is that gcc *assumes* it is non-NULL; this results
in, for example, the check for a non-NULL value being optimized out.
(Unfortunately, this code removal only occurs when optimization is
enabled, and I am in the habit of doing local builds with optimization
off to ease debugging, so the bug didn't show up in my earlier local
testing).
In general, virPortProfile might always be NULL, so it shouldn't be
marked as ATTRIBUTE_NONNULL. One other function prototype made this
same error, so this patch fixes it as well.
Add 2 new functions to the virSocketAddr 'class':
- virSocketAddrEqual: tests whether two IP addresses and their ports are equal
- virSocketaddSetIPv4Addr: set a virSocketAddr given a 32 bit int
This patch improves the previously added virAtomicInt implementation
by using gcc-builtins if possible. The needed builtins are available
since GCC >= 4.1. At least the 4.0 docs don't mention them.
This patch introduces a new block job, useful for live storage
migration using pre-copy streaming. Justification for including
this under virDomainBlockRebase rather than adding a new command
includes: 1) there are now two possible block jobs in qemu, with
virDomainBlockRebase starting either type of command, and
virDomainBlockJobInfo and virDomainBlockJobAbort working to end
either type; 2) reusing this command allows distros to backport
this feature to the libvirt 0.9.10 API without a .so bump.
Note that a future patch may add a more powerful interface named
virDomainBlockJobCopy, dedicated to just the block copy job, in
order to expose even more options (such as setting an arbitrary
format type for the destination without having to probe it from a
pre-existing destination file); adding a new command for targetting
just block copy would be similar to how we already have
virDomainBlockPull for targetting just the block pull job.
Using a live VM with the backing chain:
base <- snap1 <- snap2
as the starting point, we have:
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY)
creates /path/to/copy with the same format as snap2, with no backing
file, so entire chain is copied and flattened
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_COPY_RAW)
creates /path/to/copy as a raw file, so entire chain is copied and
flattened
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_SHALLOW)
creates /path/to/copy with the same format as snap2, but with snap1 as
a backing file, so only snap2 is copied.
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT)
reuse existing /path/to/copy (must have empty contents, and format is
probed[*] from the metadata), and copy the full chain
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT|
VIR_DOMAIN_BLOCK_REBASE_SHALLOW)
reuse existing /path/to/copy (contents must be identical to snap1,
and format is probed[*] from the metadata), and copy only the contents
of snap2
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT|
VIR_DOMAIN_BLOCK_REBASE_SHALLOW|VIR_DOMAIN_BLOCK_REBASE_COPY_RAW)
reuse existing /path/to/copy (must be raw volume with contents
identical to snap1), and copy only the contents of snap2
Less useful combinations:
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_SHALLOW|
VIR_DOMAIN_BLOCK_REBASE_COPY_RAW)
fail if source is not raw, otherwise create /path/to/copy as raw and
the single file is copied (no chain involved)
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT|
VIR_DOMAIN_BLOCK_REBASE_COPY_RAW)
makes little sense: the destination must be raw but have no contents,
meaning that it is an empty file, so there is nothing to reuse
The other three flags are rejected without VIR_DOMAIN_BLOCK_COPY.
[*] Note that probing an existing file for its format can be a security
risk _if_ there is a possibility that the existing file is 'raw', in
which case the guest can manipulate the file to appear like some other
format. But, by virtue of the VIR_DOMAIN_BLOCK_REBASE_COPY_RAW flag,
it is possible to avoid probing of raw files, at which point, probing
of any remaining file type is no longer a security risk.
It would be nice if we could issue an event when pivoting from phase 1
to phase 2, but qemu hasn't implemented that, and we would have to poll
in order to synthesize it ourselves. Meanwhile, qemu will give us a
distinct job info and completion event when we either cancel or pivot
to end the job. Pivoting is accomplished via the new:
virDomainBlockJobAbort(dom, disk, VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT)
Management applications can pre-create the copy with a relative
backing file name, and use the VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT
flag to have qemu reuse the metadata; if the management application
also copies the backing files to a new location, this can be used
to perform live storage migration of an entire backing chain.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_JOB_TYPE_COPY):
New block job type.
(virDomainBlockJobAbortFlags, virDomainBlockRebaseFlags): New enums.
* src/libvirt.c (virDomainBlockRebase): Document the new flags,
and implement general restrictions on flag combinations.
(virDomainBlockJobAbort): Document the new flag.
(virDomainSaveFlags, virDomainSnapshotCreateXML)
(virDomainRevertToSnapshot, virDomainDetachDeviceFlags): Document
restrictions.
* include/libvirt/virterror.h (VIR_ERR_BLOCK_COPY_ACTIVE): New
error.
* src/util/virterror.c (virErrorMsg): Define it.
virThreadSelf tries to access the virThreadPtr stored in TLS for the
current thread via TlsGetValue. When virThreadSelf is called on a thread
that was not created via virThreadCreate (e.g. the main thread) then
TlsGetValue returns NULL as TlsAlloc initializes TLS slots to NULL.
virThreadSelf can be called on the main thread via this call chain from
virsh
vshDeinit
virEventAddTimeout
virEventPollAddTimeout
virEventPollInterruptLocked
virThreadIsSelf
triggering a segfault as virThreadSelf unconditionally dereferences the
return value of TlsGetValue.
Fix this by making virThreadSelf check the TLS slot value for NULL and
setting the given virThreadPtr accordingly.
Reported by Marcel Müller.
Ensure we don't introduce any more lousy integer parsing in new
code, while avoiding a scrub-down of existing legacy code.
Note that we also need to enable sc_prohibit_atoi_atof (see cfg.mk
local-checks-to-skip) before we are bulletproof, but that also
entails scrubbing I'm not ready to do at the moment.
* src/util/util.c (virStrToLong_i, virStrToLong_ui)
(virStrToLong_l, virStrToLong_ul, virStrToLong_ll)
(virStrToLong_ull, virStrToDouble): Mark exemptions.
* src/util/virmacaddr.c (virMacAddrParse): Likewise.
* cfg.mk (sc_prohibit_strtol): New syntax check.
(exclude_file_name_regexp--sc_prohibit_strtol): Ignore files that
I'm not willing to fix yet.
(local-checks-to-skip): Re-enable sc_prohibit_atoi_atof.
https://bugzilla.redhat.com/show_bug.cgi?id=617711 reported that
even with my recent patched to allow <memory unit='G'>1</memory>,
people can still get away with trying <memory>1G</memory> and
silently get <memory unit='KiB'>1</memory> instead. While
virt-xml-validate catches the error, our C parser did not.
Not to mention that it's always fun to fix bugs while reducing
lines of code. :)
* src/conf/domain_conf.c (virDomainParseMemory): Check for parse error.
(virDomainDefParseXML): Avoid strtoll.
* src/conf/storage_conf.c (virStorageDefParsePerms): Likewise.
* src/util/xml.c (virXPathLongBase, virXPathULongBase)
(virXPathULongLong, virXPathLongLong): Likewise.
DBus connection. The HAL device code further requires that
the DBus connection is integrated with the event loop and
provides such glue logic itself.
The forthcoming FirewallD integration also requires a
dbus connection with event loop integration. Thus we need
to pull the current event loop glue out of the HAL driver.
Thus we create src/util/virdbus.{c,h} files. This contains
just one method virDBusGetSystemBus() which obtains a handle
to the single shared system bus instance, with event glue
automagically setup.
Currently upon a migration a callback is created when a 802.1qbg link
is set to PREASSOCIATE, this should not happen because this is a no-op
on most switches, and does not lead to an ASSOCIATE state. This patch
only creates callbacks when CREATE or RESTORE is requested. Migration
and libvirtd restart scenarios are already handled elsewhere.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
The linux-2.6.32 kernel header does not yet define IFLA_VF_MAX and others,
which breaks compiling a new libvirt on old systems like Debian Squeeze.
(I also have to add --without-macvtap --disable-werror --without-virtualport to
./configure to get it to compile.)
Signed-off-by: Philipp Hahn <hahn@univention.de>
This patch adds a netlink callback when migrating a VEPA enabled
virtual machine. It fixes a Bug where a VM would not request a port
association when it was cleared by lldpad.
This patch requires the latest git version of lldpad to work.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
Leak introduced in commit 0436d32. If we allocate an actions array,
but fail early enough to never consume it with the qemu monitor
transaction call, we leaked memory.
But our semantics of making the transaction command free the caller's
memory is awkward; avoiding the memory leak requires making every
intermediate function in the call chain check for error. It is much
easier to fix things so that the function that allocates also frees,
while the call chain leaves the caller's data intact. To do that,
I had to hack our JSON data structure to make it easy to protect a
portion of an arbitrary JSON tree from being freed.
* src/util/json.h (virJSONType): Name the enum.
(_virJSONValue): New field.
* src/util/json.c (virJSONValueFree): Use it to protect a portion
of an array.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONTransaction): Avoid
freeing caller's data.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive):
Free actions array on failure.
https://bugzilla.redhat.com/show_bug.cgi?id=808979
The leak is really in virProcessInfoGetAffinity, as shown in the
valgrind output given in the above bug report - it calls CPU_ALLOC(),
but then fails to call CPU_FREE().
This leak has existed in every version of libvirt since 0.7.5.
With latest gnulib we are checking even the lowest level functions
whether they check flags. Moreover, we are shadowing the real error
on system without TUNSETIFF support.
The code is splattered with a mix of
sizeof foo
sizeof (foo)
sizeof(foo)
Standardize on sizeof(foo) and add a syntax check rule to
enforce it
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A handful of places used %zd for format specifiers even
though the args was size_t, not ssize_t.
* src/remote/remote_driver.c, src/util/xml.c: s/%zd/%zu/
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
An upstream gnulib bug[1] meant that some of our syntax checks
weren't being run. Fix up our offenders before we upgrade to
a newer gnulib.
[1] https://lists.gnu.org/archive/html/bug-gnulib/2012-03/msg00194.html
* src/util/virnetdevtap.c (virNetDevTapCreate): Use flags.
* tests/lxcxml2xmltest.c (mymain): Strip useless ().
When libvirtd is restarted, also restart the netlink event
message callbacks for existing VEPA connections and send
a message to lldpad for these existing links, so it learns
the new libvirtd pid.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
Return statements with parameter enclosed in parentheses were modified
and parentheses were removed. The whole change was scripted, here is how:
List of files was obtained using this command:
git grep -l -e '\<return\s*([^()]*\(([^()]*)[^()]*\)*)\s*;' | \
grep -e '\.[ch]$' -e '\.py$'
Found files were modified with this command:
sed -i -e \
's_^\(.*\<return\)\s*(\(\([^()]*([^()]*)[^()]*\)*\))\s*\(;.*$\)_\1 \2\4_' \
-e 's_^\(.*\<return\)\s*(\([^()]*\))\s*\(;.*$\)_\1 \2\3_'
Then checked for nonsense.
The whole command looks like this:
git grep -l -e '\<return\s*([^()]*\(([^()]*)[^()]*\)*)\s*;' | \
grep -e '\.[ch]$' -e '\.py$' | xargs sed -i -e \
's_^\(.*\<return\)\s*(\(\([^()]*([^()]*)[^()]*\)*\))\s*\(;.*$\)_\1 \2\4_' \
-e 's_^\(.*\<return\)\s*(\([^()]*\))\s*\(;.*$\)_\1 \2\3_'
When qparams support was dropped in commit bc1ff160, we forgot
to add tests to ensure that viruri can do the same round trip
handling of a URI. This round trip was broken, due to use
of the old 'query' field of xmlUriPtr, instead of the new
'query_raw'
Also, we forgot to report an OOM error.
* tests/viruritest.c (mymain): Add tests based on just-deleted
qparamtest.
(testURIParse): Allow difference in input and expected output.
* src/util/viruri.c (virURIFormat): Add missing error. Use
query_raw, instead of query for xmlUriPtr object.
Libvirt on x86 parses 'dmidecode' to gather characteristics of host
system. On PowerPC, this is now implemented by reading /proc/cpuinfo
NOTE: memory-DIMM information is not presently implemented.
Acked-by: Daniel Veillard <veillard@redhat.com>
Acked-by: Daniel P Berrange <berrange@redhat.com>
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
When SASL requests auth credentials, try to look them up in the
config file first. If any are found, remove them from the list
that the user is prompted for
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ensure that the functions in virauth.h have names matching the file
prefix, by renaming virRequest{Username,Password} to
virAuthGet{Username,Password}
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To follow latest naming conventions, rename src/util/authhelper.[ch]
to src/util/virauth.[ch].
* src/util/authhelper.[ch]: Rename to src/util/virauth.[ch]
* src/esx/esx_driver.c, src/hyperv/hyperv_driver.c,
src/phyp/phyp_driver.c, src/xenapi/xenapi_driver.c: Update
for renamed include files
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The '.ini' file format is a useful alternative to the existing
config file style, when you need to have config files which
are hashes of hashes. The 'virKeyFilePtr' object provides a
way to parse these file types.
* src/Makefile.am, src/util/virkeyfile.c,
src/util/virkeyfile.h: Add .ini file parser
* tests/Makefile.am, tests/virkeyfiletest.c: Test
basic parsing capabilities
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert drivers currently using the qparams APIs, to instead
use the virURIPtr query parameters directly.
* src/esx/esx_util.c, src/hyperv/hyperv_util.c,
src/remote/remote_driver.c, src/xenapi/xenapi_utils.c: Remove
use of qparams
* src/util/qparams.h, src/util/qparams.c: Delete
* src/Makefile.am, src/libvirt_private.syms: Remove qparams
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Avoid the need for each driver to parse query parameters itself
by storing them directly in the virURIPtr struct. The parsing
code is a copy of that from src/util/qparams.c The latter will
be removed in a later patch
* src/util/viruri.h: Add query params to virURIPtr
* src/util/viruri.c: Parse query parameters when creating virURIPtr
* tests/viruritest.c: Expand test to cover params
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of just typedef'ing the xmlURIPtr struct for virURIPtr,
use a custom libvirt struct. This allows us to fix various
problems with libxml2. This initially just fixes the query vs
query_raw handling problems.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The parameter in the virURIFormat impl mistakenly used the
xmlURIPtr type, instead of virURIPtr. Since they will soon
cease to be identical, this needs fixing
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Since we defined a custom virURIPtr type, we should use a
virURIFree method instead of assuming it will always be
a typedef for xmlURIPtr
* src/util/viruri.c, src/util/viruri.h, src/libvirt_private.syms:
Add a virURIFree method
* src/datatypes.c, src/esx/esx_driver.c, src/libvirt.c,
src/qemu/qemu_migration.c, src/vmx/vmx.c, src/xen/xend_internal.c,
tests/viruritest.c: s/xmlFreeURI/virURIFree/
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A few times libvirt users manually setting mac addresses have
complained of a networking failure that ends up being due to a multicast
mac address being used for a guest interface. This patch prevents that
by logging an error and failing if a multicast mac address is
encountered in each of the three following cases:
1) domain xml <interface> mac address.
2) network xml bridge mac address.
3) network xml dhcp/host mac address.
There are several other places where a mac address can be input that
aren't controlled in this manner because failure to do so has no
consequences (e.g., if the address will be used to search through
existing interfaces for a match).
The RNG has been updated to add multiMacAddr and uniMacAddr along with
the existing macAddr, and macAddr was switched to uniMacAddr where
appropriate.
This patch is in response to:
https://bugzilla.redhat.com/show_bug.cgi?id=798467
If a guest's tap device is created using the same MAC address the
guest uses for its own network card (which connects to the tap
device), the Linux kernel will log the following message and traffic
will not pass:
kernel: vnet9: received packet with own address as source address
This patch disallows MAC addresses with a first byte of 0xFE, but only in
the case that the MAC address is used for a guest interface that's
connected by way of a standard tap device. (In other words, the
validation is done at runtime at the same place the MAC address is
modified for the tap device, rather than when mac address is parsed,
the idea being that it is then we know for sure the address will be
problematic.)
This patch fixes a NULL pointer check that was causing SegFault on
some specific configurations. It also reverts commit 59d0c9801c
that was checking for this value in one place.
Thanks to cgroups, providing user vs. system time of the overall
guest is easy to add to our existing API.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_CPU_STATS_USERTIME)
(VIR_DOMAIN_CPU_STATS_SYSTEMTIME): New constants.
* src/util/virtypedparam.h (virTypedParameterArrayValidate)
(virTypedParameterAssign): Enforce checking the result.
* src/qemu/qemu_driver.c (qemuDomainGetPercpuStats): Fix offender.
(qemuDomainGetTotalcpuStats): Implement new parameters.
* tools/virsh.c (cmdCPUStats): Tweak output accordingly.
As documented in linux.git/Documentation/cgroups/cpuacct.txt,
cpuacct.stat returns user and system time in ticks (the same
unit used in times(2)). It would be a bit nicer if it were like
getrusage(2) and reported timeval contents, or like cpuacct.usage
and in nanoseconds, but we can't be picky.
* src/util/cgroup.h (virCgroupGetCpuacctStat): New function.
* src/util/cgroup.c (virCgroupGetCpuacctStat): Implement it.
(virCgroupGetValueStr): Allow for multi-line files.
* src/libvirt_private.syms (cgroup.h): Export it.
If there is a disk file with a comma in the name, QEmu expects a double
comma instead of a single one (e.g., the file "virtual,disk.img" needs
to be specified as "virtual,,disk.img" in QEmu's command line). This
patch fixes libvirt to work with that feature. Fix RHBZ #801036.
Based on an initial patch by Crístian Viana.
* src/util/buf.h (virBufferEscape): Alter signature.
* src/util/buf.c (virBufferEscape): Add parameter.
(virBufferEscapeSexpr): Fix caller.
* src/qemu/qemu_command.c (qemuBuildRBDString): Likewise. Also
escape commas in file names.
(qemuBuildDriveStr): Escape commas in file names.
* docs/schemas/basictypes.rng (absFilePath): Relax RNG to allow
commas in input file names.
* tests/qemuxml2argvdata/*-disk-drive-network-sheepdog.*: Update
test.
Signed-off-by: Eric Blake <eblake@redhat.com>
This is nearly identical to an earlier patch for virnetlink.c.
There are special stub versions of all public functions in this file
that are compiled when the platform isn't linux. Each of these
functions had an almost identical message, differing only in the
function name included in the message. Since log messages already
contain the function name, we can just define a const char* with the
common part of the string, and use that same string for all the log
messages.
If nothing else, this at least makes for less strings that need
translating...
There are several functions that call virNetlinkCommand, and they all
follow a common pattern, with three exit labels: err_exit (or
cleanup), malformed_resp, and buffer_too_small. All three of these
labels do their own cleanup and have their own return. However, the
malformed_resp label usually frees the same items as the
cleanup/err_exit label, and the buffer_too_small label just doesn't
free recvbuf (because it's known to always be NULL at the time we goto
buffer_too_small.
In order to simplify and standardize the code, I've made the following
changes to all of these functions:
1) err_exit is replaced with the more libvirt-ish "cleanup", which
makes sense because in all cases this code is also executed in the
case of success, so labelling it err_exit may be confusing.
2) rc is initialized to -1, and set to 0 just before the cleanup
label. Any code that currently sets rc = -1 is made to instead goto
cleanup.
3) malformed_resp and buffer_too_small just log their error and goto
cleanup. This gives us a single return path, and a single place to
free up resources.
4) In one instance, rather then logging an error immediately, a char*
msg was pointed to an error string, then goto cleanup (and cleanup
would log an error if msg != NULL). It takes no more lines of code
to just log the message as we encounter it.
This patch should have 0 functional effects.
There are special stub versions of all public functions in this file
that are compiled when either libnl isn't available or the platform
isn't linux. Each of these functions had two almost identical message,
differing only in the function name included in the message. Since log
messages already contain the function name, we can just define a const
char* with the common part of the string, and use that same string for
all the log messages.
Also, rather than doing #if defined ... #else ... #endif *inside the
error log macro invocation*, this patch does #if defined ... just
once, using it to decide which single string to define. This turns the
error log in each function from 6 lines, to 1 line.
This patch will allow OpenFlow controllers to identify which interface
belongs to a particular VM by using the Domain UUID.
ovs-vsctl get Interface vnet0 external_ids
{attached-mac="52:54:00:8C:55:2C", iface-id="83ce45d6-3639-096e-ab3c-21f66a05f7fa", iface-status=active, vm-id="142a90a7-0acc-ab92-511c-586f12da8851"}
V2 changes:
Replaced vm-uuid with vm-id. There was a discussion in Open vSwitch
mailinglist that we should stick with the same DB key postfixes for the
sake of consistency (e.g iface-id, vm-id ...).
The indentation on the final lines of the function was off by four
spaces, making me wonder for a second if there was something
missing. (There wasn't.)
If we need to virFork() to check assess() under different
UID+GID we need to translate returned status via WEXITSTATUS().
Otherwise, we may return values greater than 255 which is
obviously wrong.
Scaling an integer based on a suffix is something we plan on reusing
in several contexts: XML parsing, virsh CLI parsing, and possibly
elsewhere. Make it easy to reuse, as well as adding in support for
powers of 1000.
* src/util/util.h (virScaleInteger): New function.
* src/util/util.c (virScaleInteger): Implement it.
* src/libvirt_private.syms (util.h): Export it.
Overflow can be user-induced, so it deserves more than being called
an internal error. Note that in general, 32-bit platforms have
far more places to trigger this error (anywhere the public API
used 'unsigned long' but the other side of the connection is a
64-bit server); but some are possible on 64-bit platforms (where
the public API computes the product of two numbers).
* include/libvirt/virterror.h (VIR_ERR_OVERFLOW): New error.
* src/util/virterror.c (virErrorMsg): Translate it.
* src/libvirt.c (virDomainSetVcpusFlags, virDomainGetVcpuPinInfo)
(virDomainGetVcpus, virDomainGetCPUStats): Use it.
* daemon/remote.c (HYPER_TO_TYPE): Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockResize): Likewise.
ATTRIBUTE_UNUSED was accidentally forgotten on one arg of a stub
function for functionality that's not present on non-linux
platforms. This causes a non-linux build with
--enable-compile-warnings=error to fail.
* For now, only "cpu_time" is supported.
* cpuacct cgroup is used for providing percpu cputime information.
* src/qemu/qemu.conf - take care of cpuacct cgroup.
* src/qemu/qemu_conf.c - take care of cpuacct cgroup.
* src/qemu/qemu_driver.c - added an interface
* src/util/cgroup.c/h - added interface for getting percpu cputime
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
This patch includes the following changes to virnetdevmacvlan.c and
virnetdevvportprofile.c:
- removes some netlink functions which are now available in
virnetdev.c
- Adds a vf argument to all port profile functions.
For 802.1Qbh devices, the port profile calls can use a vf argument if
passed by the caller. If the vf argument is -1 it will try to derive the vf
if the device passed is a virtual function.
For 802.1Qbg devices, This patch introduces a null check for the device
argument because during port profile assignment on a hostdev, this argument
can be null.
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
This patch adds the following:
- functions to set and get vf configs
- Functions to replace and store vf configs (Only mac address is handled today.
But the functions can be easily extended for vlans and other vf configs)
- function to dump link dev info (This is moved from virnetdevvportprofile.c)
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
pciDeviceGetVirtualFunctionInfo returns pf netdevice name and virtual
function index for a given vf. This is just a wrapper around existing functions
to return vf's pf and vf_index with one api call
pciConfigAddressToSysfsfile returns the sysfile pci device link
from a 'struct pci_config_address'
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Commit 723d5c (added after the release of 0.9.10) adds a
NetlinkEventClient for each interface sent to
virNetDevMacVLanCreateWithVPortProfile. This should only be done if
the interface actually *has* a virtPortProfile, otherwise the event
handler would be a NOP. The bigger problem is that part of the setup
to create the NetlinkEventClient is to do a memcpy of virtPortProfile
- if it's NULL, this triggers a segv.
This patch just qualifies the code that adds the client - if
virtPortProfile is NULL, it's skipped.
With an additional new bool added to determine whether or not to
discourage the use of the supplied MAC address by the bridge itself,
virNetDevTapCreateInBridgePort had three booleans (well, 2 bools and
an int used as a bool) in the arg list, which made it increasingly
difficult to follow what was going on. This patch combines those three
into a single flags arg, which not only shortens the arg list, but
makes it more self-documenting.
When a tap device for a domain is created and attached to a bridge,
the first byte of the tap device MAC address is set to 0xFE, while the
rest is set to match the MAC address that will be presented to the
guest as its network device MAC address. Setting this high value in
the tap's MAC address discourages the bridge from using the tap
device's MAC address as the bridge's own MAC address (Linux bridges
always take on the lowest numbered MAC address of all attached devices
as their own).
In one case within libvirt, a tap device is created and attached to
the bridge with the intent that its MAC address be taken on by the
bridge as its own (this is used to assure that the bridge has a fixed
MAC address to prevent network outages created by the bridge MAC
address "flapping" as guests are started and stopped). In this case,
the first byte of the mac address is *not* altered to 0xFE.
In the current code, callers to virNetDevTapCreateInBridgePort each
make the MAC address modification themselves before calling, which
leads to code duplication, and also prevents lower level functions
from knowing the real MAC address being used by the guest. The problem
here is that openvswitch bridges must be informed about this MAC
address, or they will be unable to pass traffic to/from the guest.
This patch centralizes the location of the MAC address "0xFE fixup"
into virNetDevTapCreateInBridgePort(), meaning 1) callers of this
function no longer need the extra strange bit of code, and 2)
bitNetDevTapCreateBridgeInPort itself now is called with the guest's
unaltered MAC address, and can pass it on, unmodified, to
virNetDevOpenvswitchAddPort.
There is no other behavioral change created by this patch.
Nuke the last vestiges of printing pid_t values with the wrong
types, at least in code compiled on mingw64. There may be other
places, but for now they are only compiled on systems where the
existing %d doesn't trigger gcc warnings.
* src/rpc/virnetsocket.c (virNetSocketNew): Use %lld and casting,
rather than assuming any particular int type for pid_t.
* src/util/command.c (virCommandRunAsync, virPidWait)
(virPidAbort): Likewise.
(verify): Drop a now stale assertion.
No thanks to 64-bit windows, with 64-bit pid_t, we have to avoid
constructs like 'int pid'. Our API in libvirt-qemu cannot be
changed without breaking ABI; but then again, libvirt-qemu can
only be used on systems that support UNIX sockets, which rules
out Windows (even if qemu could be compiled there) - so for all
points on the call chain that interact with this API decision,
we require a different variable name to make it clear that we
audited the use for safety.
Adding a syntax-check rule only solves half the battle; anywhere
that uses printf on a pid_t still needs to be converted, but that
will be a separate patch.
* cfg.mk (sc_correct_id_types): New syntax check.
* src/libvirt-qemu.c (virDomainQemuAttach): Document why we didn't
use pid_t for pid, and validate for overflow.
* include/libvirt/libvirt-qemu.h (virDomainQemuAttach): Tweak name
for syntax check.
* src/vmware/vmware_conf.c (vmwareExtractPid): Likewise.
* src/driver.h (virDrvDomainQemuAttach): Likewise.
* tools/virsh.c (cmdQemuAttach): Likewise.
* src/remote/qemu_protocol.x (qemu_domain_attach_args): Likewise.
* src/qemu_protocol-structs (qemu_domain_attach_args): Likewise.
* src/util/cgroup.c (virCgroupPidCode, virCgroupKillInternal):
Likewise.
* src/qemu/qemu_command.c(qemuParseProcFileStrings): Likewise.
(qemuParseCommandLinePid): Use pid_t for pid.
* daemon/libvirtd.c (daemonForkIntoBackground): Likewise.
* src/conf/domain_conf.h (_virDomainObj): Likewise.
* src/probes.d (rpc_socket_new): Likewise.
* src/qemu/qemu_command.h (qemuParseCommandLinePid): Likewise.
* src/qemu/qemu_driver.c (qemudGetProcessInfo, qemuDomainAttach):
Likewise.
* src/qemu/qemu_process.c (qemuProcessAttach): Likewise.
* src/qemu/qemu_process.h (qemuProcessAttach): Likewise.
* src/uml/uml_driver.c (umlGetProcessInfo): Likewise.
* src/util/virnetdev.h (virNetDevSetNamespace): Likewise.
* src/util/virnetdev.c (virNetDevSetNamespace): Likewise.
* tests/testutils.c (virtTestCaptureProgramOutput): Likewise.
* src/conf/storage_conf.h (_virStoragePerms): Use mode_t, uid_t,
and gid_t rather than int.
* src/security/security_dac.c (virSecurityDACSetOwnership): Likewise.
* src/conf/storage_conf.c (virStorageDefParsePerms): Avoid
compiler warning.
Commit 7c90026 added #include "conf/domain_conf.h" to
util/virrandom.c. Fortunately it didn't actually use anything from
domain_conf.h, since as far as I'm aware, files in util aren't allowed
to reference anything in conf (although the opposite is allowed). So
this #include is unnecessary.
I verified it still compiles with the line removed, but have placed a
one day moratorium on me doing any "trivial rule" pushes, so will
wait for someone else to verify/ACK before pushing.
Add de-association handling for 802.1qbg (vepa) via lldpad
netlink messages. Also adds the possibility to perform an
association request without waiting for a confirmation.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
This code adds a netlink event interface to libvirt.
It is based upon the event_poll code and makes use of
it. An event is generated for each netlink message sent
to the libvirt pid.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
This patch changes behavior of virPidFileRead to enable passing NULL as
path to the binary the pid file should be checked against to skip this
check. This enables using this function for reading files that have same
semantics as pid files, but belong to unknown processes.
Function xmlParseURI does not remove square brackets around IPv6
address when parsing. One of the solutions is making wrappers around
functions working with xmlURI*. This assures that uri->server will be
always properly assigned and it doesn't have to be changed when used
on some new place in the code.
For this purpose, functions virParseURI and virSaveURI were
added. These function are wrappers around xmlParseURI and xmlSaveUri
respectively.
Also there is one new syntax check function to prohibit these functions
anywhere else.
File changes:
- src/util/viruri.h -- declaration
- src/util/viruri.c -- definition
- src/libvirt_private.syms -- symbol export
- src/Makefile.am -- added source and header files
- cfg.mk -- added sc_prohibit_xmlURI
- all others -- ID name and include fixes
[forwarding this here from RH bug #796732]
When creating a network (virsh net-create) with an erroneous XML
containing an empty <name> element, the error message is misleading:
error: Failed to create network from foo.xml
error: missing domain name information
It took me a bit of time to figure out that it was the *network* name
that was missing (I generate this xml and didn't look at it, first).
I realized that the same message is used for missing name when creating
a domain, network, or device node.
This patch adds VIR_MIGRATE_UNSAFE flag for migration APIs and new
VIR_ERR_MIGRATION_UNSAFE error code. The error code should be returned
whenever migrating a domain is considered unsafe (e.g., it's configured
in a way that does not ensure data integrity once it is migrated).
VIR_MIGRATE_UNSAFE flag may be used to force migration even though it
would normally be considered unsafe and forbidden.
AC_CHECK_PROG checks for program in given path. However, if it doesn't
exists, [variable] is set to [value-if-not-found]. We don't want this
to be the empty string in case of 'modprobe' and 'scrub' as we want to
fallback to runtime detection.
* src/util/virfile.h: the virFileWrapperFdFlags being defined as
a globa variable instead of a type ended up generating a duplicate
symbol error.
* AUTHORS: added Lincoln Myers
This patch allows libvirt to add interfaces to already
existing Open vSwitch bridges. The following syntax in
domain XML file can be used:
<interface type='bridge'>
<mac address='52:54:00:d0:3f:f2'/>
<source bridge='ovsbr'/>
<virtualport type='openvswitch'>
<parameters interfaceid='921a80cd-e6de-5a2e-db9c-ab27f15a6e1d'/>
</virtualport>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x03' function='0x0'/>
</interface>
or if libvirt should auto-generate the interfaceid use
following syntax:
<interface type='bridge'>
<mac address='52:54:00:d0:3f:f2'/>
<source bridge='ovsbr'/>
<virtualport type='openvswitch'>
</virtualport>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x03' function='0x0'/>
</interface>
It is also possible to pass an optional profileid. To do that
use following syntax:
<interface type='bridge'>
<source bridge='ovsbr'/>
<mac address='00:55:1a:65:a2:8d'/>
<virtualport type='openvswitch'>
<parameters interfaceid='921a80cd-e6de-5a2e-db9c-ab27f15a6e1d'
profileid='test-profile'/>
</virtualport>
</interface>
To create Open vSwitch bridge install Open vSwitch and
run the following command:
ovs-vsctl add-br ovsbr
The auto-generated WWN comply with the new addressing schema of WWN:
<quote>
the first nibble is either hex 5 or 6 followed by a 3-byte vendor
identifier and 36 bits for a vendor-specified serial number.
</quote>
We choose hex 5 for the first nibble. And for the 3-bytes vendor ID,
we uses the OUI according to underlying hypervisor type, (invoking
virConnectGetType to get the virt type). e.g. If virConnectGetType
returns "QEMU", we use Qumranet's OUI (00:1A:4A), if returns
ESX|VMWARE, we use VMWARE's OUI (00:05:69). Currently it only
supports qemu|xen|libxl|xenapi|hyperv|esx|vmware drivers. The last
36 bits are auto-generated.
Now that no one is relying on the return value being a pointer to
somewhere inside of the passed-in argument, we can simplify the
callers to simply return success or failure. Also wrap some long
lines and add some const-correctness.
* src/util/sysinfo.c (virSysinfoParseBIOS, virSysinfoParseSystem)
(virSysinfoParseProcessor, virSysinfoParseMemory): Change return.
(virSysinfoRead): Adjust caller.
virFileDirectFd was used for accessing files opened with O_DIRECT using
libvirt_iohelper. We will want to use the helper for accessing files
regardless on O_DIRECT and thus virFileDirectFd was generalized and
renamed to virFileWrapperFd.
dmidecode displays processor information, followed by BIOS, system and
memory-DIMM details.
Calls to virSysinfoParseBIOS(), virSysinfoParseSystem() would update
the buffer pointer 'base', so the processor information would be lost
before virSysinfoParseProcessor() was called. Sysinfo would therefore
not be able to display processor details -- It only described <bios>,
<system> and <memory_device> details.
This patch attempts to insulate sysinfo from ordering of dmidecode
output.
Before the fix:
---------------
virsh # sysinfo
<sysinfo type='smbios'>
<bios>
....
</bios>
<system>
....
</system>
<memory_device>
....
</memory_device>
After the fix:
-------------
virsh # sysinfo
<sysinfo type='smbios'>
<bios>
....
</bios>
<system>
....
</system>
<processor>
....
</processor>
<memory_device>
....
</memory_device>
gcc 4.7 complains:
util/virhashcode.c:49:17: error: always_inline function might not be inlinable [-Werror=attributes]
util/virhashcode.c:35:17: error: always_inline function might not be inlinable [-Werror=attributes]
Normal 'inline' is a hint that the compiler may ignore; the fact
that the function is static is good enough. We don't care if the
compiler decided not to inline after all.
* src/util/virhashcode.c (getblock, fmix): Relax attribute.
virFileOpenAs previously would only try opening a file as the current
user, or as a different user, but wouldn't try both methods in a
single call. This made it cumbersome to use as a replacement for
open(2). Additionally, it had a lot of historical baggage that led to
it being difficult to understand.
This patch refactors virFileOpenAs in the following ways:
* reorganize the code so that everything dealing with both the parent
and child sides of the "fork+setuid+setgid+open" method are in a
separate function. This makes the public function easier to understand.
* Allow a single call to virFileOpenAs() to first attempt the open as
the current user, and if that fails to automatically re-try after
doing fork+setuid (if deemed appropriate, i.e. errno indicates it
would now be successful, and the file is on a networkFS). This makes
it possible (in many, but possibly not all, cases) to drop-in
virFileOpenAs() as a replacement for open(2).
(NB: currently qemuOpenFile() calls virFileOpenAs() twice, once
without forking, then again with forking. That unfortunately can't
be changed without at least some discussion of the ramifications,
because the requested file permissions are different in each case,
which is something that a single call to virFileOpenAs() can't deal
with.)
* Add a flag so that any fchown() of the file to a different uid:gid
is explicitly requested when the function is called, rather than it
being implied by the presence of the O_CREAT flag. This just makes
for less subtle surprises to consumers. (Commit
b1643dc15c added the check for O_CREAT
before forcing ownership. This patch just makes that restriction
more explicit.)
* If either the uid or gid is specified as "-1", virFileOpenAs will
interpret this to mean "the current [gu]id".
All current consumers of virFileOpenAs should retain their present
behavior (after a few minor changes to their setup code and
arguments).
Rename the src/util/netlink files to src/util/virnetlink to
better fit the naming scheme. Also rename nlComm to virNetlinkCommand.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
Sometimes, its easier to run children with 2>&1 in shell notation,
and just deal with stdout and stderr interleaved. This was already
possible for fd handling; extend it to also work when doing string
capture of a child process.
* docs/internals/command.html.in: Document this.
* src/util/command.c (virCommandSetErrorBuffer): Likewise.
(virCommandRun, virExecWithHook): Implement it.
* tests/commandtest.c (test14): Test it.
* daemon/remote.c (remoteDispatchAuthPolkit): Use new command
feature.