Through conversation with Kumar L Srikanth-B22348, I found
that the function of getting memory usage (e.g., virsh dominfo)
doesn't work for lxc with ns subsystem of cgroup enabled.
This is because of features of ns and memory subsystems.
Ns creates child cgroup on every process fork and as a result
processes in a container are not assigned in a cgroup for
domain (e.g., libvirt/lxc/test1/). For example, libvirt_lxc
and init (or somewhat specified in XML) are assigned into
libvirt/lxc/test1/8839/ and libvirt/lxc/test1/8839/8849/,
respectively. On the other hand, memory subsystem accounts
memory usage within a group of processes by default, i.e.,
it does not take any child (and descendant) groups into
account. With the two features, virsh dominfo which just
checks memory usage of a cgroup for domain always returns
zero because the cgroup has no process.
Setting memory.use_hierarchy of a group allows to account
(and limit) memory usage of every descendant groups of the group.
By setting it of a cgroup for domain, we can get proper memory
usage of lxc with ns subsystem enabled. (To be exact, the
setting is required only when memory and ns subsystems are
enabled at the same time, e.g., mount -t cgroup none /cgroup.)
As same as normal directories, a cgroup cannot be removed if it
contains sub groups. This patch changes virCgroupRemove to remove
all descendant groups (subdirectories) of a target group before
removing the target group.
The handling is required when we run lxc with ns subsystem of cgroup.
Ns subsystem automatically creates child cgroups on every process
forks, but unfortunately the groups are not removed on process exits,
so we have to remove them by ourselves.
With this patch, such child (and descendant) groups are surely removed
at lxc shutdown, i.e., lxcVmCleanup which calls virCgroupRemove.
http://bugzilla.redhat.com/601143, part 1 - document existing
behavior. Ever since Mar 2010 (commit ced154cb), the use of
'attach-disk' or 'attach-device' to change cdrom/floppy media has been
documented but deprecated, but the replacement to use 'update-device'
was not documented.
* tools/virsh.c (cmdAttachInterface, cmdAttachDisk): Fix bad error
message.
* tools/virsh.pod (attach-device, attach-disk): Refer to
update-device for cdrom and floppy behavior.
(update-device): Add documentation.
add iptables rules to allow TFTP from the virtual network if <tftp>
element is defined in the network definition.
Fedora bz#580215
* src/network/bridge_driver.c: open UDP port 69 for TFTP traffic if
tftproot is defined
We already use the '-nodefaults' command line arg with QEMU to stop
it adding any default devices to guests. Unfortunately, QEMU will
load global config files from /etc/qemu that may also add default
devices. These aren't blocked by '-nodefaults', so we need to also
add the '-nodefconfig' arg to prevent that.
Unfortunately these global config files are also used to define
custom CPU models. So in blocking global hardware device addition
we also block definitions of new CPU models. Libvirt doesn't know
about these custom CPU models though, so it would never make use
of them anyway. Thus blocking them via -nodefconfig isn't a show
stopping problem. We would need to expand libvirt's own CPU model
XML database to support these instead.
* src/qemu/qemu_conf.c: Add '-nodefconfig' if available
* tests/qemuxml2argvdata/: Add '-nodefconfig' to all data files which
have '-nodefaults' present
The current code pattern requires that callers of qemuMonitorClose
check for the return value == 0, and if so, set priv->mon = NULL
and release the reference held on the associated virDomainObjPtr
The change d84bb6d6a3 violated that
requirement, meaning that priv->mon never gets set to NULL, and
a reference count is leaked on virDomainObjPtr.
This design was a bad one, so remove the need to check the return
valueof qemuMonitorClose(). Instead allow registration of a
callback that's invoked just when the last reference on qemuMonitorPtr
is released.
Finally there was a potential reference leak in qemuConnectMonitor
in the failure path.
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add a destroy
callback invoked from qemuMonitorFree
* src/qemu/qemu_driver.c: Use the destroy callback to release the
reference on virDomainObjPtr when the monitor is freed. Fix other
potential reference count leak in connecting to monitor
Before issuing monitor commands it is neccessary to check whether
the guest is still running. Most places use virDomainIsActive()
correctly, but a few relied on 'priv->mon != NULL'. In theory
these should be equivalent, but the release of the last reference
count on priv->mon can be delayed a small amount of time until
the event handler is finally deregistered. A further ref counting
bug also means that priv->mon might be never released. In such a
case, code could mistakenly issue a monitor command and wait for
a response that will never arrive, effectively leaving the QEMU
driver waiting on virCondWait() forever..
To protect against these possibilities, make sure all code uses
virDomainIsActive(), not 'priv->mon != NULL'
* src/qemu/qemu_driver.c: Replace 'priv->mon != NULL' with
calls to 'priv->mon != NULL'()
If there is no driver for a URI we report
"no hypervisor driver available"
This is bad because not all virt drivers are hypervisors (ie container
based virt).
If there is no driver support for an API we report
"this function is not supported by the hypervisor"
This is bad for the same reason, and additionally because it is
also used for the network, interface & storage drivers.
* src/util/virterror.c: Improve error messages
Running virsh while having /var/lib/libvirt/libvirt-guests file open
makes SELinux emit messages about preventing virsh from reading that
file. Since virsh doesn't really want to read anything, it's better to
run it with /dev/null on stdin to prevent those messages.
Following Daniel Berrange's multiple helpful suggestions for improving
this patch and introducing another driver interface, I now wrote the
below patch where the nwfilter driver registers the functions to
instantiate and teardown the nwfilters with a function in
conf/domain_nwfilter.c called virDomainConfNWFilterRegister. Previous
helper functions that were called from qemu_driver.c and qemu_conf.c
were move into conf/domain_nwfilter.h with slight renaming done for
consistency. Those functions now call the function expored by
domain_nwfilter.c, which in turn call the functions of the new driver
interface, if available.
- Fix documentation for virGetStorageVol: it has 'key' argument instead
of 'uuid'.
- Remove TODO comment from virReleaseStorageVol: we use volume key as an
identifier instead of UUID.
- Print human-readable UUID string in debug message in virReleaseSecret.
Per-connection hashes for domains, networks, storage pools and network
filter pools were indexed by names which was not the best choice. UUIDs
are better identifiers, so lets use them.
According to docs/formatdomain.html.in, "The boot element can be
repeated multiple times to setup a priority list of boot devices to try
in turn." The Relax-NG schema required / allowed exactly one entry.
Signed-off-by: Philipp Hahn <hahn@univention.de>
If VM startup fails early enough (can't find a referenced USB device),
libvirtd will crash trying to clear the VNC port bit, since port = 0,
which overflows us out of the bitmap bounds.
Fix this by being more defensive in the bitmap operations, and only
clearing a previously set VNC port.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Probably a copy-paste-bug in python/libvirt-override-api.xml:
virStorageVolGetInfo() extracts information about a "storage volume",
not the "storage pool" as virStoragePoolGetInfo() does.
Signed-off-by: Philipp Hahn <hahn@univention.de>
Followup to https://bugzilla.redhat.com/show_bug.cgi?id=599091,
commit 20206a4b, to reduce disk waste in padding.
* src/qemu/qemu_monitor.h (QEMU_MONITOR_MIGRATE_TO_FILE_BS): Drop
back to 4k.
(QEMU_MONITOR_MIGRATE_TO_FILE_TRANSFER_SIZE): New macro.
* src/qemu/qemu_driver.c (qemudDomainSaveFlag): Update comment.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextMigrateToFile): Use
two invocations of dd to output non-aligned large blocks.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONMigrateToFile):
Likewise.
This patch adds an optional XML attribute to a nwfilter rule to give the user control over whether the rule is supposed to be using the iptables state match or not. A rule may now look like shown in the XML below with the statematch attribute either having value '0' or 'false' (case-insensitive).
[...]
<rule action='accept' direction='in' statematch='false'>
<tcp srcmacaddr='1:2:3:4:5:6'
srcipaddr='10.1.2.3' srcipmask='32'
dscp='33'
srcportstart='20' srcportend='21'
dstportstart='100' dstportend='1111'/>
</rule>
[...]
I am also extending the nwfilter schema and add this attribute to a test case.
This patch adds the persistence status (yes/no) to the output of the virsh
dominfo and pool-info commands. This patch also adds the autostart status
to the output of the virsh pool-info command.
Red Hat BZ for this:
https://bugzilla.redhat.com/show_bug.cgi?id=603696
Use virBuffer* API to conditionally keep the portion of the command
line specific to HMC, so that IVM can work.
Signed-off-by: Eric Blake <eblake@redhat.com>
This patch works around a recent extension of the netlink driver I had made use of when building the netlink messages. Unfortunately older kernels don't accept IFLA_IFNAME + name of interface as a replacement for the interface's index, so this patch now gets the interface index ifindex if it's not provided (ifindex <= 0).
Presently the vol-key command only supports being provided with
a volume path.
This patch adds support for providing it with a pool and volume
identifier pair as well.
virsh # vol-key --pool <pool-name-or-uuid> <vol-name-or-path>
Justin Clift reported a problem with adding virStoragePoolIsPersistent
to virsh's pool-info command, resulting in a strange problem. Here's
an example:
virsh # pool-create-as images_dir3 dir - - - - "/home/images2"
Pool images_dir3 created
virsh # pool-info images_dir3
Name: images_dir3
UUID: 90301885-94eb-4ca7-14c2-f30b25a29a36
State: running
Capacity: 395.20 GB
Allocation: 30.88 GB
Available: 364.33 GB
virsh # pool-destroy images_dir3
Pool images_dir3 destroyed
At this point the images_dir3 pool should be gone (because it was
transient) and we should be able to create a new pool with the same name:
virsh # pool-create-as images_dir3 dir - - - - "/home/images2"
Pool images_dir3 created
virsh # pool-info images_dir3
Name: images_dir3
UUID: 90301885-94eb-4ca7-14c2-f30b25a29a36
error: Storage pool not found
The new pool got the same UUID as the first one, but we didn't specify
one. libvirt should have picked a random UUID, but it didn't.
It turned out that virStoragePoolIsPersistent leaks a reference to the
storage pool object (actually remoteDispatchStoragePoolIsPersistent does).
As a result, pool-destroy doesn't remove the virStoragePool for the
"images_dir3" pool from the virConnectPtr's storagePools hash on libvirtd's
side. Then the second pool-create-as get's the stale virStoragePool object
associated with the "images_dir3" name. But this object has the old UUID.
This commit ensures that all get_nonnull_* and make_nonnull_* calls for
libvirt objects are matched properly with vir*Free calls. This fixes the
reference leaks and the reported problem.
All remoteDispatch*IsActive and remoteDispatch*IsPersistent functions were
affected. But also remoteDispatchDomainMigrateFinish2 was affected in the
success path. I wonder why that didn't surface earlier. Probably because
domainMigrateFinish2 is executed on the destination host and in the common
case this connection is opened especially for the migration and gets closed
after the migration is done. So there was no chance to run into a problem
because of the leaked reference.
Make 'start --paused' mirror 'create --paused'.
* tools/virsh.c (cmdStart): Use new virDomainCreateWithFlags API
when needed.
* tools/virsh.pod (start): Document --paused.
Match earlier change for qemu pause support with virDomainCreateXML.
* src/qemu/qemu_driver.c (qemudDomainObjStart): Add parameter; all
callers changed.
(qemudDomainStartWithFlags): Implement flag support.
Define the wire format for the new virDomainCreateWithFlags
API, and implement client and server side of marshaling code.
* daemon/remote.c (remoteDispatchDomainCreateWithFlags): Add
server side dispatch for virDomainCreateWithFlags.
* src/remote/remote_driver.c (remoteDomainCreateWithFlags)
(remote_driver): Client side serialization.
* src/remote/remote_protocol.x
(remote_domain_create_with_flags_args)
(remote_domain_create_with_flags_ret)
(REMOTE_PROC_DOMAIN_CREATE_WITH_FLAGS): Define wire format.
* daemon/remote_dispatch_args.h: Regenerate.
* daemon/remote_dispatch_prototypes.h: Likewise.
* daemon/remote_dispatch_table.h: Likewise.
* src/remote/remote_protocol.c: Likewise.
* src/remote/remote_protocol.h: Likewise.
* src/remote_protocol-structs: Likewise.
Daniel's patch works with gcc and CFLAGS containing -O (the
autoconf default), but fails with non-gcc or with other
CFLAGS (such as -g), since c-ctype.h declares c_isdigit as
a macro only for certain compilation settings.
* src/Makefile.am (libvirt_parthelper_LDFLAGS): Add gnulib
library, for when c_isdigit is not a macro.
* src/storage/parthelper.c (main): Avoid out-of-bounds
dereference, noticed by Jim Meyering.
Disks with a trailing digit in their path (eg /dev/loop0 or
/dev/dm0) have an extra 'p' appended before the partition
number (eg, to form /dev/loop0p1 not /dev/loop01). Fix the
partition lookup to append this extra 'p' when required
* src/storage/parthelper.c: Add a 'p' before partition
number if required
Otherwise, a malicious packet could cause a DoS via spurious
out-of-memory failure.
* src/uml/uml_driver.c (umlMonitorCommand): Validate that incoming
data is reliable before using it to allocate/dereference memory.
Don't report bogus errno on short read.
Reported by Jim Meyering.
This patch adds two new parameters to the vol-create-as command:
--backing-vol <volume-name-or-key-or-path>
--backing-vol-format <format-of-backing-vol>
virsh # vol-create-as guest_images_lvm snapvol1 5G --backing-vol \
rhel6vm1lun1
Vol snapvol1 created
virsh # vol-create-as image_dir qcow2snap2 5G --format qcow2 \
--backing-vol imagevol1.qcow2 \
--backing-vol-format qcow2
Vol qcow2snap2 created
Additionally, the virsh man page update fixes incorrect snapshot
parameters that were included in my prior bulk volume command patch.
On Fedora 13 with sufficient mingw32-* packages installed, running
./autobuild.sh failed to cross-compile to mingw because
mingw32-pthreads installed a broken <pthread.h>. With that
issue fixed, the build still failed due to use of O_SYNC.
Meanwhile, recent .spec.in changes got out of sync.
* bootstrap.conf (gnulib_modules): Add fcntl-h, for O_SYNC.
* .gnulib: Update to latest, to work around buggy pthreads-win32
library.
* bootstrap: Import latest from gnulib.
* mingw32-libvirt.spec.in: Distribute new file.