The use of 'stat' in nodeDeviceVportCreateDelete is only to check
if the file exists or not, it's a bit overkill, and safe to replace
with the wrapper of access(2) (virFileExists).
"open_wwn_file" in node_device_linux_sysfs.c is redundant, on one
hand it duplicates work of virFileReadAll, on the other hand, it's
waste to use a function for it, as there is no other users of it.
So I don't see why the file opening work cannot be done in
"read_wwn_linux".
"read_wwn_linux" can be abstracted as an util function. As what all
it does is to read the sysfs entry.
So this patch removes "open_wwn_file", and abstract "read_wwn_linux"
as an util function "virReadFCHost" (a more general name, because
after changes, it can read each of the fc_host entry now).
* src/util/virutil.h: (Declare virReadFCHost)
* src/util/virutil.c: (Implement virReadFCHost)
* src/node_device/node_device_linux_sysfs.c: (Remove open_wwn_file,
and read_wwn_linux)
src/node_device/node_device_driver.h: (Remove the declaration of
read_wwn_linux, and the related macros)
src/libvirt_private.syms: (Export virReadFCHost)
VIR_CONNECT_LIST_NODE_DEVICES_CAP_FC_HOST to filter the FC HBA,
and VIR_CONNECT_LIST_NODE_DEVICES_CAP_VPORTS to filter the FC HBA
which supports vport.
Guess it was created for the fc_host and vports_ops capabilities
purpose, but there is enum virNodeDevScsiHostCapFlags for them,
and enum virNodeDevHBACapType is unused, and actually both
VIR_ENUM_DECL and VIR_ENUM_IMPL use the wrong enum name
"virNodeDevHBACap".
For a root filesystem with type=file or type=block, the LXC
container was forgetting to actually mount it, before doing
the pivot root step.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the lxc controller sets up the devpts instance on
$rootfsdef->src, but this only works if $rootfsdef is using
type=mount. To support type=block or type=file for the root
filesystem, we must use /var/lib/libvirt/lxc/$NAME.devpts
for the temporary devpts mount in the controller
Instead of using /var/lib/libvirt/lxc/$NAME for the FUSE
filesystem, use /var/lib/libvirt/lxc/$NAME.fuse. This allows
room for other temporary mounts in the same directory
Some of the LXC callbacks did not lock the virDomainObjPtr
instance. This caused transient errors like
error: Failed to start domain busy-mount
error: cannot rename file '/var/run/libvirt/lxc/busy-mount.xml.new' as '/var/run/libvirt/lxc/busy-mount.xml': No such file or directory
as 2 threads tried to update the status file concurrently
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virDomainGetRootFilesystem was only returning filesystems
with type=mount. This is bogus - any type of filesystem is
valid as the root, if dst=/.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Noticed that parsing bond interface XML containing the miimon element
fails
<interface type="bond" name="bond0">
...
<bond mode="active-backup">
<miimon freq="100" carrier="netif"/>
...
</bond>
</interface>
This configuration does not contain the optional updelay and downdelay
attributes, but parsing will fail due to returning the result of
virXPathULong (a -1 when the attribute doesn't exist) from
virInterfaceDefParseBond after examining the updelay attribute.
While fixing this bug, cleanup the function to use virXPathInt instead
of virXPathULong, and store the result directly instead of using a tmp
variable. Using virXPathInt actually fixes a potential silent
truncation bug noted by Eric Blake.
Also, there is no cleanup in the error label. Remove the label,
returning failure where failure occurs and success if the end of the
function is reached.
The 'nodeset' variable was never initialized, causing a later
VIR_FREE(nodeset) to free uninitialized memory.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This does nothing more than adding the new device and capability.
The device is present since QEMU 1.2.0.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A better way to do this would be to use a configuration file like
[iscsi "target-name"]
user = name
password = pwd
and pass it via -readconfig. This would remove the username and password
from the "ps" output. For now, however, keep this solution.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Only sheepdog actually required it in the code, and we can use 7000 as the
default---the same value that QEMU uses for the simple "sheepdog:VOLUME"
syntax. With this change, the schema can be fixed to allow no port.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
libiscsi provides a userspace iSCSI initiator.
The main advantage over the kernel initiator is that it is very
easy to provide different initiator names for VMs on the same host.
Thus libiscsi supports usage of persistent reservations in the VM,
which otherwise would only be possible with NPIV.
libiscsi uses "iscsi" as the scheme, not "iscsi+tcp". We can change
this in the tests (while remaining backwards-compatible manner, because
QEMU uses TCP as the default transport for both Gluster and NBD).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some code mistakenly called virIdentityOnceInit directly
instead of virIdentityInitialize(). This meant that one-time
initializer was run many times with predictably bad results.
The VIR_ERR_NO_SUPPORT error code is reserved for cases where an
API is not implemented in a driver. It definitely should not be
used when an API execution fails due to unsupported operation.
The recent commit moved some of the use of libnuma out of the
driver code, and into src/util/. It did not, however, update
libvirt_util.la to link against libnuma. This caused linkage
failure with virt-aa-helper, since nothing else caused libnuma
to be pulled onto the linker command line.
The fix removes all reference to NUMACTL_LIBS/CFLAGS from the
various modules in src/Makefile.am and just adds them to the
libvirt_util.la module, which everything else depends on.
Technically a build-breaker fix, but wanted to wait for feedback
on this
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We should record the new disk src in the shared disk table for
updating disk (CD-ROM or Floppy) API. Fortunately, we only allow
to update the disk source now, otherwise we might also want to
set the unpriv_sgio setting.
This plumbs in the XML description of iSCSI shares. The next patches
will add support for the libiscsi userspace initiator.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now that VCPU number are removed from qemu_monitor_text.c
(commit cc78d7ba), VCPU string checking also should be removed.
Report-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
but libvirt is built with --with-selinux. In this case getpeercon
returns ENOPROTOOPT so don't return an error in that case but simply
don't set seccon.
The virNetSocket & virIdentity classes accidentally got some
conditionals using HAVE_SELINUX instead of WITH_SELINUX.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Intend to reduce the redundant code,use virNumaSetupMemoryPolicy
to replace virLXCControllerSetupNUMAPolicy and
qemuProcessInitNumaMemoryPolicy.
This patch also moves the numa related codes to the
file virnuma.c and virnuma.h
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Allow lxc using the advisory nodeset from querying numad,
this means if user doesn't specify the numa nodes that
the lxc domain should assign to, libvirt will automatically
bind the lxc domain to the advisory nodeset which queried from
numad.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
qemuGetNumadAdvice will be used by LXC driver, rename
it to virNumaGetAutoPlacementAdvice and move it to virnuma.c
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
The "dtb" option sets the filename for the device tree.
If without this option support, "-dtb file" will be converted into
<qemu:commandline> in domain XML file.
For example, '-dtb /media/ram/test.dtb' will be converted into
<qemu:commandline>
<qemu:arg value='-dtb'/>
<qemu:arg value='/media/ram/test.dtb'/>
</qemu:commandline>
This is not very friendly.
This patchset add special <dtb> tag like <kernel> and <initrd>
which is easier for user to write domain XML file.
<os>
<type arch='ppc' machine='ppce500v2'>hvm</type>
<kernel>/media/ram/uImage</kernel>
<initrd>/media/ram/ramdisk</initrd>
<dtb>/media/ram/test.dtb</dtb>
<cmdline>root=/dev/ram rw console=ttyS0,115200</cmdline>
</os>
Signed-off-by: Eric Blake <eblake@redhat.com>
When building with --without-libvirtd and udev support is detected we
will fail to build with the following error:
node_device/node_device_udev.c:1608:37: error: unknown type name
'virStateInhibitCallback'
virStorageBackendRBDRefreshPool() first allocates an array big enough
to hold 1024 names, then calls rbd_list(), which returns ERANGE if the
array isn't big enough. When that happens, the VIR_ALLOC_N is called
again with a larger size. Unfortunately, the original array isn't
freed before allocating a new one.
The LXC controller is closing loop devices as soon as the
container has started. This is fine if the loop device
was setup as a mounted filesystem, but if we're just passing
through the loop device as a disk, nothing else is keeping
it open. Thus we must keep the loop device FDs open for as
long the libvirt_lxc process is running.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the LXC controller creates the cgroup, configures the
resources and adds the task all in one go. This is not sufficiently
flexible for the forthcoming NBD integration. We need to make sure
the NBD process gets into the right cgroup immediately, but we can
not have limits (in particular the device ACL) applied at the point
where we start qemu-nbd. So create a virLXCCgroupCreate method
which creates the cgroup and adds the current task to be called
early, and leave virLXCCgroupSetup to only do resource config.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When dispatching an RPC API call, setup the current identity to
hold the identity of the network client associated with the
RPC message being dispatched. The setting is thread-local, so
only affects the API call in this thread
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add APIs which allow creation of a virIdentity from the info
associated with a virNetServerClientPtr instance. This is done
based on the results of client authentication processes like
TLS, x509, SASL, SO_PEERCRED
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If no user identity is available, some operations may wish to
use the system identity. ie the identity of the current process
itself. Add an API to get such an identity.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To allow any internal API to get the current identity, add APIs
to associate a virIdentityPtr with the current thread, via a
thread local
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Introduce a local object virIdentity for managing security
attributes used to form a client application's identity.
Instances of this object are intended to be used as if they
were immutable, once created & populated with attributes
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A socket object has various pieces of security data associated
with it, such as the SELinux context, the SASL username and
the x509 distinguished name. Add new APIs to virNetServerClient
and related modules to access this data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit 82d5fe5437
qemu: check backing chains even when cgroup is omitted
added backing file checks just before the code that removes optional
disks if they are not present. However, the backing chain code fails in
case the disk file does not exist, which makes qemuProcessStart fail
regardless on configured startupPolicy.
Note that startupPolicy implementation is still wrong after this patch
since it only check the first file in a possible chain. It should rather
check the complete backing chain. But this is an existing limitation
that can be solved later. After all, startupPolicy is most useful for
CDROM images and they won't make use of backing files in most cases.
QEMU 1.3 and newer support an alternative URI-based syntax to specify
the location of an NBD server. Libvirt can keep on using the old
syntax in general, but only the URI syntax supports IPv6 addresses.
The URI syntax also supports relative paths to Unix sockets. These
should never be used but aren't explicitly blocked either by the parser,
so support it just in case.
The URI syntax is intentionally compatible with Gluster's, and the
code can be reused.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
This reuses the XML format that was introduced for Gluster.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
These are supported by nbd-server and by the NBD server that QEMU
embeds for live image access.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Move the code to an external function, and structure it to prepare
the addition of new features in the next few patches.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
We've already scrubbed for comparisons of 'uid_t == -1' (which fail
on platforms where uid_t is a u16), but another one snuck in.
* src/util/virutil.c (virSetUIDGIDWithCaps): Correct uid comparison.
* cfg.mk (sc_prohibit_risky_id_promotion): New rule.
QEMU added -drive in 2007, and NBD in 2008. Both appeared first in
release 0.10.0. Thus the code to support network disks without -drive
is dead, and in fact it incorrectly escapes commas. Drop it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When getting CPUs' information, it assumes that CPU indexes
are not contiguous. But for ppc64 platform, CPU indexes are not
contiguous because SMT is needed to be disabled, so CPU information
is not right on ppc64 and vpuinfo, vcpupin can't work corretly.
This patch is to remove the assumption to be compatible with ppc64.
Test:
4 vcpus are assigned to one VM and execute vcpuinfo command.
Without patch: There is only one vcpu informaion can be listed.
With patch: All vcpus' information can be listed correctly.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
This patch adds auditing of resources used by Virtio RNG devices. Only
resources on the local filesystems are audited.
The audit logs look like:
For the 'random' backend:
type=VIRT_RESOURCE msg=audit(1363099126.643:31): pid=995252 uid=0 auid=4294967295 ses=4294967295 msg='virt=kvm resrc=rng reason=start vm="qcow-test" uuid=118733ed-b658-3e22-a2cb-4fe5cb3ddf79 old-rng="?" new-rng="/dev/random": exe="/home/pipo/libvirt/daemon/.libs/libvirtd" hostname=? addr=? terminal=pts/0 res=success'
For local character device source:
type=VIRT_RESOURCE msg=audit(1363100164.240:96): pid=995252 uid=0 auid=4294967295 ses=4294967295 msg='virt=kvm resrc=rng reason=start vm="qcow-test" uuid=118733ed-b658-3e22-a2cb-4fe5cb3ddf79 old-rng="?" new-rng="/tmp/unix.sock": exe="/home/pipo/libvirt/daemon/.libs/libvirtd" hostname=? addr=? terminal=pts/0 res=success'
Newer versions of QEMU support virtio-scsi and virtio-rng devices
on the virtio-s390 and ccw buses. Adding capability detection,
address assignment and command line generation for that.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
QEMU_CAPS_VIRTIO_SCSI_PCI implies that virtio-scsi is only supported
for the PCI bus, which is not the case. Remove the _PCI suffix.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
My commit 7a2e845a86 (and its
prerequisites) managed to effectively ignore the
clear_emulator_capabilities setting in qemu.conf (visible in the code
as the VIR_EXEC_CLEAR_CAPS flag when qemu is being exec'ed), with the
result that the capabilities are always cleared regardless of the
qemu.conf setting. This patch fixes it by passing the flag through to
virSetUIDGIDWithCaps(), which uses it to decide whether or not to
clear existing capabilities before adding in those that were
requested.
Note that the existing capabilities are *always* cleared if the new
process is going to run as non-root, since the whole point of running
non-root is to have the capabilities removed (it's still possible to
maintain individual capabilities as needed using the capBits argument
though).
Multi-head QXL support is so useful that distros have started to
backport it to qemu earlier than 1.2. After discussion with
Alon Levy, we determined that the existence of the qxl-vga.surfaces
property is a reliable indicator of whether '-device qxl-vga' works,
or whether we have to stick to the older '-vga qxl'. I'm leaving
in the existing check for QEMU_CAPS_DEVICE_VIDEO_PRIMARY tied to
qemu 1.2 and newer (in case qemu is built without qxl support),
but for those distros that backport qxl, this additional capability
check will allow the correct command line for both RHEL 6.3 (which
lacks the feature) and RHEL 6.4 (where qemu still claims to be
version 0.12.2.x, but has backported multi-head qxl).
* src/qemu/qemu_capabilities.c (virQEMUCapsObjectPropsQxlVga): New
property test.
(virQEMUCapsExtractDeviceStr): Probe for backport of new
capability to qemu earlier than 1.2.
* tests/qemuhelpdata/qemu-kvm-1.2.0-device: Update test.
* tests/qemuhelpdata/qemu-1.2.0-device: Likewise.
* tests/qemuhelpdata/qemu-kvm-0.12.1.2-rhel62-beta-device:
Likewise.
The src/lxc/lxc_*_dispatch.h files only had deps on the
RPC generator script & the XDR definition file. So when
the Makefile.am args passed to the generator were change,
the disaptch code was not re-generated. This caused a
build failure
CC libvirt_lxc-lxc_controller.o
lxc/lxc_controller.c: In function 'virLXCControllerSetupServer':
lxc/lxc_controller.c:718:47: error: 'virLXCMonitorProcs' undeclared (first use in this function)
lxc/lxc_controller.c:718:47: note: each undeclared identifier is reported only once for each function it appears in
lxc/lxc_controller.c:719:47: error: 'virLXCMonitorNProcs' undeclared (first use in this function)
make[3]: *** [libvirt_lxc-lxc_controller.o] Error 1
For added fun, the generated files were not listed in
CLEANFILES, so only a 'git clean -f' would fix the build
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The naming used in the RPC protocols for the LXC monitor and
lock daemon confused the script used to generate systemtap
helper functions. Rename the LXC monitor protocol symbols to
reduce confusion. Adapt the gensystemtap.pl script to cope
with the LXC monitor / lock daemon naming conversions.
This has no functional impact on RPC wire protocol, since
names are only used in the C layer
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When converting to virObject, the probes on the 'Free' functions
were removed on the basis that there is a probe on virObjectFree
that suffices. This puts a burden on people writing probe scripts
to identify which object is being dispose. This adds back probes
in the 'Dispose' functions and updates the rpc monitor systemtap
example to use them
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Normally libvirtd should run with a SELinux label
system_u:system_r:virtd_t:s0-s0:c0.c1023
If a user manually runs libvirtd though, it is sometimes
possible to get into a situation where it is running
system_u:system_r:init_t:s0
The SELinux security driver isn't expecting this and can't
parse the security label since it lacks the ':c0.c1023' part
causing it to complain
internal error Cannot parse sensitivity level in s0
This updates the parser to cope with this, so if no category
is present, libvirtd will hardcode the equivalent of c0.c1023.
Now this won't work if SELinux is in Enforcing mode, but that's
not an issue, because the user can only get into this problem
if in Permissive mode. This means they can now start VMs in
Permissive mode without hitting that parsing error
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Pull the code which parses the current process MCS range
out of virSecuritySELinuxMCSFind and into a new method
virSecuritySELinuxMCSGetProcessRange.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The body of the loop in virSecuritySELinuxMCSFind would
directly 'return NULL' on OOM, instead of jumping to the
cleanup label. This caused a leak of several local vars.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If an LXC domain failed to start because of a bogus SELinux
label, virLXCProcessStart would call VIR_CLOSE(0) by mistake.
This is because the code which initializes the member of the
ttyFDs array to -1 got moved too far away from the place where
the array is first allocated.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When opening a stream to a device which is a TTY, that device
may become the controlling TTY of libvirtd, if libvirtd was
daemonized. This in turn means when the other end of the stream
closes, libvirtd gets SIGHUP, causing it to reload its config.
Prevent this by forcing O_NOCTTY on all streams that are opened
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Qemu's implementation of virtio RNG supports rate limiting of the
entropy used. This patch exposes the option to tune this functionality.
This patch is based on qemu commit 904d6f588063fb5ad2b61998acdf1e73fb4
The rate limiting is exported in the XML as:
<devices>
...
<rng model='virtio'>
<rate bytes='123' period='1234'/>
<backend model='random'/>
</rng>
...
In debug mode, the bug failed to start vm
error: Failed to start domain rhel5u9
error: internal error Out of space while reading console log output:
...
We didn't yet expose the virtio device attach and detach functionality
for s390 domains as the device hotplug was very limited with the old
virtio-s390 bus. With the CCW bus there's full hotplug support for
virtio devices in QEMU, so we are adding this to libvirt too.
Since the virtio hotplug isn't limited to PCI anymore, we change the
function names from xxxPCIyyy to xxxVirtioyyy, where we handle all
three virtio bus types.
Signed-off-by: J.B. Joret <jb@linux.vnet.ibm.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
This commit adds the QEMU driver support for CCW addresses. The
current QEMU only allows virtio devices to be attached to the
CCW bus. We named the new capability indicating that support
QEMU_CAPS_VIRTIO_CCW accordingly.
The fact that CCW devices can only be assigned to domains with a
machine type of s390-ccw-virtio requires a few extra checks for
machine type in qemu_command.c on top of querying
QEMU_CAPS_VIRTIO_{CCW|S390}.
The majority of the new functions deals with CCW address generation
and management.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Add necessary handling code for the new s390 CCW address type to
virDomainDeviceInfo. Further, introduce memory management, XML
parsing, output formatting and range validation for the new
virDomainDeviceCCWAddress type.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
In some startup failure modes, the fuse thread may get itself
wedged. This will cause the entire libvirt_lxc process to
hang trying to the join the thread. There is no compelling
reason to wait for the thread to exit if the whole process
is exiting, so just daemonize the fuse thread instead.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A number of symbols are only present when GNUTLS is enabled.
Thus we must use a separate libvirt_gnutls.syms file for them
instead of libvirt_private.syms
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virDomainLxcEnterNamespace method mistakenly uses
virCheckFlags, which returns immediately instead of
virCheckFlagsGoto which jumps to the error cleanup
patch where there is a virDispatchError call
The virDomainGetSecurityLabel method is currently (mistakenly)
showing the label of the libvirt_lxc process:
...snip...
Security model: selinux
Security DOI: 0
Security label: system_u:system_r:virtd_t:s0-s0:c0.c1023 (permissive)
when it should be showing the init process label
...snip...
Security model: selinux
Security DOI: 0
Security label: system_u:system_r:svirt_t:s0:c724,c995 (permissive)
Add a new virDomainLxcEnterSecurityLabel() function as a
counterpart to virDomainLxcEnterNamespaces(), which can
change the current calling process to have a new security
context. This call runs client side, not in libvirtd
so we can't use the security driver infrastructure.
When entering a namespace, the process spawned from virsh
will default to running with the security label of virsh.
The actual desired behaviour is to run with the security
label of the container most of the time. So this changes
virsh lxc-enter-namespace command to invoke the
virDomainLxcEnterSecurityLabel method.
The current behaviour is:
LABEL PID TTY TIME CMD
system_u:system_r:svirt_lxc_net_t:s0:c0.c1023 1 pts/0 00:00:00 systemd
system_u:system_r:svirt_lxc_net_t:s0:c0.c1023 3 pts/1 00:00:00 sh
system_u:system_r:svirt_lxc_net_t:s0:c0.c1023 24 ? 00:00:00 systemd-journal
system_u:system_r:svirt_lxc_net_t:s0:c0.c1023 29 ? 00:00:00 dhclient
staff_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 47 ? 00:00:00 ps
Note the ps command is running as unconfined_t, After this patch,
The new behaviour is this:
virsh -c lxc:/// lxc-enter-namespace dan -- /bin/ps -eZ
LABEL PID TTY TIME CMD
system_u:system_r:svirt_lxc_net_t:s0:c0.c1023 1 pts/0 00:00:00 systemd
system_u:system_r:svirt_lxc_net_t:s0:c0.c1023 3 pts/1 00:00:00 sh
system_u:system_r:svirt_lxc_net_t:s0:c0.c1023 24 ? 00:00:00 systemd-journal
system_u:system_r:svirt_lxc_net_t:s0:c0.c1023 32 ? 00:00:00 dhclient
system_u:system_r:svirt_lxc_net_t:s0:c0.c1023 38 ? 00:00:00 ps
The '--noseclabel' flag can be used to skip security labelling.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
With our recent patch (1715c83b5f) we thrive to get the correct
number of maximal VCPUs. However, we are using a constant from
linux/kvm.h which may be not defined in every distro. Hence, we
should guard usage of the constant with ifdef preprocessor
directive. This was introduced in kernel:
commit 8c3ba334f8588e1d5099f8602cf01897720e0eca
Author: Sasha Levin <levinsasha928@gmail.com>
Date: Mon Jul 18 17:17:15 2011 +0300
KVM: x86: Raise the hard VCPU count limit
The patch raises the hard limit of VCPU count to 254.
This will allow developers to easily work on scalability
and will allow users to test high VCPU setups easily without
patching the kernel.
To prevent possible issues with current setups, KVM_CAP_NR_VCPUS
now returns the recommended VCPU limit (which is still 64) - this
should be a safe value for everybody, while a new KVM_CAP_MAX_VCPUS
returns the hard limit which is now 254.
$ git desc 8c3ba334f
v3.1-rc7-48-g8c3ba33
The virCaps structure gathered a ton of irrelevant data over time that.
The original reason is that it was propagated to the XML parser
functions.
This patch aims to create a new data structure virDomainXMLConf that
will contain immutable data that are used by the XML parser. This will
allow two things we need:
1) Get rid of the stuff from virCaps
2) Allow us to add callbacks to check and add driver specific stuff
after domain XML is parsed.
This first attempt removes pointers to private data allocation functions
to this new structure and update all callers and function that require
them.
Currently the server determines whether authentication of clients
is complete, by checking whether an identity is set. This patch
removes that lame hack and replaces it with an explicit method
for changing the client auth code
* daemon/remote.c: Update for new APis
* src/libvirt_private.syms, src/rpc/virnetserverclient.c,
src/rpc/virnetserverclient.h: Remove virNetServerClientGetIdentity
and virNetServerClientSetIdentity, adding a new method
virNetServerClientSetAuth.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a virThreadCancel function. This functional is inherently
dangerous and not something we want to use in general, but
integration with SELinux requires that we provide this stub.
We leave out any Win32 impl to discourage further use and
because obviously SELinux isn't enabled on Win32
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When setting up disks with loop devices for LXC, one of the
switch cases was missing a 'break' causing it to fallthrough
to an error condition.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
At least one caller may call qemuSharedDiskEntryFree with NULL as the
first argument. Let's make the function similar to other *Free functions
and do nothing in such case.
otherwise we crash with
#0 virUSBDeviceListFind (list=0x0, dev=dev@entry=0x8193d70) at util/virusb.c:526
#1 0xb1a4995b in virLXCPrepareHostdevUSBDevices (driver=driver@entry=0x815d9a0, name=0x815dbf8 "debian-700267", list=list@entry=0x81d8f08) at lxc/lxc_hostdev.c:88
#2 0xb1a49fce in virLXCPrepareHostUSBDevices (def=0x8193af8, driver=0x815d9a0) at lxc/lxc_hostdev.c:261
#3 virLXCPrepareHostDevices (driver=driver@entry=0x815d9a0, def=0x8193af8) at lxc/lxc_hostdev.c:328
#4 0xb1a4c5b1 in virLXCProcessStart (conn=0x817d3f8, driver=driver@entry=0x815d9a0, vm=vm@entry=0x8190908, autoDestroy=autoDestroy@entry=false, reason=reason@entry=VIR_DOMAIN_RUNNING_BOOTED)
at lxc/lxc_process.c:1068
#5 0xb1a57e00 in lxcDomainStartWithFlags (dom=dom@entry=0x815e460, flags=flags@entry=0) at lxc/lxc_driver.c:1014
#6 0xb1a57fc3 in lxcDomainStart (dom=0x815e460) at lxc/lxc_driver.c:1046
#7 0xb79c8375 in virDomainCreate (domain=domain@entry=0x815e460) at libvirt.c:8450
#8 0x08078959 in remoteDispatchDomainCreate (args=0x81920a0, rerr=0xb65c21d0, client=0xb0d00490, server=<optimized out>, msg=<optimized out>) at remote_dispatch.h:1066
#9 remoteDispatchDomainCreateHelper (server=0x80c4928, client=0xb0d00490, msg=0xb0d005b0, rerr=0xb65c21d0, args=0x81920a0, ret=0x815d208) at remote_dispatch.h:1044
#10 0xb7a36901 in virNetServerProgramDispatchCall (msg=0xb0d005b0, client=0xb0d00490, server=0x80c4928, prog=0x80c6438) at rpc/virnetserverprogram.c:432
#11 virNetServerProgramDispatch (prog=0x80c6438, server=server@entry=0x80c4928, client=0xb0d00490, msg=0xb0d005b0) at rpc/virnetserverprogram.c:305
#12 0xb7a300a7 in virNetServerProcessMsg (msg=<optimized out>, prog=<optimized out>, client=<optimized out>, srv=0x80c4928) at rpc/virnetserver.c:162
#13 virNetServerHandleJob (jobOpaque=0xb0d00510, opaque=0x80c4928) at rpc/virnetserver.c:183
#14 0xb7924f98 in virThreadPoolWorker (opaque=opaque@entry=0x80a94b0) at util/virthreadpool.c:144
#15 0xb7924515 in virThreadHelper (data=0x80a9440) at util/virthreadpthread.c:161
#16 0xb7887c39 in start_thread (arg=0xb65c2b70) at pthread_create.c:304
#17 0xb77eb78e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130
when adding a domain with a usb device. This is Debian bug
http://bugs.debian.org/700267
By current implementation, network inbound is required in order
to use 'floor' for guaranteeing minimal throughput. This is so,
because we want user to tell us the maximal throughput of the
network instead of finding out ourselves (and detect bogus values
in case of virtual interfaces). However, we are nowadays
requiring this only on documentation level. So if user starts a
domain with 'floor' set on one its interfaces, we silently ignore
the setting. We should error out instead.
'virsh capabilities' will now include a new <memory> element
per <cell> of the topology, as in:
<topology>
<cells num='2'>
<cell id='0'>
<memory unit='KiB'>12572412</memory>
<cpus num='12'>
...
</cell>
Signed-off-by: Eric Blake <eblake@redhat.com>
This fixes the build on Debian Wheezy which otherwise fails with:
CC libvirt_driver_lxc_impl_la-lxc_process.lo
lxc/lxc_process.c: In function 'virLXCProcessGetNsInode':
lxc/lxc_process.c:648:5: error: implicit declaration of function 'stat' [-Werror=implicit-function-declaration]
lxc/lxc_process.c:648:5: error: nested extern declaration of 'stat' [-Werror=nested-externs]
cc1: all warnings being treated as errors
When there are two concurrent threads, we may dereference a NULL
pointer, even though it has been checked before:
1. Thread1: starts executing qemuDomainBlockStatsFlags() with nparams != 0.
It finds given disk and successfully pass check for disk->info.alias
not being NULL.
2. Thread2: starts executing qemuDomainDetachDeviceFlags() on the very same
disk as Thread1 is working on.
3. Thread1: gets to qemuDomainObjBeginJob() where it sets a job on a
domain.
4. Thread2: also tries to set a job. However, we are not guaranteed which
thread wins. So assume it's Thread2 who can continue.
5. Thread2: does the actual detach and frees disk->info.alias
6. Thread2: quits the job
7. Thread1: now successfully acquires the job, and accesses a NULL pointer.
Rename AppArmorSetImageFDLabel to AppArmorSetFDLabel which could
be used as a common function for *ALL* fd relabelling in Linux.
In apparmor profile for specific vm with uuid cdbebdfa-1d6d-65c3-be0f-fd74b978a773
Path: /etc/apparmor.d/libvirt/libvirt-cdbebdfa-1d6d-65c3-be0f-fd74b978a773.files
The last line is for the tapfd relabelling.
# DO NOT EDIT THIS FILE DIRECTLY. IT IS MANAGED BY LIBVIRT.
"/var/log/libvirt/**/rhel6qcow2.log" w,
"/var/lib/libvirt/**/rhel6qcow2.monitor" rw,
"/var/run/libvirt/**/rhel6qcow2.pid" rwk,
"/run/libvirt/**/rhel6qcow2.pid" rwk,
"/var/run/libvirt/**/*.tunnelmigrate.dest.rhel6qcow2" rw,
"/run/libvirt/**/*.tunnelmigrate.dest.rhel6qcow2" rw,
"/var/lib/libvirt/images/rhel6u3qcow2.img" rw,
"/dev/tap45" rw,
By using a loopback device, disks backed by plain files can
be made available to LXC containers. We make no attempt to
auto-detect format if <driver type="raw"/> is not set,
instead we unconditionally treat that as meaning raw. This
is to avoid the security issues inherent with format
auto-detection
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The current QEMU code for skipping log messages only skips over
'debug' message, switch to virLogProbablyLogMessage to make sure
it skips over all of them
Currently we rely on a VIR_ERROR message being logged by the
virRaiseError function to report LXC startup errors. This gives
the right message, but is rather ugly and can be truncated
if lots of log messages are written. Change the LXC controller
to explicitly print any virErrorPtr message to stderr. Then
change the driver to skip over anything that looks like a log
message.
The result is that this
error: Failed to start domain busy
error: internal error guest failed to start: 2013-03-04 19:46:42.846+0000: 1734: info : libvirt version: 1.0.2
2013-03-04 19:46:42.846+0000: 1734: error : virFileLoopDeviceAssociate:600 : Unable to open /root/disk.raw: No such file or directory
changes to
error: Failed to start domain busy
error: internal error guest failed to start: Unable to open /root/disk.raw: No such file or directory
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When reading log output from QEMU/LXC we need to skip over any
libvirt log messages. Currently the QEMU driver checks for a
fixed string, but this is better done with a regex. Add a method
virLogProbablyLogMessage to do a regex check
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In the LXC container startup code when switching stdio
streams, we call VIR_FORCE_CLOSE on all FDs. This triggers
a huge number of warnings, but we don't see them because
stdio is closed at this point. strace() however shows them
which can confuse people debugging the code. Switch to
VIR_MASS_CLOSE to avoid this
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virNetDevSetupControlFull function was protected by a
conditional on SIOCBRADDBR, which is bogus since it does
not use that symbol. Update the conditionals around all
callers to do stricter checks to ensure we always build
succesfully
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The RHEL4 vintage header files do not define GET_VLAN_VID_CMD.
Conditionally define it in our source, since the kernel can
raise a runtime error if it isn't supported
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The loop.h on RHEL4 is broken and cannot be imported. We already
detect this in configure as a side-effect of looking for whether
LO_FLAGS_AUTOCLEAR is available. We protected the impl with
HAVE_DECL_LO_FLAGS_AUTOCLEAR, but not the header import
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To avoid a clash with daemon() libc API, rename the
'daemon' param in the header file to 'binary'. The
source file already uses the name 'binary'.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
On RHEL-4 vintage one of the header files is polluted causing a
clash between the clone() syscall and the 'clone' parameter in
a libvirt driver API
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit 0df3e89 only touched the header, but the .c file had the
same shadowing potential.
* src/util/viralloc.c (virDeleteElementsN): s/remove/toremove/ to
match the header.
Code that validates the whitelist for the RNG device filename
didn't account for fact that filename may be NULL. This led
to a NULL reference crash. This wasn't caught since the test
suite was not covering this XML syntax
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Resolves the following valgrind error from qemuxml2argvtest:
==20393== 5 bytes in 1 blocks are definitely lost in loss record 2 of 60
==20393== at 0x4A0883C: malloc (vg_replace_malloc.c:270)
==20393== by 0x38D690A167: __vasprintf_chk (in /usr/lib64/libc-2.16.so)
==20393== by 0x4CB0D97: virVasprintf (stdio2.h:210)
==20393== by 0x4CB0E53: virAsprintf (virutil.c:2017)
==20393== by 0x428DC5: qemuAssignDeviceAliases (qemu_command.c:791)
==20393== by 0x41DF93: testCompareXMLToArgvHelper (qemuxml2argvtest.c:151)
==20393== by 0x41F53F: virtTestRun (testutils.c:157)
==20393== by 0x41DA9B: mymain (qemuxml2argvtest.c:885)
==20393== by 0x41FB7A: virtTestMain (testutils.c:719)
==20393== by 0x38D6821A04: (below main) (in /usr/lib64/libc-2.16.so)
==20393==
From qemu_command.c/line 791:
if (def->rng) {
if (virAsprintf(&def->rng->info.alias, "rng%d", 0) < 0)
goto no_memory;
}
This patch plugs two memory leaks, removes some useless and confusing
constructs and renames renames "cleanup" label as "error" since it is
only used for error path rather then being common for both success and
error paths.
1. The virObjectLock() call was unconditional, but Unlock was conditional
on vm being valid. Removed the check
2. A call to virDomainEventNewFromObj() isn't guaranteed to return an
event - that check needs to be made prior to libxlDomainEventQueue()
of the event. Did not add libxlDriverLock/Unlock around the call since
some callers already have lock taken
3. Need to initialize fd = -1 in libxlDoDomainSave() since we can jump
to cleanup before it's set.
4. Missing break;'s in libxlDomainModifyDeviceFlags() for case
LIBXL_DEVICE_UPDATE. The default: case would report an error
A value which is equal to a integer maximum such as LLONG_MAX is
a valid integer value.
The patch fix the following error:
1, virsh memtune vm --swap-hard-limit -1
2, virsh start vm
In debug mode, it shows error like:
virScaleInteger:1813 : numerical overflow:\
value too large: 9007199254740991KiB
This patch adds proper error reporting if parsing of cputune parameters
fails due to incorrect values provided by the user. Previously no errors
were reported in such a case and the failure was silently ignored.
Make the iterator function usable in the next patches. Also refactor
some parts to avoid strcmp if not necessary.
This commit tweaks and shadows the change that was done in commit
babe7dada0 and was needed after the
support for multiple console devices was added. Historically the first
<console> element is alias for the <serial> device.
There is some controversy[1] on the qemu list on whether qemu should
have ever allowed arbitrary file name passthrough, or whether it
should be restricted to JUST /dev/random and /dev/hwrng. It is
always easier to add support for additional filenames than it is
to remove support for something once released, so this patch
restricts libvirt 1.0.3 (where the virtio-random backend was first
supported) to just the two uncontroversial names, letting us defer
to a later date any decision on whether supporting arbitrary files
makes sense. Additionally, since qemu 1.4 does NOT support
/dev/fdset/nnn fd passthrough for the backend, limiting to just
two known names means that we don't get tempted to try fd
passthrough where it won't work.
[1]https://lists.gnu.org/archive/html/qemu-devel/2013-03/threads.html#00023
* src/conf/domain_conf.c (virDomainRNGDefParseXML): Only allow
/dev/random and /dev/hwrng.
* docs/schemas/domaincommon.rng: Flag invalid files.
* docs/formatdomain.html.in (elementsRng): Document this.
* tests/qemuxml2argvdata/qemuxml2argv-virtio-rng-random.args:
Update test to match.
* tests/qemuxml2argvdata/qemuxml2argv-virtio-rng-random.xml:
Likewise.
19c6ad9a (qemu: Refactor qemuDomainSetMemoryParameters) introduced
a new macro, VIR_GET_LIMIT_PARAMETER(PARAM, VALUE). But if statement
in the macro is not correct and so set_XXXX flags are set to false
in the wrong. As a result, libvirt ignores all memtune parameters.
This patch fixes the conditional expression to work correctly.
Signed-off-by: Satoru Moriya <satoru.moriya@hds.com>
BZ:https://bugzilla.redhat.com/show_bug.cgi?id=912021
Without error handler set, virDefaultErrorFunc will be called, the
error message is prefixed with "libvir:". It become a little better
by using prefix "libvirt:" when working with upper application.
For example:
1, stop libvirtd daemon
2, run virt-top.
libvir: XML-RPC error : Failed to connect \
socket to '/var/run/libvirt/libvirt-sock-ro': \
No such file or directory
libvirt: VIR_ERR_SYSTEM_ERROR: VIR_FROM_RPC: \
Failed to connect socket to '/var/run/libvirt/libvirt-sock-ro': \
No such file or directory
Commit f506a4c1 changed virSetUIDGID() to be a noop
when uid/gid are -1, while it used to be a noop when
they are <= 0.
The changes in this commit broke creating new VMs in GNOME Boxes
as qemuDomainCheckDiskPresence gets called during domain creation/startup,
which in turn calls virFileAccessibleAs which fails after calling
virSetUIDGID(0, 0) (Boxes uses session libvirtd). virSetUIDGID is called with
(0, 0) as these are the default user/group values in virQEMUDriverConfig
for session libvirtd.
This commit changes virQEMUDriverConfigNew to use -1 as the unpriviledged
uid/gid. I've also looked at the various places where cfg->user is used,
and they all seem to handle -1 correctly.
Currently, after we removed the qemu driver lock, it may happen
that two or more threads will start up a machine with macvlan and
race over virNetDevMacVLanCreateWithVPortProfile(). However,
there's a racy section in which we are generating a sequence of
possible device names and detecting if they exits. If we found
one which doesn't we try to create a device with that name.
However, the other thread is doing just the same. Assume it will
succeed and we must therefore fail. If this happens more than 5
times (which in massive parallel startup surely will) we return
-1 without any error reported. This patch is a simple hack to
both of these problems. It introduces a mutex, so only one thread
will enter the section, and if it runs out of possibilities,
error is reported. Moreover, the number of retries is raised to 20.
This reverts the hack done in
commit 568a6cda27
Author: Jiri Denemark <jdenemar@redhat.com>
Date: Fri Feb 15 15:11:47 2013 +0100
qemu: Avoid deadlock in autodestroy
since we now have a fix which avoids the deadlock scenario
entirely
There is a lock ordering problem in the QEMU close callback
APIs.
When starting a guest we have a lock on the VM. We then
set a autodestroy callback, which acquires a lock on the
close callbacks.
When running auto-destroy, we obtain a lock on the close
callbacks, then run each callbacks - which obtains a lock
on the VM.
This causes deadlock if anyone tries to start a VM, while
autodestroy is taking place.
The fix is to do autodestroy in 2 phases. First obtain
all the callbacks and remove them from the list under
the close callback lock. Then invoke each callback
from outside the close callback lock.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When the auto-destroy callback runs it is supposed to return
NULL if the virDomainObjPtr is no longer valid. It was not
doing this for transient guests, so we tried to virObjectUnlock
a mutex which had been freed. This often led to a crash.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
qemuProcessStart expects to be run with a job already set and every
caller except for qemuMigrationPrepareAny use it correctly. This bug can
be observed in libvirtd logs during incoming migration as
warning : qemuDomainObjEnterMonitorInternal:979 : This thread seems
to be the async job owner; entering monitor without asking for a
nested job is dangerous
With the apparmor security driver enabled, qemu instances fail
to start
# grep ^security_driver /etc/libvirt/qemu.conf
security_driver = "apparmor"
# virsh start test-kvm
error: Failed to start domain test-kvm
error: internal error security label already defined for VM
The model field of virSecurityLabelDef object is always populated
by virDomainDefGetSecurityLabelDef(), so remove the check for a
NULL model when verifying if a label is already defined for the
instance.
Checking for a NULL model and populating it later in
AppArmorGenSecurityLabel() has been left in the code to be
consistent with virSecuritySELinuxGenSecurityLabel().
As pointed out in
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1034661
The sentence
"The function of PCI device addresses must less than 8"
does not quite make sense. Update that to read
"The function of PCI device addresses must be less than 8"
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Currently, qemuDomainShutdownFlags() chooses the agent method of
shutdown whenever the agent is configured. However, this
assumption is not enough as the guest agent may be unresponsive
at the moment. So unless guest agent method has been explicitly
requested, we should fall back to the ACPI method.
The unitialized local variable qemuVersion can cause an random value
to be returned for the hypervisor version, observable with virsh version.
Introduced by commit b46f7f4a0b
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
disk->src is still used for disks->hosts->name, do not free it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
The QEMU driver has a list of devices nodes that are whitelisted
for all guests. The kernel has recently started returning an
error if you try to whitelist a device which does not exist.
This causes a warning in libvirt logs and an audit error for
any missing devices. eg
2013-02-27 16:08:26.515+0000: 29625: warning : virDomainAuditCgroup:451 : success=no virt=kvm resrc=cgroup reason=allow vm="vm031714" uuid=9d8f1de0-44f4-a0b1-7d50-e41ee6cd897b cgroup="/sys/fs/cgroup/devices/libvirt/qemu/vm031714/" class=path path=/dev/kqemu rdev=? acl=rw
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The code for putting the emulator threads in a separate cgroup
would spam the logs with warnings
2013-02-27 16:08:26.731+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 3
2013-02-27 16:08:26.731+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 4
2013-02-27 16:08:26.732+0000: 29624: warning : virCgroupMoveTask:887 : no vm cgroup in controller 6
This is because it has only created child cgroups for 3 of the
controllers, but was trying to move the processes from all the
controllers. The fix is to only try to move threads in the
controllers we actually created. Also remove the warning and
make it return a hard error to avoid such lazy callers in the
future.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virQEMUCloseCallbacksRunOne method was passing a uuid string
to virDomainObjListFindByUUID, when it actually expected to get
a raw uuid buffer. This was not caught by the compiler because
the method was using a 'void *uuid' instead of first casting
it to the expected type.
This regression was accidentally caused by refactoring in
commit 568a6cda27
Author: Jiri Denemark <jdenemar@redhat.com>
Date: Fri Feb 15 15:11:47 2013 +0100
qemu: Avoid deadlock in autodestroy
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=896092 mentions that
qemu 1.4 and earlier only accept a simple start-stop range for
the cpu=... argument of -numa. Libvirt would attempt to use
-numa cpu=1,3 for a disjoint range, which did not work as intended.
Upstream qemu will be adding a new syntax for disjoint cpu ranges
in 1.5; but the design for that syntax is still under discussion
at the time of this patch. So for libvirt 1.0.3, it is safest to
just reject attempts to build an invalid qemu command line; in the
future, we can add a capability bit and translate to the final
accepted design for selecting a disjoint cpu range in numa.
* src/qemu/qemu_command.c (qemuBuildNumaArgStr): Reject disjoint
ranges.
This reverts commit 383ebc4694.
We decided the xml for this feature needed more thought to make sure
we are doing it the best way, in particular wrt option values that
have multiple items.
These two flags in fact are mutually exclusive. Requesting them both
doesn't make any sense regardless of hypervisor driver. Hence, we have
to make it within libvirt.c file instead of fixing it in each driver.
Eugene Marcotte reported that if gcrypt-devel (a prereq of
gnutls-devel) is not present, then compilation fails due to
an unconditional use of <gcrypt.h>.
* src/libvirt.c (includes): Properly guard use of gcrypt.h.
This reverts commit 0bbbd42c30.
The design for this feature is not complete, and may change the
name of the 'schid' attribute. Revert requested by Viktor Mihajlovski.
The parallels storage driver declared some loop variables
inside the for(;;). This is not allowed by libvirt coding
standards
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This change tried to fix a crash with changing CDROM media but
failed to actually do so
commit d0172d2b1b
Author: Osier Yang <jyang@redhat.com>
Date: Tue Feb 19 20:27:45 2013 +0800
qemu: Remove the shared disk entry if the operation is ejecting or updating
It was still accessing disk->src, when the entire 'disk' object
has been free'd already. Even if it weren't free'd, accessing
the 'src' value of virDomainDiskDef is not allowed without
first validating disk->type is file or block. Just remove the
broken code entirely.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
VIR_ERR_NO_CONNECT already contains "no connection driver available".
This patch changes:
no connection driver available for No connection for URI hello
to:
no connection driver available for hello
Bug: https://bugzilla.redhat.com/show_bug.cgi?id=851413
Currently we call virSetDeviceUnprivSGIO with val == 0 if a block device
has an sgio attribute. But for sgio='filtered', we know that a
kernel with no unpriv_sgio support will always behave as the user
wanted. In this case, there is no need to call the function and
report a (bogus) error.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If virCondInit fails (okay, so that's unlikely), then we end up
attempting a virObjectUnlock() on the cleanup path, even though
we don't hold a lock. This is not guaranteed to be safe. While
at it, I noticed a couple places where we were referencing mon->fd
outside locks.
* src/qemu/qemu_monitor.c (qemuMonitorOpenInternal): Minimize lock
duration. mon->watch doesn't need clean up on error.
(qemuMonitorGetBlockExtent, qemuMonitorBlockResize): Don't
dereference fd outside of lock.
I built without yajl support, and noticed a strange failure message
in qemumonitorjsontest:
2013-02-22 16:12:37.503+0000: 19812: error : virJSONValueToString:1119 : internal error No JSON parser implementation is available
2013-02-22 16:12:37.503+0000: 19812: error : qemuMonitorJSONCommandWithFd:253 : out of memory
While a later patch will fix the test to skip when json is not present,
this patch avoids overriding the more useful error message from
virJSONValueToString returning NULL.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONCommandWithFd):
Don't override message.
(qemuMonitorJSONCheckError): Don't print NULL.
* src/qemu/qemu_agent.c (qemuAgentCommand): Don't override message.
(qemuAgentCheckError): Don't print NULL.
(qemuAgentArbitraryCommand): Properly fail on OOM.
The new TypedParam helper APIs allow to simplify this function
significantly.
This patch integrates the fix in 75e5bec97b
by correctly ordering the setting functions instead of reordering the
parameters.
The bridge device was showing the vnet devices created for the domains
as connected to the bridge. libvirt should only show host devices when
trying to get the interface definition rather than the domain devices as
well.
uid_t and gid_t are opaque types, ranging from s32 to u32 to u64.
Explicitly cast the magic -1 to the appropriate type.
Signed-off-by: Philipp Hahn <hahn@univention.de>
uid_t and gid_t are opaque types, ranging from s32 to u32 to u64.
Explicitly cast them to unsigned int for printing.
Signed-off-by: Philipp Hahn <hahn@univention.de>
The uid_t|gid_t values are explicitly casted to "unsigned long", but the
printf() still used "%d", which is for signed values.
Change the format to "%u".
Signed-off-by: Philipp Hahn <hahn@univention.de>
This patch adds a new capability bit QEMU_CAPS_OBJECT_RNG_EGD and code
to support the egd backend for the VirtIO RNG device.
The device is added by 3 qemu command line options:
-chardev socket,id=charrng0,host=1.2.3.4,port=1234 (communication
backend)
-object rng-egd,chardev=charrng0,id=rng0 (RNG protocol client)
-device virtio-rng-pci,rng=rng0,bus=pci.0,addr=0x4 (the RNG device)
This patch implements support for the virtio-rng-pci device and the
rng-random backend in qemu.
Two capabilities bits are added to track support for those:
QEMU_CAPS_DEVICE_VIRTIO_RNG - for the device support and
QEMU_CAPS_OBJECT_RNG_RANDOM - for the backend support.
qemu is invoked with these additional parameters if the device is
enabled:
-object rng-random,id=rng0,filename=/test/phile (to add the backend)
-device virtio-rng-pci,rng=rng0,bus=pci.0,addr=0x4 (to add the device)
This patch adds basic configuration support for the RNG device
supporting the virtio model with the "random" and "egd" backend types as
described in the schema in the previous patch.
This patch adds a fake switch statement to force the compiler to warn
after a new device type was added. This should remind the contributor to
add the new device also to this iterator function.
Originally, only a host name was used to associate a
DHCPv6 request with a specific IPv6 address. Further testing
demonstrates that this is an unreliable method and, instead,
a client-id or DUID needs to be used. According to DHCPv6
standards, this id can be a duid-LLT, duid-LL, or duid-UUID
even though dnsmasq will accept almost any text string.
Although validity checking of a specified string makes sure it is
hexadecimal notation with bytes separated by colons, there is no
rigorous check to make sure it meets the standard.
Documentation and schemas have been updated.
Signed-off-by: Gene Czarcinski <gene@czarc.net>
Signed-off-by: Laine Stump <laine@laine.org>
For really old qemu-img binaries which do not support specifying
the format of the backing file, display a DEBUG message instead of
INFO that this can't be done.
This function does the source part of NBD magic. It
invokes drive-mirror on each non shared and RW disk with
a source and wait till the mirroring process completes.
When it does we can proceed with migration.
Currently, an active waiting is done: every 500ms libvirt
asks qemu if block-job is finished or not. However, once
the job finishes, qemu doesn't report its progress so we
can only assume if the job finished successfully or not.
The better solution would be to listen to the event which
is sent as soon as the job finishes. The event does
contain the result of job.
We need to start NBD server and feed it with all non-<shared/>,
RW and source-full disks. Moreover, with new virPortAllocator we
must ensure the borrowed port for NBD server will be returned if
either migration completes or qemu process is torn down.
This will be used with new migration scheme.
This patch creates basically just monitor stub
functions. Wiring them into something useful
is done in later patches.
This will be used with new migration scheme.
This patch creates basically just monitor stub
functions. Wiring them into something useful
is done in later patches.
This migration cookie is meant for two purposes. The first is to be sent
in begin phase from source to destination to let it know we support new
implementation of VIR_MIGRATE_NON_SHARED_{DISK,INC} so destination can
start NBD server. Then, the second purpose is, destination can let us
know, on which port the NBD server is running.
This patch adds support for a new <option>-Tag in the <dhcp> block of
network configs, based on a subset of the fifth proposal by Laine
Stump in the mailing list discussion at
https://www.redhat.com/archives/libvir-list/2012-November/msg01054.html.
Any such defined option will result in a dhcp-option=<number>,"<value>"
statement in the generated dnsmasq configuration file.
Currently, DHCP options can be specified by number only and there is
no whitelisting or blacklisting of option numbers, which should
probably be added.
Signed-off-by: Pieter Hollants <pieter@hollants.com>
Signed-off-by: Laine Stump <laine@laine.org>
The bfree and blocks fields are supposed to be in units of frsize. We were
calculating capacity correctly using those units, but the available
calculation was using bsize instead. Most file systems report these as the
same value specifically because many programs are buggy, but that is no
reason to rely on that behavior, or to behave inconsistently.
This bug has been present since e266ded (2008) and aa296e6c, when the code
was originally introduced (the latter via cut and paste).
Signed-off-by: Sage Weil <sage@newdream.net>
On FreeBSD, I got a 'make check' failure:
GEN check-symsorting
Symbol block at ./libvirt_atomic.syms:4: viratomic.h not found
* src/Makefile.am (SYM_FILES): New define.
(check-symsorting): Check on all symfiles, even when not used.
* src/libvirt_atomic.syms: Fix offender.
As a side effect, this also fixes reporting disk migration process.
It was added to memory migration progress, which was wrong. Disk
progress has dedicated fields in virDomainJobInfo structure.
remoteDeserializeTypedParameters can now be called with either
preallocated params array (size of which is announced by nparams) or it
can allocate params array according to the number of parameters received
from the server.
Refactored the interface device type identification to make it more
clear about the operations. Add support for udev devtype to detect
VLANs on Linux 3.7 and newer. Move VLAN detection based on device
name to fallback case.
Mechanical move to break up udevIfaceGetIfaceDef() into different
helpers for each of the interface types to hopefully make the code
easier to follow. This moves the vlan code to
udevIfaceGetIfaceDefVlan().
Based on feedback from Laine Stump, improve a number of the error
handling cases to report the issue to the user instead of not generating
data or giving vague errors. Added the bridge device name to every error
message as well to make it clear which bridge failed.
Mechanical move to break up udevIfaceGetIfaceDef() into different
helpers for each of the interface types to hopefully make the code
easier to follow. This moves the bridge code to
udevIfaceGetIfaceDefBridge().
https://bugzilla.redhat.com/show_bug.cgi?id=896685 points out
a regression caused by commit 38c4a9c - libvirt only labels
the backing chain if the backing chain cache is populated, but
the code to populate the cache was only conditionally performed
if cgroup labeling was necessary.
* src/qemu/qemu_cgroup.c (qemuSetupCgroup): Hoist cache setup...
* src/qemu/qemu_process.c (qemuProcessStart): ...earlier into
caller, where it is now unconditional.
To avoid having to hold the qemu driver lock while iterating through
close callbacks and calling them. This fixes a real deadlock when a
domain which is being migrated from another host gets autodestoyed as a
result of broken connection to the other host.
The max value of number of cpus to compute(id) should not
be equal or greater than max cpu number.
The bug ocurrs when id value is equal to max cpu number which
leads to the off-by-one error in the following for loop.
# virsh cpu-stats guest --start 1
error: Failed to virDomainGetCPUStats()
error: internal error cpuacct parse error
Don't allow interval to be > MAX_INT/1000 in virKeepAliveStart()
Guard against possible overflow in virKeepAliveTimeout() by setting the
timeout to be MAX_INT/1000 since the math following will multiply it by 1000.
Currently, if lzop decompression binary produces a warning, it
doesn't exit with zero status but 2 instead. Terrifying, but
true. However, warnings may be ignored using '--ignore-warn'
command line argument. Moreover, in which case, the exit status
will be zero.
For both AttachDevice and UpdateDevice APIs, if the disk device
is 'cdrom' or 'floppy', the operations could be ejecting, updating,
and inserting. For either ejecting or updating, the shared disk
entry of the original disk src has to be removed, because it's
not useful anymore.
And since the original disk def will be changed, new disk def passed
as argument will be free'ed in qemuDomainChangeEjectableMedia, so
we need to copy the orignal disk def before
qemuDomainChangeEjectableMedia, to use it for qemuRemoveSharedDisk.
The disk def could be free'ed by qemuDomainChangeEjectableMedia,
which can thus cause crash if we reference the disk pointer. On
the other hand, we have to remove the added shared disk entry from
the table on error codepath.
The hash entry is changed from "ref" to {ref, @domains}. With this, the
caller can simply call qemuRemoveSharedDisk, without afraid of removing
the entry belongs to other domains. qemuProcessStart will obviously
benifit from it on error codepath (which calls qemuProcessStop to do
the cleanup).
Based on moving various checking into qemuAddSharedDisk, this
avoids the caller using it in wrong ways. Also this adds two
new checking for qemuCheckSharedDisk (disk device not 'lun'
and kernel doesn't support unpriv_sgio simply returns 0).
Automating a sorting check is the only way to ensure we don't
regress. Suggested by Dan Berrange.
* src/check-symsorting.pl (check_sorting): Add a parameter,
validate that groups are in order, and that files exist.
* src/Makefile.am (check-symsorting): Adjust caller.
* src/libvirt_private.syms: Fix typo.
* src/libvirt_linux.syms: Fix file name.
* src/libvirt_vmx.syms: Likewise.
* src/libvirt_xenxs.syms: Likewise.
* src/libvirt_sasl.syms: Likewise.
* src/libvirt_libssh2.syms: Likewise.
* src/libvirt_esx.syms: Mention file name.
* src/libvirt_openvz.syms: Likewise.
Due to "feature"/"features" nasty typo, any features marked as mandatory
by one side of a migration are silently considered optional by the other
side. The following is the code that formats mandatory features in
migration cookie:
for (i = 0 ; i < QEMU_MIGRATION_COOKIE_FLAG_LAST ; i++) {
if (mig->flagsMandatory & (1 << i))
virBufferAsprintf(buf, " <feature name='%s'/>\n",
qemuMigrationCookieFlagTypeToString(i));
}
Some functions were using virDomainDeviceInfo where virDevicePCIAddress
would suffice. Some were only using integers for slots and functions,
assuming the bus numbers are always 0.
Switch from virDomainDeviceInfoPtr to virDevicePCIAddressPtr:
qemuPCIAddressAsString
qemuDomainPCIAddressCheckSlot
qemuDomainPCIAddressReserveAddr
qemuDomainPCIAddressReleaseAddr
Switch from int slot to virDevicePCIAddressPtr:
qemuDomainPCIAddressReserveSlot
qemuDomainPCIAddressReleaseSlot
qemuDomainPCIAddressGetNextSlot
Deleted functions (they would take the same parameters
as ReserveAddr/ReleaseAddr do now.)
qemuDomainPCIAddressReserveFunction
qemuDomainPCIAddressReleaseFunction
Recent renames were not reflected into the comments of
libvirt_private.syms; furthermore, since we mix private headers from
several directories into this file, knowing where the file lives
can be helpful.
* src/libvirt_private.sym: Reflect recent names.
We pass over the address/port start/end values many times so we put
them in structs.
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Signed-off-by: Laine Stump <laine@laine.org>
Let users set the port range to be used for forward mode NAT:
...
<forward mode='nat'>
<nat>
<port start='1024' end='65535'/>
</nat>
</forward>
...
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Signed-off-by: Laine Stump <laine@laine.org>
Support setting which public ip to use for NAT via attribute
address in subelement <nat> in <forward>:
...
<forward mode='nat'>
<address start='1.2.3.4' end='1.2.3.10'/>
</forward>
...
This will construct an iptables line using:
'-j SNAT --to-source <start>-<end>'
instead of:
'-j MASQUERADE'
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Signed-off-by: Laine Stump <laine@laine.org>
We need to drop the server lock before calling virObjectUnlock(client)
since in case we had the last reference to the client, its dispose
callback would be called and that could possibly try to lock the server
and cause a deadlock. This is exactly what happens when there is only
one QEMU domain running and it is marked to be autodestroyed when the
connection dies. This results in qemuProcessAutoDestroy ->
qemuProcessStop -> virNetServerRemoveShutdownInhibition call sequence,
where the last function locks the server.
The function does not report any errors so there should be no need too
reset an existing error first. Moreover, virTypedParamsFree is mostly
called in cleanup phase where it has the potential to reset any useful
reported earlier.
Gcc lets you do:
int ATTRIBUTE_NONNULL(1) foo(void *param);
int foo(void *param) ATTRIBUTE_NONNULL(1);
int ATTRIBUTE_NONNULL(1) foo(void *param) { ... }
but chokes on:
int foo(void *param) ATTRIBUTE_NONNULL(1) { ... }
However, since commit eefb881, we have intentionally been disabling
ATTRIBUTE_NONNULL because of lame gcc handling of the attribute (that
is, gcc doesn't do decent warning reporting, then compiles code that
mysteriously fails if you break the contract of the attribute, which
is surprisingly easy to do), leaving it on only for Coverity (which
does a much better job of improved static analysis when the attribute
is present).
But completely eliding the macro makes it too easy to write code that
uses the fourth syntax option, if you aren't using Coverity. So this
patch forces us to avoid syntax errors, even when not using the
attribute under gcc. It also documents WHY we disable the warning
under gcc, rather than forcing you to find the commit log.
* src/internal.h (ATTRIBUTE_NONNULL): Expand to empty attribute,
rather than nothing, when on gcc.
The conversion to qemuCaps dropped the ability with qemu{,-kvm} 1.2 and
newer to set the lost tick policy for the PIT. While the
-no-kvm-pit-reinjection option is depreacated, it is still supported at
least through 1.4, it is better to not lose the functionality.
Coverity found the DACGenLabel was checking for mgr == NULL after a
possible dereference; however, in order to get into the function the
virSecurityManagerGenLabel would have already dereferenced sec_managers[i]
so the check was unnecessary. Same check is made in SELinuxGenSecurityLabel.
Between revision 65fb9d49 and before this patch, an upgrade of libvirt while
VMs are running and instantiating iptables filtering rules due to nwfilter
rules, may leave stray iptables rules behind when shutting VMs down.
Left-over iptables rules may look like this:
Chain FP-vnet0 (1 references)
target prot opt source destination
DROP tcp -- 0.0.0.0/0 0.0.0.0/0 tcp spt:122
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0
[...]
Chain libvirt-out (1 references)
target prot opt source destination
FO-vnet0 all -- 0.0.0.0/0 0.0.0.0/0 [goto] PHYSDEV match --physdev-out vnet0
The reason is that the recent nwfilter code only removed filtering rules in
the libvirt-out chain that contain the --physdev-is-bridged parameter.
Older rules didn't match and were not removed.
Note that the user-defined chain FO-vnet0 could not be removed due to the
reference from the rule in libvirt-out.
Often the work around may be done through
service iptables restart
kill -SIGHUP $(pidof libvirtd)
This patch now also removes older libvirt versions' iptables rules.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
If you have a qcow2 file /path1/to/file pointed to by symlink
/path2/symlink, and pass qemu /path2/symlink, then qemu treats
a relative backing file in the qcow2 metadata as being relative
to /path2, not /path1/to. Yes, this means that it is possible
to create a qcow2 file where the choice of WHICH directory and
symlink you access its contents from will then determine WHICH
backing file (if any) you actually find; the results can be
rather screwy, but we have to match what qemu does.
Libvirt and qemu default to creating absolute backing file
names, so most users don't hit this. But at least VDSM uses
symlinks and relative backing names alongside the
--reuse-external flags to libvirt snapshot operations, with the
result that libvirt was failing to follow the intended chain of
backing files, and then backing files were not granted the
necessary sVirt permissions to be opened by qemu.
See https://bugzilla.redhat.com/show_bug.cgi?id=903248 for
more gory details. This fixes a regression introduced in
commit 8250783.
I tested this patch by creating the following chain:
ls /home/eblake/Downloads/Fedora.iso # raw file for base
cd /var/lib/libvirt/images
qemu-img create -f qcow2 \
-obacking_file=/home/eblake/Downloads/Fedora.iso,backing_fmt=raw one
mkdir sub
cd sub
ln -s ../one onelink
qemu-img create -f qcow2 \
-obacking_file=../sub/onelink,backing_fmt=qcow2 two
mv two ..
ln -s ../two twolink
qemu-img create -f qcow2 \
-obacking_file=../sub/twolink,backing_fmt=qcow2 three
mv three ..
ln -s ../three threelink
then pointing my domain at /var/lib/libvirt/images/sub/threelink.
Prior to this patch, I got complaints about missing backing
files; afterwards, I was able to verify that the backing chain
(and hence DAC and SELinux relabels) of the entire chain worked.
* src/util/virstoragefile.h (_virStorageFileMetadata): Add
directory member.
* src/util/virstoragefile.c (absolutePathFromBaseFile): Drop,
replaced by...
(virFindBackingFile): ...better function.
(virStorageFileGetMetadataInternal): Add an argument.
(virStorageFileGetMetadataFromFD, virStorageFileChainLookup)
(virStorageFileGetMetadata): Update callers.
Prior to this patch, we had the callchains:
external users
\-> virStorageFileGetMetadataFromFD
\-> virStorageFileGetMetadataFromBuf
virStorageFileGetMetadataRecurse
\-> virStorageFileGetMetadataFromFD
\-> virStorageFileGetMetadataFromBuf
However, a future patch wants to add an additional parameter to
the bottom of the chain, for use by virStorageFileGetMetadataRecurse,
without affecting existing external callers. Since there is only a
single caller of the internal function, we can repurpose it to fit
our needs, with this patch giving us:
external users
\-> virStorageFileGetMetadataFromFD
\-> virStorageFileGetMetadataInternal
virStorageFileGetMetadataRecurse /
\-> virStorageFileGetMetadataInternal
* src/util/virstoragefile.c (virStorageFileGetMetadataFromFD):
Move most of the guts...
(virStorageFileGetMetadataFromBuf): ...here, and rename...
(virStorageFileGetMetadataInternal): ...to this.
(virStorageFileGetMetadataRecurse): Use internal helper.
virStorageFileGetMetadataFromFD is the only caller of
virStorageFileGetMetadataFromBuf; and it doesn't care about the
difference between a return of 0 (total success) or 1
(metadata was inconsistent, but pointer was populated as best
as possible); only about a return of -1 (could not read metadata
or out of memory). Changing the return type, and normalizing
the variable names used, will make merging the functions easier
in the next commit.
* src/util/virstoragefile.c (virStorageFileGetMetadataFromBuf):
Change return value, and rename some variables.
(virStorageFileGetMetadataFromFD): Rename some variables.
No semantic change; done so the next patch doesn't need a forward
declaration of a static function.
* src/util/virstoragefile.c (virStorageFileProbeFormatFromBuf):
Hoist earlier.
More mingw build failures:
CCLD libvirt-lxc.la
/usr/lib64/gcc/i686-w64-mingw32/4.7.2/../../../../i686-w64-mingw32/bin/ld: cannot find libvirt_lxc.def: No such file or directory
CC virportallocatortest-virportallocatortest.o
../../tests/virportallocatortest.c: In function 'main':
../../tests/virportallocatortest.c:195:1: error: implicit declaration of function 'setenv' [-Werror=implicit-function-declaration]
* src/Makefile.am (GENERATED_SYM_FILES): Also generate
libvirt_lxc.def.
* bootstrap.conf (gnulib_modules): Import setenv.
CC libvirt_util_la-vircommand.lo
../../src/util/vircommand.c:2358:1: error: 'virCommandHandshakeChild' defined but not used [-Werror=unused-function]
The function is only implemented inside #ifndef WIN32.
* src/util/vircommand.c (virCommandHandshakeChild): Hoist earlier,
so that win32 build doesn't hit an unused forward declaration.
No need to use HAVE_REGEX_H - our use of gnulib guarantees that
the header exists and works, regardless of platform. Similarly,
we can unconditionally assume a compiling <sys/wait.h> (although
the mingw version of this header is not full-featured).
* src/storage/storage_backend.c: Drop useless conditional.
* tests/testutils.c: Likewise.