Implement the new API for sending signals to processes in a guest
for the LXC driver. Only support sending signals to the init
process for now, because
- The kernel does not appear to expose the mapping between
container PID numbers and host PID numbers anywhere in the
host OS namespace
- There is no race-free way to validate whether a host PID
corresponds to a process in a container.
* src/lxc/lxc_driver.c: Allow sending processes signals
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
we already have virtualize meminfo for container through fuse filesystem,
add function lxcContainerMountProcFuse to mount this meminfo file to
the container's /proc/meminfo.
So we can isolate container's /proc/meminfo from host now.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
with this patch,container's meminfo will be shown based on
containers' mem cgroup.
Right now,it's impossible to virtualize all values in meminfo,
I collect some values such as MemTotal,MemFree,Cached,Active,
Inactive,Active(anon),Inactive(anon),Active(file),Inactive(anon),
Active(file),Inactive(file),Unevictable,SwapTotal,SwapFree.
if I miss something, please let me know.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
this patch addes fuse support for libvirt lxc.
we can use fuse filesystem to generate sysinfo dynamically,
So we can isolate /proc/meminfo,cpuinfo and so on through
fuse filesystem.
we mount fuse filesystem for every container.
the mount name is libvirt,mount point is
localstatedir/run/libvirt/lxc/containername.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
When starting an LXC guest with a virNetwork based NIC device,
if the network was not active, the virNetworkPtr device would
be leaked
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When starting a container, newDef is initialized to a
copy of 'def', but when startup fails newDef is never
removed. This cause later attempts to use 'virDomainDefine'
to lose the new data being defined.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A mistaken initialization of 'ret' caused failure to create
macvtap devices to be ignored. The libvirt_lxc process
would later fail to start due to missing devices
Also make sure code checks '< 0' and not '!= 0' since only
-1 is considered an error condition
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If the <interface> device did not contain any <target>
element, LXC would crash on a NULL pointer if starting
the container failed
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The LXC driver relies on use of cgroups to kill off LXC processes
in shutdown. If cgroups aren't available, we're unable to kill
off processes, so we must treat lack of cgroups as a fatal startup
error.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The code setting up LXC cgroups used an 'rc' variable both
for capturing the return value of methods it calls, and
its own return status. The result was that several failures
in setting up cgroups would actually result in success being
returned.
Use a separate 'ret' for tracking return value as per normal
code design in other parts of libvirt
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The initpid will be required long term to enable LXC to
implement various hotplug operations. Thus it needs to be
persisted in the domain status XML. LXC has not used the
domain status XML before, so this introduces use of the
helpers.
Currently the lxcContainerSetupMounts method uses the
virSecurityManagerPtr instance to obtain the mount options
string and then only passes the string down into methods
it calls. As functionality in LXC grows though, those
methods need to have direct access to the virSecurityManagerPtr
instance. So push the code down a level.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The impls of virSecurityManagerGetMountOptions had no way to
return errors, since the code was treating 'NULL' as a success
value. This is somewhat pointless, since the calling code did
not want NULL in the first place and has to translate it into
the empty string "". So change the code so that the impls can
return "" directly, allowing use of NULL for error reporting
once again
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When no security driver is specified libvirt_lxc segfaults as a debug
message tries to access security labels for the container that are not
present.
This problem was introduced in commit 6c3cf57d6c.
Early jumps to the cleanup label caused a crash of the libvirt_lxc
container helper as the cleanup section called
virLXCControllerDeleteInterfaces(ctrl) without checking the ctrl argument
for NULL. The argument was de-referenced soon after.
$ /usr/libexec/libvirt_lxc
/usr/libexec/libvirt_lxc: missing --name argument for configuration
Segmentation fault
The virLXCControllerClientCloseHook method was mistakenly
assuming that the private data associated with the network
client was the virLXCControllerPtr. In fact it was just a
dummy int, so we were derefencing a bogus struct. The
frequent result of this was that we would never quit, because
we tried to arm a non-existant timer.
Fix the code by removing the dummy private data and just
using the virLXCControllerPtr instance as private data
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the LXC driver logs audit messages when a container
is started or stopped. These audit messages, however, contain
the PID of the libvirt_lxc supervisor process. To enable
sysadmins to correlate with audit messages generated by
processes /inside/ the container, we need to include the
container init process PID.
We can't do this in the main 'start' audit message, since
the init PID is not available at that point. Instead we output
a completely new audit record, that lists both PIDs.
type=VIRT_CONTROL msg=audit(1353433750.071:363): pid=20180 uid=0 auid=501 ses=3 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 msg='virt=lxc op=init vm="busy" uuid=dda7b947-0846-1759-2873-0f375df7d7eb vm-pid=20371 init-pid=20372 exe="/home/berrange/src/virt/libvirt/daemon/.libs/lt-libvirtd" hostname=? addr=? terminal=pts/6 res=success'
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The LXC controller code currently directly invokes the
libvirt main loop code. The problem is that this misses
the cleanup of virNetServerClient connections that
virNetServerRun takes care of.
The result is that when libvirtd is stopped, the
libvirt_lxc controller process gets stuck in a I/O loop.
When libvirtd is then started again, it fails to connect
to the controller and thus kills off the entire domain.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
For S390, the default console target type cannot be of type 'serial'.
It is necessary to at least interpret the 'arch' attribute
value of the os/type element to produce the correct default type.
Therefore we need to extend the signature of defaultConsoleTargetType
to account for architecture. As a consequence all the drivers
supporting this capability function must be updated.
Despite the amount of changed files, the only change in behavior is
that for S390 the default console target type will be 'virtio'.
N.B.: A more future-proof approach could be to to use hypervisor
specific capabilities to determine the best possible console type.
For instance one could add an opaque private data pointer to the
virCaps structure (in case of QEMU to hold capsCache) which could
then be passed to the defaultConsoleTargetType callback to determine
the console target type.
Seems to be however a bit overengineered for the use case...
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
The libvirt coding standard is to use 'function(...args...)'
instead of 'function (...args...)'. A non-trivial number of
places did not follow this rule and are fixed in this patch.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This needs to be done before the container starts. Turning
off the mknod capability is noticed by systemd, which will
no longer attempt to create device nodes.
This eliminates SELinux AVC messages and ugly failure messages in the journal.
Add two new APIs virNetServerClientNewPostExecRestart and
virNetServerClientPreExecRestart which allow a virNetServerClientPtr
object to be created from a JSON object and saved to a
JSON object, for the purpose of re-exec'ing a process.
This includes serialization of the connected socket associated
with the client
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There was an inverted return value in lxcCgroupControllerActive().
The function assumes cgroups are active and do couple of checks
to prove that. If any of them fails, false is returned. Therefore,
at the end, after all checks are done we must return true, not false.
Many parts of virDomainDefPtr were using 'int' variables as
array length counts. Replace all these with size_t and update
various format strings & API signatures to adapt
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Depending on the scenario in which LXC containers exit, it is
possible for the EOF callback of the LXC monitor to deadlock
the driver.
#0 0x00000038a0a0de4d in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00000038a0a09ca6 in _L_lock_840 () from /lib64/libpthread.so.0
#2 0x00000038a0a09ba8 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00007f4bd9579d55 in virMutexLock (m=<optimized out>) at util/threads-pthread.c:85
#4 0x00007f4bcacc7597 in lxcDriverLock (driver=0x7f4bc40c8290) at lxc/lxc_conf.h:81
#5 virLXCProcessMonitorEOFNotify (mon=<optimized out>, vm=0x7f4bb4000b00) at lxc/lxc_process.c:581
#6 0x00007f4bd9645c91 in virNetClientCloseLocked (client=client@entry=0x7f4bb4009e60)
at rpc/virnetclient.c:554
#7 0x00007f4bd96460f8 in virNetClientIOEventLoopPassTheBuck (thiscall=0x0, client=0x7f4bb4009e60)
at rpc/virnetclient.c:1306
#8 virNetClientIOEventLoopPassTheBuck (client=0x7f4bb4009e60, thiscall=0x0)
at rpc/virnetclient.c:1287
#9 0x00007f4bd96467a2 in virNetClientCloseInternal (reason=3, client=0x7f4bb4009e60)
at rpc/virnetclient.c:589
#10 virNetClientCloseInternal (client=0x7f4bb4009e60, reason=3) at rpc/virnetclient.c:561
#11 0x00007f4bcacc4a82 in virLXCMonitorClose (mon=0x7f4bb4000a00) at lxc/lxc_monitor.c:201
#12 0x00007f4bcacc55ac in virLXCProcessCleanup (reason=<optimized out>, vm=0x7f4bb4000b00,
driver=0x7f4bc40c8290) at lxc/lxc_process.c:240
#13 virLXCProcessStop (driver=0x7f4bc40c8290, vm=vm@entry=0x7f4bb4000b00,
reason=reason@entry=VIR_DOMAIN_SHUTOFF_DESTROYED) at lxc/lxc_process.c:735
#14 0x00007f4bcacc5bd2 in virLXCProcessAutoDestroyDom (payload=<optimized out>,
name=0x7f4bb4003c80, opaque=0x7fff41af2df0) at lxc/lxc_process.c:94
#15 0x00007f4bd9586649 in virHashForEach (table=0x7f4bc409b270,
iter=iter@entry=0x7f4bcacc5ab0 <virLXCProcessAutoDestroyDom>, data=data@entry=0x7fff41af2df0)
at util/virhash.c:514
#16 0x00007f4bcacc52d7 in virLXCProcessAutoDestroyRun (driver=driver@entry=0x7f4bc40c8290,
conn=conn@entry=0x7f4bb8000ab0) at lxc/lxc_process.c:120
#17 0x00007f4bcacca628 in lxcClose (conn=0x7f4bb8000ab0) at lxc/lxc_driver.c:128
#18 0x00007f4bd95e67ab in virReleaseConnect (conn=conn@entry=0x7f4bb8000ab0) at datatypes.c:114
When the driver calls virLXCMonitorClose, there is really no
need for the EOF callback to be invoked in this case, since
the caller can easily handle events itself. In changing this,
the monitor needs to take a deep copy of the callback list,
not merely a reference.
Also adds debug statements in various places to aid
troubleshooting
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Asynchronously setting priv->mon to NULL was pointless,
just remove the destroy callback entirely.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Continue consolidation of process functions by moving some
helpers out of command.{c,h} into virprocess.{c,h}
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some kernel versions (at least RHEL-6 2.6.32) do not let you over-mount
an existing selinuxfs instance with a new one. Thus we must unmount the
existing instance inside our namespace.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://www.gnu.org/licenses/gpl-howto.html recommends that
the 'If not, see <url>.' phrase be a separate sentence.
* tests/securityselinuxhelper.c: Remove doubled line.
* tests/securityselinuxtest.c: Likewise.
* globally: s/; If/. If/
The introduction of /sys/fs/cgroup came in fairly recent kernels.
Prior to that time distros would pick a custom directory like
/cgroup or /dev/cgroup. We need to auto-detect where this is,
rather than hardcoding it
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This bug was revealed by the crash described in
https://bugzilla.redhat.com/show_bug.cgi?id=852383
The vlan info pointer sent to virNetDevOpenvswitchAddPort should never
be non-NULL unless there is at least one tag. The factthat such a vlan
info pointer was receveid pointed out that a caller was passing the
wrong pointer. Instead of sending &net->vlan, the result of
virDomainNetGetActualVlan(net) should be sent - that function will
look for vlan info in net->data.network.actual->vlan, and in cany case
return NULL instead of a pointer if the vlan info it finds has no
tags.
Aside from causing the crash, sending a hardcoded &net->vlan has the
effect of ignoring vlan info from a <network> or <portgroup> config.
This patch updates the structures that store information about each
domain and each hypervisor to support multiple security labels and
drivers. It also updates all the remaining code to use the new fields.
Signed-off-by: Marcelo Cerri <mhcerri@linux.vnet.ibm.com>
Add the ability to support VLAN tags for Open vSwitch virtual port
types. To accomplish this, modify virNetDevOpenvswitchAddPort and
virNetDevTapCreateInBridgePort to take a virNetDevVlanPtr
argument. When adding the port to the OVS bridge, setup either a
single VLAN or a trunk port based on the configuration from the
virNetDevVlanPtr.
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
Currently there is a hook function that is invoked when a
new client connection comes in, which allows an app to
setup private data. This setup will make it difficult to
serialize client state during process re-exec(). Change to
a model where the app registers a callback when creating
the virNetServerPtr instance, which is used to allocate
the client private data immediately during virNetClientPtr
construction.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the virNetClientPtr constructor will always register
the async IO event handler and the keepalive objects. In the
case of the lock manager, there will be no event loop available
nor keepalive support required. Split this setup out of the
constructor and into separate methods.
The remote driver will enable async IO and keepalives, while
the LXC driver will only enable async IO
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
As the consensus in:
https://www.redhat.com/archives/libvir-list/2012-July/msg01692.html,
this patch is to destroy conf/virdomainlist.[ch], folding the
helpers into conf/domain_conf.[ch].
* src/Makefile.am:
- Various indention fixes incidentally
- Add macro DATATYPES_SOURCES (datatypes.[ch])
- Link datatypes.[ch] for libvirt_lxc
* src/conf/domain_conf.c:
- Move all the stuffs from virdomainlist.c into it
- Use virUnrefDomain and virUnrefDomainSnapshot instead of
virDomainFree and virDomainSnapshotFree, which are defined
in libvirt.c, and we don't want to link to it.
- Remove "if" before "free" the object, as virObjectUnref
is in the list "useless_free_options".
* src/conf/domain_conf.h:
- Move all the stuffs from virdomainlist.h into it
- s/LIST_FILTER/LIST_DOMAINS_FILTER/
* src/libxl/libxl_driver.c:
- s/LIST_FILTER/LIST_DOMAINS_FILTER/
- no (include "virdomainlist.h")
* src/libxl/libxl_driver.c: Likewise
* src/lxc/lxc_driver.c: Likewise
* src/openvz/openvz_driver.c: Likewise
* src/parallels/parallels_driver.c: Likewise
* src/qemu/qemu_driver.c: Likewise
* src/test/test_driver.c: Likewise
* src/uml/uml_driver.c: Likewise
* src/vbox/vbox_tmpl.c: Likewise
* src/vmware/vmware_driver.c: Likewise
* tools/virsh-domain-monitor.c: Likewise
* tools/virsh.c: Likewise
The meat of this patch is just moving the calls to
virNWFilterRegisterCallbackDriver from each hypervisor's "register"
function into its "initialize" function. The rest is just code
movement to allow that, and a new virNWFilterUnRegisterCallbackDriver
function to undo what the register function does.
The long explanation:
There is an array in nwfilter called callbackDrvArray that has
pointers to a table of functions for each hypervisor driver that are
called by nwfilter. One of those function pointers is to a function
that will lock the hypervisor driver. Entries are added to the table
by calling each driver's "register" function, which happens quite
early in libvirtd's startup.
Sometime later, each driver's "initialize" function is called. This
function allocates a driver object and stores a pointer to it in a
static variable that was previously initialized to NULL. (and here's
the important part...) If the "initialize" function fails, the driver
object is freed, and that pointer set back to NULL (but the entry in
nwfilter's callbackDrvArray is still there).
When the "lock the driver" function mentioned above is called, it
assumes that the driver was successfully loaded, so it blindly tries
to call virMutexLock on "driver->lock".
BUT, if the initialize never happened, or if it failed, "driver" is
NULL. And it just happens that "lock" is always the first field in
driver so it is also NULL.
Boom.
To fix this, the call to virNWFilterRegisterCallbackDriver for each
driver shouldn't be called until the end of its (*already guaranteed
successful*) "initialize" function, not during its "register" function
(which is currently the case). This implies that there should also be
a virNWFilterUnregisterCallbackDriver() function that is called in a
driver's "shutdown" function (although in practice, that function is
currently never called).