libvirt

mirror of https://gitlab.com/libvirt/libvirt.git synced 2024-11-02 11:21:12 +00:00

Author	SHA1	Message	Date
Eric Blake	8de47efd3f	maint: fix comment typos * src/lxc/lxc_controller.c (virLXCControllerSetupDisk): Fix typo. * src/lxc/lxc_driver.c (lxcDomainAttachDeviceDiskLive): Likewise. Signed-off-by: Eric Blake <eblake@redhat.com>	2013-09-26 15:40:34 -06:00
Chen Hanxiao	c82513acc2	LXC: free dst before lxcDomainAttachDeviceDiskLive returns Free dst before lxcDomainAttachDeviceDiskLive returns Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>	2013-09-26 15:13:55 +02:00
Chen Hanxiao	9a08e2cbc6	LXC: Check the existence of dir before resolving symlinks If a dir does not exist, raise an immediate error in logs rather than letting virFileResolveAllLinks fail, since this gives better error reporting to the user. Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>	2013-09-23 11:22:17 +01:00
Chen Hanxiao	0e618c6aff	LXC: follow the unit style of /proc/meminfo When FUSE is enabled, the LXC container is setup with a custom /proc/meminfo file. This file uses "KB" as a suffix, rather than "kB" which is the kernel's style. Fix this inconsistency to avoid confusing apps. Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>	2013-09-23 11:01:07 +01:00
Peter Krempa	f9c7b32e5d	lxc: Add metadata modification APIs	2013-09-17 09:42:50 +02:00
Peter Krempa	d79fe8b50b	cgroup: Move [qemu\|lxc]GetCpuBWStatus to vicgroup.c and refactor it The function existed in two identical instances in lxc and qemu. Move it to vircgroup.c and simplify it. Refactor the callers too.	2013-09-16 11:32:49 +02:00
Gao feng	1c7037cff4	LXC: don't try to mount selinux filesystem when user namespace enabled Right now we mount selinuxfs even user namespace is enabled and ignore the error. But we shouldn't ignore these errors when user namespace is not enabled. This patch skips mounting selinuxfs when user namespace enabled. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-09-12 15:18:01 +01:00
Daniel P. Berrange	75235a52bc	Ensure root filesystem is recursively mounted readonly If the guest is configured with <filesystem type='mount'> <source dir='/'/> <target dir='/'/> <readonly/> </filesystem> Then any submounts under / should also end up readonly, except for those setup as basic mounts. eg if the user has /home on a separate volume, they'd expect /home to be readonly, but we should not touch the /sys, /proc, etc dirs we setup ourselves. Users can selectively make sub-mounts read-write again by simply listing them as new mounts without the <readonly> flag set <filesystem type='mount'> <source dir='/home'/> <target dir='/home'/> </filesystem> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-09-12 12:01:49 +01:00
Daniel P. Berrange	f27f5f7edd	Move array of mounts out of lxcContainerMountBasicFS Move the array of basic mounts out of the lxcContainerMountBasicFS function, to a global variable. This is to allow it to be referenced by other methods wanting to know what the basic mount paths are. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-09-12 11:52:12 +01:00
Gao feng	66e2adb2ba	LXC: introduce lxcContainerUnmountForSharedRoot Move the unmounting private or useless filesystems for container to this function. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-09-11 13:09:31 +01:00
Gao feng	4142bf46b8	LXC: umount the temporary filesystem created by libvirt The devpts, dev and fuse filesystems are mounted temporarily. there is no need to export them to container if container shares the root directory with host. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-09-11 13:09:31 +01:00
Hongwei Bi	46c9bce4c8	LXC: Free variable vroot in lxcDomainDetachDeviceHostdevUSBLive() The variable vroot should be freed in label cleanup.	2013-09-09 10:40:13 +02:00
Chen Hanxiao	744fb50831	LXC: fix typos in lxc_container.c Fix docs and error message typos in lxc_container.c Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com>	2013-09-06 12:14:00 +01:00
Gao feng	1583dfda7c	LXC: Don't mount securityfs when user namespace enabled Right now, securityfs is disallowed to be mounted in non-initial user namespace, so we must avoid trying to mount securityfs in a container which has user namespace enabled. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-09-05 12:00:07 +01:00
Daniel P. Berrange	c13a2c282b	Ensure that /dev exists in the container root filesystem If booting a container with a root FS that isn't the host's root, we must ensure that the /dev mount point exists. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-08-13 16:26:44 +01:00
Daniel P. Berrange	2d07f84302	Honour root prefix in lxcContainerMountFSBlockAuto The lxcContainerMountFSBlockAuto method can be used to mount the initial root filesystem, so it cannot assume a prefix of /.oldroot. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-08-13 14:04:28 +01:00
Dan Walsh	6807238d87	Ensure securityfs is mounted readonly in container If securityfs is available on the host, we should ensure to mount it read-only in the container. This will avoid systemd trying to mount it during startup causing SELinux AVCs. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-08-08 14:25:50 +01:00
Michal Privoznik	1199edb1d4	Introduce max_queued_clients This configuration knob lets user to set the length of queue of connection requests waiting to be accept()-ed by the daemon. IOW, it just controls the @backlog passed to listen: int listen(int sockfd, int backlog);	2013-08-05 11:03:01 +02:00
Daniel P. Berrange	1166eeba61	Fix crashing upgrading from older libvirts with running guests If upgrading from a libvirt that is older than 1.0.5, we can not assume that vm->def->resource is non-NULL. This bogus assumption caused libvirtd to crash Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-08-02 15:32:26 +01:00
Daniel P. Berrange	2fe2470181	Enable support for systemd-machined in cgroups creation Make the virCgroupNewMachine method try to use systemd-machined first. If that fails, then fallback to using the traditional cgroup setup code path. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-31 19:29:19 +01:00
Yuri Chornoivan	5b4c035b08	Fix minor typos in messages and docs Signed-off-by: Eric Blake <eblake@redhat.com>	2013-07-30 07:07:33 -06:00
Daniel P. Berrange	35fe8d97c0	Set default partition in libvirtd instead of libvirt_lxc By setting the default partition in libvirt_lxc it is not visible when querying the live XML. Move setting of the default partition into libvirtd virLXCProcessStart Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-26 17:46:22 +01:00
John Ferlan	cefb97fb81	virStateDriver - Separate AutoStart from Initialize Adjust these drivers to handle their Autostart functionality after each of the drivers has gone through their Initialization functions	2013-07-26 09:30:53 -04:00
Daniel P. Berrange	5ec5a22493	Add 'controllers' arg to virCgroupNewDetect When detecting cgroups we must honour any controllers whitelist the driver may have. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-25 19:55:47 +01:00
Daniel P. Berrange	a45b99ead9	Introduce a more convenient virCgroupNewDetectMachine Instead of requiring drivers to use a combination of calls to virCgroupNewDetect and virCgroupIsValidMachine, combine the two into virCgroupNewDetectMachine Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-25 19:47:30 +01:00
Daniel P. Berrange	f6c5f9077c	Convert LXC driver to use virCgroupNewMachine Convert the LXC driver code to use the new atomic API for setup of cgroups Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-25 11:42:48 +01:00
Michal Privoznik	4e5f0dd2d3	virLXCMonitorClose: Unlock domain while closing monitor There's a race in lxc driver causing a deadlock. If a domain is destroyed immediately after started, the deadlock can occur. When domain is started, the even loop tries to connect to the monitor. If the connecting succeeds, virLXCProcessMonitorInitNotify() is called with @mon->client locked. The first thing that callee does, is virObjectLock(vm). So the order of locking is: 1) @mon->client, 2) @vm. However, if there's another thread executing virDomainDestroy on the very same domain, the first thing done here is locking the @vm. Then, the corresponding libvirt_lxc process is killed and monitor is closed via calling virLXCMonitorClose(). This callee tries to lock @mon->client too. So the order is reversed to the first case. This situation results in deadlock and unresponsive libvirtd (since the eventloop is involved). The proper solution is to unlock the @vm in virLXCMonitorClose prior entering virNetClientClose(). See the backtrace as follows: Thread 25 (Thread 0x7f1b7c9b8700 (LWP 16312)): 0 0x00007f1b80539714 in __lll_lock_wait () from /lib64/libpthread.so.0 1 0x00007f1b8053516c in _L_lock_516 () from /lib64/libpthread.so.0 2 0x00007f1b80534fbb in pthread_mutex_lock () from /lib64/libpthread.so.0 3 0x00007f1b82a637cf in virMutexLock (m=0x7f1b3c0038d0) at util/virthreadpthread.c:85 4 0x00007f1b82a4ccf2 in virObjectLock (anyobj=0x7f1b3c0038c0) at util/virobject.c:320 5 0x00007f1b82b861f6 in virNetClientCloseInternal (client=0x7f1b3c0038c0, reason=3) at rpc/virnetclient.c:696 6 0x00007f1b82b862f5 in virNetClientClose (client=0x7f1b3c0038c0) at rpc/virnetclient.c:721 7 0x00007f1b6ee12500 in virLXCMonitorClose (mon=0x7f1b3c007210) at lxc/lxc_monitor.c:216 8 0x00007f1b6ee129f0 in virLXCProcessCleanup (driver=0x7f1b68100240, vm=0x7f1b680ceb70, reason=VIR_DOMAIN_SHUTOFF_DESTROYED) at lxc/lxc_process.c:174 9 0x00007f1b6ee14106 in virLXCProcessStop (driver=0x7f1b68100240, vm=0x7f1b680ceb70, reason=VIR_DOMAIN_SHUTOFF_DESTROYED) at lxc/lxc_process.c:710 10 0x00007f1b6ee1aa36 in lxcDomainDestroyFlags (dom=0x7f1b5c002560, flags=0) at lxc/lxc_driver.c:1291 11 0x00007f1b6ee1ab1a in lxcDomainDestroy (dom=0x7f1b5c002560) at lxc/lxc_driver.c:1321 12 0x00007f1b82b05be5 in virDomainDestroy (domain=0x7f1b5c002560) at libvirt.c:2303 13 0x00007f1b835a7e85 in remoteDispatchDomainDestroy (server=0x7f1b857419d0, client=0x7f1b8574ae40, msg=0x7f1b8574acf0, rerr=0x7f1b7c9b7c30, args=0x7f1b5c004a50) at remote_dispatch.h:3143 14 0x00007f1b835a7d78 in remoteDispatchDomainDestroyHelper (server=0x7f1b857419d0, client=0x7f1b8574ae40, msg=0x7f1b8574acf0, rerr=0x7f1b7c9b7c30, args=0x7f1b5c004a50, ret=0x7f1b5c0029e0) at remote_dispatch.h:3121 15 0x00007f1b82b93704 in virNetServerProgramDispatchCall (prog=0x7f1b8573af90, server=0x7f1b857419d0, client=0x7f1b8574ae40, msg=0x7f1b8574acf0) at rpc/virnetserverprogram.c:435 16 0x00007f1b82b93263 in virNetServerProgramDispatch (prog=0x7f1b8573af90, server=0x7f1b857419d0, client=0x7f1b8574ae40, msg=0x7f1b8574acf0) at rpc/virnetserverprogram.c:305 17 0x00007f1b82b8c0f6 in virNetServerProcessMsg (srv=0x7f1b857419d0, client=0x7f1b8574ae40, prog=0x7f1b8573af90, msg=0x7f1b8574acf0) at rpc/virnetserver.c:163 18 0x00007f1b82b8c1da in virNetServerHandleJob (jobOpaque=0x7f1b8574dca0, opaque=0x7f1b857419d0) at rpc/virnetserver.c:184 19 0x00007f1b82a64158 in virThreadPoolWorker (opaque=0x7f1b8573cb10) at util/virthreadpool.c:144 20 0x00007f1b82a63ae5 in virThreadHelper (data=0x7f1b8574b9f0) at util/virthreadpthread.c:161 21 0x00007f1b80532f4a in start_thread () from /lib64/libpthread.so.0 22 0x00007f1b7fc4f20d in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x7f1b83546740 (LWP 16297)): 0 0x00007f1b80539714 in __lll_lock_wait () from /lib64/libpthread.so.0 1 0x00007f1b8053516c in _L_lock_516 () from /lib64/libpthread.so.0 2 0x00007f1b80534fbb in pthread_mutex_lock () from /lib64/libpthread.so.0 3 0x00007f1b82a637cf in virMutexLock (m=0x7f1b680ceb80) at util/virthreadpthread.c:85 4 0x00007f1b82a4ccf2 in virObjectLock (anyobj=0x7f1b680ceb70) at util/virobject.c:320 5 0x00007f1b6ee13bd7 in virLXCProcessMonitorInitNotify (mon=0x7f1b3c007210, initpid=4832, vm=0x7f1b680ceb70) at lxc/lxc_process.c:601 6 0x00007f1b6ee11fd3 in virLXCMonitorHandleEventInit (prog=0x7f1b3c001f10, client=0x7f1b3c0038c0, evdata=0x7f1b8574a7d0, opaque=0x7f1b3c007210) at lxc/lxc_monitor.c:109 7 0x00007f1b82b8a196 in virNetClientProgramDispatch (prog=0x7f1b3c001f10, client=0x7f1b3c0038c0, msg=0x7f1b3c003928) at rpc/virnetclientprogram.c:259 8 0x00007f1b82b87030 in virNetClientCallDispatchMessage (client=0x7f1b3c0038c0) at rpc/virnetclient.c:1019 9 0x00007f1b82b876bb in virNetClientCallDispatch (client=0x7f1b3c0038c0) at rpc/virnetclient.c:1140 10 0x00007f1b82b87d41 in virNetClientIOHandleInput (client=0x7f1b3c0038c0) at rpc/virnetclient.c:1312 11 0x00007f1b82b88f51 in virNetClientIncomingEvent (sock=0x7f1b3c0044e0, events=1, opaque=0x7f1b3c0038c0) at rpc/virnetclient.c:1832 12 0x00007f1b82b9e1c8 in virNetSocketEventHandle (watch=3321, fd=54, events=1, opaque=0x7f1b3c0044e0) at rpc/virnetsocket.c:1695 13 0x00007f1b82a272cf in virEventPollDispatchHandles (nfds=21, fds=0x7f1b8574ded0) at util/vireventpoll.c:498 14 0x00007f1b82a27af2 in virEventPollRunOnce () at util/vireventpoll.c:645 15 0x00007f1b82a25a61 in virEventRunDefaultImpl () at util/virevent.c:273 16 0x00007f1b82b8e97e in virNetServerRun (srv=0x7f1b857419d0) at rpc/virnetserver.c:1097 17 0x00007f1b8359db6b in main (argc=2, argv=0x7ffff98dbaa8) at libvirtd.c:1512	2013-07-24 17:53:00 +02:00
John Ferlan	8134b37d34	lxc: Resolve Coverity warning Commit 'c8695053' resulted in the following: Coverity error seen in the output: ERROR: REVERSE_INULL FUNCTION: lxcProcessAutoDestroy Due to the 'dom' being checked before 'dom->persistent' since 'dom' is already dereferenced prior to that.	2013-07-23 19:04:48 -04:00
Daniel P. Berrange	da704c8782	Create + setup cgroups atomically for LXC process Currently the LXC driver creates the VM's cgroup prior to forking, and then libvirt_lxc moves the child process into the cgroup. This won't work with systemd whose APIs do the creation of cgroups + attachment of processes atomically. Fortunately we simply move the entire cgroups setup into the libvirt_lxc child process. We make it take place before fork'ing into the background, so by the time virCommandRun returns in the LXC driver, the cgroup is guaranteed to be present. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-23 22:46:31 +01:00
Daniel P. Berrange	87b2e6fa84	Auto-detect existing cgroup placement Use the new virCgroupNewDetect function to determine cgroup placement of existing running VMs. This will allow the legacy cgroups creation APIs to be removed entirely Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-23 22:46:31 +01:00
Daniel P. Berrange	0d7f45aea7	Convert remainder of cgroups code to report errors Convert the remaining methods in vircgroup.c to report errors instead of returning errno values. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-22 13:09:58 +01:00
Daniel P. Berrange	3260fdfab0	Convert the virCgroupKill* APIs to report errors Instead of returning errno values, change the virCgroupKill* APIs to fully report errors. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-22 13:09:58 +01:00
Daniel P. Berrange	b64dabff27	Report full errors from virCgroupNew* Instead of returning raw errno values, report full libvirt errors in virCgroupNew* functions. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-22 13:09:58 +01:00
Daniel P. Berrange	3aac4e5632	LXC: Set default driver for image backed filesystems If no explicit driver is set for an image backed filesystem, set it to use the loop driver (if raw) or nbd driver (if non-raw) Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-22 12:32:25 +01:00
Daniel P. Berrange	2e832b18d6	LXC: Fix some error reporting in filesystem setup A couple of places in LXC setup for filesystems did not do a "goto cleanup" after reporting errors. While fixing this, also add in many more debug statements to aid troubleshooting Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-22 12:32:07 +01:00
Michal Privoznik	dbeb04a65c	Introduce lxcDomObjFromDomain Similarly to qemu driver, we can use a helper function to lookup a domain instead of copying multiple lines around.	2013-07-18 14:16:54 +02:00
Michal Privoznik	eb150c86b4	Remove lxcDriverLock from almost everywhere With the majority of fields in the virLXCDriverPtr struct now immutable or self-locking, there is no need for practically any methods to be using the LXC driver lock. Only a handful of helper APIs now need it.	2013-07-18 14:16:54 +02:00
Michal Privoznik	2a82171aff	lxc: Make activeUsbHostdevs use locks The activeUsbHostdevs item in LXCDriver are lockable, but the lock has to be called explicitly. Call the virObject(Un)Lock() in order to achieve mutual exclusion once lxcDriverLock is removed.	2013-07-18 14:16:54 +02:00
Michal Privoznik	64ec738e58	Stop accessing driver->caps directly in LXC driver The 'driver->caps' pointer can be changed on the fly. Accessing it currently requires the global driver lock. Isolate this access in a single helper, so a future patch can relax the locking constraints.	2013-07-18 14:16:54 +02:00
Michal Privoznik	c86950533a	lxc: switch to virCloseCallbacks API	2013-07-18 14:16:54 +02:00
Michal Privoznik	4deeb74d01	Introduce annotations for virLXCDriverPtr fields Annotate the fields in virLXCDriverPtr to indicate the locking rules for their use.	2013-07-18 14:16:54 +02:00
Michal Privoznik	29bed27eb4	lxc: Use atomic ops for driver->nactive	2013-07-18 14:16:54 +02:00
Michal Privoznik	7fca37554c	Introduce a virLXCDriverConfigPtr object Currently the virLXCDriverPtr struct contains an wide variety of data with varying access needs. Move all the static config data into a dedicated virLXCDriverConfigPtr object. The only locking requirement is to hold the driver lock, while obtaining an instance of virLXCDriverConfigPtr. Once a reference is held on the config object, it can be used completely lockless since it is immutable. NB, not all APIs correctly hold the driver lock while getting a reference to the config object in this patch. This is safe for now since the config is never updated on the fly. Later patches will address this fully.	2013-07-18 14:16:53 +02:00
Michal Privoznik	7e94a1a4ea	virLXCDriver: Drop unused @cgroup It is not used anywhere, so it makes no sense to have it there.	2013-07-18 14:16:53 +02:00
Daniel P. Berrange	040d996342	Merge virCommandPreserveFD / virCommandTransferFD Merge the virCommandPreserveFD / virCommandTransferFD methods into a single virCommandPasFD method, and use a new VIR_COMMAND_PASS_FD_CLOSE_PARENT to indicate their difference in behaviour Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-18 12:18:24 +01:00
Daniel P. Berrange	11693bc6f0	LXC: Wire up the virDomainCreate{XML}WithFiles methods Wire up the new virDomainCreate{XML}WithFiles methods in the LXC driver, so that FDs get passed down to the init process. The lxc_container code needs to do a little dance in order to renumber the file descriptors it receives into linear order, starting from STDERR_FILENO + 1. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-18 12:07:51 +01:00
Michal Privoznik	192a86cadf	lxc_container: Don't call virGetGroupList during exec Commit `75c1256` states that virGetGroupList must not be called between fork and exec, then commit `ee777e99` promptly violated that for lxc. Patch originally posted by Eric Blake <eblake@redhat.com>.	2013-07-17 14:26:09 +02:00
Michal Privoznik	37d96498c6	lxcCapsInit: Allocate primary security driver unconditionally Currently, if the primary security driver is 'none', we skip initializing caps->host.secModels. This means, later, when LXC domain XML is parsed and <seclabel type='none'/> is found (see virSecurityLabelDefsParseXML), the model name is not copied to the seclabel. This leads to subsequent crash in virSecurityManagerGenLabel where we call STREQ() over the model (note, that we are expecting model to be !NULL).	2013-07-17 12:36:45 +02:00
Gao feng	129d25dcd9	LXC: Change the owner of live attached host devices The owner of this host devices should be the root user of container. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-07-16 09:59:41 -06:00
Gao feng	7a8212aac9	LXC: Change the owner of host devices to the root of container These host devices are created for container, the owner should be the root user of container. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-07-16 09:59:29 -06:00
Gao feng	f87be04fd8	LXC: Create host devices for container on host side Otherwise the container will fail to start if we enable user namespace, since there is no rights to do mknod in uninit user namespace. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-07-16 09:59:24 -06:00
Gao feng	4f41a8e5b2	LXC: Change the owner of live attached disk device The owner of this disk device should be the root user of container. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-07-16 09:59:20 -06:00
Gao feng	14a0c4084d	LXC: Move virLXCControllerChown to lxc_container.c lxc driver will use this function to change the owner of hot added devices. Move virLXCControllerChown to lxc_container.c and Rename it to lxcContainerChown. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-07-16 09:59:14 -06:00
Gao feng	ae4e916f04	LXC: controller: change the owner of disk to the root of container These disk devices are created for container, the owner should be the root user of container. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-07-16 09:58:53 -06:00
Gao feng	7161f0a385	LXC: Setup disks for container on host side Since mknod in container is forbidden, we should setup disks on host side. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-07-16 09:57:38 -06:00
Daniel P. Berrange	f45dbdb213	Add a couple of debug statements to LXC driver When failing to start a container due to inaccessible root filesystem path, we did not log any meaningful error. Add a few debug statements to assist diagnosis Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-12 11:06:08 +01:00
Eric Blake	ee777e9949	util: make virSetUIDGID async-signal-safe https://bugzilla.redhat.com/show_bug.cgi?id=964358 POSIX states that multi-threaded apps should not use functions that are not async-signal-safe between fork and exec, yet we were using getpwuid_r and initgroups. Although rare, it is possible to hit deadlock in the child, when it tries to grab a mutex that was already held by another thread in the parent. I actually hit this deadlock when testing multiple domains being started in parallel with a command hook, with the following backtrace in the child: Thread 1 (Thread 0x7fd56bbf2700 (LWP 3212)): #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136 #1 0x00007fd5761e7388 in _L_lock_854 () from /lib64/libpthread.so.0 #2 0x00007fd5761e7257 in __pthread_mutex_lock (mutex=0x7fd56be00360) at pthread_mutex_lock.c:61 #3 0x00007fd56bbf9fc5 in _nss_files_getpwuid_r (uid=0, result=0x7fd56bbf0c70, buffer=0x7fd55c2a65f0 "", buflen=1024, errnop=0x7fd56bbf25b8) at nss_files/files-pwd.c:40 #4 0x00007fd575aeff1d in __getpwuid_r (uid=0, resbuf=0x7fd56bbf0c70, buffer=0x7fd55c2a65f0 "", buflen=1024, result=0x7fd56bbf0cb0) at ../nss/getXXbyYY_r.c:253 #5 0x00007fd578aebafc in virSetUIDGID (uid=0, gid=0) at util/virutil.c:1031 #6 0x00007fd578aebf43 in virSetUIDGIDWithCaps (uid=0, gid=0, capBits=0, clearExistingCaps=true) at util/virutil.c:1388 #7 0x00007fd578a9a20b in virExec (cmd=0x7fd55c231f10) at util/vircommand.c:654 #8 0x00007fd578a9dfa2 in virCommandRunAsync (cmd=0x7fd55c231f10, pid=0x0) at util/vircommand.c:2247 #9 0x00007fd578a9d74e in virCommandRun (cmd=0x7fd55c231f10, exitstatus=0x0) at util/vircommand.c:2100 #10 0x00007fd56326fde5 in qemuProcessStart (conn=0x7fd53c000df0, driver=0x7fd55c0dc4f0, vm=0x7fd54800b100, migrateFrom=0x0, stdin_fd=-1, stdin_path=0x0, snapshot=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=1) at qemu/qemu_process.c:3694 ... The solution is to split the work of getpwuid_r/initgroups into the unsafe portions (getgrouplist, called pre-fork) and safe portions (setgroups, called post-fork). * src/util/virutil.h (virSetUIDGID, virSetUIDGIDWithCaps): Adjust signature. * src/util/virutil.c (virSetUIDGID): Add parameters. (virSetUIDGIDWithCaps): Adjust clients. * src/util/vircommand.c (virExec): Likewise. * src/util/virfile.c (virFileAccessibleAs, virFileOpenForked) (virDirCreate): Likewise. * src/security/security_dac.c (virSecurityDACSetProcessLabel): Likewise. * src/lxc/lxc_container.c (lxcContainerSetID): Likewise. * configure.ac (AC_CHECK_FUNCS_ONCE): Check for setgroups, not initgroups. Signed-off-by: Eric Blake <eblake@redhat.com>	2013-07-11 15:46:42 -06:00
John Ferlan	8283ef9ea2	testutils: Resolve Coverity issues Recent changes uncovered a NEGATIVE_RETURNS in the return from sysconf() when processing a for loop in virtTestCaptureProgramExecChild() in testutils.c Code review uncovered 3 other code paths with the same condition that weren't found by Covirity, so fixed those as well.	2013-07-11 14:18:11 -04:00
Gao feng	46a46563ca	LXC: remove some incorrect setting ATTRIBUTE_UNUSED these parameters shouldn't be marked as ATTRIBUTE_UNUSED. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-07-11 13:43:31 +02:00
Daniel P. Berrange	a4b57dfb9e	Convert 'int i' to 'size_t i' in src/lxc/ files Convert the type of loop iterators named 'i', 'j', k', 'ii', 'jj', 'kk', to be 'size_t' instead of 'int' or 'unsigned int', also santizing 'ii', 'jj', 'kk' to use the normal 'i', 'j', 'k' naming Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-10 17:55:16 +01:00
Michal Privoznik	56965922ab	Adapt to VIR_ALLOC and virAsprintf in src/lxc/*	2013-07-10 11:07:32 +02:00
Michal Privoznik	8290cbbc38	viralloc: Report OOM error on failure Similarly to VIR_STRDUP, we want the OOM error to be reported in VIR_ALLOC and friends.	2013-07-10 11:07:31 +02:00
Gao feng	468ee0bc4d	LXC: hostdev: create parent directory for hostdev Create parent directroy for hostdev automatically when we start a lxc domain or attach a hostdev to a lxc domain. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-07-09 11:16:20 +01:00
Gao feng	c0d8c7c885	LXC: hostdev: introduce lxcContainerSetupHostdevCapsMakePath This helper function is used to create parent directory for the hostdev which will be added to the container. If the parent directory of this hostdev doesn't exist, the mknod of the hostdev will fail. eg with /dev/net/tun Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-07-09 11:15:11 +01:00
Richard Weinberger	9a0ac6d9c2	LXC: Create /dev/tty within a container Many applications use /dev/tty to read from stdin. e.g. zypper on openSUSE. Let's create this device node to unbreak those applications. As /dev/tty is a synonym for the current controlling terminal it cannot harm the host or any other containers. Signed-off-by: Richard Weinberger <richard@nod.at>	2013-07-09 11:05:14 +01:00
Daniel P. Berrange	763973607d	Add access control filtering of domain objects Ensure that all APIs which list domain objects filter them against the access control system. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-03 15:54:53 +01:00
Gao feng	350fd95f40	LXC: blkio: allow to setup weight_device libivrt lxc can only set generic weight for container, This patch allows user to setup per device blkio weigh for container. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-07-03 12:35:54 +01:00
Gao feng	e7b3349f5a	LXC: fix memory leak when userns configuration is incorrect We forgot to free the stack when Kernel doesn't support user namespace. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-07-03 12:19:50 +01:00
Daniel P. Berrange	1165e39ca3	Add some misc debugging to LXC startup Add some debug logging of LXC wait/continue messages and uid/gid map update code.	2013-07-02 14:00:13 +01:00
Daniel P. Berrange	293f717028	Ignore failure to mount SELinux filesystem in container User namespaces will deny the ability to mount the SELinux filesystem. This is harmless for libvirt's LXC needs, so the error can be ignored. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-07-02 14:00:13 +01:00
Gao feng	5daa1b0132	LXC: fuse: Change files owner to the root user of container The owner of the /proc/meminfo in container should be the root user of container. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-07-02 11:20:05 +01:00
Gao feng	6c7665e150	LXC: controller: change the owner of /dev/pts and ptmx to the root of container These files are created for container, the owner should be the root user of container. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-07-02 11:20:05 +01:00
Gao feng	a591ae6068	LXC: controller: change the owner of devices created on host Since these devices are created for the container. the owner should be the root user of the container. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-07-02 11:20:05 +01:00
Gao feng	40a8fe6d25	LXC: controller: change the owner of /dev to the root user of container container will create /dev/pts directory in /dev. the owner of /dev should be the root user of container. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-07-02 11:20:05 +01:00
Gao feng	ff1a6019e9	LXC: controller: change the owner of tty devices to the root user of container Since these tty devices will be used by container, the owner of them should be the root user of container. This patch also adds a new function virLXCControllerChown, we can use this general function to change the owner of files. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-07-02 11:20:04 +01:00
Gao feng	e1d32bb955	LXC: Creating devices for container on host side user namespace doesn't allow to create devices in uninit userns. We should create devices on host side. We first mount tmpfs on dev directroy under state dir of container. then create devices under this dev dir. Finally in container, mount the dev directroy created on host to the /dev/ directroy of container. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-07-02 11:20:04 +01:00
Gao feng	9a085a228c	LXC: introduce virLXCControllerSetupUserns and lxcContainerSetID This patch introduces new helper function virLXCControllerSetupUserns, in this function, we set the files uid_map and gid_map of the init task of container. lxcContainerSetID is used for creating cred for tasks running in container. Since after setuid/setgid, we may be a new user. This patch calls lxcContainerSetUserns at first to make sure the new created files belong to right user. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-07-02 11:20:04 +01:00
Gao feng	8b58336eec	LXC: enable user namespace only when user set the uidmap User namespace will be enabled only when the idmap exist in configuration. If you want disable user namespace,just remove these elements from XML. If kernel doesn't support user namespace and idmap exist in configuration file, libvirt lxc will start failed and return "Kernel doesn't support user namespace" message. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-07-02 11:20:04 +01:00
Jiri Denemark	c40ed4168a	Rename virTypedParameterArrayValidate as virTypedParamsValidate	2013-06-25 00:38:24 +02:00
Daniel P. Berrange	279866d550	Add ACL checks into the LXC driver Insert calls to the ACL checking APIs in all LXC driver entrypoints. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-06-24 15:25:43 +01:00
John Ferlan	38ada092d1	lxc: Resolve issue with GetScheduler APIs for non running domain As a consequence of the cgroup layout changes from commit 'cfed9ad4', the lxcDomainGetSchedulerParameters[Flags]()' and lxcGetSchedulerType() APIs failed to return data for a non running domain. This can be seen through a 'virsh schedinfo <domain>' command which returns: Scheduler : Unknown error: Requested operation is not valid: cgroup CPU controller is not mounted Prior to that change a non running domain would return: Scheduler : posix cpu_shares : 0 vcpu_period : 0 vcpu_quota : 0 emulator_period: 0 emulator_quota : 0 This patch will restore the capability to return configuration only data for a non running domain regardless of whether cgroups are available.	2013-06-19 15:01:48 -04:00
Richard Weinberger	1133404c73	LXC: s/chroot/chdir in lxcContainerPivotRoot() ...fixes a trivial copy&paste error. Signed-off-by: Richard Weinberger <richard@nod.at>	2013-06-14 11:24:41 +02:00
Ján Tomko	e557766c3b	Replace two-state local integers with bool Found with 'git grep "= 1"'.	2013-06-06 17:22:53 +02:00
Daniel P. Berrange	922ebe4ead	Ensure non-root can read /proc/meminfo file in LXC containers By default files in a FUSE mount can only be accessed by the user which created them, even if the file permissions would otherwise allow it. To allow other users to access the FUSE mount the 'allow_other' mount option must be used. This bug prevented non-root users in an LXC container from reading the /proc/meminfo file. https://bugzilla.redhat.com/show_bug.cgi?id=967977 Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-06-05 14:02:20 +01:00
Daniel P. Berrange	61e672b23e	Remove legacy code for single-instance devpts filesystem Earlier commit `f7e8653f` dropped support for using LXC with kernels having single-instance devpts filesystem from the LXC controller. It forgot to remove the same code from the LXC container setup. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-06-05 14:01:54 +01:00
Eric Blake	1add9c78da	maint: don't use config.h in .h files Enforce the rule that .h files don't need to (redundantly) include <config.h>. * cfg.mk (sc_prohibit_config_h_in_headers): New rule. (_virsh_includes): Delete; instead, inline a smaller number of exclusions... (exclude_file_name_regexp--sc_require_config_h) (exclude_file_name_regexp--sc_require_config_h_first): ...here. * daemon/libvirtd.h (includes): Fix offenders. * src/driver.h (includes): Likewise. * src/gnutls_1_0_compat.h (includes): Likewise. * src/libxl/libxl_conf.h (includes): Likewise. * src/libxl/libxl_driver.h (includes): Likewise. * src/lxc/lxc_conf.h (includes): Likewise. * src/lxc/lxc_driver.h (includes): Likewise. * src/lxc/lxc_fuse.h (includes): Likewise. * src/network/bridge_driver.h (includes): Likewise. * src/phyp/phyp_driver.h (includes): Likewise. * src/qemu/qemu_conf.h (includes): Likewise. * src/util/virnetlink.h (includes): Likewise. Signed-off-by: Eric Blake <eblake@redhat.com>	2013-06-05 05:53:25 -06:00
Osier Yang	1ea88abd7e	src/lxc: Remove the whitespace before ";"	2013-05-21 23:41:45 +08:00
Gao feng	7adfda0d6d	LXC: move the comments to the proper place The comments is for virLXCControllerSetupPrivateNS. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-05-20 12:45:02 -06:00
Gao feng	2a3466fafb	LXC: fix memory leak in virLXCControllerSetupDevPTS We forgot to free the mount_options. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-05-20 12:45:02 -06:00
Gao feng	eae1c286a1	LXC: remove unnecessary check on root filesystem After commit `c131525bec` "Auto-add a root <filesystem> element to LXC containers on startup" for libvirt lxc, root must be existent. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-05-20 12:45:01 -06:00
Daniel P. Berrange	63ea1e5432	Re-add selinux/selinux.h to lxc_container.c Re-add the selinux header to lxc_container.c since other functions now use it, beyond the patch that was just reverted. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-05-17 10:59:25 +01:00
Daniel P. Berrange	7bebd88871	Revert "Change label of fusefs mounted at /proc/meminfo in lxc containers" This reverts commit `940c6f1085`.	2013-05-17 10:22:54 +01:00
Daniel P. Berrange	95c6cc344b	Don't mount selinux fs in LXC if selinux is disabled Before trying to mount the selinux filesystem in a container use is_selinux_enabled() to check if the machine actually has selinux support (eg not booted with selinux=0) Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-05-16 16:28:53 +01:00
Daniel P. Berrange	d7d7581b03	Fix LXC startup when /var/run is an absolute symlink During startup, the LXC driver uses paths such as /.oldroot/var/run/libvirt/lxc/... to access directories from the previous root filesystem after doing a pivot_root(). Unfortunately if /var/run is an absolute symlink to /run, instead of a relative symlink to ../run, these paths break. At least one Linux distro is known to use an absolute symlink for /var/run, so workaround this, by resolving all symlinks before doing the pivot_root(). Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-05-16 16:28:53 +01:00
Dan Walsh	940c6f1085	Change label of fusefs mounted at /proc/meminfo in lxc containers We do not want to allow contained applications to be able to read fusefs_t. So we want /proc/meminfo label to match the system default proc_t. Fix checking of error codes	2013-05-15 17:39:22 +02:00
Daniel P. Berrange	7bb7510de7	Remove obsolete skipRoot flag in LXC driver The lxcContainerMountAllFS method had a 'bool skipRoot' flag to control whether it mounts the / filesystem. Since removal of the non-pivot root container setup codepaths, this flag is obsolete as the only caller always passes 'true'. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-05-15 17:29:35 +02:00
Daniel P. Berrange	31453a837b	Stop passing around old root directory prefix Many methods accept a string parameter specifying the old root directory prefix. Since removal of the non-pivot root container setup codepaths, this parameter is obsolete in many methods where the callers always pass "/.oldroot". Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-05-15 17:29:35 +02:00
Daniel P. Berrange	37cebfec92	Remove obsolete pivotRoot flag in LXC driver The lxcContainerMountBasicFS method had a 'bool pivotRoot' flag to control whether it mounted a private /dev. Since removal of the non-pivot root container setup codepaths, this flag is obsolete as the only caller always passes 'true'. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-05-15 17:29:35 +02:00
Daniel P. Berrange	6b5f12c805	Support NBD backed disks/filesystems in LXC driver The LXC driver can already configure <disk> or <filesystem> devices to use the loop device. This extends it to also allow for use of the NBD device, to support non-raw formats. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-05-13 13:15:19 +01:00
Daniel P. Berrange	c8fa7e8c55	Re-arrange code setting up ifs/disk loop devices for LXC The current code for setting up loop devices to LXC disks first does a switch() based on the disk format, then looks at the disk driver name. Reverse this so it first looks at the driver name, and then the disk format. This is more useful since the list of supported disk formats depends on what driver is used. The code for setting loop devices for LXC fs entries also needs to have the same logic added, now the XML schema supports this. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-05-13 13:15:19 +01:00
Michal Privoznik	a96d7f3c8f	Adapt to VIR_STRDUP and VIR_STRNDUP in src/lxc/*	2013-05-09 14:00:45 +02:00
John Ferlan	649ecb704f	lxc: Coverity false positive USE_AFTER_FREE	2013-05-08 06:16:53 -04:00
Daniel P. Berrange	a605b7e041	Unmerge attach/update/modify device APIs in drivers The LXC, QEMU, and LibXL drivers have all merged their handling of the attach/update/modify device APIs into one large 'xxxxDomainModifyDeviceFlags' which then does a 'switch()' based on the actual API being invoked. While this saves some lines of code, it is not really all that significant in the context of the driver API impls as a whole. This merger of the handling of different APIs creates pain when wanting to automated analysis of the code and do things which are specific to individual APIs. The slight duplication of code from unmerged the API impls, is preferrable to allow for easier automated analysis. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-05-08 10:47:48 +01:00
Daniel P. Berrange	4a044d0256	Separate internal node suspend APIs from public API The individual hypervisor drivers were directly referencing APIs in virnodesuspend.c in their virDriverPtr struct. Separate these methods, so there is always a wrapper in the hypervisor driver. This allows the unused virConnectPtr args to be removed from the virnodesuspend.c file. Again this will ensure that ACL checks will only be performed on invocations that are directly associated with public API usage. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-05-08 10:47:47 +01:00
Daniel P. Berrange	1c6d4ca557	Separate internal node device APIs from public API The individual hypervisor drivers were directly referencing APIs in src/nodeinfo.c in their virDriverPtr struct. Separate these methods, so there is always a wrapper in the hypervisor driver. This allows the unused virConnectPtr args to be removed from the nodeinfo.c file. Again this will ensure that ACL checks will only be performed on invocations that are directly associated with public API usage. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-05-08 10:47:47 +01:00
Daniel P. Berrange	ead630319d	Separate virGetHostname() API contract from driver APIs Currently the virGetHostname() API has a bogus virConnectPtr parameter. This is because virtualization drivers directly reference this API in their virDriverPtr tables, tieing its API design to the public virConnectGetHostname API design. This also causes problems for access control checks since these must only be done for invocations from the public API, not internal invocation. Remove the bogus virConnectPtr parameter, and make each hypervisor driver provide a dedicated function for the driver API impl. This will allow access control checks to be easily inserted later. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-05-08 10:47:47 +01:00
Michal Privoznik	7c9a2d88cd	virutil: Move string related functions to virstring.c The source code base needs to be adapted as well. Some files include virutil.h just for the string related functions (here, the include is substituted to match the new file), some include virutil.h without any need (here, the include is removed), and some require both.	2013-05-02 16:56:55 +02:00
Daniel P. Berrange	90430791ae	Make driver method names consistent with public APIs Ensure that all drivers implementing public APIs use a naming convention for their implementation that matches the public API name. eg for the public API virDomainCreate make sure QEMU uses qemuDomainCreate and not qemuDomainStart Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-24 11:00:18 +01:00
Daniel P. Berrange	abe038cfc0	Extend previous check to validate driver struct field names Ensure that the driver struct field names match the public API names. For an API virXXXX we must have a driver struct field xXXXX. ie strip the leading 'vir' and lowercase any leading uppercase letters. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-24 10:59:53 +01:00
Daniel P. Berrange	1e05073fbb	Replace more cases of /system with /machine The change in commit `aed4986322` was incomplete, missing a couple of cases of /system. This caused failure to start VMs. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-22 17:11:36 +01:00
Daniel P. Berrange	aed4986322	Change default resource partition to /machine After discussions with systemd developers it was decided that a better default policy for resource partitions is to have 3 default partitions at the top level /system - system services /machine - virtual machines / containers /user - user login session This ensures that the default policy isolates guest from user login sessions & system services, so a mis-behaving guest can't consume 100% of CPU usage if other things are contending for it. Thus we change the default partition from /system to /machine Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-22 12:10:12 +01:00
Eric Blake	1bf25ba249	docs: fix usage of 'onto' http://www.uhv.edu/ac/newsletters/writing/grammartip2009.07.01.htm (and several other sites) give hints that 'onto' is best used if you can also add 'up' just before it and still make sense. In many cases in the code base, we really want the two-word form, or even a simplification to just 'on' or 'to'. * docs/hacking.html.in: Use correct 'on to'. * python/libvirt-override.c: Likewise. * src/lxc/lxc_controller.c: Likewise. * src/util/virpci.c: Likewise. * daemon/THREADS.txt: Use simpler 'on'. * docs/formatdomain.html.in: Better usage. * docs/internals/rpc.html.in: Likewise. * src/conf/domain_event.c: Likewise. * src/rpc/virnetclient.c: Likewise. * tests/qemumonitortestutils.c: Likewise. * HACKING: Regenerate. Signed-off-by: Eric Blake <eblake@redhat.com>	2013-04-19 14:31:16 -06:00
Daniel P. Berrange	ff66b45e2b	Replace LXC cgroup mount code with call to virCgroupIsolateMount The LXC driver currently has code to detect cgroups mounts and then re-mount them inside the new root filesystem. Replace this fragile code with a call to virCgroupIsolateMount. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-15 17:35:32 +01:00
Daniel P. Berrange	767596bdb4	Remove non-functional code for setting up non-root cgroups The virCgroupNewDriver method had a 'bool privileged' param. If a false value was ever passed in, it would simply not work, since non-root users don't have any privileges to create new cgroups. Just delete this broken code entirely and make the QEMU driver skip cgroup setup in non-privileged mode Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-15 17:35:31 +01:00
Daniel P. Berrange	db44eb1b5f	Change default cgroup layout for QEMU/LXC and honour XML config Historically QEMU/LXC guests have been placed in a cgroup layout that is $LOCATION-OF-LIBVIRTD/libvirt/{qemu,lxc}/$VMNAME This is bad for a number of reasons - The cgroup hierarchy gets very deep which seriously impacts kernel performance due to cgroups scalability limitations. - It is hard to setup cgroup policies which apply across services and virtual machines, since all VMs are underneath the libvirtd service. To address this the default cgroup location is changed to be /system/$VMNAME.{lxc,qemu}.libvirt This puts virtual machines at the same level in the hierarchy as system services, allowing consistent policy to be setup across all of them. This also honours the new resource partition location from the XML configuration, for example <resource> <partition>/virtualmachines/production</partitions> </resource> will result in the VM being placed at /virtualmachines/production/$VMNAME.{lxc,qemu}.libvirt NB, with the exception of the default, /system, path which is intended to always exist, libvirt will not attempt to auto-create the partitions in the XML. It is the responsibility of the admin/app to configure the partitions. Later libvirt APIs will provide a way todo this. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-15 17:35:31 +01:00
Daniel P. Berrange	aa8604dd45	Add a new virCgroupNewPartition for setting up resource partitions A resource partition is an absolute cgroup path, ignoring the current process placement. Expose a virCgroupNewPartition API for constructing such cgroups Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-15 17:35:31 +01:00
Daniel P. Berrange	04c18d25f1	Rename virCgroupForXXX to virCgroupNewXXX Rename all the virCgroupForXXX methods to use the form virCgroupNewXXX since they are all constructors. Also make sure the output parameter is the last one in the list, and annotate all pointers as non-null. Fix up all callers, and make sure they use true/false not 0/1 for the boolean parameters Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-15 17:35:31 +01:00
Daniel P. Berrange	cfed9ad4fb	Store a virCgroupPtr instance in virLXCDomainObjPrivatePtr Instead of calling virCgroupForDomain every time we need the virCgrouPtr instance, just do it once at Vm startup and cache a reference to the object in virLXCDomainObjPrivatePtr until shutdown of the VM. Removing the virCgroupPtr from the LXC driver state also means we don't have stale mount info, if someone mounts the cgroups filesystem after libvirtd has been started Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-15 17:35:31 +01:00
Osier Yang	1bbc1e7524	cleanup: Change datatype of hostdev->missing to boolean	2013-04-11 11:36:28 +08:00
Daniel P. Berrange	1bd955ed60	Unmount existing filesystems under user specified mounts in LXC If the user requests a mount for /run, this may hide any existing mounts that are lower down in /run. The result is that the container still sees the mounts in /proc/mounts, but cannot access them sh-4.2# df df: '/run/user/501/gvfs': No such file or directory df: '/run/media/berrange/LIVE': No such file or directory df: '/run/media/berrange/SecureDiskA1': No such file or directory df: '/run/libvirt/lxc/sandbox': No such file or directory Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/vg_t500wlan-lv_root 151476396 135390200 8384900 95% / tmpfs 1970888 3204 1967684 1% /run /dev/sda1 194241 155940 28061 85% /boot devfs 64 0 64 0% /dev tmpfs 64 0 64 0% /sys/fs/cgroup tmpfs 1970888 1200 1969688 1% /etc/libvirt-sandbox/scratch Before mounting any filesystem at a particular location, we must recursively unmount anything at or below the target mount point Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-08 17:40:08 +01:00
Daniel P. Berrange	2863ca22f3	Move lxcContainerUnmountSubtree further up in file Ensure lxcContainerUnmountSubtree is at the top of the lxc_container.c file so it is easily referenced from any other method. No functional change Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-08 17:40:08 +01:00
Bogdan Purcareata	442d6a0527	Implement support for <hostdev caps=net> This allows a container-type domain to have exclusive access to one of the host's NICs. Wire <hostdev caps=net> with the lxc_controller - when moving the newly created veth devices into a new namespace, also look for any hostdev devices that should be moved. Note: once the container domain has been destroyed, there is no code that moves the interfaces back to the original namespace. This does happen, though, probably due to default cleanup on namespace destruction. Signed-off-by: Bogdan Purcareata <bogdan.purcareata@freescale.com>	2013-04-08 17:40:08 +01:00
Daniel P. Berrange	dca927c82f	Rename virCgroupMounted to virCgroupHasController & make it more robust The virCgroupMounted method is badly named, since a controller can be mounted, but disabled in the current object. Rename the method to be virCgroupHasController. Also make it tolerant to a NULL virCgroupPtr and out-of-range controller index, to avoid duplication of these checks in all callers Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-08 14:49:12 +01:00
Daniel P. Berrange	56f27b3bbc	Don't create dirs in cgroup controllers we don't want to use Currently when getting an instance of virCgroupPtr we will create the path in all cgroup controllers. Only at the virt driver layer are we attempting to filter controllers. This is bad because the mere act of creating the dirs in the controllers can have a functional impact on the kernel, particularly for performance. Update the virCgroupForDriver() method to accept a bitmask of controllers to use. Only create dirs in the controllers that are requested. When creating cgroups for domains, respect the active controller list from the parent cgroup Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-05 10:41:54 +01:00
Daniel P. Berrange	804a809a06	Rename virCgroupGetAppRoot to virCgroupForSelf The virCgroupGetAppRoot is not clear in its meaning. Change to virCgroupForSelf to highlight that this returns the cgroup config for the caller's process Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-05 10:41:54 +01:00
Peter Krempa	482e5f159c	virCaps: get rid of defaultConsoleTargetType callback This patch refactors various places to allow removing of the defaultConsoleTargetType callback from the virCaps structure. A new console character device target type is introduced - VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_NONE - to mark that no type was specified in the XML. This type is at the end converted to the standard VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL. Other types that are different from this default have to be processed separately in the device post parse callback.	2013-04-04 22:42:39 +02:00
Peter Krempa	46becc18ba	virCaps: get rid of macPrefix field Use the virDomainXMLConf structure to hold this data and tweak the code to avoid semantic change. Without configuration the KVM mac prefix is used by default. I chose it as it's in the privately administered segment so it should be usable for any purposes.	2013-04-04 22:42:38 +02:00
Peter Krempa	b5def001cc	virCaps: get rid of emulatorRequired This patch removes the emulatorRequired field and associated infrastructure from the virCaps object. Instead the driver specific callbacks are used as this field isn't enforced by all drivers. This patch implements the appropriate callbacks in the qemu and lxc driver and moves to check to that location.	2013-04-04 22:42:38 +02:00
Peter Krempa	ad0d10b2b1	conf callback: Rearrange function parameters Move the xmlopt and caps arguments to the end of the argument list.	2013-04-04 22:41:19 +02:00
Peter Krempa	43b99fc4c0	conf: Add post XML parse callbacks and prepare for cleaning of virCaps This patch adds instrumentation that will allow hypervisor drivers to fill and validate domain and device definitions after parsed by the XML parser. With this patch, after the XML is parsed, a callback to the driver is issued requesting to fill and validate driver specific details of the configuration. This allows to use sensible defaults and checks on a per driver basis at the time the XML is parsed. Two callback pointers are stored in the new virDomainXMLConf object: * virDomainDeviceDefPostParseCallback (devicesPostParseCallback) - called for a single device parsed and for every single device in a domain config. A virDomainDeviceDefPtr is passed along with the domain definition and virCaps. * virDomainDefPostParseCallback, (domainPostParseCallback) - A callback that is meant to process the domain config after it's parsed. A virDomainDefPtr is passed along with virCaps. Both types of callbacks support arbitrary opaque data passed for the callback functions. Errors may be reported in those callbacks resulting in a XML parsing failure.	2013-04-04 22:29:48 +02:00
Peter Krempa	e84b19316a	maint: Rename xmlconf to xmlopt and virDomainXMLConfig to virDomainXMLOption This patch is the result of running: for i in $(git ls-files \| grep -v html \| grep -v \.po$ ); do sed -i -e "s/virDomainXMLConf/virDomainXMLOption/g" -e "s/xmlconf/xmlopt/g" $i done and a few manual tweaks.	2013-04-04 22:18:56 +02:00
Daniel P. Berrange	6263fc5a5b	Wire up sysinfo for LXC driver The sysinfo code used by QEMU is trivially portable to the LXC driver	2013-04-04 11:07:00 +01:00
Daniel P. Berrange	edd87fa2ea	Revert "lxc: Prevent shutting down the host" This reverts commit `c9c87376f2`. Now that we force all containers to have a root filesystem, there is no way the host's /dev is ever exposed Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-04 10:51:59 +01:00
Daniel P. Berrange	c131525bec	Auto-add a root <filesystem> element to LXC containers on startup Currently the LXC container code has two codepaths, depending on whether there is a <filesystem> element with a target path of '/'. If we automatically add a <filesystem> device with src=/ and dst=/, for any container which has not specified a root filesystem, then we only need one codepath for setting up the filesystem. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-04 10:51:59 +01:00
Daniel P. Berrange	f7e8653f7e	Remove support for old kernels lacking private devpts Early on kernel support for private devpts was not widespread, so we had compatibiltiy codepaths. Such old kernels are not seriously used for LXC these days, so the compat code can go away Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-04-04 10:51:59 +01:00
Martin Kletzander	c9c87376f2	lxc: Prevent shutting down the host When the container has the same '/dev' mount as host (no chroot), calling domainShutdown(WithFlags) shouldn't shutdown the host it is running on.	2013-03-23 11:07:57 +01:00
Daniel P. Berrange	8dbe85886c	Ensure root filesystem is mounted if a file/block mount. For a root filesystem with type=file or type=block, the LXC container was forgetting to actually mount it, before doing the pivot root step. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-03-22 17:27:01 +00:00
Daniel P. Berrange	7e1a7444c6	Mount temporary devpts on /var/lib/libvirt/lxc/$NAME.devpts Currently the lxc controller sets up the devpts instance on $rootfsdef->src, but this only works if $rootfsdef is using type=mount. To support type=block or type=file for the root filesystem, we must use /var/lib/libvirt/lxc/$NAME.devpts for the temporary devpts mount in the controller	2013-03-22 17:27:01 +00:00
Daniel P. Berrange	05f664b12c	Move FUSE mount to /var/lib/libvirt/lxc/$NAME.fuse Instead of using /var/lib/libvirt/lxc/$NAME for the FUSE filesystem, use /var/lib/libvirt/lxc/$NAME.fuse. This allows room for other temporary mounts in the same directory	2013-03-22 17:27:01 +00:00
Daniel P. Berrange	d50cb2b115	Fix thread safety in LXC callback handling Some of the LXC callbacks did not lock the virDomainObjPtr instance. This caused transient errors like error: Failed to start domain busy-mount error: cannot rename file '/var/run/libvirt/lxc/busy-mount.xml.new' as '/var/run/libvirt/lxc/busy-mount.xml': No such file or directory as 2 threads tried to update the status file concurrently Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-03-22 17:27:01 +00:00
Daniel P. Berrange	c5f28d0117	Fix free of uninitialized value in LXC numad setup The 'nodeset' variable was never initialized, causing a later VIR_FREE(nodeset) to free uninitialized memory. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-03-22 11:44:35 +00:00
Gao feng	4dceffadc9	LXC: add cpuset cgroup support for lxc This patch adds cpuset cgroup support for LXC. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-03-20 19:37:16 +08:00
Gao feng	45e9d27ad8	NUMA: cleanup for numa related codes Intend to reduce the redundant code,use virNumaSetupMemoryPolicy to replace virLXCControllerSetupNUMAPolicy and qemuProcessInitNumaMemoryPolicy. This patch also moves the numa related codes to the file virnuma.c and virnuma.h Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-03-20 19:37:00 +08:00
Gao feng	c9759a7b63	LXC: allow uses advisory nodeset from querying numad Allow lxc using the advisory nodeset from querying numad, this means if user doesn't specify the numa nodes that the lxc domain should assign to, libvirt will automatically bind the lxc domain to the advisory nodeset which queried from numad. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>	2013-03-19 20:03:29 -06:00
Daniel P. Berrange	0a418355cc	Do not prematurely close loop devices in LXC controller The LXC controller is closing loop devices as soon as the container has started. This is fine if the loop device was setup as a mounted filesystem, but if we're just passing through the loop device as a disk, nothing else is keeping it open. Thus we must keep the loop device FDs open for as long the libvirt_lxc process is running. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-03-19 14:46:40 +00:00
Daniel P. Berrange	1760258cc3	Setup LXC cgroups in two phases Currently the LXC controller creates the cgroup, configures the resources and adds the task all in one go. This is not sufficiently flexible for the forthcoming NBD integration. We need to make sure the NBD process gets into the right cgroup immediately, but we can not have limits (in particular the device ACL) applied at the point where we start qemu-nbd. So create a virLXCCgroupCreate method which creates the cgroup and adds the current task to be called early, and leave virLXCCgroupSetup to only do resource config. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-03-19 14:46:35 +00:00
Daniel P. Berrange	403594eb8c	Fix generation of systemtap probes for RPC protocols The naming used in the RPC protocols for the LXC monitor and lock daemon confused the script used to generate systemtap helper functions. Rename the LXC monitor protocol symbols to reduce confusion. Adapt the gensystemtap.pl script to cope with the LXC monitor / lock daemon naming conversions. This has no functional impact on RPC wire protocol, since names are only used in the C layer Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-03-14 12:42:22 +00:00
Daniel P. Berrange	2f98a7f7ba	Avoid closing uninitialized FDs when LXC startup fails If an LXC domain failed to start because of a bogus SELinux label, virLXCProcessStart would call VIR_CLOSE(0) by mistake. This is because the code which initializes the member of the ttyFDs array to -1 got moved too far away from the place where the array is first allocated. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-03-14 12:42:21 +00:00
Daniel P. Berrange	e31f32c6a3	Daemonize fuse thread in libvirt_lxc In some startup failure modes, the fuse thread may get itself wedged. This will cause the entire libvirt_lxc process to hang trying to the join the thread. There is no compelling reason to wait for the thread to exit if the whole process is exiting, so just daemonize the fuse thread instead. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2013-03-13 15:54:06 +00:00
Daniel P. Berrange	a08810195c	Fix query of LXC security label The virDomainGetSecurityLabel method is currently (mistakenly) showing the label of the libvirt_lxc process: ...snip... Security model: selinux Security DOI: 0 Security label: system_u:system_r:virtd_t:s0-s0:c0.c1023 (permissive) when it should be showing the init process label ...snip... Security model: selinux Security DOI: 0 Security label: system_u:system_r:svirt_t:s0:c724,c995 (permissive)	2013-03-13 15:16:42 +00:00

1 2 3 4 5 ...

731 Commits