The O_NONBLOCK flag doesn't work as desired on plain files
or block devices. Introduce an I/O helper program that does
the blocking I/O operations, communicating over a pipe that
can support O_NONBLOCK
* src/fdstream.c, src/fdstream.h: Add non-blocking I/O
on plain files/block devices
* src/Makefile.am, src/util/iohelper.c: I/O helper program
* src/qemu/qemu_driver.c, src/lxc/lxc_driver.c,
src/uml/uml_driver.c, src/xen/xen_driver.c: Update for
streams API change
THe veth setup in LXC had a couple of flaws, first brInit did
not report any error when it failed. Second vethCreate() did
not correctly initialize the variable containing the return
code, so could report failure even when it succeeded.
* src/lxc/lxc_driver.c: Report error when brInit fails
* src/lxc/veth.c: Fix uninitialized variable
It is possible to set a migration speed limit when starting
migration. This new API allows the speed limit to be changed
on the fly to adjust to changing conditions
* src/driver.h, src/libvirt.c, src/libvirt_public.syms,
include/libvirt/libvirt.h.in: Add virDomainMigrateSetMaxSpeed
* src/esx/esx_driver.c, src/lxc/lxc_driver.c,
src/opennebula/one_driver.c, src/openvz/openvz_driver.c,
src/phyp/phyp_driver.c, src/qemu/qemu_driver.c,
src/remote/remote_driver.c, src/test/test_driver.c,
src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
src/vmware/vmware_driver.c, src/xen/xen_driver.c,
src/libxl/libxl_driver.c: Stub new API
* Correct the documentation for cgroup: the swap_hard_limit indicates
mem+swap_hard_limit.
* Change cgroup private apis to: virCgroupGet/SetMemSwapHardLimit
Signed-off-by: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>
The current LXC I/O controller looks for HUP to detect
when a guest has quit. This isn't reliable as during
initial bootup it is possible that 'init' will close
the console and let mingetty re-open it. The shutdown
of containers was also flakey because it only killed
the libvirt I/O controller and expected container
processes to gracefully follow.
Change the I/O controller such that when it see HUP
or an I/O error, it uses kill($PID, 0) to see if the
process has really quit.
Change the container shutdown sequence to use the
virCgroupKillPainfully function to ensure every
really goes away
This change makes the use of the 'cpu', 'devices'
and 'memory' cgroups controllers compulsory with
LXC
* docs/drvlxc.html.in: Document that certain cgroups
controllers are now mandatory
* src/lxc/lxc_controller.c: Check if PID is still
alive before quitting on I/O error/HUP
* src/lxc/lxc_driver.c: Use virCgroupKillPainfully
This patch introduces a new libvirt API (virDomainSetMemoryFlags) and
a flag (virDomainMemoryModFlags).
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Adding audit points showed that we were granting too much privilege
to qemu; it should not need any mknod rights to recreate any
devices. On the other hand, lxc should have all device privileges.
The solution is adding a flag parameter.
This also lets us restrict write access to read-only disks.
* src/util/cgroup.h (virCgroup*Device*): Adjust prototypes.
* src/util/cgroup.c (virCgroupAllowDevice)
(virCgroupAllowDeviceMajor, virCgroupAllowDevicePath)
(virCgroupDenyDevice, virCgroupDenyDeviceMajor)
(virCgroupDenyDevicePath): Add parameter.
* src/qemu/qemu_driver.c (qemudDomainSaveFlag): Update clients.
* src/lxc/lxc_controller.c (lxcSetContainerResources): Likewise.
* src/qemu/qemu_cgroup.c: Likewise.
(qemuSetupDiskPathAllow): Also, honor read-only disks.
virRun gives pretty useful error output, let's not overwrite it unless there
is a good reason. Some places were providing more information about what
the commands were _attempting_ to do, however that's usually less useful from
a debugging POV than what actually happened.
Relax the restriction that the hash table key must be a string
by allowing an arbitrary hash code generator + comparison func
to be provided
* util/hash.c, util/hash.h: Allow any pointer as a key
* internal.h: Include stdbool.h as standard.
* conf/domain_conf.c, conf/domain_conf.c,
conf/nwfilter_params.c, nwfilter/nwfilter_gentech_driver.c,
nwfilter/nwfilter_gentech_driver.h, nwfilter/nwfilter_learnipaddr.c,
qemu/qemu_command.c, qemu/qemu_driver.c,
qemu/qemu_process.c, uml/uml_driver.c,
xen/xm_internal.c: s/char */void */ in hash callbacks
Using the 'personality(2)' system call, we can make a container
on an x86_64 host appear to be i686. Likewise for most other
Linux 64bit arches.
* src/lxc/lxc_conf.c: Fill in 32bit capabilities for x86_64 hosts
* src/lxc/lxc_container.h, src/lxc/lxc_container.c: Add API to
check if an arch has a 32bit alternative
* src/lxc/lxc_controller.c: Set the process personality when
starting guest
When spawning 'init' in the container, set
LIBVIRT_LXC_UUID=XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
LIBVIRT_LXC_NAME=YYYYYYYYYYYY
to allow guest software to detect & identify that they
are in a container
* src/lxc/lxc_container.c: Set LIBVIRT_LXC_UUID and
LIBVIRT_LXC_NAME env vars
Normal practice for /dev/pts is to have it mode=620,gid=5
but LXC was leaving mode=000,gid=0 preventing unprivilegd
users in the guest use of PTYs
* src/lxc/lxc_controller.c: Fix /dev/pts setup
Done mechanically with:
$ git grep -l '\bDEBUG0\? *(' | xargs -L1 sed -i 's/\bDEBUG0\? *(/VIR_&/'
followed by manual deletion of qemudDebug in daemon/libvirtd.c, along
with a single 'make syntax-check' fallout in the same file, and the
actual deletion in src/util/logging.h.
* src/util/logging.h (DEBUG, DEBUG0): Delete.
* daemon/libvirtd.h (qemudDebug): Likewise.
* global: Change remaining clients over to VIR_DEBUG counterpart.
Until now, user namespaces have not done much, but (for that
reason) have been innocuous to glob in with other CLONE_
flags. Upcoming userns development, however, will make tasks
cloned with CLONE_NEWUSER far more restricted. In particular,
for some time they will be unable to access files with anything
other than the world access perms.
This patch assumes that noone really needs the user namespaces
to be enabled. If that is wrong, then we can try a more
baroque patch where we create a file owned by a test userid with
700 perms and, if we can't access it after setuid'ing to that
userid, then return 0. Otherwise, assume we are using an
older, 'harmless' user namespace implementation.
Comments appreciated. Is it ok to do this?
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
This will allow us to record transient runtime state in vm->def, like
default VNC parameters. Accomplish this by adding an extra 'live' parameter
to SetDefTransient, with similar semantics to the 'live' flag for
AssignDef.
Display or set unlimited values for memory parameters. Unlimited is
represented by INT64_MAX in memory cgroup.
Signed-off-by: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>
Reported-by: Justin Clift <jclift@redhat.com>
Except LXC and UML driver, implementations of all other drivers
simply return 0, because these drivers doesn't have config both
in memory and on disk, no need to track if the domain of these
drivers updated or not.
Rename "xenUnifiedDomainisPersistent" to "xenUnifiedDomainIsPersistent"
* esx/esx_driver.c
* lxc/lxc_driver.c
* opennebula/one_driver.c
* openvz/openvz_driver.c
* phyp/phyp_driver.c
* test/test_driver.c
* uml/uml_driver.c
* vbox/vbox_tmpl.c
* xen/xen_driver.c
* xenapi/xenapi_driver.c
The current semantics of non-persistent hotplug/update are confusing: the
changes will persist as long as the in memory domain definition isn't
overwritten. This means hotplug changes stay around until the domain is
redefined or libvirtd is restarted.
Call virDomainObjSetDefTransient at VM startup, so that we properly discard
hotplug changes when the VM is shutdown.
Per the gettext developer:
http://lists.gnu.org/archive/html/bug-gnu-utils/2010-10/msg00019.htmlhttp://lists.gnu.org/archive/html/bug-gnu-utils/2010-10/msg00021.html
gettext() doesn't work correctly on all platforms unless you have
called setlocale(). Furthermore, gnulib's gettext.h has provisions
for setting up a default locale, which is the preferred method for
libraries to use gettext without having to call textdomain() and
override the main program's default domain (virInitialize already
calls bindtextdomain(), but this is insufficient without the
setlocale() added in this patch; and a redundant bindtextdomain()
in this patch doesn't hurt, but serves as a good example for other
packages that need to bind a second translation domain).
This patch is needed to silence a new gnulib 'make syntax-check'
rule in the next patch.
* daemon/libvirtd.c (main): Setup locale and gettext.
* src/lxc/lxc_controller.c (main): Likewise.
* src/security/virt-aa-helper.c (main): Likewise.
* src/storage/parthelper.c (main): Likewise.
* tools/virsh.c (main): Fix exit status.
* src/internal.h (DEFAULT_TEXT_DOMAIN): Define, for gettext.h.
(_): Simplify definition accordingly.
* po/POTFILES.in: Add src/storage/parthelper.c.
Introduce implementations of the virDomainOpenConsole() API
for LXC, Xen and UML drivers.
* src/lxc/lxc_driver.c, src/lxc/lxc_driver.c,
src/xen/xen_driver.c: Wire up virDomainOpenConsole
To enable virsh console (or equivalent) to be used remotely
it is necessary to provide remote access to the /dev/pts/XXX
pseudo-TTY associated with the console/serial/parallel device
in the guest. The virStream API provide a bi-directional I/O
stream capability that can be used for this purpose. This
patch thus introduces a virDomainOpenConsole API that uses
the stream APIs.
* src/libvirt.c, src/libvirt_public.syms,
include/libvirt/libvirt.h.in, src/driver.h: Define the
new virDomainOpenConsole API
* src/esx/esx_driver.c, src/lxc/lxc_driver.c,
src/opennebula/one_driver.c, src/openvz/openvz_driver.c,
src/phyp/phyp_driver.c, src/qemu/qemu_driver.c,
src/remote/remote_driver.c, src/test/test_driver.c,
src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
src/xen/xen_driver.c, src/xenapi/xenapi_driver.c: Stub
API entry point
The /dev/console device inside the container must NOT map
to the real /dev/console device node, since this allows the
container control over the current host console. A fun side
effect of this is that starting a container containing a
real Fedora OS will kill off your X server.
Remove the /dev/console node, and replace it with a symlink
to the primary console TTY
* src/lxc/lxc_container.c: Replace /dev/console with a
symlink to /dev/pty/0
* src/lxc/lxc_controller.c: Remove /dev/console from cgroups
ACL
Using automated replacement with sed and editing I have now replaced all
occurrences of close() with VIR_(FORCE_)CLOSE() except for one, of
course. Some replacements were straight forward, others I needed to pay
attention. I hope I payed attention in all the right places... Please
have a look. This should have at least solved one more double-close
error.
There is no point in trying to fill params beyond the first error,
because when lxcDomainGetMemoryParameters returns -1 then the caller
cannot detect which values in params are valid.
Adding parsing code for memory tunables in the domain xml file
also change the internal define structures used for domain memory
informations
Adds a new specific test
Public api to set/get memory tunables supported by the hypervisors.
dv:
* some cleanups in libvirt.c
* adding extra checks in libvirt.c new entry points
v4:
* Move exporting public API to this patch
* Add unsigned int flags to the public api for future extensions
v3:
* Add domainGetMemoryParamters and NULL in all the driver interface
v2:
* Initialize domainSetMemoryParameters to NULL in all the driver
interface structure.
src/lxc/veth.c:150: VIR_DEBUG(_("Failed to delete '%s' (%d)"),
src/lxc/veth.c:188: VIR_DEBUG(_("Failed to disable '%s' (%d)"),
maint.mk: do not mark these strings for translation
* src/lxc/veth.c (vethDelete, vethInterfaceUpOrDown): Don't
translate VIR_DEBUG.
Previously, the functions in src/lxc/veth.c could sometimes return
positive values on failure rather than -1. This made accurate error
reporting difficult, and led to one failure to catch an error in a
calling function.
This patch makes all the functions in veth.c consistently return 0 on
success, and -1 on failure. It also fixes up the callers to the veth.c
functions where necessary.
Note that this patch may be related to the bug:
https://bugzilla.redhat.com/show_bug.cgi?id=607496.
It will not fix the bug, but should unveil what happens.
* po/POTFILES.in - add veth.c, which previously had no translatable strings
* src/lxc/lxc_controller.c
* src/lxc/lxc_container.c
* src/lxc/lxc_driver.c - fixup callers to veth.c, and remove error logs,
as they are now done in veth.c
* src/lxc/veth.c - make all functions consistently return -1 on error.
* src/lxc/veth.h - use ATTRIBUTE_NONNULL to protect against NULL args.
Add the library entry point for the new virDomainQemuMonitorCommand()
entry point. Because this is not part of the "normal" libvirt API,
it gets its own header file, library file, and will eventually
get its own over-the-wire protocol later in the series.
Changes since v1:
- Go back to using the virDriver table for qemuDomainMonitorCommand, due to
linking issues
- Added versioning information to the libvirt-qemu.so
Changes since v2:
- None
Changes since v3:
- Add LGPL header to libvirt-qemu.c
- Make virLibConnError and virLibDomainError macros instead of function calls
Changes since v4:
- Move exported symbols to libvirt_qemu.syms
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Init process may remain after sending SIGTERM for some reason.
For example, if original init program is used, it is definitely
not killed by SIGTERM.
* src/lxc/lxc_controller.c: kill with SIGKILL if SIGTERM wasn't
sufficient
Because tty path is unexpectedly not saved in the live configuration
file of a domain, libvirtd cannot get the console of the domain back
after restarting.
The reason why the tty path isn't saved is that, to save the tty path,
the save function, virDomainSaveConfig, requires that the target domain
is running (pid != -1), however, lxc driver calls the function before
starting the domain to pass the configuration to controller.
To ensure to save the tty path, the patch lets lxc driver call the save
function again after starting the domain.
The function is expected to return negative value on failure,
however, it returns positive value when either setInterfaceName
or vethInterfaceUpOrDown fails. Because the function returns
the return value of either as is, however, the two functions
may return positive value on failure.
The patch fixes the defects and add error messages.