Sanlock is a project that implements a disk-paxos locking
algorithm. This is suitable for cluster deployments with
shared storage.
* src/Makefile.am: Add dlopen plugin for sanlock
* src/locking/lock_driver_sanlock.c: Sanlock driver
* configure.ac: Check for sanlock
* libvirt.spec.in: Add a libvirt-lock-sanlock RPM
* src/conf/domain_conf.c, src/conf/domain_conf.h: APIs for
inserting/finding/removing virDomainLeaseDefPtr instances
* src/qemu/qemu_driver.c: Wire up hotplug/unplug for leases
* src/qemu/qemu_hotplug.h, src/qemu/qemu_hotplug.c: Support
for hotplug and unplug of leases
Some lock managers associate state with leases, allowing a process
to temporarily release its leases, and re-acquire them later, safe
in the knowledge that no other process has acquired + released the
leases in between.
This is already used between suspend/resume operations, and must
also be used across migration. This passes the lockstate in the
migration cookie. If the lock manager uses lockstate, then it
becomes compulsory to use the migration v3 protocol to get the
cookie support.
* src/qemu/qemu_driver.c: Validate that migration v2 protocol is
not used if lock manager needs state transfer
* src/qemu/qemu_migration.c: Transfer lock state in migration
cookie XML
The QEMU integrates with the lock manager instructure in a number
of key places
* During startup, a lock is acquired in between the fork & exec
* During startup, the libvirtd process acquires a lock before
setting file labelling
* During shutdown, the libvirtd process acquires a lock
before restoring file labelling
* During hotplug, unplug & media change the libvirtd process
holds a lock while setting/restoring labels
The main content lock is only ever held by the QEMU child process,
or libvirtd during VM shutdown. The rest of the operations only
require libvirtd to hold the metadata locks, relying on the active
QEMU still holding the content lock.
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h,
src/qemu/libvirtd_qemu.aug, src/qemu/test_libvirtd_qemu.aug:
Add config parameter for configuring lock managers
* src/qemu/qemu_driver.c: Add calls to the lock manager
To facilitate use of the locking plugins from hypervisor drivers,
introduce a higher level API for locking virDomainObjPtr instances.
In includes APIs targetted to VM startup, and hotplug/unplug
* src/Makefile.am: Add domain lock API
* src/locking/domain_lock.c, src/locking/domain_lock.h: High
level API for domain locking
To allow hypervisor drivers to assume that a lock driver impl
will be guaranteed to exist, provide a 'nop' impl that is
compiled into the library
* src/Makefile.am: Add nop driver
* src/locking/lock_driver_nop.c, src/locking/lock_driver_nop.h:
Nop lock driver implementation
* src/locking/lock_manager.c: Enable direct access of 'nop'
driver, instead of dlopen()ing it.
Define the basic framework lock manager plugins. The
basic plugin API for 3rd parties to implemented is
defined in
src/locking/lock_driver.h
This allows dlopen()able modules for alternative locking
schemes, however, we do not install the header. This
requires lock plugins to be in-tree allowing changing of
the lock manager plugin API in future.
The libvirt code for loading & calling into plugins
is in
src/locking/lock_manager.{c,h}
* include/libvirt/virterror.h, src/util/virterror.c: Add
VIR_FROM_LOCKING
* src/locking/lock_driver.h: API for lock driver plugins
to implement
* src/locking/lock_manager.c, src/locking/lock_manager.h:
Internal API for managing locking
* src/Makefile.am: Add locking code
A lock manager may operate in various modes. The direct mode of
operation is to obtain locks based on the resources associated
with devices in the XML. The indirect mode is where the app
creating the domain provides explicit leases for each resource
that needs to be locked. This XML extension allows for listing
resources in the XML
<devices>
...
<lease>
<lockspace>somearea</lockspace>
<key>thequickbrownfoxjumpsoverthelazydog</key>
<target path='/some/lease/path' offset='23432'/>
</lease>
...
</devices>
The 'lockspace' is a unique identifier for the lockspace which
the lease is associated
The 'key' is a unique identifier for the resource associated
with the lease.
The 'target' is the file on disk where the leases are held.
* docs/schemas/domain.rng: Add lease schema
* src/conf/domain_conf.c, src/conf/domain_conf.h: parsing and
formatting for leases
* tests/qemuxml2argvdata/qemuxml2argv-lease.args,
tests/qemuxml2argvdata/qemuxml2argv-lease.xml,
tests/qemuxml2xmltest.c: Test XML handling for leases
Allow the parent process to perform a bi-directional handshake
with the child process during fork/exec. The child process
will fork and do its initial setup. Immediately prior to the
exec(), it will stop & wait for a handshake from the parent
process. The parent process will spawn the child and wait
until the child reaches the handshake point. It will do
whatever extra setup work is required, before signalling the
child to continue.
The implementation of this is done using two pairs of blocking
pipes. The first pair is used to block the parent, until the
child writes a single byte. Then the second pair pair is used
to block the child, until the parent confirms with another
single byte.
* src/util/command.c, src/util/command.h,
src/libvirt_private.syms: Add APIs to perform a handshake
Regression introduced in commit d6623003 (v0.8.8) - using the
wrong sizeof operand meant that security manager private data
was overlaying the allowDiskFormatProbing member of struct
_virSecurityManager. This reopens disk probing, which was
supposed to be prevented by the solution to CVE-2010-2238.
* src/security/security_manager.c
(virSecurityManagerGetPrivateData): Use correct offset.
Alas, /usr/bin/kvm is also not directly supported by testutilsqemu.c.
In fact, _any_ test that uses <cpu match=...> has to use our faked
qemu.sh in order to properly answer the 'qemu -cpu ?' probe done
during qemu command line building.
* tests/qemuxml2argvdata/*graphics-spice-timeout*: Switch emulator, again.
I noticed this while building from libvirt.git on RHEL 5.6:
Generating internals/command.html.tmp
mkdir: cannot create directory `/internals': Permission denied
If I had been building as root instead, this pollutes /.
Older autoconf lacks $(builddir), but it is rigorously equal to '.'
in newer autoconf, so we could use '$(MKDIR_P) internals' instead.
However, since internals/command.html is part of the tarball, we
_already_ build it in $(srcdir), not $(builddir) during VPATH
builds, so the mkdir is wasted effort!
* docs/Makefile.am (internals/%.html.tmp): Drop unused mkdir.
Commit 2d6adabd53 replaced qsorting disk
and controller devices with inserting them at the right position. That
was to fix unnecessary reordering of devices. However, when parsing
domain XML devices are just taken in the order in which they appear in
the XML since. Use the correct insertion algorithm to honor device
target.
Remove some special case code that took care of mapping hyper to the
correct C types.
As the list of procedures that is allowed to map hyper to long is fixed
put it in the generator instead annotations in the .x files. This
results in simpler .x file parsing code.
Use macros for hyper to long assignments that perform overflow checks
when long is smaller than hyper. Map hyper to long long by default.
Suggested by Eric Blake.
The gnutls_certificate_type_set_priority method is deprecated.
Since we already set the default gnutls priority, and do not
support OpenGPG credentials in any case, it was not serving
any useful purpose and can be removed
* src/remote/remote_driver.c: Remove src/remote/remote_driver.c
call
Convert openvzLocateConfFile to a replaceable function pointer to
allow testing the config file parsing without rewriting the whole
OpenVZ config parsing to a more testable structure.
Substitute VIR_ERR_NO_SUPPORT with VIR_ERR_INTERNAL_ERROR. Error
like following is not what user want to see.
error : pciDeviceIsAssignable:1487 : this function is not supported
by the connection driver: Device 0000:07:10.0 is behind a switch
lacking ACS and cannot be assigned
This function is also affected by getline conversion. But this
didn't result in a regression in general, because the difference
would only affect the behavior of the function when the line in
/proc/vz/vestat for the given vpsid wasn't found. Under normal
conditions this should not happen.
The regression fix in 3aab7f2d6b altered the error handling.
getline returns -1 on failure to read a line (including EOF). The
original openvzReadConfigParam function using openvz_readline only
treated EOF as not-found. The current getline version treats all
getline failures as not-found.
This patch fixes this and distinguishes EOF from other getline
failures.
Since directories can be used for <filesystem> passthrough, they are
basically storage volumes.
v2:
Skip ., .., lost+found dirs
v3:
Use gnulib last_component
v4:
Use gnulib "dirname.h", not system <dirname.h>
Don't skip lost+found
This fixes this three warnings from the parser by allowing the parser
to ignore some macros in the same way as it can ignore functions.
Parsing ./../include/libvirt/libvirt.h
Misformatted macro comment for _virSchedParameter
Expecting '* _virSchedParameter:' got '* virSchedParameter:'
Misformatted macro comment for _virBlkioParameter
Expecting '* _virBlkioParameter:' got '* virBlkioParameter:'
Misformatted macro comment for _virMemoryParameter
Expecting '* _virMemoryParameter:' got '* virMemoryParameter:'
If spice graphics has no <channel> elements, the output graphics XML
is messed up. To prevent this, we need to end the <graphics> element
just before adding any compression selecting elements.
The virSysinfoIsEqual method was mistakenly inside a #ifndef WIN32
conditional.
The existing virSysinfoFormat is also stubbed out on Win32, even
though the code works without any trouble. This breaks XML output
on Win32, so the stub is removed.
virsh migrate mistakenly had some variables inside the conditional
* src/util/sysinfo.c: Build virSysinfoIsEqual on Win32 and remove
Win32 stub for virSysinfoFormat
* tools/virsh.c: Fix variable declaration on Win32
Update the qemuDomainMigrateBegin method so that it accepts
an optional incoming XML document. This will be validated
for ABI compatibility against the current domain config,
and if this check passes, will be passed back out for use
by the qemuDomainMigratePrepare method on the target
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h,
src/qemu/qemu_migration.c: Allow custom XML to be passed
Switch virsh migrate over to use virDomainMigrate2 and
virDomainMigrateToURI2. This is still compatible with
older libvirts, because these methods dynamically choose
whether to perform v1, v2 or v3 migration based on declared
RPC support from the libvirtd instances
Add a --xml arg which allows the user to pass in a custom
XML document. This XML document must be ABI compatible
with the current *live* XML document for the running guest
on the source host. ABI compatibility will be enforced by
any driver supporting this function
* tools/virsh.c: Add '--xml' arg to migrate command
To allow a client app to pass in custom XML during migration
of a guest it is neccessary to ensure the guest ABI remains
unchanged. The virDomainDefCheckABIStablity method accepts
two virDomainDefPtr structs and compares everything in them
that could impact the guest machine ABI
* src/conf/domain_conf.c, src/conf/domain_conf.h,
src/libvirt_private.syms: Add virDomainDefCheckABIStablity
* src/conf/cpu_conf.c, src/conf/cpu_conf.h: Add virCPUDefIsEqual
* src/util/sysinfo.c, src/util/sysinfo.h: Add virSysinfoIsEqual
The virDomainHostdevDef struct contains a 'char *target'
field. This is set to 'NULL' when parsing XML and never
used / set anywhere else. Clearly it is bogus & unused
* src/conf/domain_conf.c, src/conf/domain_conf.h: Remove
target from virDomainHostdevDef
This patch seperate the domain config loading just as qemu driver
does, first loading config of running or trasient domains, then
of persistent inactive domains. And only try to reconnect the
monitor of running domains, so that it won't always throws errors
saying can't connect to domain monitor.
And as "virDomainLoadConfig->virDomainAssignDef->virDomainObjAssignDef",
already do things like "vm->newDef = def", removed the codes
in "lxcReconnectVM" that does the same work.
Add support to set the maximum memory of the domain.
Also add support to change the memory of the current
state of the domain, which translates to a running
domain or the config of the domain.
Based on the code from the qemu driver.
v3:
* initialize xml pointer to avoid segfault
* throw error message if domain is paused as
libxenlight itself will pause it
v2:
* header is now padded and has a version field
* the correct restore function from libxl is used
* only create the restore event once in libxlVmStart
This patch fixes the population of the
libxenlight data structures. Now the devices
should be removed correctly from the xenstore
if they are detached.
Currently the QEMU monitor I/O handler code uses errno values
to report errors. This results in a sub-optimal error messages
on certain conditions, in particular when parsing JSON strings
malformed data simply results in 'EINVAL'.
This changes the code to use the standard libvirt error reporting
APIs. The virError is stored against the qemuMonitorPtr struct,
and when a monitor API is run, any existing stored error is copied
into that thread's error local
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_text.c: Use
virError APIs for all monitor I/O handling code
Currently whenever there is any failure with parsing the monitor,
this is treated in the same was as end-of-file (ie QEMU quit).
The domain is terminated, if not already dead.
With this change, failures in parsing the monitor stream do not
result in the death of QEMU. The guest continues running unchanged,
but all further use of the monitor will be disabled.
The VMM_FAILURE event will be emitted, and the mgmt application
can decide when to kill/restart the guest to re-gain control
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Run a
different callback for monitor EOF vs error conditions.
* src/qemu/qemu_process.c: Emit VMM_FAILURE event when monitor
fails
This introduces a new domain
VIR_DOMAIN_EVENT_ID_CONTROL_ERROR
Which uses the existing generic callback
typedef void (*virConnectDomainEventGenericCallback)(virConnectPtr conn,
virDomainPtr dom,
void *opaque);
This event is intended to be emitted when there is a failure in
some part of the domain virtualization system. Whether the domain
continues to run/exist after the failure is an implementation
detail specific to the hypervisor.
The idea is that with some types of failure, hypervisors may
prefer to leave the domain running in a "degraded" mode of
operation. For example, if something goes wrong with the QEMU
monitor, it is possible to leave the guest OS running quite
happily. The mgmt app will simply loose the ability todo various
tasks. The mgmt app can then choose how/when to deal with the
failure that occured.
* daemon/remote.c: Dispatch of new event
* examples/domain-events/events-c/event-test.c: Demo catch
of event
* include/libvirt/libvirt.h.in: Define event ID and callback
* src/conf/domain_event.c, src/conf/domain_event.h: Internal
event handling
* src/remote/remote_driver.c: Receipt of new event from daemon
* src/remote/remote_protocol.x: Wire protocol for new event
* src/remote_protocol-structs: add new event for checks