The qemuAuditDisk calls in disk hotunplug operations were being
passed 'ret >= 0', but the code which sets ret to 0 was not yet
executed, and the error path had already jumped to the 'cleanup'
label. This meant hotunplug failures were never audited, and
hotunplug success was audited as a failure
* src/qemu/qemu_hotplug.c: Fix auditing of hotunplug
When virLockDriverAcquire is invoked during hotplug the state
parameter will be left as NULL.
* src/locking/lock_driver_nop.c,
src/locking/lock_driver_sanlock.c: Don't reference NULL state
parameter
Refactoring of the lock manager hotplug methods lost the
ret = 0 assignment for successful return path
* src/locking/domain_lock.c: Add missing ret = 0 assignments
Commit 4454a9efc7 introduced bad
behaviour on the VIR_EVENT_HANDLE_ERROR condition. This condition
is only hit when an invalid FD is used in poll() (typically due
to a double-close bug). The QEMU monitor code was treating this
condition as non-fatal, and thus libvirt would poll() in a fast
loop forever burning 100% CPU. VIR_EVENT_HANDLE_ERROR must be
handled in the same way as VIR_EVENT_HANDLE_HANGUP, killing the
QEMU instance.
* src/qemu/qemu_monitor.c: Treat VIR_EVENT_HANDLE_ERROR as EOF
In between fork and exec, a connection to sanlock is acquired
and the socket file descriptor is intionally leaked to the
child process. sanlock watches this FD for POLL_HANGUP to
detect when QEMU has exited. We don't want a rogus/compromised
QEMU from issuing sanlock RPC calls on the leaked FD though,
since that could be used to DOS other guests. By calling
sanlock_restrict() on the socket before exec() we can lock
it down.
* configure.ac: Check for sanlock_restrict API
* src/locking/domain_lock.c: Restrict lock acquired in
process startup phase
* src/locking/lock_driver.h: Add VIR_LOCK_MANAGER_ACQUIRE_RESTRICT
* src/locking/lock_driver_sanlock.c: Add call to sanlock_restrict
when requested by VIR_LOCK_MANAGER_ACQUIRE_RESTRICT flag
Based on the equivalent qemu driver code
* src/libxl/libxl_driver.c: refactor the Start save and restore
routines of the driver and adds the new entry points for
managed saves handling
Sanlock is a project that implements a disk-paxos locking
algorithm. This is suitable for cluster deployments with
shared storage.
* src/Makefile.am: Add dlopen plugin for sanlock
* src/locking/lock_driver_sanlock.c: Sanlock driver
* configure.ac: Check for sanlock
* libvirt.spec.in: Add a libvirt-lock-sanlock RPM
* src/conf/domain_conf.c, src/conf/domain_conf.h: APIs for
inserting/finding/removing virDomainLeaseDefPtr instances
* src/qemu/qemu_driver.c: Wire up hotplug/unplug for leases
* src/qemu/qemu_hotplug.h, src/qemu/qemu_hotplug.c: Support
for hotplug and unplug of leases
Some lock managers associate state with leases, allowing a process
to temporarily release its leases, and re-acquire them later, safe
in the knowledge that no other process has acquired + released the
leases in between.
This is already used between suspend/resume operations, and must
also be used across migration. This passes the lockstate in the
migration cookie. If the lock manager uses lockstate, then it
becomes compulsory to use the migration v3 protocol to get the
cookie support.
* src/qemu/qemu_driver.c: Validate that migration v2 protocol is
not used if lock manager needs state transfer
* src/qemu/qemu_migration.c: Transfer lock state in migration
cookie XML
The QEMU integrates with the lock manager instructure in a number
of key places
* During startup, a lock is acquired in between the fork & exec
* During startup, the libvirtd process acquires a lock before
setting file labelling
* During shutdown, the libvirtd process acquires a lock
before restoring file labelling
* During hotplug, unplug & media change the libvirtd process
holds a lock while setting/restoring labels
The main content lock is only ever held by the QEMU child process,
or libvirtd during VM shutdown. The rest of the operations only
require libvirtd to hold the metadata locks, relying on the active
QEMU still holding the content lock.
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h,
src/qemu/libvirtd_qemu.aug, src/qemu/test_libvirtd_qemu.aug:
Add config parameter for configuring lock managers
* src/qemu/qemu_driver.c: Add calls to the lock manager
To facilitate use of the locking plugins from hypervisor drivers,
introduce a higher level API for locking virDomainObjPtr instances.
In includes APIs targetted to VM startup, and hotplug/unplug
* src/Makefile.am: Add domain lock API
* src/locking/domain_lock.c, src/locking/domain_lock.h: High
level API for domain locking
To allow hypervisor drivers to assume that a lock driver impl
will be guaranteed to exist, provide a 'nop' impl that is
compiled into the library
* src/Makefile.am: Add nop driver
* src/locking/lock_driver_nop.c, src/locking/lock_driver_nop.h:
Nop lock driver implementation
* src/locking/lock_manager.c: Enable direct access of 'nop'
driver, instead of dlopen()ing it.
Define the basic framework lock manager plugins. The
basic plugin API for 3rd parties to implemented is
defined in
src/locking/lock_driver.h
This allows dlopen()able modules for alternative locking
schemes, however, we do not install the header. This
requires lock plugins to be in-tree allowing changing of
the lock manager plugin API in future.
The libvirt code for loading & calling into plugins
is in
src/locking/lock_manager.{c,h}
* include/libvirt/virterror.h, src/util/virterror.c: Add
VIR_FROM_LOCKING
* src/locking/lock_driver.h: API for lock driver plugins
to implement
* src/locking/lock_manager.c, src/locking/lock_manager.h:
Internal API for managing locking
* src/Makefile.am: Add locking code
A lock manager may operate in various modes. The direct mode of
operation is to obtain locks based on the resources associated
with devices in the XML. The indirect mode is where the app
creating the domain provides explicit leases for each resource
that needs to be locked. This XML extension allows for listing
resources in the XML
<devices>
...
<lease>
<lockspace>somearea</lockspace>
<key>thequickbrownfoxjumpsoverthelazydog</key>
<target path='/some/lease/path' offset='23432'/>
</lease>
...
</devices>
The 'lockspace' is a unique identifier for the lockspace which
the lease is associated
The 'key' is a unique identifier for the resource associated
with the lease.
The 'target' is the file on disk where the leases are held.
* docs/schemas/domain.rng: Add lease schema
* src/conf/domain_conf.c, src/conf/domain_conf.h: parsing and
formatting for leases
* tests/qemuxml2argvdata/qemuxml2argv-lease.args,
tests/qemuxml2argvdata/qemuxml2argv-lease.xml,
tests/qemuxml2xmltest.c: Test XML handling for leases
Allow the parent process to perform a bi-directional handshake
with the child process during fork/exec. The child process
will fork and do its initial setup. Immediately prior to the
exec(), it will stop & wait for a handshake from the parent
process. The parent process will spawn the child and wait
until the child reaches the handshake point. It will do
whatever extra setup work is required, before signalling the
child to continue.
The implementation of this is done using two pairs of blocking
pipes. The first pair is used to block the parent, until the
child writes a single byte. Then the second pair pair is used
to block the child, until the parent confirms with another
single byte.
* src/util/command.c, src/util/command.h,
src/libvirt_private.syms: Add APIs to perform a handshake
Regression introduced in commit d6623003 (v0.8.8) - using the
wrong sizeof operand meant that security manager private data
was overlaying the allowDiskFormatProbing member of struct
_virSecurityManager. This reopens disk probing, which was
supposed to be prevented by the solution to CVE-2010-2238.
* src/security/security_manager.c
(virSecurityManagerGetPrivateData): Use correct offset.
Commit 2d6adabd53 replaced qsorting disk
and controller devices with inserting them at the right position. That
was to fix unnecessary reordering of devices. However, when parsing
domain XML devices are just taken in the order in which they appear in
the XML since. Use the correct insertion algorithm to honor device
target.
Remove some special case code that took care of mapping hyper to the
correct C types.
As the list of procedures that is allowed to map hyper to long is fixed
put it in the generator instead annotations in the .x files. This
results in simpler .x file parsing code.
Use macros for hyper to long assignments that perform overflow checks
when long is smaller than hyper. Map hyper to long long by default.
Suggested by Eric Blake.
The gnutls_certificate_type_set_priority method is deprecated.
Since we already set the default gnutls priority, and do not
support OpenGPG credentials in any case, it was not serving
any useful purpose and can be removed
* src/remote/remote_driver.c: Remove src/remote/remote_driver.c
call
Convert openvzLocateConfFile to a replaceable function pointer to
allow testing the config file parsing without rewriting the whole
OpenVZ config parsing to a more testable structure.
Substitute VIR_ERR_NO_SUPPORT with VIR_ERR_INTERNAL_ERROR. Error
like following is not what user want to see.
error : pciDeviceIsAssignable:1487 : this function is not supported
by the connection driver: Device 0000:07:10.0 is behind a switch
lacking ACS and cannot be assigned
This function is also affected by getline conversion. But this
didn't result in a regression in general, because the difference
would only affect the behavior of the function when the line in
/proc/vz/vestat for the given vpsid wasn't found. Under normal
conditions this should not happen.
The regression fix in 3aab7f2d6b altered the error handling.
getline returns -1 on failure to read a line (including EOF). The
original openvzReadConfigParam function using openvz_readline only
treated EOF as not-found. The current getline version treats all
getline failures as not-found.
This patch fixes this and distinguishes EOF from other getline
failures.
Since directories can be used for <filesystem> passthrough, they are
basically storage volumes.
v2:
Skip ., .., lost+found dirs
v3:
Use gnulib last_component
v4:
Use gnulib "dirname.h", not system <dirname.h>
Don't skip lost+found
If spice graphics has no <channel> elements, the output graphics XML
is messed up. To prevent this, we need to end the <graphics> element
just before adding any compression selecting elements.
The virSysinfoIsEqual method was mistakenly inside a #ifndef WIN32
conditional.
The existing virSysinfoFormat is also stubbed out on Win32, even
though the code works without any trouble. This breaks XML output
on Win32, so the stub is removed.
virsh migrate mistakenly had some variables inside the conditional
* src/util/sysinfo.c: Build virSysinfoIsEqual on Win32 and remove
Win32 stub for virSysinfoFormat
* tools/virsh.c: Fix variable declaration on Win32
Update the qemuDomainMigrateBegin method so that it accepts
an optional incoming XML document. This will be validated
for ABI compatibility against the current domain config,
and if this check passes, will be passed back out for use
by the qemuDomainMigratePrepare method on the target
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h,
src/qemu/qemu_migration.c: Allow custom XML to be passed
To allow a client app to pass in custom XML during migration
of a guest it is neccessary to ensure the guest ABI remains
unchanged. The virDomainDefCheckABIStablity method accepts
two virDomainDefPtr structs and compares everything in them
that could impact the guest machine ABI
* src/conf/domain_conf.c, src/conf/domain_conf.h,
src/libvirt_private.syms: Add virDomainDefCheckABIStablity
* src/conf/cpu_conf.c, src/conf/cpu_conf.h: Add virCPUDefIsEqual
* src/util/sysinfo.c, src/util/sysinfo.h: Add virSysinfoIsEqual
The virDomainHostdevDef struct contains a 'char *target'
field. This is set to 'NULL' when parsing XML and never
used / set anywhere else. Clearly it is bogus & unused
* src/conf/domain_conf.c, src/conf/domain_conf.h: Remove
target from virDomainHostdevDef
This patch seperate the domain config loading just as qemu driver
does, first loading config of running or trasient domains, then
of persistent inactive domains. And only try to reconnect the
monitor of running domains, so that it won't always throws errors
saying can't connect to domain monitor.
And as "virDomainLoadConfig->virDomainAssignDef->virDomainObjAssignDef",
already do things like "vm->newDef = def", removed the codes
in "lxcReconnectVM" that does the same work.
Add support to set the maximum memory of the domain.
Also add support to change the memory of the current
state of the domain, which translates to a running
domain or the config of the domain.
Based on the code from the qemu driver.
v3:
* initialize xml pointer to avoid segfault
* throw error message if domain is paused as
libxenlight itself will pause it
v2:
* header is now padded and has a version field
* the correct restore function from libxl is used
* only create the restore event once in libxlVmStart
This patch fixes the population of the
libxenlight data structures. Now the devices
should be removed correctly from the xenstore
if they are detached.
Currently the QEMU monitor I/O handler code uses errno values
to report errors. This results in a sub-optimal error messages
on certain conditions, in particular when parsing JSON strings
malformed data simply results in 'EINVAL'.
This changes the code to use the standard libvirt error reporting
APIs. The virError is stored against the qemuMonitorPtr struct,
and when a monitor API is run, any existing stored error is copied
into that thread's error local
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_text.c: Use
virError APIs for all monitor I/O handling code
Currently whenever there is any failure with parsing the monitor,
this is treated in the same was as end-of-file (ie QEMU quit).
The domain is terminated, if not already dead.
With this change, failures in parsing the monitor stream do not
result in the death of QEMU. The guest continues running unchanged,
but all further use of the monitor will be disabled.
The VMM_FAILURE event will be emitted, and the mgmt application
can decide when to kill/restart the guest to re-gain control
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Run a
different callback for monitor EOF vs error conditions.
* src/qemu/qemu_process.c: Emit VMM_FAILURE event when monitor
fails
This introduces a new domain
VIR_DOMAIN_EVENT_ID_CONTROL_ERROR
Which uses the existing generic callback
typedef void (*virConnectDomainEventGenericCallback)(virConnectPtr conn,
virDomainPtr dom,
void *opaque);
This event is intended to be emitted when there is a failure in
some part of the domain virtualization system. Whether the domain
continues to run/exist after the failure is an implementation
detail specific to the hypervisor.
The idea is that with some types of failure, hypervisors may
prefer to leave the domain running in a "degraded" mode of
operation. For example, if something goes wrong with the QEMU
monitor, it is possible to leave the guest OS running quite
happily. The mgmt app will simply loose the ability todo various
tasks. The mgmt app can then choose how/when to deal with the
failure that occured.
* daemon/remote.c: Dispatch of new event
* examples/domain-events/events-c/event-test.c: Demo catch
of event
* include/libvirt/libvirt.h.in: Define event ID and callback
* src/conf/domain_event.c, src/conf/domain_event.h: Internal
event handling
* src/remote/remote_driver.c: Receipt of new event from daemon
* src/remote/remote_protocol.x: Wire protocol for new event
* src/remote_protocol-structs: add new event for checks
Well, the remaining drivers that already had the get/set
scheduler parameter functionality to begin with.
For now, this blindly treats VIR_DOMAIN_SCHEDINFO_CURRENT as
the only supported operation for these 5 domains; it will
take domain-specific patches if more specific behavior is
preferred.
* src/esx/esx_driver.c (esxDomainGetSchedulerParameters)
(esxDomainSetSchedulerParameters): Move guts...
(esxDomainGetSchedulerParametersFlags)
(esxDomainSetSchedulerParametersFlags): ...to new functions.
* src/libxl/libxl_driver.c (libxlDomainGetSchedulerParameters)
(libxlDomainSetSchedulerParameters)
(libxlDomainGetSchedulerParametersFlags)
(libxlDomainSetSchedulerParametersFlags): Likewise.
* src/lxc/lxc_driver.c (lxcGetSchedulerParameters)
(lxcSetSchedulerParameters, lxcGetSchedulerParametersFlags)
(lxcSetSchedulerParametersFlags): Likewise.
* src/test/test_driver.c (testDomainGetSchedulerParams)
(testDomainSetSchedulerParams, testDomainGetSchedulerParamsFlags)
(testDomainSetSchedulerParamsFlags): Likewise.
* src/xen/xen_driver.c (xenUnifiedDomainGetSchedulerParameters)
(xenUnifiedDomainSetSchedulerParameters)
(xenUnifiedDomainGetSchedulerParametersFlags)
(xenUnifiedDomainSetSchedulerParametersFlags): Likewise.
* src/qemu/qemu_driver.c (qemuGetSchedulerParameters): Move
guts...
(qemuGetSchedulerParametersFlags): ...to new callback, and honor
flags more accurately.
If we can choose live or config when setting, then we need to
be able to choose which one we are querying.
Also, make the documentation clear that set must use a non-empty
subset (some of the hypervisors fail if params is NULL).
* include/libvirt/libvirt.h.in
(virDomainGetSchedulerParametersFlags): New prototype.
* src/libvirt.c (virDomainGetSchedulerParametersFlags): Implement
it.
* src/libvirt_public.syms: Export it.
* python/generator.py (skip_impl): Don't auto-generate.
* src/driver.h (virDrvDomainGetSchedulerParametersFlags): New
callback.
Apparently introdunced in commit 376e1d9420
the generator produces u_int flags not unsigned int flags.
* src/remote_protocol-structs: fix to the actual expected type and
alignment
This patch reorders the locks for the nwfilter updates and the access
the nwfilter objects. In the case that the IP address learning thread
was instantiating filters while an update happened, the previous order
lead to a deadlock.
It was suggested during review of a different patch that the libvirt
interface driver API's should have "netcf:" in their log
messages. This patch eliminates that from all interface driver API
functions, and also eliminates the extra " - " in the case that netcf
returns no details in its error info (which *never* happens at
present, but could happen sometime in the future.
This is the API agreed on in:
https://www.redhat.com/archives/libvir-list/2011-May/msg00026.html
(with a slight name change to use "...begin" rather than
"...start"). This implements transactional changes to the host network
config. When a transaction is begun with ncf_change_begin(), all other
netcf APIs will continue to work as they always have, but a snapshot
of the existing config will be taken, allowing reversion (rollback,
using ncf_change_rollback()) to the exact state of config at the time
ncf_change_begin() was called. Alternately, if it's determined that
the new changes are acceptable, ncf_change_commit() can be called,
which will eliminate the snapshot and make the changes permanent.
As a failsafe measure, if neither ncf_change_commit() or
ncf_change_rollback() is called by the next time the system reboots,
the netcf-transaction initscript will be automatically called to
rollback the changes.
Commit f044376530 replaced openvz_readline with getline and
changed EOF-handling in the openvzGetVPSUUID.
This patch restores original EOF-handling.
Reported by Jean-Baptiste Rouault.
This patch allows to modify interfaces of domain(qemu)
* src/conf/domain_conf.c src/conf/domain_conf.h src/libvirt_private.syms:
(virDomainNetInsert) : Insert a network device to domain definition.
(virDomainNetIndexByMac) : Returns an index of net device in array.
(virDomainNetRemoveByMac): Remove a NIC of passed MAC address.
* src/qemu/qemu_driver.c
(qemuDomainAttachDeviceConfig): add codes for NIC.
(qemuDomainDetachDeviceConfig): add codes for NIC.
Before commit 145d6cb05c (in August 2010) absolute file names
in VMX and domain XML configs were handled correctly. But this got
lost during the refactoring. The test cases didn't highlight this
problem because they have their own set of file name handling
functions. The actual ones require a real connection to an ESX
server. Also the test case functions always worked correctly.
Fix the regression and add a new in-the-wild VMX file that contains
such a problematic absolute path. Even though this test case won't
protect against new regressions.
Reported by lofic (IRC nick)
As reported by Diego Blanco in
https://bugzilla.redhat.com/show_bug.cgi?id=702602
commit f0443765 which replaced openvz_readline to getline(3)
broke OpenVZ driver as it changed semantics of EOF-handling
when parsing OpenVZ configuration.
There're several other issues reported with current OpenVZ driver:
#1: unclear error message when parsing "CPUS=" line
#2: openvz driver goes into crashing loop
#3: "NETIF=" line in configuration is not parsed correctly
#4: aborts even when optional parameter is missing
#5: there's a potential memory leak
This updated patch to fix #[145]. This patch does not fix #[23]
as I haven't verified these yet, but this at least got me to run
OpenVZ on libvirt once again.
Coverity spotted this off-by-one. Thankfully, no one in libvirt
was ever calling virAuditSend with an argument of 3.
* src/util/virtaudit.c (virAuditSend): Use correct comparison.
Originally most of libvirt domain-specific calls were blocking
during a migration.
A new mechanism to allow specific calls (blkstat/blkinfo) to be
executed in such condition has been implemented.
In the long term it'd be desirable to get a more general
solution to mark further APIs as migration safe, without needing
special case code.
* src/qemu/qemu_migration.c: add some additional job signal
flags for doing blkstat/blkinfo during a migration
* src/qemu/qemu_domain.c: add a condition variable that can be
used to efficiently wait for the migration code to clear the
signal flag
* src/qemu/qemu_driver.c: execute blkstat/blkinfo using the
job signal flags during migration
Based on the device attach/detach code from the QEMU driver,
but using the new functions to create the structures associated.
* src/libxl/libxl_driver.c: implements domainAttachDevice,
domainAttachDeviceFlags, domainDetachDevice, domainDetachDeviceFlags
and domainUpdateDeviceFlags
Create 3 new function refactored from previous list ones and
exports them internally to the driver
* src/libxl/libxl_conf.c src/libxl/libxl_conf.h: create libxlMakeDisk,
libxlMakeNic libxlMakeVfb out of the exsting static List functions
and exports them
When modifying the disk devices of a live domain and the domain
configuration, the function qemuDomainAttachDeviceConfig
first sets dev->data->disk to NULL. Later qemuDomainAttachDeviceLive
accesses dev->data.disk and causes a segfault.
* src/qemu/qemu_driver.c: fix qemuDomainModifyDeviceFlags() accordingly
Anything generated that must end up in the tarball must either
have unconditional rules for generation (remote_protocol.c) or
must live in libvirt.git for the case where the person running
'make dist' has disabled the configure options that control the
rebuild of the generated file (remote_protocol-structs).
* src/Makefile.am (remote_protocol-structs): Add a dependency and
document why it must live in git.
($(srcdir)/remote/%_protocol.c, $(srcdir)/remote/%_protocol.c):
Unconditionally generate.
http://lists.gnu.org/archive/html/qemu-devel/2011-05/threads.html#02162
Currently, qemu silently clips any JSON integer in the range
0x8000000000000000 - 0xffffffffffffffff (all numbers in this range
will be clipped to 0x7fffffffffffffff == LLONG_MAX).
To avoid this, pass these as signed 64 bit integers in the QMP
request.
In most cases this affects flags parameters that are unsigned in the
public and driver API but signed in the XDR protocol. Switch the
XDR protocol to unsigned for those.
A counterexample is virNWFilterGetXMLDesc. Its flags parameter is signed
in the public API and XDR protocol, but unsigned in the driver API.
virNodeGetFreeMemory used unsigned long long in the public API but
signed hyper in the XDR protocol. Convert the XDR protocol to use
unsigned hyper.
As explained by Eric before, this doesn't affect the on-the-wire protocol.
Several functions return values by reference parameters. This is realized
by passing the members of remote_CALL_ret by reference to the called
function.
The position of this parameters in the function signature follows some
patterns with some exceptions. This patterns and exceptions are hardcoded
in the generator.
Add an insert@<offset> annotation to the remote_CALL_ret struct members
for functions that return lists to remove some of the hardcoded patterns
and exceptions.
The current virDomainMigrateFinish3 method signature attempts to
distinguish two types of errors, by allowing return with ret== 0,
but ddomain == NULL, to indicate a failure to start the guest.
This is flawed, because when ret == 0, there is no way for the
virErrorPtr details to be sent back to the client.
Change the signature of virDomainMigrateFinish3 so it simply
returns a virDomainPtr, in the same way as virDomainMigrateFinish2
The disk locking code will protect against the only possible
failure mode this doesn't account for (loosing conenctivity to
libvirtd after Finish3 starts the CPUs, but before the client
sees the reply for Finish3).
* src/driver.h, src/libvirt.c, src/libvirt_internal.h: Change
virDomainMigrateFinish3 to return a virDomainPtr instead of int
* src/remote/remote_driver.c, src/remote/remote_protocol.x,
daemon/remote.c, src/qemu/qemu_driver.c, src/qemu/qemu_migration.c:
Update for API change
When doing migration, if an error occurs in Perform, it must not
be overwritten during Finish/Confirm steps. If an error occurs
in Finish, it must not be overwritten in Confirm.
Previous commit a9d12c2444 added
code to qemudDomainMigrateFinish2 to preserve the error. This
is not the right place, because it is not applicable in non-p2p
migration. The src/libvirt.c virDomainMigrateV2/3 methods need
code to preserve errors for non-p2p migration, while the
doPeer2PeerMigrate2 and doPeer2PeerMigrate3 methods contain
code to preverse errors for p2p migration.
Remove the bogus error preservation from qemudDomainMigrateFinish2
and qemudDomainMigrateFinish3.
Fix virDomainMigrateV3 and doPeer2PeerMigrate3 so that they
preserve any error hit during the Finish3 step, before invoking
Confirm3.
Finally if qemuMigrationFinish fails to resume the CPUs, it must
preserve the error before tearing down the VM, so that VM cleanup
doesn't overwrite it.
* src/libvirt.c: Preserve error before invoking Confirm3
* src/qemu/qemu_driver.c: Remove bogus error preservation
code in qemudDomainMigrateFinish2/qemudDomainMigrateFinish3
* src/qemu/qemu_migration.c: Preserve error before invoking Confirm3
and after resume fails in qemuMigrationFinish.
* src/libvirt.c: Add further debug lines in helper APIs for
migration
* src/qemu/qemu_migration.c: Add debug lines for all internal
migration API parameters
Even when failing to start CPUs, the finish method was returning
a success result. Fix this so that the QEMU process is killed
off when finish fails under v3 protocol. Also rename the
killOnFinish boolean to 'v3proto' to make it clearer that this
is a tunable based on the migration protocol version
* src/qemu/qemu_driver.c: Update for API change
* src/qemu/qemu_migration.c, src/qemu/qemu_migration.h: Kill
VM in qemuMigrationFinish if failing to start CPUs
The SPICE seamless migration process requires data to be passed
back from the target host, to the source host via a cookie.
The cookie includes the target host's hostname, but this was not
stored, merely validated. This patch explicitly records the
remote hostname after parsing the cookie, and uses it when
initiating the SPICE migration
* qemu/qemu_migration.c: Fix SPICE seamless migration hostname
Before running perform in peer-2-peer migration, the current
guest state must be recorded, so that non-live migration can
currently unpause a running guest on completion.
* src/qemu/qemu_migration.c: Move check for offline guest
to fix non-live migration
There are two pieces of information which are desirable for
migration, which cannot be supplied by applications
- The explicit QEMU migration URI, while using Peer2Peer
migration
- An override for the target VM XML
This introduces two new public APIs to support these extra
parameters. There is no need for extra wire protocool changes,
since this is supported by the v3 migration enhancements
* include/libvirt/libvirt.h.in,
src/libvirt.c, src/libvirt_public.syms: Add virDomainMigrate2
and virDomainMigrateToURI2
The virDomainMigratePerform3 currently has a single URI parameter
whose meaning varies. It is either
- A QEMU migration URI (normal migration)
- A libvirtd connection URI (peer2peer migration)
Unfortunately when using peer2peer migration, without also
using tunnelled migration, it is possible that both URIs are
required.
This adds a second URI parameter to the virDomainMigratePerform3
method, to cope with this scenario. Each parameter how has a fixed
meaning.
NB, there is no way to actually take advantage of this yet,
since virDomainMigrate/virDomainMigrateToURI do not have any
way to provide the 2 separate URIs
* daemon/remote.c, src/remote/remote_driver.c,
src/remote/remote_protocol.x, src/remote_protocol-structs: Add
the second URI parameter to perform3 message
* src/driver.h, src/libvirt.c, src/libvirt_internal.h: Add
the second URI parameter to Perform3 method
* src/libvirt_internal.h, src/qemu/qemu_migration.c,
src/qemu/qemu_migration.h: Update to handle URIs correctly
This extends the v3 migration protocol such that the
virDomainMigrateBegin3 and virDomainMigratePerform3
methods accept an application supplied XML config for
the target VM.
If the 'xmlin' parameter is NULL, then Begin3 uses the
current guest XML as normal. A driver implementing the
Begin3 method should either reject all non-NULL 'xmlin'
parameters, or strictly validate that the app supplied
XML does not change guest ABI.
The Perform3 method also needed the xmlin parameter to
cope with the Peer2Peer migration sequence.
NB it is not yet possible to use this capability since
neither of the public virDomainMigrate/virDomainMigrateToURI
methods have a way to pass in XML.
* daemon/remote.c, src/remote/remote_driver.c,
src/remote/remote_protocol.x, src/remote_protocol-structs:
Add 'remote_string xmlin' parameter to begin3/perform3
RPC messages
* src/libvirt.c, src/driver.h, src/libvirt_internal.h: Add
'const char *xmlin' parameter to Begin3/Perform3 methods
* src/qemu/qemu_driver.c, src/qemu/qemu_migration.c,
src/qemu/qemu_migration.h: Pass xmlin parameter around
migration methods
Otherwise an attempt to use virConnectOpen or virConnectOpenAuth without
auth pointer results in the driver declining the URI and libvirt falling
back to the remote driver for an esx:// URI.
The cur_vcpus member of struct libxl_domain_build_info was incorrectly
initialized to the number of vcpus, when it should have been interpreted
as a bitmap, where bit X corresponds to online/offline status of vcpuX.
To complicate matters, cur_vcpus is an int, so only 32 vcpus can be
set online. Add a check to ensure vcpus does not exceed this limit.
V2: Eric Blake noted a compilation pitfal when '1 << 32' on an int.
Account for vcpus == 32.
We don't use the gnulib vsnprintf replacement, which means that
on mingw, vsnprintf doesn't support %zn or %lln.
And as it turns out, VIR_GET_VAR_STR was a rather inefficient
reimplementation of virVasprintf logic.
* src/util/logging.c (VIR_GET_VAR_STR): Drop.
(virLogMessage): Inline a simpler version here.
* src/util/virterror.c (VIR_GET_VAR_STR, virRaiseErrorFull):
Likewise.
Reported by Matthias Bolte.
Saving domain to previously created file changes also its ownership.
This is certainly not what users want if some conditions are met:
it is a regular, local file and dynamic_ownership is off.
NB: the enum that uses the string vnet-host (now changed to vhost-net)
is used in XML, but fortunately that hasn't been in an official
release yet, so it can still be fixed.
Since -vnc uses ':' to separate the address from the port, raw
IPv6 addresses need to be escaped like [addr]:port
* src/qemu/qemu_command.c: Escape raw IPv6 addresses with []
* tests/qemuxml2argvdata/qemuxml2argv-graphics-vnc.args,
tests/qemuxml2argvdata/qemuxml2argv-graphics-vnc.xml: Tweak
to test Ipv6 escaping
* docs/schemas/domain.rng: Allow Ipv6 addresses, or hostnames
in <graphics> listen attributes
The qemuMigrationConfirm method shouldn't deal with final VM
cleanup, since it can be called from the peer2peer migration,
which expects to still use the 'vm' object afterwards.
Push the cleanup code out of qemuMigrationConfirm, into its
caller, qemuDomainMigrateConfirm3
* src/qemu/qemu_driver.c: Add VM cleanup code to
qemuDomainMigrateConfirm3
* src/qemu/qemu_migration.c, src/qemu/qemu_migration.h: Remove
job handling cleanup from qemuMigrationConfirm
To allow new mandatory migration cookie data to be introduced,
add support for checking supported feature flags when parsing
migration cookie.
* src/qemu/qemu_migration.c: Feature flag checking in migration
cookie parsing
Two additional places need initgroups call to properly work in an
environment where the UID is allowed to open/create stuff through its
supplementary groups.
This adds a streaming-video=filter|all|off attribute. It is used to change
the behavior of video stream detection in spice, the default is filter (the
default for libvirt is not to specify it - the actual default is defined in
libspice-server.so).
Usage:
<graphics type='spice' autoport='yes'>
<streaming mode='off'/>
</graphics>
Tested with the above and with tests/qemuxml2argvtest.
Signed-off-by: Alon Levy <alevy@redhat.com>
This patch enables filtering of gratuitous ARP packets using the following XML:
<rule action='accept' direction='in' priority='425'>
<arp gratuitous='true'/>
</rule>
This was discussed in:
https://www.redhat.com/archives/libvir-list/2011-May/msg01370.html
The capabilities code only sets the flag to allow use of vhost-net if
kvm is detected (set if the help string contains "(qemu-kvm-" or
"(kvm-"), but actually vhost-net is available in some qemu builds that
don't have kvm in their name, so just checking for ",vhost=" is enough.
When using TLS authentication and operating as the non-root user,
initially attempt to use that specific user's TLS certificates before
attempting to use the system wide TLS certificates.
Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Otherwise qemu is unable to write to it, with the error:
libvir: QEMU error : internal error unable to execute QEMU command 'memsave': Could not open '/var/cache/libvirt/qemu/qemu.mem.RRNvLv'
The v2 migration protocol had a limit on cookie length that was
too small to be useful for QEMU. Avoid generating cookies with
v2 protocol, so that old libvirtd can still reliably migrate a
guest to new libvirtd uses v2 protocol.
* src/qemu/qemu_driver.c: Avoid migration cookies with v2
migration
When generating a cookie for a guest with no data, the
QEMU_MIGRATION_COOKIE_GRAPHICS flag was set even if no
graphics data was added. Avoid setting the flag unless
it was needed, also add a safety check for mig->graphics
being non-NULL
* src/qemu/qemu_migration.c: Avoid cookie crash for guest
with no graphics
The internal virDomainMigratePeer2Peer and virDomainMigrateDirect
helper methods were not checking whether the target supports the
v3 migration protocol.
* src/libvirt.c: Use v3 migration protocol for p2p/direct
migration if available.
Some bogus apps are generating a VNC/SPICE/RFB listen attribute
with no content. This then causes a failure with the graphics
migration cookie parsing. Blank out the 'listenAddr' parameter
after parsing domain XML if it is the empty string, so the host
default takes over
* src/qemu/qemu_migration.c: Blank out listenAddr parameter
if empty
The on-the-wire protocol is identical; XDR guarantees that
both 'hyper' and 'unsigned hyper' are transmitted as 8 bytes.
* src/remote/remote_protocol.x (remote_get_version_ret)
(remote_get_lib_version_ret): Match public API.
* daemon/remote_generator.pl: Drop special case.
* src/remote_protocol-structs: Reflect updated type.
Clang couldn't quite see that the same condition of
(flags & VIR_DOMAIN_MEM_CONFIG) is used twice, such that
the second block is guaranteed that def was assigned in
the first block.
* src/libxl/libxl_driver.c (libxlDomainSetMemoryFlags): Add a hint
for clang.
Improve invalid argument checks in the size query case. The drivers already
relied on this unchecked behavior.
Relax the implementation of virDomainGet(Memory|Blkio)MemoryParameters
in the drivers and allow to pass more memory than necessary for all
parameters.
Add invalid argument checks for params and nparams to the public API
and remove them from the drivers (e.g. xend).
Add subset handling to libxl and test drivers.
params and nparams are essential and cannot be NULL. Check this in
libvirt.c and remove redundant checks from the drivers (e.g. xend).
Instead of enforcing that nparams must point to exact same value as
returned by virDomainGetSchedulerType relax this to a lower bound
check. This is what some drivers (e.g. xen hypervisor and esx)
already did. Other drivers (e.g. xend) didn't check nparams at all
and assumed that there is enough space in params.
Unify the behavior in all drivers to a lower bound check and update
nparams to the number of valid values in params on success.
Some drivers assumed it can be NULL (e.g. qemu and lxc) and check it
before assigning to it, other drivers assumed it must be non-NULL
(e.g. test and esx) and just assigned to it.
Unify this to nparams being optional and document it.
This error code has existed since the dawn of time, yet the messages it
generates are almost universally busted. Here's a small sampling:
src/conf/domain_conf.c:4889 : XML description for missing root element is not well formed or invalid
src/conf/domain_conf.c:4951 : XML description for unknown device type is not well formed or invalid
src/conf/domain_conf.c:5460 : XML description for maximum vcpus must be an integer is not well formed or invalid
src/conf/domain_conf.c:5468 : XML description for invalid maxvcpus %(count)lu is not well formed or invalid
Fix up the error code to instead be
XML error: <msg>
Adjust the few locations that were using the original correctly (or shouldn't
have been using the error code at all).
v2:
Fix wording of error code without a passed argument
starting with kernel 2.6.38 macvtap supports a 'passthru' mode for
attaching virtual functions of a SRIOV capable network card directly to a VM.
This patch adds the capability to configure such a device.
Signed-off-by: Dirk Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
<sys/syslimits.h> is not standardized, so portable programs should
not need to rely on it. If there really is something that we need
where <sys/syslimits.h> provided the limit but <limits.h> did not,
then that would be a candidate for fixing in gnulib. But this patch
did not turn up any compilation failures on Linux.
* src/internal.h (includes): Drop unused header.
* daemon/libvirtd.h (includes): Likewise.
* configure.ac (AC_CHECK_HEADERS): Likewise.
Based on a report by Matthias Bolte.
POSIX allows sysconf(_SC_GETPW_R_SIZE_MAX) to return -1 if there
is no fixed limit, and requires ERANGE errors to track real size.
Model our behavior after the example in POSIX itself:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/getpwuid_r.html
Also, on error for get*_r functions, errno is undefined, and the
real error was the return value.
* src/util/util.c (virGetUserEnt, virGetUserID, virGetGroupID)
(virSetUIDGID): Cope with sysconf failure or too small buffer.
Reported by Matthias Bolte.
virRunWithHook is now unused, so we can drop it. Tested w/ raw + qcow2
volume creation and copying.
v2:
Use opaque data to skip hook second time around
Simply command building
v3:
Drop explicit FindFileInPath
virStreamNew needs to dispatch the error that virGetStream reports
on failure.
remoteCreateClientStream can fail due to virStreamNew or due to
VIR_ALLOC. Report OOM error for VIR_ALLOC failure to report errors
in all error cases.
Remove OOM error reporting from remoteCreateClientStream callers.
When the session has expired then multiple threads can race while
reestablishing it.
This race condition is not that critical as it requires a special usage
pattern to be triggered. It can only happen when an application doesn't
do API calls for quite some time (the session expires after 30 min
inactivity) and then multiple threads doing simultaneous API calls and
end up doing simultaneous calls to esxVI_EnsureSession.
virsh didn't call virInitialize(), which (among other things)
initializes virLastErr thread local variable. As a result of that, virsh
could just segfault in virEventRegisterDefaultImpl() since that is the
first call that touches (resets) virLastErr.
I have no idea what lucky coincidence made this bug visible but I was
able to reproduce it in 100% cases but only in one specific environment
which included building in sandbox.
Mingw execve() has a broken signature. Disable this
function until gnulib fixes the signature, since we
don't really need this on Win32 anyway.
* src/util/command.c: Disable virCommandExec on Win32
When failing to marshall an XDR message, include the
full program/version/status/proc/type info, to allow
easier debugging & diagnosis of the problem.
* src/remote/remote_driver.c: Improve error when marshalling
fails
By running the doTunnelSendAll code in a separate thread, the
main thread can do qemuMigrationWaitForCompletion as with
normal migration. This in turn ensures that job signals work
correctly and that progress monitoring can be done
* src/qemu/qemu_migration.c: Run tunnelled migration in
separate thread
Cancelling the QEMU migration may cause QEMU to flush pending
data on the migration socket. This may in turn block QEMU if
nothing reads from the other end of the socket. Closing the
socket before cancelling QEMU migration avoids this possible
deadlock.
* src/qemu/qemu_migration.c: Close sockets before cancelling
migration on failure
The 'nbytes' variable was not re-initialized to the
buffer size on each iteration of the tunnelled migration
loop. While saferead() will ensure a full read, except
on EOF, it is clearer to use the real buffer size
* src/qemu/qemu_migration.c: Always read full buffer of data
The qemuMigrationWaitForCompletion method contains a loop which
repeatedly queries QEMU to check migration progress, and also
processes job signals (pause, setspeed, setbandwidth, cancel).
The tunnelled migration loop does not currently support this
functionality, but should. Refactor the code to allow it to
be used with tunnelled migration.
Implement the v3 migration protocol, which has two extra
steps, 'begin' on the source host and 'confirm' on the
source host. All other methods also gain both input and
output cookies to allow bi-directional data passing at
all stages.
The QEMU peer2peer migration method gains another impl
to provide the v3 migration. This finally allows migration
cookies to work with tunnelled migration, which is required
for Spice seamless migration & the lock manager transfer
* src/qemu/qemu_driver.c: Wire up migrate v3 APIs
* src/qemu/qemu_migration.c, src/qemu/qemu_migration.h: Add
begin & confirm methods, and peer2peer impl of v3
Merge the doNonTunnelMigrate2 and doTunnelMigrate2 methods
into one doPeer2PeerMigrate2 method, since they are substantially
the same. With the introduction of v3 migration, this will be
even more important, to avoid massive code duplication.
* src/qemu/qemu_migration.c: Merge tunnel & non-tunnel migration
The v2 migration protocol was accidentally missing out the
finish step, when prepare succeeded, but returned an invalid
URI
* src/libvirt.c: Teardown VM if prepare returns invalid URI
To facilitate the introduction of the v3 migration protocol,
the doTunnelMigrate method is refactored into two pieces. One
piece is intended to mirror the flow of virDomainMigrateVersion2,
while the other is the helper for setting up sockets and processing
the data.
Previously socket setup would be done before the 'prepare' step,
so errors could be dealt with immediately, avoiding need to shut
off the destination QEMU. In the new split, socket setup is done
after the 'prepare' step. This is not a serious problem, since
the control flow already requires calling 'finish' to tear down
the destination QEMU upon several errors.
* src/qemu/qemu_migration.c:
Use the graphics information from the QEMU migration cookie to
issue a 'client_migrate_info' monitor command to QEMU. This causes
the SPICE client to automatically reconnect to the target host
when migration completes
* src/qemu/qemu_migration.c: Set data for SPICE client relocation
before starting migration on src
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
new qemuMonitorGraphicsRelocate() command
Extend the QEMU migration cookie structure to allow information
about the destination host graphics setup to be passed by to
the source host. This will enable seamless migration of any
connected graphics clients
* src/qemu/qemu_migration.c: Add graphics info to migration
cookies
* daemon/libvirtd.c: Always initialize gnutls to enable
x509 cert parsing in QEMU
The migration protocol has support for a 'cookie' parameter which
is an opaque array of bytes as far as libvirt is concerned. Drivers
may use this for passing around arbitrary extra data they might
need during migration. The QEMU driver needs to do a few things:
- Pass hostname/uuid to allow strict protection against localhost
migration attempts
- Pass SPICE/VNC server port from the target back to the source to
allow seamless relocation of client sessions
- Pass lock driver state from source to destination
This patch introduces the basic glue for handling cookies
but only includes the host/guest UUID & name.
* src/libvirt_private.syms: Export virXMLParseStrHelper
* src/qemu/qemu_migration.c, src/qemu/qemu_migration.h: Parsing
and formatting of migration cookies
* src/qemu/qemu_driver.c: Pass in cookie parameters where possible
* src/remote/remote_protocol.h, src/remote/remote_protocol.x: Change
cookie max length to 16384 bytes
The qemuMigrationPrepareTunnel method should not unlock the
qemu driver, since that is the caller's job.
* src/qemu/qemu_migration.c: Fix qemuMigrationPrepareTunnel
unlocking of QEMU driver
Migration just seems to go from bad to worse. We already had to
introduce a second migration protocol when adding the QEMU driver,
since the one from Xen was insufficiently flexible to cope with
passing the data the QEMU driver required.
It turns out that this protocol still has some flaws that we
need to address. The current sequence is
* Src: DumpXML
- Generate XML to pass to dst
* Dst: Prepare
- Get ready to accept incoming VM
- Generate optional cookie to pass to src
* Src: Perform
- Start migration and wait for send completion
- Kill off VM if successful, resume if failed
* Dst: Finish
- Wait for recv completion and check status
- Kill off VM if unsuccessful
The problems with this are:
- Since the first step is a generic 'DumpXML' call, we can't
add in other migration specific data. eg, we can't include
any VM lease data from lock manager plugins
- Since the first step is a generic 'DumpXML' call, we can't
emit any 'migration begin' event on the source, or have
any hook that runs right at the start of the process
- Since there is no final step on the source, if the Finish
method fails to receive all migration data & has to kill
the VM, then there's no way to resume the original VM
on the source
This patch attempts to introduce a version 3 that uses the
improved 5 step sequence
* Src: Begin
- Generate XML to pass to dst
- Generate optional cookie to pass to dst
* Dst: Prepare
- Get ready to accept incoming VM
- Generate optional cookie to pass to src
* Src: Perform
- Start migration and wait for send completion
- Generate optional cookie to pass to dst
* Dst: Finish
- Wait for recv completion and check status
- Kill off VM if failed, resume if success
- Generate optional cookie to pass to src
* Src: Confirm
- Kill off VM if success, resume if failed
The API is designed to allow both input and output cookies
in all methods where applicable. This lets us pass around
arbitrary extra driver specific data between src & dst during
migration. Combined with the extra 'Begin' method this lets
us pass lease information from source to dst at the start of
migration
Moving the killing of the source VM out of Perform and
into Confirm, means we can now recover if the dst host
can't successfully Finish receiving migration data.
Change all the driver struct initializers to use the
C99 style, leaving out unused fields. This will make
it possible to add new APIs without changing every
driver. eg change:
qemudDomainResume, /* domainResume */
qemudDomainShutdown, /* domainShutdown */
NULL, /* domainReboot */
qemudDomainDestroy, /* domainDestroy */
to
.domainResume = qemudDomainResume,
.domainShutdown = qemudDomainShutdown,
.domainDestroy = qemudDomainDestroy,
And get rid of any existing C99 style initializersr which
set NULL, eg change
.listPools = vboxStorageListPools,
.numOfDefinedPools = NULL,
.listDefinedPools = NULL,
.findPoolSources = NULL,
.poolLookupByName = vboxStoragePoolLookupByName,
to
.listPools = vboxStorageListPools,
.poolLookupByName = vboxStoragePoolLookupByName,
Fix some driver names:
s/virDrvCPUCompare/virDrvCompareCPU/
s/virDrvCPUBaseline/virDrvBaselineCPU/
s/virDrvQemuDomainMonitorCommand/virDrvDomainQemuMonitorCommand/
s/virDrvSecretNumOfSecrets/virDrvNumOfSecrets/
s/virDrvSecretListSecrets/virDrvListSecrets/
And some driver struct field names:
s/getFreeMemory/nodeGetFreeMemory/
Only in drivers which use virDomainObj, drivers that query hypervisor
for domain status need to be updated separately in case their hypervisor
supports this functionality.
The reason is also saved into domain state XML so if a domain is not
running (i.e., no state XML exists) the reason will be lost by libvirtd
restart. I think this is an acceptable limitation.
This has been present since the introduction of phypAttachDevice
in commit 444fd07a.
* src/phyp/phyp_driver.c (phypAttachDevice): Don't dereference
NULL.
virFDStreamClose used a mutex after it was freed, and failed
to destroy that mutex on its last use.
* src/fdstream.c (virFDStreamFree): Inline into sole caller...
(virFDStreamClose): ...to avoid use-after-free and leak.
Reported by Matthias Bolte.
Several vSphere API methods are called on global objects like the
FileManager, the PerformanceManager or the SearchIndex. The generator
input file allows to mark such methods and the generator generates
such method in a way that automatically handles marked parameter. This
is done by some special macros. Those were manually written and this
patch moves them to the generator.
One functionality change here is that we no longer force enable the event
timeout for every queued event, only enable it for the first event after
the queue has been flushed. This is how other drivers have already done it,
and I haven't encountered problems in practice.
v3:
Adjust for new virDomainEventStateNew argument
The same code for queueing, flushing, and deregistering events exists
in multiple drivers, which will soon use these common functions.
v2:
Adjust libvirt_private.syms
isDispatching bool fixes
NONNULL tagging
v3:
Add requireTimer parameter to virDomainEventStateNew
This structure will be used to unify lots of duplicated event handling code
across the state drivers.
v2:
Check for state == NULL in StateFree
Add NONNULL tagging
Use bool for isDispatching
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Use capabilities to allow a driver to register a default <init> if none
is specified in the XML. Openvz was already open-coding this to be /sbin/init
LXC currently falls over if no init is specified, so an explicit error is
an improvement IMO.
(Side note: I don't think we can set a default value for LXC. If we use
/sbin/init but the user doesn't specify a separate root FS for their guest,
the container will rerun the host's init which can be traumatic :). For
virt-install I'm thinking of defaulting to /sbin/init if a root FS has
been specified, otherwise require the user to manually specify <init>)
This is needed if we want to transfer a temporary file. If the
transfer is done with iohelper, we might run into a race condition,
where we unlink() file before iohelper is executed.
* src/fdstream.c, src/fdstream.h,
src/util/iohelper.c: Add new option
* src/lxc/lxc_driver.c, src/qemu/qemu_driver.c,
src/storage/storage_driver.c, src/uml/uml_driver.c,
src/xen/xen_driver.c: Expand existing function calls
Add public API for taking screenshots of current domain console.
* include/libvirt/libvirt.h.in: add virDomainScreenshot
* src/libvirt_public.syms: Export new symbol
The public API and RPC over-the-wire format have no flags argument,
so neither should the internal callback API. This simplifies the
RPC generator.
* src/driver.h (virDrvNWFilterDefineXML): Drop argument that does
not match public API.
* src/nwfilter/nwfilter_driver.c (nwfilterDefine): Likewise.
* src/libvirt.c (virNWFilterDefineXML): Likewise.
* daemon/remote_generator.pl: Drop special case.
We were 31/73 on whether to translate; since less than 50% translated
and since VIR_INFO is less than VIR_WARN which also doesn't translate,
this makes sense.
* cfg.mk (sc_prohibit_gettext_markup): Add VIR_INFO, since it
falls between WARN and DEBUG.
* daemon/libvirtd.c (qemudDispatchSignalEvent, remoteCheckAccess)
(qemudDispatchServer): Adjust offenders.
* daemon/remote.c (remoteDispatchAuthPolkit): Likewise.
* src/network/bridge_driver.c (networkReloadIptablesRules)
(networkStartNetworkDaemon, networkShutdownNetworkDaemon)
(networkCreate, networkDefine, networkUndefine): Likewise.
* src/qemu/qemu_driver.c (qemudDomainDefine)
(qemudDomainUndefine): Likewise.
* src/storage/storage_driver.c (storagePoolCreate)
(storagePoolDefine, storagePoolUndefine, storagePoolStart)
(storagePoolDestroy, storagePoolDelete, storageVolumeCreateXML)
(storageVolumeCreateXMLFrom, storageVolumeDelete): Likewise.
* src/util/bridge.c (brProbeVnetHdr): Likewise.
* po/POTFILES.in: Drop src/util/bridge.c.
This one's tricker than the VIR_DEBUG0() removal, but the end
result is still C99 compliant, and reasonable with enough comments.
* src/libvirt.c (VIR_ARG10, VIR_HAS_COMMA)
(VIR_DOMAIN_DEBUG_EXPAND, VIR_DOMAIN_DEBUG_PASTE): New macros.
(VIR_DOMAIN_DEBUG): Rewrite to handle one argument, moving
multi-argument guts to...
(VIR_DOMAIN_DEBUG_1): New macro.
(VIR_DOMAIN_DEBUG0): Rename to VIR_DOMAIN_DEBUG_0.
Use of ',##__VA_ARGS__' is a gcc extension not guaranteed by
C99; thankfully, we can avoid it by lumping the format argument
into the var-args set.
* src/util/logging.h (VIR_DEBUG_INT, VIR_INFO_INT, VIR_WARN_INT)
(VIR_ERROR_INT, VIR_DEBUG, VIR_INFO, VIR_WARN, VIR_ERROR): Stick
to C99 var-arg macro syntax.
* examples/domain-events/events-c/event-test.c (VIR_DEBUG):
Simplify.
These VIR_XXXX0 APIs make us confused, use the non-0-suffix APIs instead.
How do these coversions works? The magic is using the gcc extension of ##.
When __VA_ARGS__ is empty, "##" will swallow the "," in "fmt," to
avoid compile error.
example: origin after CPP
high_level_api("%d", a_int) low_level_api("%d", a_int)
high_level_api("a string") low_level_api("a string")
About 400 conversions.
8 special conversions:
VIR_XXXX0("") -> VIR_XXXX("msg") (avoid empty format) 2 conversions
VIR_XXXX0(string_literal_with_%) -> VIR_XXXX(%->%%) 0 conversions
VIR_XXXX0(non_string_literal) -> VIR_XXXX("%s", non_string_literal)
(for security) 6 conversions
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
If we plow on after udev_device_get_syspath fails, we will hit a NULL
dereference. Clang found one due to strdup later in udevSetParent,
but in fact we hit a NULL dereference sooner because of the use of
STREQ within virNodeDeviceFindBySysfsPath.
* src/conf/node_device_conf.h (virNodeDeviceFindBySysfsPath): Mark
path argument non-null.
* src/node_device/node_device_udev.c (udevSetParent): Avoid null
dereference.
No syntactic effect; this merely silences some clang warnings.
* src/libxl/libxl_driver.c (libxlDomainSetVcpusFlags): Drop
redundant ret=0 statement.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextDriveDel):
Likewise.
Introduce a virProcessKill function that can be safely called
even when the job mutex is held. This allows virDomainDestroy
to kill any VM even if it is asleep in a monitor job. The PID
will die and the thread asleep on the monitor will then wake
up releasing the job mutex.
* src/qemu/qemu_driver.c: Kill process before using qemuProcessStop
to ensure job is released
* src/qemu/qemu_process.c: Add virProcessKill for killing off
QEMU processes
Version 2.0.0 or yajl changed API. It is fairly trivial for us to
cope with both APIs in libvirt, so adapt.
* configure.ac: Probe for yajl2 API
* src/util/json.c: Conditional support for yajl2 API
libxl accepts hpet configuration in its domain info struct. Parse the
domain definition's <clock> element in order to set the value.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Apologies from Eric Blake, for mistakenly committing the broken
intermediate version.
libxl accepts hpet configuration in its domain info struct. Parse the
domain definition's <clock> element in order to set the value.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Recent versions of Xen disable the virtual HPET by default. This is
usually more precise because tick policies are not implemented for
the HPET in Xen. However, there may be several reasons to control
the HPET manually: 1) to test the emulation; 2) because distros may
provide the knob while leaving the default to "enabled" for compatibility
reasons.
This patch provides support for the hpet item in both sexpr and xm
formats, and translates it to a <timer> element.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Allow the CA certificate to come from the user's home directory or from
the global location independently of the client certificate/key pair.
Mostly for the case when each user on a system has their own cert/key
pair but the system as a whole shares the same CA.
Signed-off-by: Doug Goldstein <cardoe@gentoo.org>
This matches the public API and helps to get rid of some special
case code in the remote generator.
Rename driver API functions and XDR protocol structs.
No functional change included outside of the remote generator.
Actually execs the argv/env we've generated, replacing the current process.
Kind of has a limited usage, but allows us to use virCommand in LXC
driver to launch the 'init' process
assert() is forbidden in libvirt code, and these two cases would
in fact never execute due to earlier error checks.
* src/libvirt.c: Remove assert() usage
Noticed this while trying to run rpcgen on cygwin.
* src/Makefile.am ($(srcdir)/remote/%_protocol.h)
($(srcdir)/remote/%_protocol.c): Add a dependency.
Stop storing the generated files for the remote protocol client
and server in source control. The generated files will still be
included in the result of 'make dist' to avoid end-users needing
to generate the files
Signed-off-by: Eric Blake <eblake@redhat.com>
Unfortunately, this means that the strings marked for translation
in generated files are not picked up by gnulib's syntax-check,
I'm working on fixing that in gnulib.
* .gitignore, cfg.mk, po/POTFILES.in: Reflect deletion.
Always generate the rpc files, and require rpcgen during bootstrap.
* daemon/Makefile.am: Removed generated files with
maintainer-clean target
* src/Makefile.am: Removed generated files with
maintainer-clean target. Always run 'rpcgen' if
generated files are missing
In preparation for removing generated files, it is necessary
to tell automake that the generated files must be distributed
but not directly compiled (since they are included into the
body of a larger .c file that is compiled). Hence, even though
these files are code and not headers in the strict sense of
the word, it is easier to rename them to .h for automake's sake.
* daemon/remote_client_bodies.c: Rename to .h.
* daemon/qemu_client_bodies.c: Likewise.
* src/remote/remote_client_bodies.c: Likewise.
* src/remote/qemu_client_bodies.c: Likewise.
* daemon/Makefile.am (remote_dispatch_bodies.c)
(qemu_dispatch_bodies.c): Rename to .h.
(remote.c, EXTRA_DIST): Reflect rename.
* daemon/remote.c: Likewise.
* daemon/remote_generator.pl: Likewise.
* src/Makefile.am (remote/remote_driver.c): Likewise.
* src/remote/remote_driver.c: Likewise.
* po/POTFILES.in: Likewise.
* cfg.mk (exclude_file_name_regexp--sc_require_config_h)
(exclude_file_name_regexp--sc_require_config_h_first)
(exclude_file_name_regexp--sc_prohibit_empty_lines_at_EOF):
Likewise.
This patch just covers the simple functions without explicit return
values. There is more to be handled.
The generator collects the members of the XDR argument structs and uses
this information to generate the function bodies.
Exclude the generated files from offending syntax-checks.
Suggested by Richard W.M. Jones
Creating a domU on a freshly booted dom0 does not work,
because the libxl driver does not allocate memory for the domU.
After creating a domain with xl libvirt is able to create domains too.
This patch reserves enough memory for the domU first.
Users often edit XML file stored in configuration directory
thinking of modifying a domain/network/pool/etc. Thus it is wise
to let them know they are using the wrong way and give them hint.
When setting up a FIFO for QEMU, it allows either a pair
of fifos used unidirectionally, or a single fifo used
bidirectionally. Look for the bidirectional fifo first
when labelling since that is more useful
* src/security/security_dac.c,
src/security/security_selinux.c: Fix fifo handling
As well as taint warnings going to the main libvirt log,
add taint warnings to the per-domain logfile
Domain id=3 is tainted: high-privileges
Domain id=3 is tainted: disk-probing
Domain id=3 is tainted: shell-scripts
Domain id=3 is tainted: custom-monitor
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h: Enhance
qemuDomainTaint to also log to the domain logfile
* src/qemu/qemu_driver.c: Pass -1 for logFD to taint methods to
auto-append to logfile
* src/qemu/qemu_process.c: Pass open logFD at startup for taint
methods
The qemuDomainAppendLog method allows writing a formatted string
to the end of the domain logfile, optionally opening it if needed.
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h: Add
qemuDomainAppendLog
Move the qemuProcessLogReadFD and qemuProcessLogFD methods
into qemu_domain.c, renaming them to qemuDomainCreateLog
and qemuDomainOpenLog.
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h: Add
qemuDomainCreateLog and qemuDomainOpenLog.
* src/qemu/qemu_process.c: Remove qemuProcessLogFD
and qemuProcessLogReadFD
Wire up logging of VM tainting to the QEMU driver
- If running QEMU as root user/group or without capabilities
being cleared
- If passing custom QEMU command line args
- If issuing custom QEMU monitor commands
- If using a network interface config with an associated
shell script
- If using a disk config relying on format probing
The warnings, per-VM appear in the main libvirtd logs
11:56:17.571: 10832: warning : qemuDomainObjTaint:712 : Domain id=1 name='l2' uuid=c7a3edbd-edaf-9455-926a-d65c16db1802 is tainted: high-privileges
11:56:17.571: 10832: warning : qemuDomainObjTaint:712 : Domain id=1 name='l2' uuid=c7a3edbd-edaf-9455-926a-d65c16db1802 is tainted: disk-probing
The taint flags are reset when the VM is stopped.
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h: Helper APIs
for logging taint warnings
* src/qemu/qemu_driver.c: Log tainting with custom QEMU monitor
commands and disk/net hotplug with unsupported configs
* src/qemu/qemu_process.c: Log tainting at startup based on
unsupported configs
Some configuration setups for guests are allowed, but strongly
discouraged and unsupportable in production systems. Introduce
a concept of 'tainting' to virDomainObjPtr to allow such setups
to be identified. Drivers can then log warnings at suitable
times
* src/conf/domain_conf.c, src/conf/domain_conf.h: Declare taint
flags and add parsing/formatting of domain status XML
Print the name of the CA cert, certificate, and key file that resulted
in the failure so that the user has an idea what to troubleshoot.
Signed-off-by: Doug Goldstein <cardoe@gentoo.org>
Match the fact that we have virAsprintf and virVasprintf.
* src/util/buf.h (virBufferVasprintf): New prototype.
* src/util/buf.c (virBufferAsprintf): Move guts...
(virBufferVasprintf): ...to new function.
* src/libvirt_private.syms (buf.h): Export it.
* bootstrap.conf (gnulib_modules): Add stdarg, for va_copy.
We already have virAsprintf, so picking a similar name helps for
seeing a similar purpose. Furthermore, the prefix V before printf
generally implies 'va_list', even though this variant was '...', and
the old name got in the way of adding a new va_list version.
global rename performed with:
$ git grep -l virBufferVSprintf \
| xargs -L1 sed -i 's/virBufferVSprintf/virBufferAsprintf/g'
then revert the changes in ChangeLog-old.
The qemuMigrationToFile method was accidentally annotated for
the 'compressor' parameter to be non-null, instead of the
'path' parameter. Thus GCC with -O2, unhelpfully deleted the
entire 'if (compressor == NULL)' block of code during
optimization. Thus NULL was passed to virCommandNew() with
predictably bad results.
* src/qemu/qemu_migration.h: Fix non-null annotation to be
against path instead of compressor
To cope with the QEMU binary being changed while a VM is running,
it is neccessary to persist the original qemu capabilities at the
time the VM is booted.
* src/qemu/qemu_capabilities.c, src/qemu/qemu_capabilities.h: Add
an enum for a string rep of every capability
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h: Support for
storing capabilities in the domain status XML
* src/qemu/qemu_process.c: Populate & free QEMU capabilities at
domain startup
Add missing early exits and convert error logging to proper API level
error reporting.
Centralize cleanup code for the PerfQuerySpec object.
Reported by Eric Blake, detected by clang.
The ++ on preliminaryFileName was a left over from a previous version
of this function that explicitly returned the filename and did a strdup
on preliminaryFileName afterwards.
As the filename isn't returned explicitly anymore remove the preliminary
variable for it and reuse the tmp variable instead.
Reported by Eric Blake, detected by clang.
Clang warned about a dead assignment. In the process, I noticed
that we are only using the function for a bool value. I audited
all other callers in qemu_{migration,cgroup,driver,hotplug), and
all were making the call in a bool context.
Also, do bounds checking on the argument.
* src/qemu/qemu_cgroup.c (qemuSetupCgroup): Delete dead
assignment.
(qemuCgroupControllerActive): Change return type to bool.
* src/qemu/qemu_cgroup.h (qemuCgroupControllerActive): Likewise.
Clang noticed a dead assignment, which turned out to be the use
of the wrong variable. rc starts life as -1, and is only ever
assigned to 0 just before a successful cleanup.
* src/lxc/lxc_driver.c (lxcSetupInterfaces): Don't call
virReportSystemError(-1).
Detected by gcc:
libxl/libxl_driver.c: In function 'libxlDomainDestroy':
libxl/libxl_drier.c:1351:30: error: variable 'priv' set but not used [-Werror=unused-but-set-variable]
* src/libxl/libxl_driver.c (libxlDomainDestroy): Delete unused
variable.
clang didn't like the last increment to nargs. But why even
track nargs ourselves, when virCommand does it for us?
* src/storage/storage_backend_iscsi.c
(virStorageBackendISCSIConnection): Switch to virCommand to avoid
a dead-store warning on nargs.
Clang detected a dead store to rc. It turns out that in fixing this,
I also found a FILE* leak.
This is a subtle change in behavior, although unlikely to hit. The
pidfile is a kernel file, so we've probably got more serious problems
under foot if we fail to parse one. However, the previous behavior
was that even if one pid file failed to parse, we tried others,
whereas now we give up on the first failure. Either way, though,
the function returns -1, so the caller will know that something is
going wrong, and that not all pids were necessarily reaped. Besides,
there were other instances already in the code where failure in the
inner loop aborted the outer loop.
* src/util/cgroup.c (virCgroupKillInternal): Abort rather than
resuming loop on fscanf failure, and cleanup file on error.
Clang 2.8 wasn't quite able to follow that persistentDef was
assigned earlier if (flags & VIR_DOMAIN_MEM_CONFIG) is true.
Silence this false positive, to make clang analysis easier to use.
* src/qemu/qemu_driver.c (qemudDomainSetMemoryFlags): Add an
annotation to silence clang's claim of a NULL dereference.
Clang detected a null-pointer dereference regression, introduced
in commit 4e8969eb. Without this patch, a device with
unbind_from_stub set to false would eventually try to call
virFileExists on uncomputed drvdir.
* src/util/pci.c (pciUnbindDeviceFromStub): Ensure drvdir is set
before use.
This code has had problems historically. As originally
written, in commit 6bcf2501 (Jun 08), it could call unlink
on a random string, nuking an unrelated file.
Then commit 182a80b9 (Sep 09), the code was rewritten to
allocate tmp, with both a use-after-free bug and a chance to
call unlink(NULL).
Commit e206946 (Mar 11) fixed the use-after-free, but not the
NULL dereference. Thanks to clang for catching this!
* src/qemu/qemu_driver.c (qemudDomainMemoryPeek): Don't call
unlink on NULL.
This reverts commit 0e7f7f8566.
From the mailing list:
> So, AFAICT, this patch means we will never reconnect to any LXC
> VMs now.
>
> The correct solution, is to refactor LXC driver startup to work
> the same way as the QEMU driver startup.
>
> - Load all the live state XML files (to pick up running VMs)
> - Reconnect to all VMs
> - Load all the persistent config XML files (to pick up any additional
> inactive guets)
But that solution is invasive enough to be post-0.9.1.
This commit fixes
qemu/qemu_driver.c: In function 'qemuDomainModifyDeviceFlags':
qemu/qemu_driver.c:4041:8: warning: 'ret' may be used uninitialized in this
function [-Wuninitialized]
qemu/qemu_driver.c:4013:9: note: 'ret' was declared here
The variable is set to -1 so that the error paths are taken when the code
to set it didn't get a chance to run. Without initializing it, we could
return some an undefined value from this function.
While I was at it, I made a trivial whitespace change in the same function
to improve readability.
Call shutdown functions for all subcomponents in nwfilterDriverShutdown.
Make sure that this shutdown functions can safely be called multiple times
and independent from the actual subcomponents state.
Commit e0d014f237 made binary potentially allocated on the heap.
It was freed in the parent in the error path, but not in the success path
that doesn't goto the cleanup label.
Found by 'make -C tests valgrind'.
Commit 1671d1d introduced a memory leak in virHashFree, and
wholesale table corruption in virHashRemoveSet (elements not
requested to be freed are lost).
* src/util/hash.c (virHashFree): Free bucket array.
(virHashRemoveSet): Don't lose elements.
* tests/hashtest.c (testHashCheckForEachCount): New method.
(testHashCheckCount): Expose the bug.
Support update of disks by MODIFY_CONFIG
This patch includes changes for qemu's disk to support
virDomainUpdateDeviceFlags() with VIR_DOMAIN_DEVICE_MODIFY_CONFIG.
This patch adds support for CDROM/foppy disk types.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
* src/qemu/qemu_driver.c
(qemuDomainUpdateDeviceConfig): support cdrom/floppy.
V2: Use virAsprintf instead of snprintf/strdup
The xend driver will generate a virDomainNetDef ifname if one is not
specified in xend sexpr, even if domain is inactive. The result is
network interface XML containing 'vif-1.Y' on dev attribute of target
element, e.g.
<interface type='bridge'>
<target dev='vif-1.0'/>
...
This patch changes the behavior to only generate the ifname if not
specified in xend sexpr *and* domain is not inactive (id != -1).
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=664059
Reattaching pci device back to host without destroying guest or
detaching device from guest would cause host to crash. This patch adds
a check before doing device reattach. If the device is being assigned
to guest, libvirt refuses to reattach device to host. The patch only
works for Xen, for it just checks xenstore to get pci device
information.
Signed-off-by: Yufang Zhang <yuzhang@redhat.com>
The lone caller to hostsFileWrite (and the callers for at least 3
levels up the return stack) assume that the return value will be < 0
on failure. However, hostsFileWrite returns 0 on success, and a
positive errno on failure. This patch changes hostsFileWrite to return
-errno on failure.
We support to initialize the hooks at daemon reload if there is no
hooks script is defined, we should also support initialize the hooks
at daemon shutdown if no hooks is defined.
To address bz: https://bugzilla.redhat.com/show_bug.cgi?id=688859
Support changes of disks by MODIFY_CONFIG for qemu.
This patch includes patches for qemu's disk to support
virDomainAt(De)tachDeviceFlags with VIR_DOMAIN_DEVICE_MODIFY_CONFIG.
Other devices can be added incrementally.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
* /src/conf/domain_conf.c
(virDomainDiskIndexByName): returns array index of disk in vmdef.
(virDomainDiskRemoveByName): removes a disk which has the name in vmdef.
* src/qemu/qemu_driver.c
(qemuDomainAttachDeviceConfig): add support for Disks.
(qemuDomainDetachDeviceConfig): add support for Disks.
This patch adds functions for modify domain's persistent definition.
To do error recovery in easy way, we use a copy of vmdef and update it.
The whole sequence will be:
make a copy of domain definition.
if (flags & MODIFY_CONFIG)
update copied domain definition
if (flags & MODIF_LIVE)
do hotplug.
if (no error)
save copied one to the file and update cached definition.
else
discard copied definition.
This patch is mixuture of Eric Blake's work and mine.
From: Eric Blake <eblake@redhat.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
(virDomainObjCopyPersistentDef): make a copy of persistent vm definition
(qemuDomainAttach/Detach/UpdateDeviceConfig) : callbacks. now empty
(qemuDomainModifyDeviceFlags): add support for MODIFY_CONFIG and MODIFY_CURRENT
So far first entries for each hash key are stored directly in the hash
table while other entries mapped to the same key are linked through
pointers. As a result of that, the code is cluttered with special
handling for the first items.
This patch makes all entries (even the first ones) linked through
pointers, which significantly simplifies the code and makes it more
maintainable.
This adds several tests for remaining hash APIs (custom
hasher/comparator functions are not covered yet, though).
All tests pass both before and after the "Simplify hash implementation".
Steps to reproduce this bug:
1. # cat net.xml # 00:03.0 has been used
<interface type='network'>
<mac address='52:54:00:04:72:f3'/>
<source network='default'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
2. # virsh attach-device vm1 net.xml
error: Failed to attach device from net.xml
error: internal error unable to reserve PCI address 0:0:3
3. # virsh attach-device vm1 net.xml
error: Failed to attach device from net.xml
error: internal error unable to execute QEMU command 'device_add': Device 'rtl8139' could not be initialized
The reason of this bug is that: we can not reserve PCI address 0:0:3 because it has
been used, but we release PCI address when we reserve it failed.
When buf->error is 1, we do not return buf->content in the function
virBufferContentAndReset(). So we should free buf->content when
vsnprintf() failed.
This was broken by the refactoring in ac1e6586ec. It resulted in a
segfault for 'virsh vol-dumpxml' and related volume functions.
Before the refactoring all users of the ESX_VI__TEMPLATE__DISPATCH
macro dispatched on the item type, as the item is the input to all those
functions.
Commit ac1e6586ec made the dynamically dispatched CastFromAnyType
functions use this macro too, but this functions dispatched on the
actual type of the AnyType object. The item is the output of the
CastFromAnyType functions.
This difference was missed in the refactoring, making CastFromAnyType
functions dereferencing the item pointer that is NULL at the time of
the dispatch.
Found by 'make -C tests valgrind'.
xen_xm.c: Dummy allocation via virDomainChrDefNew is directly
overwritten and lost. Free 'script' in success path too.
vmx.c: Free virtualDev_string in success path too.
domain_conf.c: Free compression in success path too.
We can exploit the fact that gcc warns about int-to-pointer conversion
in ternary cond?(void*):(int) in order to prevent future mistakes of
calling VIR_FREE on a scalar lvalue. For example, between commits
158ba873 and 802e2df, we would have had this warning:
cc1: warnings being treated as errors
remote.c: In function 'remoteDispatchListNetworks':
remote.c:3684:70: error: pointer/integer type mismatch in conditional expression
There are still a number of places that malloc into a const char*;
while it would probably be worth scrubbing them to use char*
instead, that is a separate patch, so we have to cast away const
in VIR_FREE for now.
* src/util/memory.h (VIR_FREE): Make gcc warn about integers.
Iteratively developed from a patch by Christophe Fergeau.
mingw lacks the counterpart to PTHREAD_MUTEX_INITIALIZER, so the
best we can do is portably expose once-only runtime initialization.
* src/util/threads.h (virOnceControlPtr): New opaque type.
(virOnceFunc): New callback type.
(virOnce): New prototype.
* src/util/threads-pthread.h (virOnceControl): Declare.
(VIR_ONCE_CONTROL_INITIALIZER): Define.
* src/util/threads-win32.h (virOnceControl)
(VIR_ONCE_CONTROL_INITIALIZER): Likewise.
* src/util/threads-pthread.c (virOnce): Implement in pthreads.
* src/util/threads-win32.c (virOnce): Implement in WIN32.
* src/libvirt_private.syms: Export it.
This patch strips reusable part of qemuDomainUpdateDeviceFlags()
and consolidate it to qemuDomainModifyDeviceFlags().
No functional changes.
* src/qemu/qemu_driver.c
(qemuDomainChangeDiskMediaLive) : pulled out code for updating disks.
(qemuDomainUpdateDeviceLive) : core of UpdateDevice, extracted from
UpdateDeviceFlags()
(qemuDomainModifyDeviceFlags): add support for updating device in live domain.
(qemuDomainUpdateDeviceFlags): reworked as a wrapper function of
qemuDomainModifyDeviceFlags()
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
clean up At(De)tachDeviceFlags() for consolidation.
qemuDomainAttachDeviceFlags()/qemuDomainDetachFlags()/
qemuDomainUpdateDeviceFlags() has similar logics and copied codes.
This patch series tries to unify them to use shared code when it can.
At first, clean up At(De)tachDeviceFlags() and devide it into functions.
By this, this patch pulls out shared components between functions.
Based on patch series by Eric Blake, I added some modification as
switch-case with QEMU_DEVICE_ATTACH, QEMU_DEVICE_DETACH, QEMU_DEVICE_UPDATE
* src/qemu/qemu_driver.c
(qemuDomainAt(De)tachDeviceFlags) : pulled out to qemuDomainModifyDeviceFlags()
(qemuDomainModifyDeviceFlags) : implements generic code for modifying domain.
(qemuDomainAt(De)tachDeviceFlagsLive) : code for at(de)taching devices to
domain in line. no changes in logic from old code.
(qemuDomainAt(De)tachDeviceDiskLive) : for at(de)taching Disks.
(qemuDomainAt(De)tachDeviceControllerLive) : for at(de)taching Controllers
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Centralize device modification in the more flexible APIs, to allow future
honoring of additional flags. Explicitly reject the
VIR_DOMAIN_DEVICE_MODIFY_FORCE flag on attach/detach.
Based on Eric Blake<eblake@redhat.com>'s work.
* src/qemu/qemu_driver.c
(qemudDomainAttachDevice)(qemudDomainAttachDeviceFlags): Swap bodies,rename...
(qemudDomainDetachDevice, qemudDomainDetachDeviceFlags): Likewise.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Up to now we missed parser for cpuinfo on x390(x) machines. Those machines
have only 1 thread, core, socket. What is missing is information about
CPU frequency.
The two ends of the pipe used for feeding QEMU tunnelled
migration data were interchanged, so QEMU got given the
"write" end instead of the "read" end.
The qemuMigrationPrepareTunnel method was also immediately
closing the "write" end of the pipe, so the stream failed
to actually write anything.
* src/qemu/qemu_migration.c: Swap tunnelled migration
pipe FDs & don't close pipe given to stream
Here is a new version of this patch:
https://www.redhat.com/archives/libvir-list/2011-April/msg00337.html
v2:
- store the cputune info for the whole runtime of the domain
- remove cputune info when domain is destroyed
The nodeGetInfo code had to be moved into a helper
function to reuse it without a virConnectPtr.
Rather than copying and pasting lots of code, factor it into a
single helper function.
This commit adds a warning if tighter integer parsing would fail
due to any stray bytes after the number, but should not change
any behavior other than the bug fix for phypNumDomainsGeneric
looking only at numeric lines.
* src/phyp/phyp_driver.c (phypExecInt): New function.
(phypGetVIOSPartitionID, phypNumDomainsGeneric, phypGetLparID)
(phypGetLparMem, phypGetLparCPUGeneric, phypGetRemoteSlot)
(phypGetVIOSNextSlotNumber, phypAttachDevice)
(phypGetStoragePoolSize, phypStoragePoolNumOfVolumes)
(phypNumOfStoragePools, phypInterfaceDestroy)
(phypInterfaceDefineXML, phypInterfaceLookupByName)
(phypInterfaceIsActive, phypNumOfInterfaces): Use it.
(phypNumDomainsGeneric): Correctly find numeric line.
This last minute addition caused a build failure
cc1: warnings being treated as errors
qemu/qemu_process.c: In function 'qemuProcessHandleWatchdog':
qemu/qemu_process.c:436:34: error: ignoring return value of 'virDomainObjUnref', declared with attribute warn_unused_result [-Wunused-result]
make[3]: *** [libvirt_driver_qemu_la-qemu_process.lo] Error 1
This patch does the following two things:
1. hold an extra reference while handling watchdog event
If the domain is not persistent, and qemu quits unexpectedly before
calling processWatchdogEvent(), vm will be freed and the function
processWatchdogEvent() will be dangerous.
2. unlock qemu driver and vm before returning from processWatchdogEvent()
When the function processWatchdogEvent() failed, we only free wdEvent,
but forget to unlock qemu driver and vm, free dumpfile.
We do not lock qemu_driver when calling virThreadPoolNew(). If it failed,
we will unlock qemu_driver. It is dangerous.
We may use this pool during auto starting domains. So we must create it before
calling qemuAutostartDomains(). Otherwise, libvirtd will crash.
Also mark error messages in block_stats.c for translation, add the
new macro to the msg_gen functions in cfg.mk and add block_stats.c
to po/POTFILES.in
commit d4601696 introduces two more generated files: esx_vi.generated.h
and esx_vi.generated.h. But we do not include them into dist file.
It will break building if using dist file to build.
Use the name 'ret' for all phypExec results, to make it easier
to wrap phypExec. Don't allow a possibly NULL ret through printf.
* src/phyp/phyp_driver.c (phypBuildVolume, phypDestroyStoragePool)
(phypBuildStoragePool, phypBuildLpar): Avoid NULL dereference.
(phypInterfaceDestroy): Avoid redundant free.
(phypVolumeLookupByPath, phypVolumeGetPath): Use consistent
naming.
Ever since commit ebc46f, the destroy function built two command
variants but only used one. I went with the variant that matches
the idiom used in the counterpart of phypBuildStoragePool.
* src/phyp/phyp_driver.c (phypDestroyStoragePool): Avoid
clobbering cmd. Fix error message typo.
This warnings come from partly generated code. Therefore, the best
solution is to mark them as potentially being unused using the
ATTRIBUTE_UNUSED macro. This is suggested by the gcc documentation.
Reported by Christophe Fergeau
This patch addresses:
https://bugzilla.redhat.com/show_bug.cgi?id=694382
In order to give each libvirt-created bridge a fixed MAC address,
commit 5754dbd56d, added code to create
a dummy tap device with guaranteed lowest MAC address and attach it to
the bridge. This tap device was given the name "${bridgename}-nic".
However, an interface device name must be IFNAMSIZ (15) characters or
less, so a valid ${bridgename} such as "verylongname123" (15
characters) would lead to an invalid tap device name
("verylongname123-nic" - 19 characters), and that in turn led to a
failure to bring up the network.
The solution is to shorten the part of the original name used to
generate the tap device name. However, simply truncating it is
insufficient, because the last few characters of an interface name are
often a number used to indicate one of a list of several similar
devices (for example, "verylongname123", "verylongname124", etc) and
simple truncation would lead to duplicate names (eg "verlongnam-nic"
and "verylongnam-nic"). So instead we take the first 8 characters of
$bridgename ("verylong" in the example), add on the final 3 bytes
("123"), then add "-nic" (so "verylong123-nic"). Not pretty, but it
is much more likely to generate a unique name, and is reproducible
(unlike, say, a random number).
Due to differences in /proc/cpuinfo the parsing of the cpu data is
different between architectures. On PPC /proc/cpuinfo looks like this:
[original formatting with tabs]
processor : 0
cpu : PPC970MP, altivec supported
clock : 2297.700000MHz
revision : 1.1 (pvr 0044 0101)
processor : 1
cpu : PPC970MP, altivec supported
clock : 2297.700000MHz
revision : 1.1 (pvr 0044 0101)
[..]
timebase : 14318000
platform : pSeries
model : IBM,8844-AC1
machine : CHRP IBM,8844-AC1
The patch adapts the parsing of the data found in /proc/cpuinfo.
/sys/devices/system/cpu/cpuX/topology/physical_package_id also
always returns -1. Check for it on ppc and make it '0' if found negative.
This patch enables the migration of Qemu VMs between hosts of different endianess. I tested this by migrating a i686 VM between a x86 and ppc64 host.
I am converting the 'int's in the VM's state header to uint32_t assuming this doesn't break compatibility with existing deployments other than Linux.
gcc 4.6 warns when a variable is initialized but isn't used afterwards:
vmware/vmware_driver.c:449:18: warning: variable 'vmxPath' set but not used [-Wunused-but-set-variable]
This patch fixes these warnings. There are still 2 offending files:
- vbox_tmpl.c: the variable is used inside an #ifdef and is assigned several
times outside of #ifdef. Fixing the warning would have required wrapping
all the assignment inside #ifdef which hurts readability.
vbox/vbox_tmpl.c: In function 'vboxAttachDrives':
vbox/vbox_tmpl.c:3918:22: warning: variable 'accessMode' set but not used [-Wunused-but-set-variable]
- esx_vi_types.generated.c: the name implies it's generated code and I
didn't want to dive into the code generator
esx/esx_vi_types.generated.c: In function 'esxVI_FileQueryFlags_Free':
esx/esx_vi_types.generated.c:1203:3: warning: variable 'item' set but not used [-Wunused-but-set-variable]
Make: passed
Make check: passed
Make syntax-check: passed
this is the commit to introduce the function to create new character
device definition for the domain as advised by Cole Robinson
<crobinso@redhat.com>.
The function is used on the relevant places and also new tests has
been added.
Signed-off-by: Michal Novotny <minovotn@redhat.com>
This extends the SPICE XML to allow variable compression settings for audio,
images and streaming:
<graphics type='spice' port='5901' tlsPort='-1' autoport='yes'>
<image compression='auto_glz'/>
<jpeg compression='auto'/>
<zlib compression='auto'/>
<playback compression='on'/>
</graphics>
All new elements are optional.
This fixes: https://bugzilla.redhat.com/show_bug.cgi?id=696660
While starting a network, if brSetForwardDelay() fails, we go to err1
where we want to access macTapIfName variable which was just
VIR_FREE'd a few lines above. Instead, keep macTapIfName until we are
certain of success.
The methods qemuDomain{Get,Set}{Memory,Blkio,Scheduler}Parameters
all forgot to do a check on virDomainIsActive(), resulting in bogus
error messages from later parts of their impl
* src/qemu/qemu_driver.c: Add missing checks on virDomainIsActive()
sizeof(domain->name) is the wrong thing. Instead of using strdup here
rewrite escape_specialcharacters to allocate the buffer itself.
Add a contains_specialcharacters to be used in phypOpen, as phypOpen is
not interested in the escaped version.
Don't pre-allocate 4kb per key, make phypVolumeGetKey allocate the memory.
Make phypBuildVolume return the volume key instead of using pre-allocated
memory to store it.
Also fix a memory leak in phypVolumeLookupByName when phypVolumeGetKey
fails. Fix another memory leak in phypVolumeLookupByPath in the success
path. Fix phypVolumeGetXMLDesc leaking voldef.key.
Move the virInterfacePtr declaration to the top of the
function to avoid jump uninitialized variable warnings
* src/phyp/phyp_driver.c: Fix var declaration
This is the implementation of the previous patch now using virInterface*
API. Ended up this patch got much more simpler, smaller and easier to
review. Here is some details:
* MAC size and interface name are fixed due to specifications on HMC,
both are created automatically and CAN'T be specified from user. They
have the following format:
* MAC: 122980003002
* Interface name: U9124.720.067BE8B-V3-C0
* I did replaced all the |grep|sed following the comments Eric Blake
did on the last patch.
* According to my last email, It's not possible to create a network
interface without assigning it to a specific lpar. Then, I am using
this very minimalistic XML file for testing:
<interface type='ethernet' name='LPAR01'>
</interface>
In this file I am using "name" as the lpar name which I am going to
assign the new network interface. I couldn't find a better way to
refer to it. Comments are welcome.
* Regarding the fact I am sleeping one second waiting for the HMC to
complete creation of the interface, I don't have means to check
if the whole process is done. All I do is execute a command, wait
until is complete (which is not enough in this case) check
the return and the exit status. The process of actually creating
a networking interface seems to take a little longer than just the
return of the ssh control.
In qemuDomainObjBeginJobWithDriver, when virCondWaitUntil timeouts,
the function tries to call qemuDriverLock with virDomainObj locked,
this causes the dead-lock problem. This patch fixes this.
Commit 9677cd33ee made it possible to
remove current entry when iterating through all hash entries. However,
it didn't properly handle a special case of removing first entry
assigned to a given key which contains several entries in its collision
list.
This patch implements the code to support virDomainSetMaxMemory API,
and to support VIR_DOMAIN_MEM_MAXIMUM flag in qemudDomainSetMemoryFlags function.
As a result, we can change the maximum memory size of inactive QEMU guests.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Move "returns" keyword from beginning of API doc lines
when it does not describe return values.
Maybe the API doc extractor could be changed to look for
"returns: " to avoid such confusion.
The libexec program libvirt_iohelper is only for libvirtd. If we build rpm
without libvirtd, we will receive the following messages:
Checking for unpackaged file(s): /usr/lib/rpm/check-files /home/wency/rpmbuild/BUILDROOT/libvirt-0.9.0-1.el6.x86_64
error: Installed (but unpackaged) file(s) found:
/usr/libexec/libvirt_iohelper
This patch adds support for the evaluation of TCP flags in nwfilters.
It adds documentation to the web page and extends the tests as well.
Also, the nwfilter schema is extended.
The following are some example for rules using the tcp flags:
<rule action='accept' direction='in'>
<tcp state='NONE' flags='SYN/ALL' dsptportstart='80'/>
</rule>
<rule action='drop' direction='in'>
<tcp state='NONE' flags='SYN/ALL'/>
</rule>
This patch adds virDomainSetMemoryFlags(,,VIR_DOMAIN_MEM_CURRENT) support
code to qemu driver.
Also, change virDomainObjIsActive to return bool, given its usage.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
This patch introduces VIR_DOMAIN_MEM_CURRENT flag and
modifies virDomainSetMemoryFlags function to support it.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
* .gnulib: Update to latest, for pipe2.
* bootstrap.conf (gnulib_modules): Add pipe2.
* src/util/event_poll.c (virEventPollInit): Use it, to avoid
problematic virSetCloseExec on mingw.
1) Both "qemuDomainStartWithFlags" and "qemuAutostartDomain" try to
restore the domain from managedsave'ed image if it exists (by
invoking "qemuDomainObjRestore"), but it unlinks the image even
if restoring fails, which causes data loss. (This problem exists
for "virsh managedsave dom; virsh start dom").
The fix for is to unlink the managed state file only if restoring
succeeded.
2) For "virsh save dom; virsh restore dom;", it can cause data
corruption if one reuse the saved state file for restoring. Add
doc to tell user about it.
3) In "qemuDomainObjStart", if "managed_save" is NULL, we shouldn't
fallback to start the domain, skipping it to cleanup as a incidental
fix. Discovered by Eric.
We should bind pci device to original driver when pciBindDeviceToStub() failed.
If the pci device is not bound to any driver before calling pciBindDeviceToStub(),
we should only unbind it from pci-stub. If it is bound to pci-stub, we should not
unbind it from pci-stub.
This patch do the following things:
1. rename the function as 'Unbind' is better than 'UnBind'.
2. pciUnbindDeviceFromStub() will be used in the function pciBindDeviceToStub() in
next patch. Float it up, instead of having to have a forward declaration
In file included from util/threads.c:31:
util/threads-pthread.c: In function 'virThreadSelfID':
util/threads-pthread.c:214: warning: cast from function call of type 'pthread_t' to non-matching type 'int' [-Wbad-function-cast]
* src/util/threads-pthread.c (virThreadSelfID) [!SYS_gettid]:
Add intermediate cast to silence gcc.
We're seeing bugs apparently resulting from thread unsafety of
libpciaccess, such as
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/726099
To prevent those, as suggested by danpb on irc, move the
nodeDeviceLock(driverState) higher into the callers. In
particular:
udevDeviceMonitorStartup should hold the lock while calling
udevEnumerateDevices(), and udevEventHandleCallback should hold it
over its entire execution.
It's not clear to me whether it is ok to hold the
nodeDeviceLock while taking the virNodeDeviceObjLock(dev) on a
device. If not, then the lock will need to be dropped around
the calling of udevSetupSystemDev(), and udevAddOneDevice()
may not actually be safe to call from higher layers with the
driverstate lock held.
libvirt 0.8.8 with this patch on it seems to work fine for me.
Assuming it looks ok and I haven't done anything obviously dumb,
I'll ask the bug submitters to try this patch.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
This patch adds max_processes option to qemu.conf which can be used to
override system default limit on number of processes that are allowed to
be running for qemu user.
cc1: warnings being treated as errors
libxl/libxl_driver.c: In function 'libxlDomainSetVcpusFlags':
libxl/libxl_driver.c:1570:14: error: cast from function call of type 'double' to non-matching type 'unsigned int' [-Wbad-function-cast]
libxl/libxl_driver.c:1578:15: error: cast from function call of type 'double' to non-matching type 'unsigned int' [-Wbad-function-cast]
This was the only use of floor() and ceil(), and floating-point
is overkill for power-of-two manipulations.
* src/libxl/libxl_driver.c (libxlDomainSetVcpusFlags): Avoid -lm
for trivial computations.
GCC is a little confused about the cast of beginthread/beginthreadex
from unsigned long -> void *. Go via an intermediate variable avoids
the bogus warning, and makes the code a little cleaner
* src/util/threads-win32.c: Avoid compiler warning in cast
The SCSI volumes get a better 'key' field based on the fully
qualified volume path. All SCSI volumes have a unique serial
available in hardware which can be obtained by sending a
suitable SCSI command. Call out to udev's 'scsi_id' command
to fetch this value
* src/storage/storage_backend_scsi.c: Improve volume key
field value stability and uniqueness
When initializing qemu guest capabilities, we should ignore qemu
binaries that we are not able to extract version/help info from since
they will be unusable for creating domains anyway. Ignoring them is also
much better than letting initialization of qemu driver fail.
A couple of functions were declared using the old style foo()
for no-parameters, instead of foo(void)
* src/xen/xen_hypervisor.c, tests/testutils.c: Replace () with (void)
in some function declarations
* m4/virt-compile-warnings.m4: Enable -Wold-style-definition
Replace openvz_readline with getline in several places to get rid of stack
allocated buffers to hold lines.
openvzReadConfigParam allocates memory for return values instead of
expecting a preexisting buffer.
This patch enables the relative backing file path support provided by
qemu-img create.
If a relative path is specified for the backing file, it is converted
to an absolute path using the storage pool path. The absolute path is
used to verify that the backing file exists. If the backing file exists,
the relative path is allowed and will be provided to qemu-img create.
Even with -Wuninitialized (which is part of autobuild.sh
--enable-compile-warnings=error), gcc does NOT catch this
use of an uninitialized variable:
{
if (cond)
goto error;
int a = 1;
error:
printf("%d", a);
}
which prints 0 (supposing the stack started life wiped) if
cond was true. Clang will catch it, but we don't use clang
as often. Using gcc -Wjump-misses-init catches it, but also
gives false positives:
{
if (cond)
goto error;
int a = 1;
return a;
error:
return 0;
}
Here, a was never used in the scope of the error block, so
declaring it after goto is technically fine (and clang agrees).
However, given that our HACKING already documents a preference
to C89 decl-before-statement, the false positive warning is
enough of a prod to comply with HACKING.
[Personally, I'd _really_ rather use C99 decl-after-statement
to minimize scope, but until gcc can efficiently and reliably
catch scoping and uninitialized usage bugs, I'll settle with
the compromise of enforcing a coding standard that happens to
reject false positives if it can also detect real bugs.]
* acinclude.m4 (LIBVIRT_COMPILE_WARNINGS): Add -Wjump-misses-init.
* src/util/util.c (__virExec): Adjust offenders.
* src/conf/domain_conf.c (virDomainTimerDefParseXML): Likewise.
* src/remote/remote_driver.c (doRemoteOpen): Likewise.
* src/phyp/phyp_driver.c (phypGetLparNAME, phypGetLparProfile)
(phypGetVIOSFreeSCSIAdapter, phypVolumeGetKey)
(phypGetStoragePoolDevice)
(phypVolumeGetPhysicalVolumeByStoragePool)
(phypVolumeGetPath): Likewise.
* src/vbox/vbox_tmpl.c (vboxNetworkUndefineDestroy)
(vboxNetworkCreate, vboxNetworkDumpXML)
(vboxNetworkDefineCreateXML): Likewise.
* src/xenapi/xenapi_driver.c (getCapsObject)
(xenapiDomainDumpXML): Likewise.
* src/xenapi/xenapi_utils.c (createVMRecordFromXml): Likewise.
* src/security/security_selinux.c (SELinuxGenNewContext):
Likewise.
* src/qemu/qemu_command.c (qemuBuildCommandLine): Likewise.
* src/qemu/qemu_hotplug.c (qemuDomainChangeEjectableMedia):
Likewise.
* src/qemu/qemu_process.c (qemuProcessWaitForMonitor): Likewise.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextGetPtyPaths):
Likewise.
* src/qemu/qemu_driver.c (qemudDomainShutdown)
(qemudDomainBlockStats, qemudDomainMemoryPeek): Likewise.
* src/storage/storage_backend_iscsi.c
(virStorageBackendCreateIfaceIQN): Likewise.
* src/node_device/node_device_udev.c (udevProcessPCI): Likewise.
If strdup("x509dname") or strdup("saslUsername") success, but
strdup(x509dname) or strdup(saslUsername) failed, subject->nidentity
is not the num elements of subject->identities, and we will leak some
memory.
When you happen to have a libvirtd binary compiled with the
libxenlight driver (say you have installed xen-4.1 libraries)
but not running a xen enabled system, then libvirtd fails to start.
The cause is that libxlStartup() returns -1 when failing to initialize
the library, and this propagates to virStateInitialize() which consider
this a failure. We should only exit libxlStartup with an error code
if something like an allocation error occurs, not if the driver failed
to initialize.
* src/libxl/libxl_driver.c: fix libxlStartup() to not return -1
when failing to initialize the libxenlight library
qemu driver uses a 4K buffer for reading qemu log file. This is enough
when only qemu's output is present in the log file. However, when
debugging messages are turned on, intermediate libvirt process fills the
log with a bunch of debugging messages before it executes qemu binary.
In such a case the buffer may become too small. However, we are not
really interested in libvirt messages so they can be filtered out from
the buffer.
It throws errors as long as the cgroup controller is not available,
regardless of whether we really want to use it to do setup or not,
which is not what we want, fixing it with throwing error when need
to use the controller.
And change "VIR_WARN" to "qemuReportError" for memory controller
incidentally.
We create a temporary file to save memory, and we will remove it after reading
memory to buffer. But we free the variable that contains the temporary filename
before we remove it. So we should free tmp after unlinking it.
strcase{cmp/str} have the drawback of being sensitive to the global
locale; this is unacceptable in a library setting. Prefer a
hard-coded C locale alternative for all but virsh, which is user
facing and where the global locale isn't changing externally.
* .gnulib: Update to latest, for c-strcasestr change.
* bootstrap.conf (gnulib_modules): Drop strcasestr, add c-strcase
and c-strcasestr.
* cfg.mk (sc_avoid_strcase): New rule.
(exclude_file_name_regexp--sc_avoid_strcase): New exception.
* src/internal.h (STRCASEEQ, STRCASENEQ, STRCASEEQLEN)
(STRCASENEQLEN): Adjust offenders.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextEjectMedia):
Likewise.
* tools/virsh.c (namesorter): Document exception.
If qemu quited unexpectedly when we call qemuMonitorJSONHMP(),
libvirt will crash.
Steps to reproduce this bug:
1. use gdb to attach libvirtd, and set a breakpoint in the function
qemuMonitorSetCapabilities()
2. start a vm
3. let the libvirtd to run until qemuMonitorJSONSetCapabilities() returns.
4. kill the qemu process
5. continue running libvirtd
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
If the monitor met a error, and we will call qemuProcessHandleMonitorEOF().
But we may try to send monitor command after qemuProcessHandleMonitorEOF()
returned. Then libvirtd will be blocked in qemuMonitorSend().
Steps to reproduce this bug:
1. use gdb to attach libvirtd, and set a breakpoint in the function
qemuConnectMonitor()
2. start a vm
3. let the libvirtd to run until qemuMonitorOpen() returns.
4. kill the qemu process
5. continue running libvirtd
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Currently libvirt's default logging is limited and it is difficult to
determine what was happening when a proglem occurred (especially on a
machines where one don't know the detail.) This patch helps to do that
by making additional logging available for the following events:
creating/defining/undefining domains
creating/defining/undefining/starting/stopping networks
creating/defining/undefining/starting/stopping storage pools
creating/defining/undefining/starting/stopping storage volumes.
* AUTHORS: add Naoya Horiguchi
* src/network/bridge_driver.c src/qemu/qemu_driver.c
src/storage/storage_driver.c: provide more VIR_INFO logging
Not sure if it's the correct way to add cputune xml for xend driver,
and besides, seems "xm driver" and "xen hypervisor" also support
vcpu affinity, do we need to add support for them too?
When domain startup, setting cpu affinity and cpu shares according
to the cputune xml specified in domain xml.
Modify "qemudDomainPinVcpu" to update domain config for vcpupin,
and modify "qemuSetSchedulerParameters" to update domain config
for cpu shares.
v1 - v2:
* Use "VIR_ALLOC_N" instead of "VIR_ALLOC_VAR"
* But keep raising error when it fails on adding vcpupin xml
entry, as I still don't have a better idea yet.
Implementations of following functions:
virDomainVcpupinIsDuplicate
virDomainVcpupinFindByVcpu
virDomainVcpupinAdd
Update "virDomainDefParseXML" to parse, and "virDomainDefFormatXML"
to build cputune xml, also implementations of new internal helper
functions.
v1 - v2:
* Resolve potential crash bug of "virDomainVcpupinAdd"
Also related new functions' declaration, and expose the new introduced
functions in libvirt_private.syms.
v1 - v2:
Don't expose "virAllocVar" in libvirt_private.syms
My earlier testing for commit 34fa0de0 was done while starting
just-built libvirt from an unconfined_t shell, where the fds happened
to work when transferring to qemu. But when installed and run under
virtd_t, failure to label the raw file (with no compression) or the
pipe (with compression) triggers SELinux failures when passing fds
over SCM_RIGHTS to svirt_t qemu.
* src/qemu/qemu_migration.c (qemuMigrationToFile): When passing
FDs, make sure they are labeled.
First fallout of fd: migration - it looks like SELinux enforcing
_does_ require fd labeling (running uninstalled libvirtd from an
unconstrained shell had no problems, but once faked out by doing
chcon `stat -c %C /usr/sbin/libvirtd` daemon/libvirtd
run_init $PWD/daemon/libvirtd
to run it with the same context as an init script service, and with
SELinux enforcing, I got a rather confusing failure:
error: Failed to save domain fedora_12 to fed12.img
error: internal error unable to send TAP file handle: No file descriptor supplied via SCM_RIGHTS
This fixes the error message, then I need to figure out a subsequent
patch that does the fsetfilecon() necessary to keep things happy.
It also appears that libvirtd hangs on a failed fd transfer; I don't
know if that needs an independent fix.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextSendFileHandle):
Improve message, since TAP is no longer only client.
* src/Makefile.am src/libvirt_private.syms configure.ac: share and
reuse the sexpr routines from sexpr.h of the old xen driver
* src/libxl/libxl_driver.c: implements libxlDomainXMLFromNative and
libxlDomainXMLToNative
Hook the virtual cpu functions to their libxenlight counterparts
* src/libxl/libxl_driver.c: implements libxlDomainSetVcpus,
libxlDomainGetVcpus, libxlDomainSetVcpusFlags,
libxlDomainGetVcpusFlags and libxlDomainPinVcpu
* src/libxl/libxl_conf.h: add the necessary fields to the driver
private structure
* src/libxl/libxl_driver.c: add lifecycle event support and entry
points for event(de)register(any)
New APIs are added allowing streaming of content to/from
storage volumes.
* include/libvirt/libvirt.h.in: Add virStorageVolUpload and
virStorageVolDownload APIs
* src/driver.h, src/libvirt.c, src/libvirt_public.syms: Stub
code for new APIs
* src/storage/storage_driver.c, src/esx/esx_storage_driver.c:
Add dummy entries in driver table for new APIs
The O_NONBLOCK flag doesn't work as desired on plain files
or block devices. Introduce an I/O helper program that does
the blocking I/O operations, communicating over a pipe that
can support O_NONBLOCK
* src/fdstream.c, src/fdstream.h: Add non-blocking I/O
on plain files/block devices
* src/Makefile.am, src/util/iohelper.c: I/O helper program
* src/qemu/qemu_driver.c, src/lxc/lxc_driver.c,
src/uml/uml_driver.c, src/xen/xen_driver.c: Update for
streams API change
Spawn the compressor ourselves, instead of requiring the shell.
* src/qemu/qemu_migration.c (qemuMigrationToFile): Spawn
compression helper process when needed.
SELinux labeling and cgroup ACLs aren't required if we hand a
pre-opened fd to qemu. All the more reason to love fd: migration.
* src/qemu/qemu_migration.c (qemuMigrationToFile): Skip steps
that are irrelevant in fd migration.
This points out that core dumps (still) don't work for root-squash
NFS, since the fd is not opened correctly. This patch should not
introduce any functionality change, it is just a refactoring to
avoid duplicated code.
* src/qemu/qemu_migration.h (qemuMigrationToFile): New prototype.
* src/qemu/qemu_migration.c (qemuMigrationToFile): New function.
* src/qemu/qemu_driver.c (qemudDomainSaveFlag, doCoreDump): Use
it.
Direct access to an open file is so much simpler than passing
everything through a pipe!
* src/qemu/qemu_driver.c (qemudOpenAsUID)
(qemudDomainSaveImageClose): Delete.
(qemudDomainSaveImageOpen): Rename...
(qemuDomainSaveImageOpen): ...and drop read_pid argument. Use
virFileOpenAs instead of qemudOpenAsUID.
(qemudDomainSaveImageStartVM, qemudDomainRestore)
(qemudDomainObjRestore): Rename...
(qemuDomainSaveImageStartVM, qemuDomainRestore)
(qemDomainObjRestore): ...and simplify accordingly.
(qemudDomainObjStart, qemuDriver): Update callers.
This patch intentionally doesn't change indentation, in order to
make it easier to review the real changes.
* src/util/util.h (VIR_FILE_OP_RETURN_FD, virFileOperationHook):
Delete.
(virFileOperation): Rename...
(virFileOpenAs): ...and reduce parameters.
* src/util/util.c (virFileOperationNoFork, virFileOperation):
Rename and simplify.
* src/qemu/qemu_driver.c (qemudDomainSaveFlag): Adjust caller.
* src/storage/storage_backend.c (virStorageBackendCreateRaw):
Likewise.
* src/libvirt_private.syms: Reflect rename.
Currently, the hook function in virFileOperation is extremely limited:
it must be async-signal-safe, and cannot modify any memory in the
parent process. It is much handier to return a valid fd and operate
on it in the parent than to deal with hook restrictions.
* src/util/util.h (VIR_FILE_OP_RETURN_FD): New flag.
* src/util/util.c (virFileOperationNoFork, virFileOperation):
Honor new flag.
This allows direct saves (no compression, no root-squash NFS) to use
the more efficient fd: migration, which in turn avoids a race where
qemu exec: migration can sometimes fail because qemu does a generic
waitpid() that conflicts with the pclose() used by exec:. Further
patches will solve compression and root-squash NFS.
* src/qemu/qemu_driver.c (qemudDomainSaveFlag): Use new function
when there is no compression.
Latent bug introduced in commit 2d6a581960 (Aug 2009), but not exposed
until commit 1859939a (Jan 2011). Basically, when virExec creates a
pipe, it always marks libvirt's side as cloexec. If libvirt then
wants to hand that pipe to another child process, things work great if
the fd is dup2()'d onto stdin or stdout (as with stdin: or exec:
migration), but if the pipe is instead used as-is (such as with fd:
migration) then qemu sees EBADF because the fd was closed at exec().
This is a minimal fix for the problem at hand; it is slightly racy,
but no more racy than the rest of libvirt fd handling, including the
case of uncompressed save images. A more invasive fix, but ultimately
safer at avoiding leaking unintended fds, would be to _always and
atomically_ open all fds as cloexec in libvirt (thanks to primitives
like open(O_CLOEXEC), pipe2(), accept4(), ...), then teach virExec to
clear that bit for all fds explicitly marked to be handed to the child
only after forking.
* src/qemu/qemu_command.c (qemuBuildCommandLine): Clear cloexec
flag.
* tests/qemuxml2argvtest.c (testCompareXMLToArgvFiles): Tweak test.
* src/util/logging.c (virLogStartup, virLogSetBufferSize):
Over-allocate, so that a debugger can just print the circular
buffer. Suggested by Daniel Veillard.
* src/Makefile.am (remote_protocol-structs): Flatten tabs.
* src/remote_protocol-structs: Likewise. Also add a hint to emacs
to make it easier to keep spaces in the file.
Otherwise, if something like doStopVcpus fails after the first
restore, a second restore is attempted and throws a useless
warning.
* src/qemu/qemu_driver.c (qemudDomainSaveFlag): Avoid second
restore of state label.
The Open Nebula driver has been unmaintained since it was first
introduced. The only commits have been for tree-wide cleanups.
It also has a major design flaw, in that it only knows about guests
that it has created itself, which makes it of very limited use.
Discussions wrt evolution of the VMWare ESX driver, concluded that
it should limit itself to single-node ESX operation and not try to
manage the multi-node architecture of VirtualCenter. Open Nebula
is a cluster like Virtual Center, not a single node system, so
the same reasoning applies.
The DeltaCloud project includes an Open Nebula driver and is a much
better fit architecturally, since it is explicitly targetting the
distributed multihost cluster scenario.
Thus this patch deletes the libvirt Open Nebula driver with the
recommendation that people use DeltaCloud for managing it instead.
* configure.ac: Remove probe for xmlrpc & --with-one arg
* daemon/Makefile.am, daemon/libvirtd.c, src/Makefile.am: Remove
ONE driver build
* src/opennebula/one_client.c, src/opennebula/one_client.h,
src/opennebula/one_conf.c, src/opennebula/one_conf.h,
src/opennebula/one_driver.c, src/opennebula/one_driver.c: Delete
files
* autobuild.sh, libvirt.spec.in, mingw32-libvirt.spec.in: Remove
build rules for Open Nebula
* docs/drivers.html.in, docs/sitemap.html.in: Remove reference
to OpenNebula
* docs/drvone.html.in: Delete file
Add missing open curly brace between function declaration of non-linux
variant of qemudDomainInterfaceStats() and its body.
Signed-off-by: Philipp Hahn <hahn@univention.de>
Sometimes, an asynchronous helper is started (such as a compressor
or iohelper program), but a later error means that we want to
abort that child. Make this easier.
Note that since daemons and virCommandRunAsync can't mix, the only
time virCommandFree can reap a process is if someone did
virCommandRunAsync for a non-daemon and didn't stash the pid.
* src/util/command.h (virCommandAbort): New prototype.
* src/util/command.c (_virCommand): Add new field.
(virCommandRunAsync, virCommandWait): Track whether pid was used.
(virCommandFree): Reap child if caller did not request pid.
(virCommandAbort): New function.
* src/libvirt_private.syms (command.h): Export it.
* tests/commandtest.c (test19): New test.
It doesn't make sense to run a daemon without synchronously
waiting for the child process to reply whether the daemon has
been kicked off and pidfile written yet.
* src/util/command.c (VIR_EXEC_RUN_SYNC): New constant.
(virCommandRun): Set temporary flag.
(virCommandRunAsync): Use it to prevent async runs of intermediate
child when spawning asynchronous daemon grandchild.
Child processes don't always reach _exit(); if they die from a
signal, then any messages should still be accurate. Most users
either expect a 0 status (thankfully, if status==0, then
WIFEXITED(status) is true and WEXITSTATUS(status)==0 for all
known platforms) or were filtering on WIFEXITED before printing
a status, but a few were missing this check. Additionally,
nwfilter_ebiptables_driver was making an assumption that works
on Linux (where WEXITSTATUS shifts and WTERMSIG just masks)
but fails on other platforms (where WEXITSTATUS just masks and
WTERMSIG shifts).
* src/util/command.h (virCommandTranslateStatus): New helper.
* src/libvirt_private.syms (command.h): Export it.
* src/util/command.c (virCommandTranslateStatus): New function.
(virCommandWait): Use it to also diagnose status from signals.
* src/security/security_apparmor.c (load_profile): Likewise.
* src/storage/storage_backend.c
(virStorageBackendQEMUImgBackingFormat): Likewise.
* src/util/util.c (virExecDaemonize, virRunWithHook)
(virFileOperation, virDirCreate): Likewise.
* daemon/remote.c (remoteDispatchAuthPolkit): Likewise.
* src/nwfilter/nwfilter_ebiptables_driver.c (ebiptablesExecCLI):
Likewise.
Hotpluging host usb device by text mode will fail, because the monitor
command 'device_add' outputs 'husb: using...' if it succeeds, but we
think the command should not output anything.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Add the compiler attribute to ensure we don't introduce any more
ref bugs like were just patched in commit 9741f34, then explicitly
mark the remaining places in code that are safe.
* src/qemu/qemu_monitor.h (qemuMonitorUnref): Mark
ATTRIBUTE_RETURN_CHECK.
* src/conf/domain_conf.h (virDomainObjUnref): Likewise.
* src/conf/domain_conf.c (virDomainObjParseXML)
(virDomainLoadStatus): Fix offenders.
* src/openvz/openvz_conf.c (openvzLoadDomains): Likewise.
* src/vmware/vmware_conf.c (vmwareLoadDomains): Likewise.
* src/qemu/qemu_domain.c (qemuDomainObjBeginJob)
(qemuDomainObjBeginJobWithDriver)
(qemuDomainObjExitRemoteWithDriver): Likewise.
* src/qemu/qemu_monitor.c (QEMU_MONITOR_CALLBACK): Likewise.
Suggested by Daniel P. Berrange.
This simplifies several callers that were repeating checks already
guaranteed by util.c, and makes other callers more robust to now
reject directories. remote_driver.c was over-strict - access(,R_OK)
is only needed to execute a script file; a binary only needs
access(,X_OK) (besides, it's unusual to see a file with x but not
r permissions, whether script or binary).
* cfg.mk (sc_prohibit_access_xok): New syntax-check rule.
(exclude_file_name_regexp--sc_prohibit_access_xok): Exempt one use.
* src/network/bridge_driver.c (networkStartRadvd): Fix offenders.
* src/qemu/qemu_capabilities.c (qemuCapsProbeMachineTypes)
(qemuCapsInitGuest, qemuCapsInit, qemuCapsExtractVersionInfo):
Likewise.
* src/remote/remote_driver.c (remoteFindDaemonPath): Likewise.
* src/uml/uml_driver.c (umlStartVMDaemon): Likewise.
* src/util/hooks.c (virHookCheck): Likewise.
Steps to reproduce this bug:
1. virsh attach-disk domain --source diskimage --target sdb --sourcetype file --driver qemu --subdriver qcow2
error: Failed to attach disk
error: operation failed: adding scsi-disk,bus=scsi0.0,scsi-id=1,drive=drive-scsi0-0-1,id=scsi0-0-1 device failed: Property 'scsi-disk.drive' can't find value 'drive-scsi0-0-1'
2. service libvirtd restart
Stopping libvirtd daemon: [ OK ]
Starting libvirtd daemon: [ OK ]
3. virsh attach-disk domain --source diskimage --target sdb --sourcetype file --driver qemu --subdriver raw
error: Failed to attach disk
error: operation failed: adding lsi,id=scsi0,bus=pci.0,addr=0x6 device failed: Duplicate ID 'scsi0' for device
The reason is that we create a new scsi controller but we do not update
/var/run/libvirt/qemu/domain.xml.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
The ref count was assigned to 1 at creation, then never modified again
until it was decremented just before freeing the object.
* src/conf/domain_conf.h (_virDomainSnapshotObj): Delete unused
field.
(virDomainSnapshotObjUnref): Delete unused prototype.
* src/libvirt_private.syms: Likewise.
* src/conf/domain_conf.c (virDomainSnapshotObjNew)
(virDomainSnapshotObjListDataFree): Update users.
(virDomainSnapshotObjUnref): Delete.
Problem:
"parser.head" is not NULL even if it's free'ed by "virJSONValueFree",
returning "parser.head" when "virJSONValueFromString" fails will cause
unexpected errors (libvirtd will crash sometimes), e.g.
In function "qemuMonitorJSONArbitraryCommand":
if (!(cmd = virJSONValueFromString(cmd_str)))
goto cleanup;
if (qemuMonitorJSONCommand(mon, cmd, &reply) < 0)
goto cleanup;
......
cleanup:
virJSONValueFree(cmd);
It will continues to send command to monitor even if "virJSONValueFromString"
is failed, and more worse, it trys to free "cmd" again.
Crash example:
{"error":{"class":"QMPBadInputObject","desc":"Expected 'execute' in QMP input","data":{"expected":"execute"}}}
{"error":{"class":"QMPBadInputObject","desc":"Expected 'execute' in QMP input","data":{"expected":"execute"}}}
error: server closed connection:
error: unable to connect to '/var/run/libvirt/libvirt-sock', libvirtd may need to be started: Connection refused
error: failed to connect to the hypervisor
This fix is to:
1) return NULL for failure of "virJSONValueFromString",
2) and it seems "virJSONValueFree" uses incorrect loop index for type
of "VIR_JSON_TYPE_OBJECT", fix it together.
* src/util/json.c
Steps to reproduce this bug:
# cat usb.xml
<hostdev mode='subsystem' type='usb'>
<source>
<address bus='0x001' device='0x003'/>
</source>
</hostdev>
# virsh attach-device vm1 usb.xml
error: Failed to attach device from usb.xml
error: server closed connection:
The reason of this bug is that we set data.cgroup to NULL, and this will cause
libvirtd crashed.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
A future patch will change reference counting idioms; consolidating
this pattern now makes the next patch smaller (touch only the new
macro rather than every caller).
* src/qemu/qemu_monitor.c (QEMU_MONITOR_CALLBACK): New helper.
(qemuMonitorGetDiskSecret, qemuMonitorEmitShutdown)
(qemuMonitorEmitReset, qemuMonitorEmitPowerdown)
(qemuMonitorEmitStop, qemuMonitorEmitRTCChange)
(qemuMonitorEmitWatchdog, qemuMonitorEmitIOError)
(qemuMonitorEmitGraphics): Use it to reduce duplication.
This patch introduces PREASSOCIATE-RR during incoming VM migration on the
destination host. This is similar to the usage of PREASSOCIATE during
migration in 8021qbg libvirt code today. PREASSOCIATE-RR is a VDP operation.
With the latest at IEEE, 8021qbh will need to support VDP operations.
A corresponding enic driver patch to support PREASSOCIATE-RR for 8021qbh
will be posted for net-next-2.6 inclusion soon.
THe veth setup in LXC had a couple of flaws, first brInit did
not report any error when it failed. Second vethCreate() did
not correctly initialize the variable containing the return
code, so could report failure even when it succeeded.
* src/lxc/lxc_driver.c: Report error when brInit fails
* src/lxc/veth.c: Fix uninitialized variable
Enhance the QEMU migration monitoring loop, so that it can get
a signal to change migration speed on the fly
* src/qemu/qemu_domain.h: Add signal for changing speed on the fly
* src/qemu/qemu_driver.c: Wire up virDomainMigrateSetSpeed driver
* src/qemu/qemu_migration.c: Support signal for changing speed
It is possible to set a migration speed limit when starting
migration. This new API allows the speed limit to be changed
on the fly to adjust to changing conditions
* src/driver.h, src/libvirt.c, src/libvirt_public.syms,
include/libvirt/libvirt.h.in: Add virDomainMigrateSetMaxSpeed
* src/esx/esx_driver.c, src/lxc/lxc_driver.c,
src/opennebula/one_driver.c, src/openvz/openvz_driver.c,
src/phyp/phyp_driver.c, src/qemu/qemu_driver.c,
src/remote/remote_driver.c, src/test/test_driver.c,
src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
src/vmware/vmware_driver.c, src/xen/xen_driver.c,
src/libxl/libxl_driver.c: Stub new API
Fix for bug https://bugzilla.redhat.com/show_bug.cgi?id=618970
The "prepare" hook is called very early in the VM statup process
before device labeling, so that it can allocate ressources not
managed by libvirt, such as DRBD, or for instance create missing
bridges and vlan interfaces.
* src/util/hooks.c src/util/hooks.h: add definitions for new hooks
VIR_HOOK_QEMU_OP_PREPARE and VIR_HOOK_QEMU_OP_RELEASE
* src/qemu/qemu_process.c: use them in qemuProcessStart and
qemuProcessStop()
With only a single caller to these two monitor commands, I
didn't need to wrap a new WithFds version, but just change
the command itself.
* src/qemu/qemu_monitor.h (qemuMonitorAddNetdev)
(qemuMonitorAddHostNetwork): Add parameters.
* src/qemu/qemu_monitor.c (qemuMonitorAddNetdev)
(qemuMonitorAddHostNetwork): Add support for fd passing.
* src/qemu/qemu_hotplug.c (qemuDomainAttachNetDevice): Use it to
simplify code.
This is also a bug fix - on the error path, qemu_hotplug would
leave the configfd file leaked into qemu. At least the next
attempt to hotplug a PCI device would reuse the same fdname,
and when the qemu getfd monitor command gets a new fd by the
same name as an earlier one, it closes the earlier one, so there
is no risk of qemu running out of fds.
* src/qemu/qemu_monitor.h (qemuMonitorAddDeviceWithFd): New
prototype.
* src/qemu/qemu_monitor.c (qemuMonitorAddDevice): Move guts...
(qemuMonitorAddDeviceWithFd): ...to new function, and add support
for fd passing.
* src/qemu/qemu_hotplug.c (qemuDomainAttachHostPciDevice): Use it
to simplify code.
Suggested by Daniel P. Berrange.
qemu_monitor was already returning -1 and setting errno to EINVAL
on any attempt to send an fd without a unix socket, but this was
a silent failure in the case of qemuDomainAttachHostPciDevice.
Meanwhile, qemuDomainAttachNetDevice was doing some sanity checking
for a better error message; it's better to consolidate that to a
central point in the API.
* src/qemu/qemu_hotplug.c (qemuDomainAttachNetDevice): Move sanity
checking...
* src/qemu/qemu_monitor.c (qemuMonitorSendFileHandle): ...into
central location.
Suggested by Chris Wright.
https://bugzilla.redhat.com/show_bug.cgi?id=684655 points out
a regression introduced in commit 2215050edd - non-root users
can't connect to qemu:///session because libvirtd dies when
it can't use pciaccess initialization.
* src/node_device/node_device_udev.c (udevDeviceMonitorStartup):
Don't abort udev driver (and libvirtd overall) if non-root user
can't use pciaccess.
Valgrind caught that our log wrap-around was going 1 past the end.
Regression introduced in commit b16f47a; previously the
buffer was static and size+1 bytes, but now it is dynamic and
exactly size bytes.
* src/util/logging.c (virLogStr): Don't write past end of log.
We have reported error in the function prepareCall(), and
the error is not only OOM error. So we should not report
OOM error in the function call() when prepareCall() failed.
If virFileIsExecutable is to replace access(file,X_OK), then
errno must be usable on failure.
* src/util/util.c (virFileIsExecutable): Set errno on failure.
The current description suggests that you always have to provide
a valid typeVer pointer. But if you want only the libvirt version
it's also possible to set type and typeVer to NULL to skip the
hypervisor part.
This patch enables cgroup controllers as much as possible by skipping
the creation of blkio controller when running with old kernels that
doesn't support multi-level directory for blkio controller.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
THREADS.txt states that the contents of vm should not be read or
modified while the vm lock is not held, but that the lock must not
be held while performing a monitor command. This fixes all the
offenders that I could find.
* src/qemu/qemu_process.c (qemuProcessStartCPUs)
(qemuProcessInitPasswords, qemuProcessStart): Don't modify or
refer to vm state outside lock.
* src/qemu/qemu_driver.c (qemudDomainHotplugVcpus): Likewise.
* src/qemu/qemu_hotplug.c (qemuDomainChangeGraphicsPasswords):
Likewise.
This is detailed in:
https://bugzilla.redhat.com/show_bug.cgi?id=688957
Since radvd is executed by daemonizing it, the attempt to exec the
radvd binary doesn't happen until after libvirtd has already received
an exit code from the intermediate forked process, so no error is
detected or logged by __virExec().
We can't require radvd as a prerequisite for the libvirt package (many
installations don't use IPv6, so they don't need it), so instead we
add in a check to verify there is an executable radvd binary prior to
trying to exec it.
When SASL is active, it was possible that we read and decoded
more data off the wire than we initially wanted. The loop
processing this data terminated after only one message to
avoid delaying the calling thread, but this could delay
event delivery. As long as there is decoded SASL data in
memory, we must process it, before returning to the poll()
event loop.
This is a counterpart to the same kind of issue solved in
commit 68d2c3482f
in a different area of the code
* src/remote/remote_driver.c: Process all pending SASL data
virExec would only resolved the binary to $PATH if no env
variables were being set. Since there is no execvep() API
in POSIX, we use virFindFileInPath to manually resolve
the binary and then use execv() instead of execvp().
Add a new xen driver based on libxenlight [1], which is the primary
toolstack starting with Xen 4.1.0. The driver is stateful and runs
privileged only.
Like the existing xen-unified driver, the libxenlight driver is
accessed with xen:// URI. Driver selection is based on the status
of xend. If xend is running, the libxenlight driver will not load
and xen:// connections are handled by xen-unified. If xend is not
running *and* the libxenlight driver is available, xen://
connections are deferred to the libxenlight driver.
V6:
- Address several code style issues noted by Daniel Veillard
- Make drive work with xen:/// URI
- Hold domain object reference while domain is injected in
libvirt event loop. Race found and fixed by Markus Groß.
V5:
- Ensure events are unregistered when domain private data
is destroyed. Discovered and fixed by Markus Groß.
V4:
- Handle restart of libvirtd, reconnecting to previously
started domains
- Rebased to current master
- Tested against Xen 4.1 RC7-pre (c/s 22961:c5d121fd35c0)
V3:
- Reserve vnc port within driver when autoport=yes
V2:
- Update to Xen 4.1 RC6-pre (c/s 22940:5a4710640f81)
- Rebased to current master
- Plug memory leaks found by Stefano Stabellini and valgrind
- Handle SHUTDOWN_crash domain death event
[1] http://lists.xensource.com/archives/html/xen-devel/2009-11/msg00436.html
Calling most hash APIs is not safe from inside of an iterator callback.
Exceptions are APIs that do not modify the hash table and removing
current hash entry from virHashFroEach callback.
This patch make all APIs which are not safe fail instead of just relying
on the callback being nice not calling any unsafe APIs.
Steps to reproduce this bug:
# cat test.sh
#! /bin/bash -x
virsh start domain
sleep 5
virsh qemu-monitor-command domain 'cpu_set 2 online' --hmp
# while true; do ./test.sh ; done
Then libvirtd will crash.
The reason is that:
we add a reference of obj when we open the monitor. We will reduce this
reference when we free the monitor.
If the reference of monitor is 0, we will free monitor automatically and
the reference of obj is reduced.
But in the function qemuDomainObjExitMonitorWithDriver(), we reduce this
reference again when the reference of monitor is 0.
It will cause the obj be freed in the function qemuDomainObjEndJob().
Then we start the domain again, and libvirtd will crash in the function
virDomainObjListSearchName(), because we pass a null pointer(obj->def->name)
to strcmp().
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
This bug was reported by Shi Jin(jinzishuai@gmail.com):
=============
# virsh attach-disk RHEL6RC /var/lib/libvirt/images/test3.img vdb \
--driver file --subdriver qcow2
Disk attached successfully
# virsh save RHEL6RC /var/lib/libvirt/images/memory.save
Domain RHEL6RC saved to /var/lib/libvirt/images/memory.save
# virsh restore /var/lib/libvirt/images/memory.save
error: Failed to restore domain from /var/lib/libvirt/images/memory.save
error: internal error unsupported driver name 'file'
for disk '/var/lib/libvirt/images/test3.img'
=============
We check the driver name when we start or restore VM, but we do
not check it while attaching a disk. This adds the same check on disk
driverName used in qemuBuildCommandLine to qemudDomainAttachDevice.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
As pointed out, locking the buffer from the signal handler
cannot been guaranteed to be safe, so to avoid any hazard
we prefer the trade off of dumping logs possibly messed up
by concurrent logging activity rather than risk a daemon
crash.
* src/util/logging.c: change virLogEmergencyDumpAll() to not
take any lock on the log buffer but reset buffer content variables
to an empty set before starting the actual dump.
Steps to reproduce this bug:
# virsh qemu-monitor-command domain 'cpu_set 2 online' --hmp
The domain has 2 cpus, and we try to set the third cpu online.
The qemu crashes, and this command will hang.
The reason is that the refs is not 1 when we unwatch the monitor.
We lock the monitor, but we do not unlock it. So virCondWait()
will be blocked.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
* Correct the documentation for cgroup: the swap_hard_limit indicates
mem+swap_hard_limit.
* Change cgroup private apis to: virCgroupGet/SetMemSwapHardLimit
Signed-off-by: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>
I'm proposing we make use of $PCIDIR/reset in qemu-kvm to reset
devices on VM reset. We need to add it to libvirt's list of
files that get ownership for device assignment.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
xen-unstable c/s 21118:28e5409e3fb3 bumped sysctl version to 8.
xen-unstable c/s 21212:de94884a669c introduced CPU pools feature,
adding another member to xen_domctl_getdomaininfo struct. Add a
corresponding domctl v7 struct in xen hypervisor sub-driver and
detect sysctl v8 during initialization.
The virCond of the remote_thread_call struct was leaked in some
places. This results in leaking the underlying mutex. Which in turn
leaks a handle on Windows.
Reported by Aliaksandr Chabatar and Ihar Smertsin.
A bug in libnl (see https://bugzilla.redhat.com/show_bug.cgi?id=677724
and https://bugzilla.redhat.com/show_bug.cgi?id=677725) makes it very
easy to create a failure to connect to the netlink socket when trying
to open a macvtap network device ("type='direct'" in domain interface
XML). When that error occurred (during a call to libnl's nl_connect()
from libvirt's nlComm(), there was no log message, leading virsh (for
example) to report "unknown error".
There were two other cases in nlComm where an error in a libnl
function might return with failure but no error reported. In all three
cases, this patch logs a message which will hopefully be more useful.
Note that more detailed information about the failure might be
available from libnl's nl_geterror() function, but it calls
strerror(), which is not threadsafe, so we can't use it.
If pool xml has no definition for "port", then "Segmentation fault"
happens when jumping to "cleanup:" to do "VIR_FREE(port)", as "port"
was not initialized in this situation.
* src/conf/storage_conf.c
POSIX states about dd:
If the bs=expr operand is specified and no conversions other than
sync, noerror, or notrunc are requested, the data returned from each
input block shall be written as a separate output block; if the read
returns less than a full block and the sync conversion is not
specified, the resulting output block shall be the same size as the
input block. If the bs=expr operand is not specified, or a conversion
other than sync, noerror, or notrunc is requested, the input shall be
processed and collected into full-sized output blocks until the end of
the input is reached.
Since we aren't using conv=sync, there is no zero-padding, but our
use of bs= means that a short read results in a short write. If
instead we use ibs= and obs=, then short reads are collected and dd
only has to do a single write, which can make dd more efficient.
* src/qemu/qemu_monitor.c (qemuMonitorMigrateToFile):
Avoid 'dd bs=', since it can cause short writes.
The virCommandNewArgs() method would free the virCommandPtr
if it failed to add the args. This meant errors reported in
virCommandAddArgSet() were lost. Simply removing the check
for errors from the constructor means they can be reported
correctly later
The virCommandAddEnvPassCommon() method failed to check for
errors before reallocating the cmd->env array, causing a
potential SEGV if cmd was NULL
The virCommandAddArgSet() method needs to validate that at
least 1 element in 'val's parameter is non-NULL, otherwise
code like
cmd = virCommandNew(binary)
virCommandAddAtg(cmd, "foo")
Would end up trying todo execve("foo"), if binary was
NULL.
The virSetNonBlock() API only allows enabling non-blocking
operations. It doesn't allow turning blocking back on. Add
a new API to allow arbitrary toggling.
* src/libvirt_private.syms, src/util/util.h
src/util/util.c: Add virSetBlocking
This patch fix a simple bug in virDomainSetMemoryFlags function.
The patch sent before lacks the consideration of the case
where the driver doesn't support virDomainSetMemoryFlags API.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
The current LXC I/O controller looks for HUP to detect
when a guest has quit. This isn't reliable as during
initial bootup it is possible that 'init' will close
the console and let mingetty re-open it. The shutdown
of containers was also flakey because it only killed
the libvirt I/O controller and expected container
processes to gracefully follow.
Change the I/O controller such that when it see HUP
or an I/O error, it uses kill($PID, 0) to see if the
process has really quit.
Change the container shutdown sequence to use the
virCgroupKillPainfully function to ensure every
really goes away
This change makes the use of the 'cpu', 'devices'
and 'memory' cgroups controllers compulsory with
LXC
* docs/drvlxc.html.in: Document that certain cgroups
controllers are now mandatory
* src/lxc/lxc_controller.c: Check if PID is still
alive before quitting on I/O error/HUP
* src/lxc/lxc_driver.c: Use virCgroupKillPainfully
This is the part allowing to dynamically resize the debug log
buffer from it's default 64kB size. The buffer is now dynamically
allocated.
It adds a new API virLogSetBufferSize() which resizes the buffer
If passed a zero size, the buffer is deallocated and we do the small
optimization of not formatting messages which are not output anymore.
On the daemon side, it just adds a new option log_buffer_size to
libvirtd.conf and call virLogSetBufferSize() if needed
* src/util/logging.h src/util/logging.c src/libvirt_private.syms:
make buffer dynamic and add virLogSetBufferSize() internal API
* daemon/libvirtd.conf: document the new log_buffer_size option
* daemon/libvirtd.c: read and use the new log_buffer_size option
Outgoing migration still uses a Unix socket and or exec netcat until
the next patch.
* src/qemu/qemu_migration.c (qemuMigrationPrepareTunnel):
Replace Unix socket with simpler pipe.
Suggested by Paolo Bonzini.
The newly added call to qemuAuditNetDevice in qemuPhysIfaceConnect was
assuming that res_ifname (the name of the macvtap device) was always
valid, but this isn't the case. If openMacvtapTap fails, it always
returns NULL, which would result in a segv.
Since the audit log only needs a record of devices that are actually
sent to qemu, and a failure to open the macvtap device means that no
device will be sent to qemu, we can solve this problem by only doing
the audit if openMacvtapTap is successful (in which case res_ifname is
guaranteed valid).
Normally dnsmasq will send a default route (the address of the host in
the network definition) to any client requesting an address via
DHCP. On an isolated network this makes no sense, as we have iptables
to prevent any traffic going out via that interface, so anything sent
that way would be dropped anyway.
This extra/unusable default route becomes problematic if you have
setup a guest with multiple network interfaces, with one connected to
an isolated network and another that provides connectivity to the
outside (example - one interface directly connecting to a physical
interface via macvtap, with a second connected to an isolated network
so that the host and guest can communicate (macvtap doesn't support
guest<->host communication without an external switch that supports
vepa, or reflecting all traffic back)). In this case, if the guest
chooses the default route of the isolated network, the guest will not
be able to get network traffic beyond the host.
To prevent dnsmasq from sending a default route, you can tell it to
send 0 bytes of data for the default route option (option number 3)
with --dhcp-option=3 (normally the data to send for the option would
follow the option number; no extra data means "don't send this option").
I have checked on RHEL5 (a good representative of the oldest supported
libvirt platforms) and its version of dnsmasq (2.45) does support
--dhcp-option, so this shouldn't create any compatibility problems.
As pointed on CVE-2011-1146, some API forgot to check the read-only
status of the connection for entry point which modify the state
of the system or may lead to a remote execution using user data.
The entry points concerned are:
- virConnectDomainXMLToNative
- virNodeDeviceDettach
- virNodeDeviceReAttach
- virNodeDeviceReset
- virDomainRevertToSnapshot
- virDomainSnapshotDelete
* src/libvirt.c: fix the above set of entry points to error on read-only
connections
By default, all dnsmasq processes share the same leases file. libvirt
also uses the --dhcp-lease-max option to control the maximum number of
leases allowed. The problem is that libvirt puts in a number equal to
the number of addresses in the range for the one network handled by a
single instance of dnsmasq, but dnsmasq checks the total number of
leases in the file (which could potentially contain many more).
The solution is to tell each instance of dnsmasq to create and use its
own leases file. (/var/lib/libvirt/network/<net-name>.leases).
This file is created by dnsmasq when it starts, but not deleted when
it exists. This is fine when the network is just being stopped, but if
the leases file was left around when a network was undefined, we could
end up with an ever-increasing number of dead files - instead, we
explicitly unlink the leases file when a network is undefined.
Note that Ubuntu carries a patch against an older version of libvirt for this:
hhttps://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/713071
ttp://bazaar.launchpad.net/~serge-hallyn/ubuntu/maverick/libvirt/bugall/revision/109
I was certain I'd also seen discussion of this on libvir-list or
libvirt-users, but couldn't find it.
This fixes a regression introduced in commit ad48df, and reported on
the libvirt-users list:
https://www.redhat.com/archives/libvirt-users/2011-March/msg00018.html
The problem in that commit was that we began searching a list of ip
address definitions (rather than just having one) to look for a dhcp
range or static host; when we didn't find any, our pointer (ipdef) was
left at NULL, and when ipdef was NULL, we returned without starting up
dnsmasq.
Previously dnsmasq was started even without any dhcp ranges or static
entries, because it's still useful for DNS services.
Another problem I noticed while investigating was that, if there are
IPv6 addresses, but no IPv4 addresses of any kind, we would jump out
at an ever higher level in the call chain.
This patch does the following:
1) networkBuildDnsmasqArgv() = all uses of ipdef are protected from
NULL dereference. (this patch doesn't change indentation, to make
review easier. The next patch will change just the
indentation). ipdef is intended to point to the first IPv4 address
with DHCP info (or the first IPv4 address if none of them have any
dhcp info).
2) networkStartDhcpDaemon() = if the loop looking for an ipdef with
DHCP info comes up empty, we then grab the first IPv4 def from the
list. Also, instead of returning if there are no IPv4 defs, we just
return if there are no IP defs at all (either v4 or v6). This way a
network that is IPv6-only will still get dnsmasq listening for DNS
queries.
3) in networkStartNetworkDaemon() - we will startup dhcp not just if there
are any IPv4 addresses, but also if there are any IPv6 addresses.
Currently a single storage volume with a broken backing file will disable the
whole storage pool. This can happen when the backing file is on some
unavailable network storage or if the backing volume is deleted, while the
storage volumes using it remain.
Since the storage pool can not be re-activated, re-creating the missing
or deleting the now useless volumes using libvirt only is not possible.
Fixing this is a little bit tricky:
1. virStorageBackendProbeTarget() only detects the missing backing file,
if the backing file format is not explicitly specified. If the
backing file is created using
kvm-img create -f qcow2 -o backing_fmt=qcow2,backing_file=... ...
no error is detected at this stage.
The new return code -3 signals that the backing file could not be
opened.
2. The backingStore.format must be >= 0, since values < 0 would break
virStorageVolTargetDefFormat() when dumping the XML data such as
<format type='...'/>
Because of this the format is faked as VIR_STORAGE_FILE_RAW.
3. virStorageBackendUpdateVolTargetInfo() always opens the backing file
and thus always detects a missing backing file.
Since it "only" updates the capacity, allocation, owner, group, mode
and SELinux label, just ignore errors at this stage, print an error
message and continue.
4. Using vol-dump on a broken volume still doesn't work, but at least
vol-destroy and pool-refresh do work now.
To reproduce:
dir=$(mktemp -d)
virsh pool-create-as tmp dir '' '' '' '' "$dir"
virsh vol-create-as --format qcow2 tmp back 1G
virsh vol-create-as --format qcow2 --backing-vol-format qcow2 --backing-vol back tmp cow 1G
virsh vol-delete --pool tmp back
virsh pool-refresh tmp
After the last step, the pool will be gone (because it was not persistent). As
long as the now broken image stays in the directory, you will not be able to
re-create or re-start the pool.
Signed-off-by: Philipp Hahn <hahn@univention.de>
This patch introduces a new libvirt API (virDomainSetMemoryFlags) and
a flag (virDomainMemoryModFlags).
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Opening raw network devices with the intent of passing those fds to
qemu is worth an audit point. This makes a multi-part audit: first,
we audit the device(s) that libvirt opens on behalf of the MAC address
of a to-be-created interface (which can independently succeed or
fail), then we audit whether qemu actually started the network device
with the same MAC (so searching backwards for successful audits with
the same MAC will show which fd(s) qemu is actually using). Note that
it is possible for the fd to be successfully opened but no attempt
made to pass the fd to qemu (for example, because intermediate
nwfilter operations failed) - no interface start audit will occur in
that case; so the audit for a successful opened fd does not imply
rights given to qemu unless there is a followup audit about the
attempt to start a new interface.
Likewise, when a network device is hot-unplugged, there is only one
audit message about the MAC being discontinued; again, searching back
to the earlier device open audits will show which fds that qemu quits
using (and yes, I checked via /proc/<qemu-pid>/fd that qemu _does_
close out the fds associated with an interface on hot-unplug). The
code would require much more refactoring to be able to definitively
state which device(s) were discontinued at that point, since we
currently don't record anywhere in the XML whether /dev/vhost-net was
opened for a given interface.
* src/qemu/qemu_audit.h (qemuAuditNetDevice): New prototype.
* src/qemu/qemu_audit.c (qemuAuditNetDevice): New function.
* src/qemu/qemu_command.h (qemuNetworkIfaceConnect)
(qemuPhysIfaceConnect, qemuOpenVhostNet): Adjust prototype.
* src/qemu/qemu_command.c (qemuNetworkIfaceConnect)
(qemuPhysIfaceConnect, qemuOpenVhostNet): Add audit points and
adjust parameters.
(qemuBuildCommandLine): Adjust caller.
* src/qemu/qemu_hotplug.c (qemuDomainAttachNetDevice): Likewise.
Since libvirt always passes /dev/net/tun to qemu via fd, we should
never trigger the cases where qemu tries to directly open the
device. Therefore, it is safer to deny the cgroup device ACL.
* src/qemu/qemu_cgroup.c (defaultDeviceACL): Remove /dev/net/tun.
* src/qemu/qemu.conf (cgroup_device_acl): Reflect this change.
qemu driver in libvirt gained support for creating domain snapshots
almost a year ago in libvirt 0.8.0. Since then we enabled QMP support
for qemu >= 0.13.0 but QMP equivalents of {save,load,del}vm commands are
not implemented in current qemu (0.14.0) so the domain snapshot support
is not very useful.
This patch detects when the appropriate QMP command is not implemented
and tries to use human-monitor-command (aka HMP passthrough) to run
it's HMP equivalent.
JSON monitor command implementation can now just directly call text
monitor implementation and it will be automatically encapsulated into
QMP's human-monitor-command.
Some qemu monitor event handlers were issuing inadequate warning when
virDomainSaveStatus() failed. They copied the message from I/O error
handler without customizing it to provide better information on why
virDomainSaveStatus() was called.
For newer qemu-img, the help string for "backing file format" is
"[-F backing_fmt]".
Fix the wrong logic error by commit e997c268.
* src/storage/storage_backend.c
Adding audit points showed that we were granting too much privilege
to qemu; it should not need any mknod rights to recreate any
devices. On the other hand, lxc should have all device privileges.
The solution is adding a flag parameter.
This also lets us restrict write access to read-only disks.
* src/util/cgroup.h (virCgroup*Device*): Adjust prototypes.
* src/util/cgroup.c (virCgroupAllowDevice)
(virCgroupAllowDeviceMajor, virCgroupAllowDevicePath)
(virCgroupDenyDevice, virCgroupDenyDeviceMajor)
(virCgroupDenyDevicePath): Add parameter.
* src/qemu/qemu_driver.c (qemudDomainSaveFlag): Update clients.
* src/lxc/lxc_controller.c (lxcSetContainerResources): Likewise.
* src/qemu/qemu_cgroup.c: Likewise.
(qemuSetupDiskPathAllow): Also, honor read-only disks.
Although the cgroup device ACL controller path can be worked out
by researching the code, it is more efficient to include that
information directly in the audit message.
* src/util/cgroup.h (virCgroupPathOfController): New prototype.
* src/util/cgroup.c (virCgroupPathOfController): Export.
* src/libvirt_private.syms: Likewise.
* src/qemu/qemu_audit.c (qemuAuditCgroup): Use it.
Device names can be manipulated, so it is better to also log
the major/minor device number corresponding to the cgroup ACL
changes that libvirt made. This required some refactoring
of the relatively new qemu cgroup audit code.
Also, qemuSetupChardevCgroup was only auditing on failure, not success.
* src/qemu/qemu_audit.h (qemuDomainCgroupAudit): Delete.
(qemuAuditCgroup, qemuAuditCgroupMajor, qemuAuditCgroupPath): New
prototypes.
* src/qemu/qemu_audit.c (qemuDomainCgroupAudit): Rename...
(qemuAuditCgroup): ...and drop a parameter.
(qemuAuditCgroupMajor, qemuAuditCgroupPath): New functions, to
allow listing device major/minor in audit.
(qemuAuditGetRdev): New helper function.
* src/qemu/qemu_driver.c (qemudDomainSaveFlag): Adjust callers.
* src/qemu/qemu_cgroup.c (qemuSetupDiskPathAllow)
(qemuSetupHostUsbDeviceCgroup, qemuSetupCgroup)
(qemuTeardownDiskPathDeny): Likewise.
(qemuSetupChardevCgroup): Likewise, fixing missing audit.
* src/qemu/qemu_audit.c (qemuDomainHostdevAudit): Avoid use of
"type", which has a pre-defined meaning.
(qemuDomainCgroupAudit): Likewise, as well as "item".
I noticed these while testing 'make dist'.
Parsing ./../src/util/event.c
Function comment for virEventRegisterDefaultImpl lacks description of return value
Function comment for virEventRunDefaultImpl lacks description of return value
Parsing ./../src/util/virterror.c
Missing comment for function virSetErrorLogPriorityFunc
* src/util/event.c (virEventRegisterDefaultImpl)
(virEventRunDefaultImpl): Document return types.
* src/util/virterror.c (virSetErrorLogPriorityFunc): Provide docs.
virRun gives pretty useful error output, let's not overwrite it unless there
is a good reason. Some places were providing more information about what
the commands were _attempting_ to do, however that's usually less useful from
a debugging POV than what actually happened.
as described at
http://wiki.debian.org/ToolChain/DSOLinkinghttps://fedoraproject.org/wiki/UnderstandingDSOLinkChange
otherwise the build fails on current Debian unstable with:
CCLD libvirtd
/usr/bin/ld: ../src/.libs/libvirt_driver_lxc.a(libvirt_driver_lxc_la-lxc_container.o): undefined reference to symbol 'capng_apply'
/usr/bin/ld: note: 'capng_apply' is defined in DSO //usr/lib/libcap-ng.so.0 so try adding it to the linker command line
CCLD libvirtd
/usr/bin/ld: ../src/.libs/libvirt_driver_storage.a(libvirt_driver_storage_la-storage_backend.o): undefined reference to symbol 'fgetfilecon'
/usr/bin/ld: note: 'fgetfilecon' is defined in DSO //lib/libselinux.so.1 so try adding it to the linker command line
//lib/libselinux.so.1: could not read symbols: Invalid operation
and similar errors.
On cygwin:
CC libvirt_driver_security_la-security_dac.lo
security/security_dac.c: In function 'virSecurityDACSetProcessLabel':
security/security_dac.c:618: warning: format '%d' expects type 'int', but argument 7 has type 'uid_t' [-Wformat]
We've done this before (see src/util/util.c).
* src/security/security_dac.c (virSecurityDACSetProcessLabel): On
cygwin, uid_t is a 32-bit long.
On cygwin:
CC libvirt_util_la-cgroup.lo
util/cgroup.c: In function 'virCgroupKillRecursiveInternal':
util/cgroup.c:1458: warning: implicit declaration of function 'virCgroupNew' [-Wimplicit-function-declaration]
* src/util/cgroup.c (virCgroupKill): Don't build on platforms
where virCgroupNew is unsupported.
Apparently some signals found on Unix are not exposed, this led
to a compilation failure
* src/util/logging.c: make code related to each signal dependant
upon the definition of that signal
The way to detach a USB disk is the same as that to detach a SCSI
disk. Rename this function and we can use it to detach a USB disk.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
When I use newest libvirt to save a domain, libvirtd will be deadlock.
Here is the output of gdb:
(gdb) thread 3
[Switching to thread 3 (Thread 0x7f972a1fc710 (LWP 30265))]#0 0x000000351fe0e034 in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
at qemu/qemu_driver.c:2074
ret=0x7f972a1fbbe0) at remote.c:2273
(gdb) thread 7
[Switching to thread 7 (Thread 0x7f9730bcd710 (LWP 30261))]#0 0x000000351fe0e034 in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
(gdb) p *(virMutexPtr)0x6fdd60
$2 = {lock = {__data = {__lock = 2, __count = 0, __owner = 30261, __nusers = 1, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = "\002\000\000\000\000\000\000\000\065v\000\000\001", '\000' <repeats 26 times>, __align = 2}}
(gdb) p *(virMutexPtr)0x1a63ac0
$3 = {lock = {__data = {__lock = 2, __count = 0, __owner = 30265, __nusers = 1, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = "\002\000\000\000\000\000\000\000\071v\000\000\001", '\000' <repeats 26 times>, __align = 2}}
(gdb) info threads
7 Thread 0x7f9730bcd710 (LWP 30261) 0x000000351fe0e034 in __lll_lock_wait () from /lib64/libpthread.so.0
6 Thread 0x7f972bfff710 (LWP 30262) 0x000000351fe0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
5 Thread 0x7f972b5fe710 (LWP 30263) 0x000000351fe0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
4 Thread 0x7f972abfd710 (LWP 30264) 0x000000351fe0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
* 3 Thread 0x7f972a1fc710 (LWP 30265) 0x000000351fe0e034 in __lll_lock_wait () from /lib64/libpthread.so.0
2 Thread 0x7f97297fb710 (LWP 30266) 0x000000351fe0b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
1 Thread 0x7f9737aac800 (LWP 30260) 0x000000351fe0803d in pthread_join () from /lib64/libpthread.so.0
The reason is that we will try to lock some object in callback function, and we may call event API with locking the same object.
In the function virEventDispatchHandles(), we unlock eventLoop before calling callback function. I think we should
do the same thing in the function virEventCleanupTimeouts() and virEventCleanupHandles().
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Not all applications have an existing event loop they need
to integrate with. Forcing them to implement the libvirt
event loop integration APIs is an undue burden. This just
exposes our simple poll() based implementation for apps
to use. So instead of calling
virEventRegister(....callbacks...)
The app would call
virEventRegisterDefaultImpl()
And then have a thread somewhere calling
static bool quit = false;
....
while (!quit)
virEventRunDefaultImpl()
* daemon/libvirtd.c, tools/console.c,
tools/virsh.c: Convert to public event loop APIs
* include/libvirt/libvirt.h.in, src/libvirt_private.syms: Add
virEventRegisterDefaultImpl and virEventRunDefaultImpl
* src/util/event.c: Implement virEventRegisterDefaultImpl
and virEventRunDefaultImpl using poll() event loop
* src/util/event_poll.c: Add full error reporting
* src/util/virterror.c, include/libvirt/virterror.h: Add
VIR_FROM_EVENTS
The event loop implementation is used by more than just the
daemon, so move it into the shared area.
* daemon/event.c, src/util/event_poll.c: Renamed
* daemon/event.h, src/util/event_poll.h: Renamed
* tools/Makefile.am, tools/console.c, tools/virsh.c: Update
to use new virEventPoll APIs
* daemon/mdns.c, daemon/mdns.c, daemon/Makefile.am: Update
to use new virEventPoll APIs
* src/util/logging.c: fix virLogDumpAllFD() to avoid snprintf, simplify
the code and provide more useful signal descriptions. Also remove an
unused variable.
For qemu names the primary vga as "qxl-vga":
1) if vram is specified for 2nd qxl device:
-vga qxl -global qxl-vga.vram_size=$SIZE \
-device qxl,id=video1,vram_size=$SIZE,...
2) if vram is not specified for 2nd qxl device, (use the default
set by global):
-vga qxl -global qxl-vga.vram_size=$SIZE \
-device qxl,id=video1,...
For qemu names all qxl devices as "qxl":
1) if vram is specified for 2nd qxl device:
-vga qxl -global qxl.vram_size=$SIZE \
-device qxl,id=video1,vram_size=$SIZE ...
2) if vram is not specified for 2nd qxl device:
-vga qxl -global qxl-vga.vram_size=$SIZE \
-device qxl,id=video1,...
"-global" is the only way to define vram_size for the primary qxl
device, regardless of how qemu names it, (It's not good a good
way, as original idea of "-global" is to set a global default for
a driver property, but to specify vram for first qxl device, we
have to use it).
For other qxl devices, as they are represented by "-device", could
specify it directly and seperately for each, and it overrides the
default set by "-global" if specified.
v1 - v2:
* modify "virDomainVideoDefaultRAM" so that it returns 16M as the
default vram_size for qxl device.
* vram_size * 1024 (qemu accepts bytes for vram_size).
* apply default vram_size for qxl device for which vram_size is
not specified.
* modify "graphics-spice" tests (more sensiable vram_size)
* Add an argument of virDomainDefPtr type for qemuBuildVideoDevStr,
to use virDomainVideoDefaultRAM in qemuBuildVideoDevStr).
v2 - v3:
* Modify default video memory size for qxl device from 16M to 24M
* Update codes to be consistent with changes on qemu_capabilities.*
virLogEmergencyDumpAll() allows to dump the content of the
debug buffer from within a signal handler. It saves to all
log file or stderr if none is found
* src/util/logging.h src/util/logging.c: add the new API
and cleanup the old virLogDump code
* src/libvirt_private.syms: exports it as a private symbol
Initially only the log actually written out by libvirt were
saved on the memory buffer, this patch forces all informations
including info and debug to be saved in memory too. This is
useful to get full data in case of crash.
This was also found while investigating
https://bugzilla.redhat.com/show_bug.cgi?id=670848
An EOF on a domain's monitor socket results in an event being queued
to handle the EOF. The handler calls qemuProcessHandleMonitorEOF. If
it is a transient domain, this leads to a call to
virDomainRemoveInactive, which removes the domain from the driver's
hashtable and unref's it. Nowhere in this code is the qemu driver lock
acquired.
However, all modifications to the driver's domain hashtable *must* be
done while holding the driver lock, otherwise the hashtable can become
corrupt, and (even more likely) another thread could call a different
hashtable function and acquire a pointer to the domain that is in the
process of being destroyed.
To prevent such a disaster, qemuProcessHandleMonitorEOF must get the
qemu driver lock *before* it gets the DomainObj's lock, and hold it
until it is finished with the DomainObj. This guarantees that nobody
else modifies the hashtable at the same time, and that anyone who had
already gotten the DomainObj from the hashtable prior to this call has
finished with it before we remove/destroy it.
This was found while researching the root cause of:
https://bugzilla.redhat.com/show_bug.cgi?id=670848
virDomainUnref should only be called with the lock held for the
virDomainObj in question. However, when a transient qemu domain gets
EOF on its monitor socket, it queues an event which frees the monitor,
which unref's the virDomainObj without first locking it. If another
thread has already locked the virDomainObj, the modification of the
refcount could potentially be corrupted. In an extreme case, it could
also be potentially unlocked by virDomainObjFree, thus left open to
modification by anyone else who would have otherwise waited for the
lock (not to mention the fact that they would be accessing freed
data!).
The solution is to have qemuMonitorFree lock the domain object right
before unrefing it. Since the caller to qemuMonitorFree doesn't expect
this lock to be held, if the refcount doesn't go all the way to 0,
qemuMonitorFree must unlock it after the unref.
In virFileOperation, the parent does a fallback to a non-fork
attempt if it detects that the child returned EACCES. However,
the child was calling _exit(-EACCES), which does _not_ appear
as EACCES in the parent.
* src/util/util.c (virFileOperation): Correctly pass EACCES from
child to parent.
virSecurityDAC{Set,Restore}ChardevCallback expect virSecurityManagerPtr,
but are passed virDomainObjPtr instead. This makes
virSecurityDACSetChardevLabel set a wrong uid/gid on chardevs. This
patch fixes this behaviour.
Signed-off-by: Soren Hansen <soren@linux2go.dk>
This fixes a possible crash of libvirtd during its startup. When qemu
driver reconnects to running domains, it iterates over all domain
objects in a hash. When reconnecting to an associated qemu monitor
fails and the domain is transient, it's immediately removed from the
hash. Despite the fact that it's explicitly forbidden to do so. If
libvirtd is lucky enough, virHashForEach will access random memory when
the callback finishes and the deamon will crash.
Since it's trivial to fix virHashForEach to allow removal of hash
entries while iterating through them, I went this way instead of fixing
qemuReconnectDomain callback (and possibly others) to avoid deleting the
entries.
qemudDomainSaveImageStartVM was evil - it closed the incoming fd
argument on some, but not all, code paths, without informing the
caller about that action. No wonder that this resulted in
double-closes: https://bugzilla.redhat.com/show_bug.cgi?id=672725
* src/qemu/qemu_driver.c (qemudDomainSaveImageStartVM): Alter
signature, to avoid double-close.
(qemudDomainRestore, qemudDomainObjRestore): Update callers.
When a SPICE or VNC graphics controller is present, and sound is
piggybacked over a channel to the graphics device rather than
directly accessing host hardware, then there is no need to grant
host hardware access to that qemu process.
* src/qemu/qemu_cgroup.c (qemuSetupCgroup): Prevent sound with
spice, and with vnc when vnc_allow_host_audio is 0.
Reported by Daniel Berrange.
The kill() function doesn't exist on Win32, so it needs to be
checked for at build time & code disabled in cgroups
* configure.ac: Check for kill()
* src/util/cgroup.c: Stub out virCGroupKill* functions
when kill() isn't available
this is the patch to add support for multiple serial ports to the
libvirt Xen driver. It support both old style (serial = "pty") and
new style (serial = [ "/dev/ttyS0", "/dev/ttyS1" ]) definition and
tests for xml2sexpr, sexpr2xml and xmconfig have been added as well.
Written and tested on RHEL-5 Xen dom0 and working as designed but
the Xen version have to have patch for RHBZ #614004 but this patch
is for upstream version of libvirt.
Also, this patch is addressing issue described in RHBZ #670789.
Signed-off-by: Michal Novotny <minovotn@redhat.com>
this is the patch to fix the virDomainChrDefParseTargetXML() functionality
to parse the target port from XML if available. This is necessary for
multiple serial port support which is the second part of this patch.
Signed-off-by: Michal Novotny <minovotn@redhat.com>
The virCgroupKill method kills all PIDs found in a cgroup
The virCgroupKillRecursively method does this recursively
for child cgroups.
The virCgroupKillPainfully method does a recursive kill
several times in a row until everything has really died
Relax the restriction that the hash table key must be a string
by allowing an arbitrary hash code generator + comparison func
to be provided
* util/hash.c, util/hash.h: Allow any pointer as a key
* internal.h: Include stdbool.h as standard.
* conf/domain_conf.c, conf/domain_conf.c,
conf/nwfilter_params.c, nwfilter/nwfilter_gentech_driver.c,
nwfilter/nwfilter_gentech_driver.h, nwfilter/nwfilter_learnipaddr.c,
qemu/qemu_command.c, qemu/qemu_driver.c,
qemu/qemu_process.c, uml/uml_driver.c,
xen/xm_internal.c: s/char */void */ in hash callbacks
Since the deallocator is passed into the constructor of
a hash table it is not desirable to pass it into each
function again. Remove it from all functions, but provide
a virHashSteal to allow a item to be removed from a hash
table without deleteing it.
* src/util/hash.c, src/util/hash.h: Remove deallocator
param from all functions. Add virHashSteal
* src/libvirt_private.syms: Add virHashSteal
* src/conf/domain_conf.c, src/conf/nwfilter_params.c,
src/nwfilter/nwfilter_learnipaddr.c,
src/qemu/qemu_command.c, src/xen/xm_internal.c: Update
for changed hash API
Using the 'personality(2)' system call, we can make a container
on an x86_64 host appear to be i686. Likewise for most other
Linux 64bit arches.
* src/lxc/lxc_conf.c: Fill in 32bit capabilities for x86_64 hosts
* src/lxc/lxc_container.h, src/lxc/lxc_container.c: Add API to
check if an arch has a 32bit alternative
* src/lxc/lxc_controller.c: Set the process personality when
starting guest
This is done for two reasons:
- we are getting very close to 64 flags which is the maximum we can use
with unsigned long long
- by using LL constants in enum we already violates C99 constraint that
enum values have to fit into int
When spawning 'init' in the container, set
LIBVIRT_LXC_UUID=XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
LIBVIRT_LXC_NAME=YYYYYYYYYYYY
to allow guest software to detect & identify that they
are in a container
* src/lxc/lxc_container.c: Set LIBVIRT_LXC_UUID and
LIBVIRT_LXC_NAME env vars
The virFileAbsPath was not taking into account the '/' directory
separator when allocating memory for combining cwd + path. Convert
to use virAsprintf to avoid this type of bug completely.
* src/util/util.c: Convert virFileAbsPath to use virAsprintf
Normal practice for /dev/pts is to have it mode=620,gid=5
but LXC was leaving mode=000,gid=0 preventing unprivilegd
users in the guest use of PTYs
* src/lxc/lxc_controller.c: Fix /dev/pts setup
Current code does an IFF_UP on a 8021Qbh interface immediately after a port
profile set. This is ok in most cases except when its the migration prepare
stage. During migration we want to postpone IFF_UP'ing the interface on the
destination host until the source host has disassociated the interface.
This patch moves IFF_UP of the interface to the final stage of migration.
The motivation for this change is to postpone any addr registrations on the
destination host until the source host has done the addr deregistrations.
While at it, for symmetry with associate move ifDown of a 8021Qbh interface
to before disassociate
Steps to reproduce this bug:
1. virsh attach-disk domain --source imagefile --target sdb --sourcetype file --driver qemu --subdriver raw
2. virsh detach-device controller.xml # remove scsi controller 0
3. virsh detach-disk domain sdb
error: Failed to detach disk
error: operation failed: detaching scsi0-0-1 device failed: Device 'scsi0-0-1' not found
I think we should not detach a controller when it is used by some other device.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Done mechanically with:
$ git grep -l '\bDEBUG0\? *(' | xargs -L1 sed -i 's/\bDEBUG0\? *(/VIR_&/'
followed by manual deletion of qemudDebug in daemon/libvirtd.c, along
with a single 'make syntax-check' fallout in the same file, and the
actual deletion in src/util/logging.h.
* src/util/logging.h (DEBUG, DEBUG0): Delete.
* daemon/libvirtd.h (qemudDebug): Likewise.
* global: Change remaining clients over to VIR_DEBUG counterpart.
The virConnectPtr struct will cache instances of all other
objects. APIs like virDomainLookupByUUID will return a
cached object, so if you do virDomainLookupByUUID twice in
a row, you'll get the same exact virDomainPtr instance.
This does not have any performance benefit, since the actual
logic in virDomainLookupByUUID (and other APIs returning
virDomainPtr, etc instances) is not short-circuited. All
it does is to ensure there is only one single virDomainPtr
in existance for any given UUID.
The caching has a number of downsides though, all relating
to stale data. If APIs aren't careful to always overwrite
the 'id' field in virDomainPtr it may become out of data.
Likewise for the name field, if a guest is renamed, or if
a guest is deleted, and then a new one created with the
same UUID but different name.
This has been an ongoing, endless source of bugs for all
applications using libvirt from languages with garbage
collection, causing them to get virDomainPtr instances
from long ago with stale data.
The caching is also a waste of memory resources, since
both applications, and language bindings often maintain
their own hashtable caches of object instances.
This patch removes all the hash table caching, so all
APIs return brand new virDomainPtr (etc) object instances.
* src/datatypes.h: Delete all hash tables.
* src/datatypes.c: Remove all object caching code
This patch adds the possibility to not just drop packets, but to also have them rejected where iptables at least sends an ICMP msg back to the originator. On ebtables this again maps into dropping packets since rejecting is not supported.
I am adding 'since 0.8.9' to the docs assuming this will be the next version of libvirt.
Two regressions:
Commit df1011ca broke builds for systems that lack devmapper
(non-Linux, as well as Linux with ./autogen.sh --without-libvirtd
and without the libraries present).
Commit ce6fd650 broke cross-compilation, due to a gnulib bug.
* .gnulib: Update to latest, for cross-compilation fix.
* src/util/util.c (virIsDevMapperDevice): Provide stub for
platforms not using storage driver.
* configure.ac (devmapper): Arrange to define HAVE_LIBDEVMAPPER_H.
devmapper issue reported by Wen Congyang.
Etienne Gosset reported that libvirt fails to connect to his ESX
server because it failed to parse its malformed host UUID, that
contains an additional space and lacks one hexdigit in the last
group:
xxxxxxxx-xxxx-xxxx-xxxx- xxxxxxxxxxx
Don't treat this as a fatal error, just ignore it.
Virsh freecell --all was not only getting wrong NUMA nodes count, but
even the NUMA nodes IDs. They doesn't have to be continuous, as I've
found out during testing this. Therefore a modification of
nodeGetCellsFreeMemory() error message.
$ ./configure
...
$ make
...
GEN libvirt.syms
...
$ ./configure --with-driver-modules
...
$ make
...
libvirt.syms doesn't get regenerated but it should as it should
contain virDriverLoadModule now.
* src/Makefile.am (libvirt.syms): Depend on configure changes.
Reported by Matthias Bolte.
There were several occurrences of an extra space inserted between
a function name and the ( opening the argument list in
datatypes.c. This is not consistent with the coding style used in
the rest of this file so removing this extra space makes the
code slightly more readable.
Now that the virHash handling functions call virReportOOMError by
themselves when needed, users of the virHash API no longer need to
do it by themselves. Since users of the virHash API were not
consistently calling virReportOOMError after memory failures from
the virHash code, this has the added benefit of making OOM
reporting from this code more consistent and reliable.
The only difference between these 2 functions is that one errors
out when the entry is already present while the other modifies
the existing entry. Add an helper function with a boolean argument
indicating whether existing entries should be updated or not, and
use this helper in both functions.
The code in virHashUpdateEntry and virHashAddEntry is really
similar. However, the latter rebalances the hash table when
one of its buckets contains too many elements while the former
does not. Fix this discrepancy.
When we attach a disk, but we specify a wrong format of disk image,
qemu monitor command drive_add will fail, but libvirt does not detect
this error.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Followup to commit 17e19add, and would have prevented the bug
independently fixed in commit 76c57a7c.
* src/util/logging.c (virLogMessage): Preserve errno, since
logging should be as unintrusive as possible.
This fixes https://bugzilla.redhat.com/show_bug.cgi?id=609463
The problem was that, since a bridge always acquires the MAC address
of the connected interface with the numerically lowest MAC, as guests
are started and stopped, it was possible for the MAC address to change
over time, and this change in the network was being detected by
Windows 7 (it sees the MAC of the default route change), so on each
reboot it would bring up a dialog box asking about this "new network".
The solution is to create a dummy tap interface with a MAC guaranteed
to be lower than any guest interface's MAC, and attach that tap to the
bridge as soon as it's created. Since all guest MAC addresses start
with 0xFE, we can just generate a MAC with the standard "0x52, 0x54,
0" prefix, and it's guaranteed to always win (physical interfaces are
never connected to these bridges, so we don't need to worry about
competing numerically with them).
Note that the dummy tap is never set to IFF_UP state - that's not
necessary in order for the bridge to take its MAC, and not setting it
to UP eliminates the clutter of having an (eg) "virbr0-nic" displayed
in the output of the ifconfig command.
I chose to not auto-generate the MAC address in the network XML
parser, as there are likely to be consumers of that API that don't
need or want to have a MAC address associated with the
bridge.
Instead, in bridge_driver.c when the network is being defined, if
there is no MAC, one is generated. To account for virtual network
configs that already exist when upgrading from an older version of
libvirt, I've added a %post script to the specfile that searches for
all network definitions in both the config directory
(/etc/libvirt/qemu/networks) and the state directory
(/var/lib/libvirt/network) that are missing a mac address, generates a
random address, and adds it to the config (and a matching address to
the state file, if there is one).
docs/formatnetwork.html.in: document <mac address.../>
docs/schemas/network.rng: add nac address to schema
libvirt.spec.in: %post script to update existing networks
src/conf/network_conf.[ch]: parse and format <mac address.../>
src/libvirt_private.syms: export a couple private symbols we need
src/network/bridge_driver.c:
auto-generate mac address when needed,
create dummy interface if mac address is present.
tests/networkxml2xmlin/isolated-network.xml
tests/networkxml2xmlin/routed-network.xml
tests/networkxml2xmlout/isolated-network.xml
tests/networkxml2xmlout/routed-network.xml: add mac address to some tests
An upcoming patch has a use for a tap device to be created that
doesn't need to be actually put into the "up" state, and keeping it
"down" keeps the output of ifconfig from being unnecessarily cluttered
(ifconfig won't show down interfaces unless you add "-a").
bridge.[ch]: add "up" as an arg to brAddTap()
uml_conf.c, qemu_command.c: add "up" (set to "true") to brAddTap() call.
This is in response to:
https://bugzilla.redhat.com/show_bug.cgi?id=629662
Explanation
qemu's virtio-net-pci driver allows setting the algorithm used for tx
packets to either "bh" or "timer". This is done by adding ",tx=bh" or
",tx=timer" to the "-device virtio-net-pci" commandline option.
'bh' stands for 'bottom half'; when this is set, packet tx is all done
in an iothread in the bottom half of the driver. (In libvirt, this
option is called the more descriptive "iothread".)
'timer' means that tx work is done in qemu, and if there is more tx
data than can be sent at the present time, a timer is set before qemu
moves on to do other things; when the timer fires, another attempt is
made to send more data. (libvirt retains the name "timer" for this
option.)
The resulting difference, according to the qemu developer who added
the option is:
bh makes tx more asynchronous and reduces latency, but potentially
causes more processor bandwidth contention since the cpu doing the
tx isn't necessarily the cpu where the guest generated the
packets.
Solution
This patch provides a libvirt domain xml knob to change the option on
the qemu commandline, by adding a new attribute "txmode" to the
<driver> element that can be placed inside any <interface> element in
a domain definition. It's use would be something like this:
<interface ...>
...
<model type='virtio'/>
<driver txmode='iothread'/>
...
</interface>
I chose to put this setting as an attribute to <driver> rather than as
a sub-element to <tune> because it is specific to the virtio-net
driver, not something that is generally usable by all network drivers.
(note that this is the same placement as the "driver name=..."
attribute used to choose kernel vs. userland backend for the
virtio-net driver.)
Actually adding the tx=xxx option to the qemu commandline is only done
if the version of qemu being used advertises it in the output of
qemu -device virtio-net-pci,?
If a particular txmode is requested in the XML, and the option isn't
listed in that help output, an UNSUPPORTED_CONFIG error is logged, and
the domain fails to start.
When the <driver> element (and its "name" attribute) was added to the
domain XML's interface element, a "backend" enum was simply added to
the toplevel of the virDomainNetDef struct.
Ignoring the naming inconsistency ("name" vs. "backend"), this is fine
when there's only a single item contained in the driver element of the
XML, but doesn't scale well as we add more attributes that apply to
the backend of the virtio-net driver, or add attributes applicable to
other drivers.
This patch changes virDomainNetDef in two ways:
1) Rename the item in the struct from "backend" to "name", so that
it's the same in the XML and in the struct, hopefully avoiding
confusion for someone unfamiliar with the function of the
attribute.
2) Create a "driver" union within virDomainNetDef, and a "virtio"
struct in that struct, which contains the "name" enum value.
3) Move around the virDomainNetParse and virDomainNetFormat functions
to allow for simple plugin of new attributes without disturbing
existing code. (you'll note that this results in a seemingly
redundant if() in the format function, but that will no longer be
the case as soon as a 2nd attribute is added).
In the future, new attributes for the virtio driver backend can be
added to the "virtio" struct, and any other network device backend that
needs an attribute will have its own struct added to the "driver"
union.
The introduction of the v3 migration protocol, along with
support for migration cookies, will significantly expand
the size of the migration code. Move it all to a separate
file to make it more manageable
The functions are not moved 100%. The API entry points
remain in the main QEMU driver, but once the public
virDomainPtr is resolved to the internal virDomainObjPtr,
all following code is moved.
This will allow the new v3 API entry points to call into the
same shared internal migration functions
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h: Add
qemuDomainFormatXML helper method
* src/qemu/qemu_driver.c: Remove all migration code
* src/qemu/qemu_migration.c, src/qemu/qemu_migration.h: Add
all migration code.
Move the qemudStartVMDaemon and qemudShutdownVMDaemon
methods into a separate file, renaming them to
qemuProcessStart, qemuProcessStop. All helper methods
called by these are also moved & renamed to match
* src/Makefile.am: Add qemu_process.c/.h
* src/qemu/qemu_command.c: Add qemuDomainAssignPCIAddresses
* src/qemu/qemu_command.h: Add VNC port min/max
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h: Add
domain event queue helpers
* src/qemu/qemu_driver.c, src/qemu/qemu_driver.h: Remove
all QEMU process startup/shutdown functions
* src/qemu/qemu_process.c, src/qemu/qemu_process.h: Add
all QEMU process startup/shutdown functions
The name convention of device mapper disk is different, and 'parted'
can't be used to delete a device mapper disk partition. e.g.
Name Path
-----------------------------------------
3600a0b80005ad1d7000093604cae912fp1 /dev/mapper/3600a0b80005ad1d7000093604cae912fp1
Error: Expecting a partition number.
This patch introduces 'dmsetup' to fix it.
Changes:
- New function "virIsDevMapperDevice" in "src/utils/utils.c"
- remove "is_dm_device" in "src/storage/parthelper.c", use
"virIsDevMapperDevice" instead.
- Requires "device-mapper" for 'with-storage-disk" in "libvirt.spec.in"
- Check "dmsetup" in 'configure.ac' for "with-storage-disk"
- Changes on "src/Makefile.am" to link against libdevmapper
- New entry for "virIsDevMapperDevice" in "src/libvirt_private.syms"
Changes from v1 to v3:
- s/virIsDeviceMapperDevice/virIsDevMapperDevice/g
- replace "virRun" with "virCommand"
- sort the list of util functions in "libvirt_private.syms"
- ATTRIBUTE_NONNULL(1) for virIsDevMapperDevice declaration.
e.g.
Name Path
-----------------------------------------
3600a0b80005ad1d7000093604cae912fp1 /dev/mapper/3600a0b80005ad1d7000093604cae912fp1
Vol /dev/mapper/3600a0b80005ad1d7000093604cae912fp1 deleted
Name Path
-----------------------------------------
"qemudDomainSaveFlag" goto wrong label "endjob", which will cause
error when security manager trying to restore label (regression).
As it's more reasonable to check if vm is shutoff immediately, and
return right away if it is, remove the checking in "qemudDomainSaveFlag",
and add checking in "qemudDomainSave".
* src/qemu/qemu_driver.c
* src/util/cgroup.c (virCgroupSetValueStr, virCgroupGetValueStr)
(virCgroupRemoveRecursively): VIR_DEBUG can clobber errno.
(virCgroupRemove): Use VIR_DEBUG rather than DEBUG.
The code expected that host CPU architecture matches the architecture on
which libvirt runs. This is normally true but not in tests, where host
CPU is faked to produce consistent results.
clang had 5 reports against virCommand; three were false positives
(a NULL deref in ProcessIO solved by sa_assert, and two uninitialized
memory operations solved by adding an initializer), but two were real.
* src/util/command.c (virCommandProcessIO): Fix real bug of
possible NULL dereference. Teach clang that buf is never NULL.
(virCommandRun): Teach clang that infd is only ever accessed when
initialized.
The processWatchdogEvent fix is real, although it can only trigger
on OOM, since bad things happen if doCoreDump is called with a NULL
pathname argument. The other fixes silence clang, but aren't a real
bug because virReportErrorHelper tolerates a NULL format string even
though *printf does not.
* src/qemu/qemu_driver.c (processWatchdogEvent): Exit on OOM.
(qemuDomainIsActive, qemuDomainIsPersistent, qemuDomainIsUpdated):
Provide valid message.
The SCSI storage backend leaks a string containing the pathname
for each block device it discovers
* src/storage/storage_backend_scsi.c: Free the device name
When creating the virDomain::snapshots hash table, virGetDomain
wasn't checking if the creation was successful. This would then
lead to failures in the vir*DomainSnapshot functions. Better to
report this error early and make virGetDomain fail if the
snapshots hash couldn't be created.
* src/datatypes.c: report failure to make a hash table
A couple of allocation were not calling virReportOOMError on allocation
errors
* src/util/hash.c: add the needed call in virHashCreate and
virHashAddOrUpdateEntry
"virStorageBackendCreateVols":
"names->next" serves as condition expression for "do...while",
however, "names" was shifted before, it then results in one less
loop, and thus, one less volume will be created for mpath pool,
the patch is to fix it.
* src/storage/storage_backend_mpath.c
clang complained that STREQ(group->controllers[i].mountPoint,...) was
a NULL dereference when i==VIR_CGROUP_CONTROLLER_CPUSET, because it
assumes the worst about virCgroupPathOfController. Marking the
argument const doesn't yet have an effect, per this clang bug:
http://llvm.org/bugs/show_bug.cgi?id=7758
So, we use sa_assert, which was designed to shut up false positives
from tools like clang.
* src/util/cgroup.c (virCgroupMakeGroup): Teach clang that there
is no NULL dereference.
This patch reorders the connlimit and comment match extensions relative to the state match (-m state); connlimit being most useful if found after a -m state --state NEW and not before it.
When formatting XML for smartcard device with mode=host, libvirt
generates invalid XML if the device has address info associated:
<smartcard mode='host' <address type='ccid' controller='0' slot='1'/>
Commit 9962e406c6 introduced a
problem where if the VM failed to startup, it would not be
correctly cleaned up. Amongst other things the SELinux
security label would not be removed, which prevents the VM
from ever starting again.
The virDomainIsActive() check at the start of qemudShutdownVMDaemon
checks for vm->def->id not being -1. By moving the assignment of the
VM id to the start of qemudStartVMDaemon, we can ensure cleanup will
occur on failure
* src/qemu/qemu_driver.c: Move initialization of 'vm->def->id'
so that qemudShutdownVMDaemon() will process the shutdown
When attaching a device that already exists, xend driver updates
the device with "device_configure", it causes problems (e.g. for
disk device, 'device_configure' only can be used to update device
like CDROM), on the other hand, we provide additional API
(virDomainUpdateDevice) to update device, this fix is to raise up
errors instead of updating the existed device which is not CDROM
device.
Changes from v1 to v2:
- allow to update CDROM
* src/xen/xend_internal.c
Building the 0.8.8 release candidate on cygwin produced this compiler
warning, which is indicative of catastrophic failure on any attempt to
print an error message with errno turned to a string:
CC strerror_r.lo
strerror_r.c: In function 'rpl_strerror_r':
strerror_r.c:67: warning: assignment makes integer from pointer without a cast
This has been fixed in gnulib.
* .gnulib: Update to latest, for strerror_r fix.
* src/util/memory.c (includes): Satisfy 'make syntax-check'.
Suspending a VM which contains shell meta characters doesn't work with
libvirt-0.8.7:
/var/log/libvirt/qemu/andreas_231-ne\ doch\ nicht.log:
sh: -c: line 0: syntax error near unexpected token `doch'
sh: -c: line 0: `cat | { dd bs=4096 seek=1 if=/dev/null && dd bs=1048576; }
Although target="andreas_231-ne doch nicht" contains shell meta
characters (here: blanks), they are not properly escaped by
src/qemu/qemu_monitor_{json,text}.c#qemuMonitor{JSON,Text}MigrateToFile()
First, the filename needs to be properly escaped for the shell, than
this command line has to be properly escaped for qemu again.
For this to work, remove the old qemuMonitorEscapeArg() wrapper, rename
qemuMonitorEscape() to it removing the handling for shell=TRUE, and
implement a new qemuMonitorEscapeShell() returning strings using single
quotes.
Using double quotes or escaping special shell characters with backslashes
would also be possible, but the set of special characters heavily
depends on the concrete shell (dsh, bash, zsh) and its setting (history
expansion, interactive use, ...)
Signed-off-by: Philipp Hahn <hahn@univention.de>
The logging functions are enhanced so that immediately prior to
the first log message being printed to any output channel, the
libvirt package version will be printed.
eg
$ LIBVIRT_DEBUG=1 virsh
18:13:28.013: 17536: info : libvirt version: 0.8.7
18:13:28.013: 17536: debug : virInitialize:361 : register drivers
...
The 'configure' script gains two new arguments which can be
used as
--with-packager="Fedora Project, x86-01.phx2.fedoraproject.org, 01-27-2011-18:00:10"
--with-packager-version="1.fc14"
to allow distros to append a custom string with package specific
data.
The RPM specfile is modified so that it appends the RPM version,
the build host, the build date and the packager name.
eg
$ LIBVIRT_DEBUG=1 virsh
18:14:52.086: 17551: info : libvirt version: 0.8.7, package: 1.fc13 (Fedora Project, x86-01.phx2.fedoraproject.org, 01-27-2011-18:00:10)
18:14:52.086: 17551: debug : virInitialize:361 : register drivers
Thus when distro packagers receive bug reports they can clearly
see what version was in use, even if the bug reporter mistakenly
or intentionally lies about version/builds
* src/util/logging.c: Output version data prior to first log message
* libvirt.spec.in: Include RPM release, date, hostname & packager
* configure.ac: Add --with-packager & --with-packager-version args
QEMUD_CMD_FLAG_PCI_MULTIBUS should be set in the function
qemuCapsExtractVersionInfo()
The flag QEMUD_CMD_FLAG_PCI_MULTIBUS is used in the function
qemuBuildDeviceAddressStr(). All callers get qemuCmdFlags
by the function qemuCapsExtractVersionInfo() except that
testCompareXMLToArgvFiles() in qemuxml2argvtest.c.
So we should set QEMUD_CMD_FLAG_PCI_MULTIBUS in the function
qemuCapsExtractVersionInfo() instead of qemuBuildCommandLine()
because the function qemuBuildCommandLine() does not be called
when we attach a pci device.
tests: set QEMUD_CMD_FLAG_PCI_MULTIBUS in testCompareXMLToArgvFiles()
set QEMUD_CMD_FLAG_PCI_MULTIBUS before calling qemuBuildCommandLine()
as the flags is not set by qemuCapsExtractVersionInfo().
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Quite a few hosts don't have cgroups mounted and so see warnings
from libvirt logged, which then cause bug reports, etc. Reduce
the log level to INFO so they're not visible by default
* src/qemu/qemu_driver.c: Reduce log level for cgroups
When run non-root the nwfilter driver logs error messages about
being unable to find iptables/ebtables commands (they are in
/sbin which isn't in $PATH). The nwfilter driver can't ever work
as non-root, so simply skip it entirely thus avoiding the error
messages
* src/conf/nwfilter_conf.h, src/nwfilter/nwfilter_driver.c,
src/nwfilter/nwfilter_gentech_driver.c,
src/nwfilter/nwfilter_gentech_driver.h: Pass 'bool privileged'
flag down to final driver impl
* src/nwfilter/nwfilter_ebiptables_driver.c: Skip initialization
if not privileged
A typo s/spice/vnc/ caused parsing of the spice 'auth' data
to write into the wrong part of the struct, blowing away
other unrelated data.
* src/conf/domain_conf.c: s/vnc/spice/ in parsing spice auth
Most of te VIR_INFO calls in the udev driver are only relevant
to developers so can switch to VIR_DEBUG. Failure to initialize
libpciaccess though is a fatal error
* src/node_device/node_device_udev.c: Adjust log levels
When probing machine types if the QEMU binary does not exist
we get a hard to diagnose error, due to the execve() in the
child failing
error: internal error Child process exited with status 1.
Add an explicit check so that we get
error: Cannot find QEMU binary /usr/libexec/qem3u-kvm: No such file or directory
* src/qemu/qemu_capabilities.c: Check for QEMU binary
To make it easier to investigate problems with async event
delivery, add two more debugging lines
* daemon/remote.c: Debug when an event is queued for dispatch
* src/remote/remote_driver.c: Debug when an event is received
for processing
When built as modules, the connection drivers live
in $LIBDIR/libvirt/drivers. Now we add lock manager
drivers, we need to distinguish. So move the existing
modules to 'connection-driver'
* src/Makefile.am: Move module install dir
* src/driver.c: Move module search dir
Some functionality run in virExec hooks may do I/O which
can trigger SIGPIPE. Renable SIGPIPE blocking around the
hook function
* src/util/util.c: Block SIGPIPE around hooks
The Linux kernel headers don't have a value for SCSI type 12,
but HAL source code shows this to be a 'raid'. Add workaround
for this type. Lower log level for unknown types since
this is not a fatal error condition. Include the device sysfs
path in the log output to allow identification of which device
has problems.
* src/node_device/node_device_udev.c: Add SCSI RAID type
libpciaccess has many bugs in its pci_system_init/cleanup
functions that makes calling them multiple times unwise.
eg it will double close() FDs, and leak other FDs.
* src/node_device/node_device_udev.c: Only initialize
libpciaccess once
Until now, user namespaces have not done much, but (for that
reason) have been innocuous to glob in with other CLONE_
flags. Upcoming userns development, however, will make tasks
cloned with CLONE_NEWUSER far more restricted. In particular,
for some time they will be unable to access files with anything
other than the world access perms.
This patch assumes that noone really needs the user namespaces
to be enabled. If that is wrong, then we can try a more
baroque patch where we create a file owned by a test userid with
700 perms and, if we can't access it after setuid'ing to that
userid, then return 0. Otherwise, assume we are using an
older, 'harmless' user namespace implementation.
Comments appreciated. Is it ok to do this?
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
The new virConnectGetSysinfo() API allows one to get the system
information associated to a connection host, providing the same
data as a guest that uses <os><smbios mode='host'/></os>, and in
a format that can be pasted into the guest and edited when using
<os><smbios mode='sysinfo'/></os>.
* include/libvirt/libvirt.h.in (virConnectGetSysinfo): Declare.
* src/libvirt_public.syms: Export new symbol.
qemu 0.13.0 (at least as built for Fedora 14, and also backported to
RHEL 6.0 qemu) supported an older syntax for a spicevmc channel; it's
not as flexible (it has an implicit name and hides the chardev
aspect), but now that we support spicevmc, we might as well target
both variants.
* src/qemu/qemu_capabilities.h (QEMUD_CMD_FLAG_DEVICE_SPICEVMC):
New flag.
* src/qemu/qemu_capabilities.c (qemuCapsParseDeviceStr): Set it
correctly.
* src/qemu/qemu_command.h (qemuBuildVirtioSerialPortDevStr): Drop
declaration.
* src/qemu/qemu_command.c (qemuBuildVirtioSerialPortDevStr): Alter
signature, check flag.
(qemuBuildCommandLine): Adjust caller and check flag.
* tests/qemuhelptest.c (mymain): Update test.
* tests/qemuxml2argvtest.c (mymain): New test.
* tests/qemuxml2argvdata/qemuxml2argv-channel-spicevmc-old.xml:
New file.
* tests/qemuxml2argvdata/qemuxml2argv-channel-spicevmc-old.args:
Likewise.
Adds <smartcard mode='passthrough' type='spicevmc'/>, which uses the
new <channel name='smartcard'/> of <graphics type='spice'>.
* docs/schemas/domain.rng: Support new XML.
* docs/formatdomain.html.in: Document it.
* src/conf/domain_conf.h (virDomainGraphicsSpiceChannelName): New
enum value.
(virDomainChrSpicevmcName): New enum.
(virDomainChrSourceDef): Distinguish spicevmc types.
* src/conf/domain_conf.c (virDomainGraphicsSpiceChannelName): Add
smartcard.
(virDomainSmartcardDefParseXML): Parse it.
(virDomainChrDefParseXML, virDomainSmartcardDefParseXML): Set
spicevmc name.
(virDomainChrSpicevmc): New enum conversion functions.
* src/libvirt_private.syms: Export new functions.
* src/qemu/qemu_command.c (qemuBuildChrChardevStr): Conditionalize
name.
* tests/qemuxml2argvtest.c (domain): New test.
* tests/qemuxml2argvdata/qemuxml2argv-smartcard-passthrough-spicevmc.args:
New file.
* tests/qemuxml2argvdata/qemuxml2argv-smartcard-passthrough-spicevmc.xml:
Likewise.
Inspired by https://bugzilla.redhat.com/show_bug.cgi?id=615757
Add a new character device backend for virtio serial channels that
activates the QEMU spice agent on the main channel using the vdagent
spicevmc connection. The <target> must be type='virtio', and supports
an optional name that specifies how the guest will see the channel
(for now, name must be com.redhat.spice.0).
<channel type='spicevmc'>
<target type='virtio'/>
<address type='virtio-serial' controller='1' bus='0' port='3'/>
</channel>
* docs/schemas/domain.rng: Support new XML.
* docs/formatdomain.html.in: Document it.
* src/conf/domain_conf.h (virDomainChrType): New enum value.
* src/conf/domain_conf.c (virDomainChr): Add spicevmc.
(virDomainChrDefParseXML, virDomainChrSourceDefParseXML)
(virDomainChrDefParseTargetXML): Parse and enforce proper use.
(virDomainChrSourceDefFormat, virDomainChrDefFormat): Format.
* src/qemu/qemu_command.c (qemuBuildChrChardevStr)
(qemuBuildCommandLine): Add qemu support.
* tests/qemuxml2argvtest.c (domain): New test.
* tests/qemuxml2argvdata/qemuxml2argv-channel-spicevmc.xml: New
file.
* tests/qemuxml2argvdata/qemuxml2argv-channel-spicevmc.args:
Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Qemu smartcard/spicevmc support exists on branches (such as
http://cgit.freedesktop.org/~alon/qemu/commit/?h=usb_ccid.v15&id=024a37b)
but is not yet upstream. The added -help output matches a scratch build
that will be close to the RHEL 6.1 qemu-kvm.
* src/qemu/qemu_capabilities.h (QEMUD_CMD_FLAG_CCID_EMULATED)
(QEMUD_CMD_FLAG_CCID_PASSTHRU, QEMUD_CMD_FLAG_CHARDEV_SPICEVMC):
New flags.
* src/qemu/qemu_capabilities.c (qemuCapsComputeCmdFlags)
(qemuCapsParseDeviceStr): Check for smartcard capabilities.
(qemuCapsExtractVersionInfo): Tweak comment.
* tests/qemuhelptest.c (mymain): New test.
* tests/qemuhelpdata/qemu-kvm-0.12.1.2-rhel61: New file.
* tests/qemuhelpdata/qemu-kvm-0.12.1.2-rhel61-device: Likewise.
* src/conf/domain_conf.h (virDomainSmartcardType): New enum.
(virDomainSmartcardDef, virDomainDeviceCcidAddress): New structs.
(virDomainDef): Include smartcards.
(virDomainSmartcardDefIterator): New typedef.
(virDomainSmartcardDefFree, virDomainSmartcardDefForeach): New
prototypes.
(virDomainControllerType, virDomainDeviceAddressType): Add ccid
enum values.
(virDomainDeviceInfo): Add ccid address type.
* src/conf/domain_conf.c (virDomainSmartcard): Convert between
enum and string.
(virDomainSmartcardDefParseXML, virDomainSmartcardDefFormat)
(virDomainSmartcardDefFree, virDomainDeviceCcidAddressParseXML)
(virDomainDefMaybeAddSmartcardController): New functions.
(virDomainDefParseXML): Parse the new XML.
(virDomainDefFormat): Convert back to XML.
(virDomainDefFree): Clean up.
(virDomainDeviceInfoIterate): Iterate over passthrough aliases.
(virDomainController, virDomainDeviceAddress)
(virDomainDeviceInfoParseXML, virDomainDeviceInfoFormat)
(virDomainDefAddImplicitControllers): Support new values.
* src/libvirt_private.syms (domain_conf.h): New exports.
* cfg.mk (useless_free_options): List new function.
Currently users who want to use virDomainQemuMonitorCommand() API or
it's virsh equivalent has to use the same protocol as libvirt uses for
communication to qemu. Since the protocol is QMP with current qemu and
HMP much more usable for humans, one ends up typing something like the
following:
virsh qemu-monitor-command DOM \
'{"execute":"human-monitor-command","arguments":{"command-line":"info kvm"}}'
which is not a very convenient way of debugging qemu.
This patch introduces --hmp option to qemu-monitor-command, which says
that the provided command is in HMP. If libvirt uses QMP to talk with
qemu, the command will automatically be converted into QMP. So the
example above is simplified to just
virsh qemu-monitor-command --hmp DOM "info kvm"
Also the result is converted from
{"return":"kvm support: enabled\r\n"}
to just plain HMP:
kvm support: enabled
If libvirt talks to qemu in HMP, --hmp flag is obviously a noop.
This patch fixes 2 occurrences of nla_put expression with a '!' in
front of them that basically prevented the detection that the buffer
is too small. However, code further below would then detect that the
buffer is too small when further parts are added to the netlink message.
* src/qemu/qemu_driver.c (qemudShutdownVMDaemon): Check that vm is
still active.
Reported by Wen Congyang as follows:
Steps to reproduce this bug:
1. use gdb to debug libvirtd, and set breakpoint in the function
qemuConnectMonitor()
2. start a vm, and the libvirtd will be stopped in qemuConnectMonitor()
3. kill -STOP $(cat /var/run/libvirt/qemu/<domain>.pid)
4. continue to run libvirtd in gdb, and libvirtd will be blocked in the
function qemuMonitorSetCapabilities()
5. kill -9 $(cat /var/run/libvirt/qemu/<domain>.pid)
Here is log of the qemu:
=========
LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin ...
char device redirected to /dev/pts/3
2011-01-27 09:38:48.101: shutting down
2011-01-27 09:41:26.401: shutting down
=========
The vm is shut down twice. I do not know whether this behavior has
side effect, but I think we should shutdown the vm only once.
When compiling libvirt with GCC 3.4.6 the following warning is being triggered quite a lot:
util/memory.h:60: warning: declaration of 'remove' shadows a global declaration
/usr/include/stdio.h:175: warning: shadowed declaration is here
Fix this by renaming the parameter to 'toremove'.
Depending if the qemu binary supports multiple pci-busses, the device
options will contain "bus=pci" or "bus=pci.0".
Only x86_64 and i686 seem to have support for multiple PCI-busses. When
a guest of these architectures is started, set the
QEMUD_CMD_FLAG_PCI_MULTIBUS flag.
Signed-off-by: Niels de Vos <ndevos@redhat.com>
In the SASL codepath we typically read far more data off the
wire than we immediately need. When using a connection from a
single thread this isn't a problem, since only our reply will
be pending (or an event we can handle directly). When using a
connection from multiple threads though, we may read the data
from replies from other threads. If those replies occur after
our own reply, they'll not be processed. The other thread will
then go into poll() and wait for its reply which has already
been received and decoded. The solution is to set poll() timeout
to 0 if there is pending SASL data.
* src/remote/remote_driver.c: Don't sleep in poll() if SASL
data exists
Command line building for incoming tunneled migration is missed,
as a result, all the tunneled migration will fail with "unknown
migration protocol".
* src/qemu/qemu_command.c
commit f1fe9671e was supposed to make sure we use files.h
macros to avoid double close, but it didn't work.
Meanwhile, virCommand is vastly superior to system(), fork(),
and popen() (also to virExec, but we haven't completed that
conversion), so enforce that, too.
* cfg.mk (sc_prohibit_close): Fix typo that excluded close, and
add pclose.
(sc_prohibit_fork_wrappers): New rule, for fork, system, and popen.
* .x-sc_prohibit_close: More exemptions.
* .x-sc_prohibit_fork_wrappers: New file.
* Makefile.am (syntax_check_exceptions): Ship new file.
* src/datatypes.c (virReleaseConnect): Tweak comment to avoid
false positive.
* src/util/files.h (VIR_CLOSE): Likewise.
* src/fdstream.c (virFDStreamOpenFile, virFDStreamCreateFile):
Use VIR_FORCE_CLOSE instead of close.
* tests/commandtest.c (mymain): Likewise.
* tools/virsh.c (editFile): Use virCommand instead of system.
* src/util/util.c (__virExec): Special case preservation of std
file descriptors to child.
Use it in all places where a memory or storage request size is converted
to a larger granularity. This avoids requesting too small memory or storage
sizes that could result from the truncation done by a simple division.
This extends the round up fix in 6002e0406c
to the whole codebase.
Instead of reporting errors for odd values in the VMX code round them up.
Update the QEMU Argv tests accordingly as the original memory size 219200
isn't a even multiple of 1024 and is rounded up to 215 megabyte now. Change
it to 219100 and 219136. Use two different values intentionally to make
sure that rounding up works.
Update virsh.pod accordingly, as rounding down and rejecting are replaced
by rounding up.
Fixes test failure that was overlooked after commit 1e1f7a8950.
* daemon/Makefile.am (check-local): Let 'make check' fail on error.
* daemon/test_libvirtd.aug: Move qemu-specific option...
* src/qemu/test_libvirtd_qemu.aug: ...into correct test.
* src/qemu/libvirtd_qemu.aug: Parse new option.
qemu allows the user to choose what io storage api should be used,
either the default (threads) or native (linux aio) which in the latter
case can result in better performance.
Based on a patch originally by Matthias Dahl.
Red Hat Bugzilla #591703
Signed-off-by: Eric Blake <eblake@redhat.com>
The refactoring of QEMU command startup was comitted with
a couple of VIR_WARN lines left in from debugging.
* src/qemu/qemu_driver.c: Remove log warning lines
When qemuMonitorSetCapabilities() fails, there is no need to
call qemuMonitorClose(), because the caller will already see
the error code and tear down the entire VM. The extra call to
qemuMonitorClose resulted in a double-free due to it removing
a ref count prematurely.
* src/qemu/qemu_driver.c: Remove premature close of monitor
Regression in commit caa805ea let a lot of bad messages slip in.
* cfg.mk (msg_gen_function): Fix function name.
* src/qemu/qemu_cgroup.c (qemuRemoveCgroup): Fix fallout from
'make syntax-check'.
* src/qemu/qemu_driver.c (qemudDomainGetInfo)
(qemuDomainWaitForMigrationComplete, qemudStartVMDaemon)
(qemudDomainSaveFlag, qemudDomainAttachDevice)
(qemuDomainUpdateDeviceFlags): Likewise.
* src/qemu/qemu_hotplug.c (qemuDomainAttachHostUsbDevice)
(qemuDomainDetachPciDiskDevice, qemuDomainDetachSCSIDiskDevice):
Likewise.
When attaching device from a xml file and the device is mis-configured,
virsh gives mis-leading message "out of memory". This patch fixes this.
Signed-off-by: Eric Blake <eblake@redhat.com>
* src/qemu/qemu_command.c (qemuBuildChrChardevStr): Alter the
chardev alias.
(qemuBuildCommandLine): Output an id for the chardev counterpart.
* tests/qemuxml2argvdata/*: Update tests to match.
Reported by Daniel P. Berrange.
Steps to reproduce this bug:
1. service libvirtd start
2. virsh start <domain>
3. kill -STOP $(cat /var/run/libvirt/qemu/<domain>.pid)
4. service libvirtd restart
5. kill -9 $(cat /var/run/libvirt/qemu/<domain>.pid)
Then libvirtd will core dump or be in deadlock state.
Make sure that json is built into libvirt and the version
of qemu is newer than 0.13.0.
The reason of libvirtd cores dump is that:
We add vm->refs when we alloc the memory, and decrease it
in the function qemuHandleMonitorEOF() in other thread.
We add vm->refs in the function qemuConnectMonitor() and
decrease it when the vm is inactive.
The libvirtd will block in the function qemuMonitorSetCapabilities()
because the vm is stopped by signal SIGSTOP. Now the vm->refs is 2.
Then we kill the vm by signal SIGKILL. The function
qemuMonitorSetCapabilities() failed, and then we will decrease vm->refs
in the function qemuMonitorClose().
In another thread, mon->fd is broken and the function
qemuHandleMonitorEOF() is called.
If qemuHandleMonitorEOF() decreases vm->refs before qemuConnectMonitor()
returns, vm->refs will be decrease to 0 and the memory is freed.
We will call qemudShutdownVMDaemon() as qemuConnectMonitor() failed.
The memory has been freed, so qemudShutdownVMDaemon() is too dangerous.
We will reference NULL pointer in the function virDomainConfVMNWFilterTeardown():
=============
void
virDomainConfVMNWFilterTeardown(virDomainObjPtr vm) {
int i;
if (nwfilterDriver != NULL) {
for (i = 0; i < vm->def->nnets; i++)
virDomainConfNWFilterTeardown(vm->def->nets[i]);
}
}
============
vm->def->nnets is not 0 but vm->def->nets is NULL(We don't set vm->def->nnets
to 0 when we free vm).
We should add an extra reference of vm to avoid vm to be deleted if
qemuConnectMonitor() failed.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
This new parameter allows user specifies where the client
cerficate, client key, CA certificate of x509 is, instead of
hardcoding it. If 'pkipath' is not specified, and the user
is not root, try to find files in $HOME/.pki/libvirt, as long
as one of client cerficate, client key, CA certificate can
not be found, use default global location (LIBVIRT_CACERT,
LIBVIRT_CLIENTCERT, LIBVIRT_CLIENTKEY, see
src/remote/remote_driver.h)
Example of use:
[root@Osier client]# virsh -c qemu+tls://10.66.93.111/system?pkipath=/tmp/pki/client
error: Cannot access CA certificate '/tmp/pki/client/cacert.pem': No such file
or directory
error: failed to connect to the hypervisor
[root@Osier client]# ls -l
total 24
-rwxrwxr-x. 1 root root 6424 Jan 24 21:35 a.out
-rw-r--r--. 1 root root 1245 Jan 23 19:04 clientcert.pem
-rw-r--r--. 1 root root 132 Jan 23 19:04 client.info
-rw-r--r--. 1 root root 1679 Jan 23 19:04 clientkey.pem
[root@Osier client]# cp /tmp/cacert.pem .
[root@Osier client]# virsh -c qemu+tls://10.66.93.111/system?pkipath=/tmp/pki/client
Welcome to virsh, the virtualization interactive terminal.
Type: 'help' for help with commands
'quit' to quit
virsh #
* src/remote/remote_driver.c: adds support for the new pkipath URI parameter
If vol->capacity is odd, the capacity will be rounded down
by devision, this patch is to round it up instead of rounding
down, to be safer in case of one writes to the volume with the
size he used to create.
- src/storage/storage_backend_logical.c: make sure size is not rounded down
If a guest image is saved in compressed format, and the restore fails
in some way after the intermediate process used to uncompress the
image has been started, but before qemu has been started to hook up to
the uncompressor, libvirt will endlessly wait for the uncompressor to
finish, but it never will because it's still waiting to have something
hooked up to drain its output.
The solution is to close the pipes on both sides of the uncompressor,
then send a SIGTERM before calling waitpid on it (only if the restore
has failed, of course).
Add a hook to the error reporting APIs to allow specific
error messages to be filtered out. Wire up libvirtd to
remove VIR_ERR_NO_DOMAIN & similar error codes from the
logs. They are still logged at DEBUG level.
* daemon/libvirtd.c: Filter VIR_ERR_NO_DOMAIN and friends
* src/libvirt_private.syms, src/util/virterror.c,
src/util/virterror_internal.h: Hook for changing error
reporting level
This reverts the additions in commit
abff683f78
taking us back to state where all errors are fully logged
in both libvirtd and normal clients.
THe intent was to stop VIR_ERR_NO_DOMAIN (No such domain
with UUID XXXX) messages from client apps polluting syslog
The change affected all error codes, but more seriously,
it also impacted errors from internal libvirtd infrastructure
For example guest autostart no longer logged errors. The
libvirtd network code no longer logged some errors. This
makes debugging incredibly hard
* daemon/libvirtd.c: Remove error log priority filter
* src/util/virterror.c, src/util/virterror_internal.h: Remove
callback for overriding log priority
This patch is a partial resolution to the following bug:
https://bugzilla.redhat.com/show_bug.cgi?id=667756
(to complete the fix, an updated selinux-policy package is required,
to add the policy that allows libvirt to set the context of a fifo,
which was previously not allowed).
Explanation : When an incoming migration is over a pipe (for example,
if the image was compressed and is being fed through gzip, or was on a
root-squash nfs server, so needed to be opened by a child process
running as a different uid), qemu cannot read it unless the selinux
context label for the pipe has been set properly.
The solution is to check the fd used as the source of the migration
just before passing it to qemu; if it's a fifo (implying that it's a
pipe), we call the newly added virSecurityManagerSetFDLabel() function
to set the context properly.
A need was found to set the SELinux context label on an open fd (a
pipe, as a matter of fact). This patch adds a function to the security
driver API that will set the label on an open fd to secdef.label. For
all drivers other than the SELinux driver, it's a NOP. For the SElinux
driver, it calls fsetfilecon().
If the return is a failure, it only returns error up to the caller if
1) the desired label is different from the existing label, 2) the
destination fd is of a type that supports setting the selinux context,
and 3) selinux is in enforcing mode. Otherwise it will return
success. This follows the pattern of the existing function
SELinuxSetFilecon().
The problem was introduced by commit 4303c91, which removed the checking
of domain state, this patch is to fix it.
Otherwise, improper error will be thrown, e.g.
error: Failed to save domain rhel6 state
error: cannot resolve symlink /var/lib/libvirt/qemu/save/rhel6.save: No such
file or directory
In QEMU, the card itself is a PCI device, but it requires a codec
(either -device hda-output or -device hda-duplex) to actually output
sound. Specifying <sound model='ich6'/> gives us -device intel-hda
-device hda-duplex I think it's important that a simple <sound model='ich6'/>
sets up a useful codec, to have consistent behavior with all other sound cards.
This is basically Dan's proposal of
<sound model='ich6'>
<codec type='output' slot='0'/>
<codec type='duplex' slot='3'/>
</sound>
without the codec bits implemented.
The important thing is to keep a consistent API here, we don't want some
<sound> devs require tweaking codecs but not others. Steps I see to
accomplishing this:
- every <sound> device has a <codec type='default'/> (unless codecs are
manually specified)
- <codec type='none'/> is required to specify 'no codecs'
- new audio settings like mic=on|off could then be exposed in
<sound> or <codec> in a consistent manner for all sound models
v2:
Use model='ich6'
v3:
Use feature detection, from eblake
Set codec id, bus, and cad values
v4:
intel-hda isn't supported if -device isn't available
v5:
Comment spelling fixes
If vnc_auto_unix_socket is enabled, any VNC devices without a hardcoded
listen or socket value will be setup to serve over a unix socket in
/var/lib/libvirt/qemu/$vmname.vnc.
We store the generated socket path in the transient VM definition at
CLI build time.
QEMU supports serving VNC over a unix domain socket rather than traditional
TCP host/port. This is specified with:
<graphics type='vnc' socket='/foo/bar/baz'/>
This provides better security access control than VNC listening on
127.0.0.1, but will cause issues with tools that rely on the lax security
(virt-manager in fedora runs as regular user by default, and wouldn't be
able to access a socket owned by 'qemu' or 'root').
Also not currently supported by any clients, though I have patches for
virt-manager, and virt-viewer should be simple to update.
v2:
schema: Make listen vs. socket a <choice>
This will allow us to record transient runtime state in vm->def, like
default VNC parameters. Accomplish this by adding an extra 'live' parameter
to SetDefTransient, with similar semantics to the 'live' flag for
AssignDef.
When restoring a saved qemu instance via JSON monitor, the vm is
left in a paused state. Turns out the 'cont' cmd was failing with
"MigrationExpected" error class and "An incoming migration is
expected before this command can be executed" error description
due to migration (restore) not yet complete.
Detect if 'cont' cmd fails with "MigrationExpecte" error class and
retry 'cont' cmd.
V2: Fix potential double-free noted by Laine Stump
Report VIR_ERR_CONFIG_UNSUPPORTED instead of VIR_ERR_INTERNAL_ERROR,
as it's valid in our domain schema, just unsupported by hypervisor
here.
* src/qemu/qemu_command.c
The code which set VNC passwords correctly had fallback for
the set_password command, but was lacking it for the
expire_password command. This made it impossible to start
a guest. It also failed to check whether QEMU was still
running after the initial 'set_password' command completed
* src/qemu/qemu_hotplug.c: Fix error handling when
password expiry fails
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_text.c: Fix
return code for missing expire_password command
Avoid overwriting the real error message with a generic
OOM failure message, when machine type probe fails
* src/qemu/qemu_driver.c: Don't overwrite error
If the XML security model is NULL, it is assumed that the current
model will be used with dynamic labelling. The verify step is
meaningless and potentially crashes if dereferencing NULL
* src/security/security_manager.c: Skip NULL model on verify
The function virUnrefConnect() may call virReleaseConnect() to release
the dest connection, and the function virReleaseConnect() will call
conn->driver->close().
So the function virUnrefConnect() should be surrounded by
qemuDomainObjEnterRemoteWithDriver() and
qemuDomainObjExitRemoteWithDriver() to prevent possible deadlock between
two communicating libvirt daemons.
See commit f0c8e1cb37 for further details.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
In some circumstances, libvirtd would issue two STOPPED events after it
stopped a domain. This was because an EOF event can arrive after a qemu
process is killed but before qemuMonitorClose() is called.
qemuHandleMonitorEOF() should ignore EOF when the domain is not running.
I wasn't able to reproduce this bug directly, only after adding an
artificial sleep() into qemudShutdownVMDaemon().
A large number of return values used 'return (0)' instead
of simply 'return 0'. Remove all these redundant brackets
so the style is consistent throughout the file
* src/libvirt.c: Remove redundant brackets
The driver table only has 10 slots, but there are potentially
11 drivers that need activating. Improve the error message
when driver registration fails
* src/libvirt.c: Increase driver table size & improve errors
The virLibConnError() function (and related ones) do not correctly
report line number info. Turn them all into macros so line numbers
are reported correctly. Drop the connection object in all of them
since it is no longer used.
Also from the virLibConnWarning() equivalents completely. Now
that the Xen driver is running 100% inside libvirtd, those
codepaths for secondary drivers cannot be reached.
* src/libvirt.c: Replace error functions with macros
* .gnulib: Update to latest, for sigpipe and sigaction modules.
* bootstrap.conf (gnulib_modules): Add siaction, sigpipe, strerror_r.
* tools/virsh.c (vshSetupSignals) [!SIGPIPE]: Delete, now that
gnulib guarantees it.
(SA_SIGINFO): Define for mingw fallback.
* src/util/virterror.c (virStrerror): Simplify, now that gnulib
guarantees the POSIX interface.
* configure.ac (AC_CHECK_FUNCS_ONCE): Drop redundant check.
(AM_PROG_CC_STDC): Move earlier, to keep autoconf happy.
The public object is called NWFilter but the corresponding private
object is called NWFilterPool. I don't see compelling reasons for this
Pool suffix. One might argue that an NWFilter is a "pool" of rules, etc.
Remove the Pool suffix from NWFilterPool. No functional change included.
Fixes regression introduced in commit 2211518, where all qemu 0.12.x
fails to start, as does qemu 0.13.x lacking the pci-assign device.
Prior to 2211518, the code was just ignoring a non-zero exit status
from the qemu child, but the virCommand code checked this to avoid
masking any other issues, which means the real bug of provoking
non-zero exit status has been latent for a longer time.
* src/qemu/qemu_capabilities.c (qemuCapsExtractVersionInfo): Check
for -device driver,? support.
(qemuCapsExtractDeviceStr): Avoid failure if all probed devices
are unsupported.
Reported by Ken Congyang.
https://bugzilla.redhat.com/show_bug.cgi?id=620363
When using -incoming stdio or -incoming exec:, qemu keeps the
stdin fd open long after the migration is complete. Not to
mention that exec:cat is horribly inefficient, by doubling the
I/O and going through a popen interface in qemu.
The new -incoming fd: of qemu 0.12.0 closes the fd after using
it, and allows us to bypass an intermediary cat process for
less I/O.
* src/qemu/qemu_command.h (qemuBuildCommandLine): Add parameter.
* src/qemu/qemu_command.c (qemuBuildCommandLine): Support
migration via fd: when possible. Consolidate migration handling
into one spot, now that it is more complex.
* src/qemu/qemu_driver.c (qemudStartVMDaemon): Update caller.
* tests/qemuxml2argvtest.c (mymain): Likewise.
* tests/qemuxml2argvdata/qemuxml2argv-restore-v2-fd.args: New file.
* tests/qemuxml2argvdata/qemuxml2argv-restore-v2-fd.xml: Likewise.
Currently, boot order can be specified per device class but there is no
way to specify exact disk/NIC device to boot from.
This patch adds <boot order='N'/> element which can be used inside
<disk/> and <interface/>. This is incompatible with the older os/boot
element. Since not all hypervisors support per-device boot
specification, new deviceboot flag is included in capabilities XML for
hypervisors which understand the new boot element. Presence of the flag
allows (but doesn't require) users to use the new style boot order
specification.
Display or set unlimited values for memory parameters. Unlimited is
represented by INT64_MAX in memory cgroup.
Signed-off-by: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>
Reported-by: Justin Clift <jclift@redhat.com>
virLibConnError already includes __FUNCTION__ in its output, so we
were redundant. Furthermore, clang warns that __FUNCTION__ is not
a string literal (at least __FUNCTION__ will never contain %, so
it was not a security risk).
* src/datatypes.c: Replace __FUNCTION__ with a descriptive string.
This is in response to a request in:
https://bugzilla.redhat.com/show_bug.cgi?id=665293
In short, under heavy load, it's possible for qemu's networking to
lock up due to the tap device's default 1MB sndbuf being
inadequate. adding "sndbuf=0" to the qemu commandline -netdevice
option will alleviate this problem (sndbuf=0 actually sets it to
0xffffffff).
Because we must be able to explicitly specify "0" as a value, the
standard practice of "0 means not specified" won't work here. Instead,
virDomainNetDef also has a sndbuf_specified, which defaults to 0, but
is set to 1 if some value was given.
The sndbuf value is put inside a <tune> element of each <interface> in
the domain. The intent is that further tunable settings will also be
placed inside this element.
<interface type='network'>
...
<tune>
<sndbuf>0</sndbuf>
...
</tune>
</interface>
This patch is in response to
https://bugzilla.redhat.com/show_bug.cgi?id=643050
The existing libvirt support for the vhost-net backend to the virtio
network driver happens automatically - if the vhost-net device is
available, it is always enabled, otherwise the standard userland
virtio backend is used.
This patch makes it possible to force whether or not vhost-net is used
with a bit of XML. Adding a <driver> element to the interface XML, eg:
<interface type="network">
<model type="virtio"/>
<driver name="vhost"/>
will force use of vhost-net (if it's not available, the domain will
fail to start). if driver name="qemu", vhost-net will not be used even
if it is available.
If there is no <driver name='xxx'/> in the config, libvirt will revert
to the pre-existing automatic behavior - use vhost-net if it's
available, and userland backend if vhost-net isn't available.
We try to use that command first when setting a VNC/SPICE password. If
that doesn't work we fallback to the legacy VNC only password
Allow an expiry time to be set, if that doesn't work, throw an error
if they try to use SPICE.
Change since v1:
- moved qemuInitGraphicsPasswords to qemu_hotplug, renamed
to qemuDomainChangeGraphicsPasswords.
- updated what looks like a typo (that appears to work anyway) in
initial patch from Daniel:
- ret = qemuInitGraphicsPasswords(driver, vm,
- VIR_DOMAIN_GRAPHICS_TYPE_SPICE,
- &vm->def->graphics[0]->data.vnc.auth,
- driver->vncPassword);
+ ret = qemuInitGraphicsPasswords(driver, vm,
+ VIR_DOMAIN_GRAPHICS_TYPE_SPICE,
+ &vm->def->graphics[0]->data.spice.auth,
+ driver->spicePassword);
Based on patch by Daniel P. Berrange <berrange@redhat.com>.
I broke 'make check' with commit 04197350 by unconditionally
emitting 'hap=' in xen xm driver. Only emit 'hap=' if
xendConfigVersion >= 3. I've tested sending 'hap=' to a Xen 3.2
machine without support for hap setting and verified that xend
silently drops the unrecognized setting.
* src/qemu/qemu_capabilities.h (qemuCapsParseDeviceStr): New
prototype.
* src/qemu/qemu_capabilities.c (qemuCapsParsePCIDeviceStrs)
Rename and split...
(qemuCapsExtractDeviceStr, qemuCapsParseDeviceStr): ...to make it
easier to add and test device-specific checks.
(qemuCapsExtractVersionInfo): Update caller.
* tests/qemuhelptest.c (testHelpStrParsing): Also test parsing of
device-related flags.
(mymain): Update expected flags.
* tests/qemuhelpdata/qemu-0.12.1-device: New file.
* tests/qemuhelpdata/qemu-kvm-0.12.1.2-rhel60-device: New file.
* tests/qemuhelpdata/qemu-kvm-0.12.3-device: New file.
* tests/qemuhelpdata/qemu-kvm-0.13.0-device: New file.
It was awkward having only int conversion in the virStrToLong family,
but only long conversion in the virXPath family. Make both families
support both types.
* src/util/util.h (virStrToLong_l, virStrToLong_ul): New
prototypes.
* src/util/xml.h (virXPathInt, virXPathUInt): Likewise.
* src/util/util.c (virStrToLong_l, virStrToLong_ul): New
functions.
* src/util/xml.c (virXPathInt, virXPathUInt): Likewise.
* src/libvirt_private.syms (util.h, xml.h): Export them.
* src/qemu/qemu_capabilities.c (qemuCapsProbeMachineTypes)
(qemuCapsProbeCPUModels, qemuCapsParsePCIDeviceStrs)
(qemuCapsExtractVersionInfo): Use virCommand rather than virExec.
xen-unstable c/s 16931 introduced a per-domain setting for hvm
guests to enable/disable hardware assisted paging. If disabled,
software techniques such as shadow page tables are used. If enabled,
and the feature exists in underlying hardware, hardware support for
paging is used.
Xen does not provide a mechanism to discover the HAP capability, so
we advertise its availability for hvm guests on Xen >= 3.3.
xen-unstable c/s 16931 introduced a per-domain setting for hvm
guests to enable/disable hardware assisted paging. If disabled,
software techniques such as shadow page tables are used. If enabled,
and the feature exists in underlying hardware, hardware support for
paging is used.
This provides implementation for mapping HAP setting to/from
domxml/native formats in xen drivers.
Extend the virDomainFeature enumeration to include HAP (hardware
assisted paging) feature.
Hardware features such as Extended Page Table and Nested Page
Table augment hypervisor software techniques such as shadow
page table. Adding HAP to the virDomainFeature enumeration
allows users to select between hardware and software memory
management mechanisms for their guests.
Without this patch, at least tests/daemon-conf (which sticks
$builddir/src in the PATH) tries to execute the directory
$builddir/src/qemu rather than a real qemu binary.
* src/util/util.h (virFileExists): Adjust prototype.
(virFileIsExecutable): New prototype.
* src/util/util.c (virFindFileInPath): Reject non-executables and
directories. Avoid huge stack allocation.
(virFileExists): Use lighter-weight syscall.
(virFileIsExecutable): New function.
* src/libvirt_private.syms (util.h): Export new function.
When we do peer2peer migration, the dest uri is an address of the
target host as seen from the source machine. So we must specify
the ip or hostname of target host in dest uri. If we do not specify
it, report an error to the user.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
If the emulator doesn't support SDL graphic, we should reject
the use of SDL graphic xml with error messages, but not ignore
it silently, and pretend things are fine.
"-sdl" flag was exposed explicitly by qemu since 0.10.0, more detail:
http://www.redhat.com/archives/libvir-list/2011-January/msg00442.html
And we already have capability flag "QEMUD_CMD_FLAG_0_10", which
could be used to prevent the patch affecting the older versions
of QEMU.
* src/qemu/qemu_command.c
Don't report an error when the VirtualBox registry key is missing,
as this just indicates that VirtualBox is not installed in general.
This matches the behavior of the XPCOM glue that silently ignores
a missing VBoxXPCOMC.so.
Skip IB700 when assigning PCI slots.
Note: the I6300ESB watchdog _is_ a PCI device.
To test this: I applied this patch to libvirt-0.8.3-2.fc14 (rebasing
it slightly: qemu_command.c didn't exist in that version) and
installed this on my machine, then tested that I could successfully
add an ib700 watchdog device to a guest, start the guest, and the
ib700 was available to the guest. I also added an i6300esb (PCI)
watchdog to another guest, and verified that libvirt assigned a PCI
device to it, that the guest could be started, and that i6300esb was
present in the guest.
Note that if you previously had a domain with a ib700 watchdog, it
would have had an <address type='pci' .../> clause added to it in the
libvirt configuration. This patch does not attempt to remove this.
You cannot start such a domain -- qemu gives an error if you try.
With this patch you are able to remove the bogus address element
without libvirt adding it back.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
* src/util/network.c (virSocketAddrMask): Zero out port, so that
iptables can initialize just the netmask then call
virSocketFormatAddr without an uninitialized read in getnameinfo.
After the remote driver runs an event callback, it unconditionally disables the
loop timer, thinking it just flushed every queued event. This doesn't work
correctly though if an event is queued while a callback is running.
The events actually aren't being lost, it's just that the event loop didn't
think there was anything that needed to be dispatched. So all those 'lost
events' should actually get re-triggered if you manually kick the loop by
generating a new event (like creating a new guest).
The solution is to disable the dispatch timer _before_ we invoke any event
callbacks. Events queued while a callback is running will properly reenable the
timer.
More info at https://bugzilla.redhat.com/show_bug.cgi?id=624252
The current security driver usage requires horrible code like
if (driver->securityDriver &&
driver->securityDriver->domainSetSecurityHostdevLabel &&
driver->securityDriver->domainSetSecurityHostdevLabel(driver->securityDriver,
vm, hostdev) < 0)
This pair of checks for NULL clutters up the code, making the driver
calls 2 lines longer than they really need to be. The goal of the
patchset is to change the calling convention to simply
if (virSecurityManagerSetHostdevLabel(driver->securityDriver,
vm, hostdev) < 0)
The first check for 'driver->securityDriver' being NULL is removed
by introducing a 'no op' security driver that will always be present
if no real driver is enabled. This guarentees driver->securityDriver
!= NULL.
The second check for 'driver->securityDriver->domainSetSecurityHostdevLabel'
being non-NULL is hidden in a new abstraction called virSecurityManager.
This separates the driver callbacks, from main internal API. The addition
of a virSecurityManager object, that is separate from the virSecurityDriver
struct also allows for security drivers to carry state / configuration
information directly. Thus the DAC/Stack drivers from src/qemu which
used to pull config from 'struct qemud_driver' can now be moved into
the 'src/security' directory and store their config directly.
* src/qemu/qemu_conf.h, src/qemu/qemu_driver.c: Update to
use new virSecurityManager APIs
* src/qemu/qemu_security_dac.c, src/qemu/qemu_security_dac.h
src/qemu/qemu_security_stacked.c, src/qemu/qemu_security_stacked.h:
Move into src/security directory
* src/security/security_stack.c, src/security/security_stack.h,
src/security/security_dac.c, src/security/security_dac.h: Generic
versions of previous QEMU specific drivers
* src/security/security_apparmor.c, src/security/security_apparmor.h,
src/security/security_driver.c, src/security/security_driver.h,
src/security/security_selinux.c, src/security/security_selinux.h:
Update to take virSecurityManagerPtr object as the first param
in all callbacks
* src/security/security_nop.c, src/security/security_nop.h: Stub
implementation of all security driver APIs.
* src/security/security_manager.h, src/security/security_manager.c:
New internal API for invoking security drivers
* src/libvirt.c: Add missing debug for security APIs
If invalid type is specified, e.g.
<serial type='foo'>
<target port='0'/>
</serial>
We replace 'foo' with "null" type implicitly, without reporting an
error message to tell the user, and "start" or "edit" the domain
will be success.
It's not good to guess what the user wants, This patch is to fix
the problem.
* src/conf/domain_conf.c
Add VM name/UUID in log for domain related APIs.
Format: "dom=%p, (VM: name=%s, uuid=%s), param0=%s, param1=%s
*src/libvirt.c (introduce two macros: VIR_DOMAIN_DEBUG, and
VIR_DOMAIN_DEBUG0)
I added a host definition to a network definition:
<network>
<name>Lokal</name>
<uuid>2074f379-b82c-423f-9ada-305d8088daaa</uuid>
<bridge name='virbr1' stp='on' delay='0' />
<ip address='192.168.180.1' netmask='255.255.255.0'>
<dhcp>
<range start='192.168.180.128' end='192.168.180.254' />
<host mac='23:74:00:03:42:02' name='somevm' ip='192.168.180.10' />
</dhcp>
</ip>
</network>
But due to the wrong if-statement the argument --dhcp-hostsfile doesn't get
added to the dnsmasq command. The patch below fixes it for me.
When dynamic_ownership=0, saved images must be owned by the same uid
as is used to run the qemu process, otherwise restore won't work. To
accomplish this, qemuSecurityDACRestoreSavedStateLabel() needs to
simply return when it's called.
This fix is in response to:
https://bugzilla.redhat.com/show_bug.cgi?id=661720
Although the upper-layer code protected against it, it was possible to
call iptablesForwardMasquerade() with an IPv6 address and have it
attempt to add a rule to the MASQUERADE chain of ip6tables (which
doesn't exist).
This patch changes that function to check the protocol of the given
address, generate an error log if it's not IPv4 (AF_INET), and finally
hardcodes all the family parameters sent down to lower-level functions.
This is partially in response to
https://bugzilla.redhat.com/show_bug.cgi?id=653300
The crash in that report was coincidentally fixed when we switched
from using inet_pton() to using virSocketParseAddr(), but the absence
of an ip address in a dhcp static host definition was still silently
ignored (and that entry discarded from the saved XML). This patch
turns that into a logged failure; likewise if the entry has neither a
mac address nor a name attribute (the entry is useless without at
least one of those, plus an ip address).
Since the network name is now pulled into this function in order for
those error logs to be more informative, the other error messages in
the function have also been changed to take advantage.
While doing some testing with Qemu and creating huge logfiles I encountered the case where the VM could not start anymore due to the lseek() to the end of the Qemu VM's log file failing. The patch below fixes the problem by replacing the previously used 'int' with 'off_t'.
To reproduce this error, you could do the following:
dd if=/dev/zero of=/var/log/libvirt/qemu/<name of VM>.log bs=1024 count=$((1024*2048))
and you should get an error like this:
error: Failed to start domain <name of VM>
error: Unable to seek to -2147482651 in /var/log/libvirt/qemu/<name of VM>.log: Success
Detected on cygwin:
util/util.c: In function 'virSetUIDGID':
util/util.c:2824: warning: format '%d' expects type 'int', but argument 7 has type 'gid_t' [-Wformat]
(and three other lines)
* src/util/util.c (virSetUIDGID): Cast, as is done elsewhere in
this file, to avoid printf type mismatch warnings.
The udev driver does not update a PCI device with its SR-IOV capabilities,
when applicable, the way the hal driver does. As a result, dumping the
device's XML will not include the relevant physical or virtual function
information.
With this patch, the XML is correct:
# virsh nodedev-dumpxml pci_0000_09_00_0
<device>
<name>pci_0000_09_00_0</name>
<parent>pci_0000_00_1c_0</parent>
<driver>
<name>vxge</name>
</driver>
<capability type='pci'>
<domain>0</domain>
<bus>9</bus>
<slot>0</slot>
<function>0</function>
<product id='0x5833'>X3100 Series 10 Gigabit Ethernet PCIe</product>
<vendor id='0x17d5'>Neterion Inc.</vendor>
<capability type='virt_functions'>
<address domain='0x0000' bus='0x0a' slot='0x00' function='0x1'/>
<address domain='0x0000' bus='0x0a' slot='0x00' function='0x2'/>
<address domain='0x0000' bus='0x0a' slot='0x00' function='0x3'/>
</capability>
</capability>
</device>
# virsh nodedev-dumpxml pci_0000_0a_00_1
<device>
<name>pci_0000_0a_00_1</name>
<parent>pci_0000_00_1c_0</parent>
<driver>
<name>vxge</name>
</driver>
<capability type='pci'>
<domain>0</domain>
<bus>10</bus>
<slot>0</slot>
<function>1</function>
<product id='0x5833'>X3100 Series 10 Gigabit Ethernet PCIe</product>
<vendor id='0x17d5'>Neterion Inc.</vendor>
<capability type='phys_function'>
<address domain='0x0000' bus='0x09' slot='0x00' function='0x0'/>
</capability>
</capability>
</device>
Cc: Dave Allan <dallan@redhat.com>
Signed-off-by: Chris Wright <chrisw@redhat.com>
As pointed out in https://bugzilla.redhat.com/show_bug.cgi?id=659855#c9,
commit c3568ec2 introduced a regression where we no longer close any
fd's beyond FD_SETSIZE.
* src/util/util.c (__virExec): Continue to close fd's beyond
keepfd range.
Reported by Stefan Praszalowicz.
The original version of these functions would modify the address sent
in, meaning that the caller would usually need to copy the address
first. This change makes the original a const, and puts the resulting
masked address into a new arg (which could point to the same
virSocketAddr as the original, if the caller really wants to modify
it).
This also makes the API consistent with virSocketAddrBroadcast[ByPrefix].
Previously we used ioctl() to set the IP address and netmask of the
bridges used for virtual networks, and apparently the SIOCSIFNETMASK
ioctl implicitly set the broadcast address for the interface. The new
method of using the "ip" command requires broadcast address to be
explicitly specified though.
These functions work only for IPv4, becasue IPv6 doesn't have the same
concept of "broadcast address" as IPv4. They merely OR the inverse of
the netmask with the given host address, thus turning on all the host
bits.
Add vboxArrayGetWithUintArg to handle new signature variations. Also
refactor vboxArrayGet* implementation to use a common helper function.
Deal with the incompatible changes in the VirtualBox 4.0 API. This
includes major changes in virtual machine and storage medium lookup,
in RDP server property handling, in session/lock handling and other
minor areas.
VirtualBox 4.0 also dropped the old event API and replaced it with a
completely new one. This is not fixed yet and will be addressed in
another patch. Therefore, currently the domain events are supported
for VirtualBox 3.x only.
Based on initial work from Jean-Baptiste Rouault.
On Windows IID's are represented as GUID by value, instead of nsID
by reference on non-Windows platforms.
Patch the vbox_CAPI_v2_2.h header to deal with this difference.
Rewrite vboxIID abstraction that deals with the different IID
representations. Add support for the GUID representation. Also unify
the four context dependent free functions for vboxIIDs
vboxIIDUnalloc, vboxIIDFree, vboxIIDUtf8Free, vboxIIDUtf16Free
into vboxIIDUnalloc that is now safe to be called (even multiple
times) on a vboxIID independent of the source and context of the
vboxIID.
The new vboxIID is designed to be used as a stack allocated variable.
It has a value member that represents the actual IID value.
This patch fixes https://bugzilla.redhat.com/show_bug.cgi?id=664406
If qemu is run as a different uid, it has been unable to access mode
0660 files that are owned by a different user, but with a group that
the qemu is a member of (aside from the one group listed in the passwd
file), because initgroups() is not being called prior to the
exec. initgroups will change the group membership of the process (and
its children) to match the new uid.
To make this happen, the setregid()/setreuid() code in
qemuSecurityDACSetProcessLabel has been replaced with a call to
virSetUIDGID(), which does both of those, plus calls initgroups.
Similar, but not identical, code in qemudOpenAsUID() has been replaced
with virSetUIDGID(). This not only consolidates the functionality to a
single location, but also potentially fixes some as-yet unreported
bugs.
virSetUIDGID() sets both the real and effective group and user of the
process, and additionally calls initgroups() to assure that the
process joins all the auxiliary groups that the given uid is a member
of.
There are cases when we want log an error message, and possibly free
some memory as part of the cleanup, while still preserving errno for a
caller, but the functions that log errors, and virFree (VIR_FREE) make
system calls that will clear errno. This patch preserves errno during
those most basic functions (corresponding to virReportSystemError(),
virReportOOMError(), networkReportError(), etc, as well as
virStrError()). It does *not preserve errno across calls to higher
level items such as virDispatchError(), as it's assumed the caller is
all finished with any need for errno by the time it dispatches the
error.
Running an instance of the router advertisement daemon (radvd) allows
guests using the virtual network to automatically acquire an IPv6
address and default route. Note that acquiring an address only works
for networks with a prefix length of exactly 64 - radvd is still run
in other circumstances, and still advertises routes, but autoconf will
not work because it requires exactly 64 bits of address info from the
network prefix.
This patch avoids a race condition with the pidfile by manually
daemonizing radvd rather than allowing it to daemonize itself, then
creating our own pidfile (in addition to radvd's own file, which is
unnecessary, but there is no way to tell radvd to not create it). This
is accomplished by exec'ing it with "--debug 1" in the commandline,
and using virCommand's features to fork, create a pidfile, and detach
from the newly forked process.
At this point everything is already in place to make IPv6 happen, we just
need to add a few rules, remove some checks for IPv4-only, and document
the changes to the XML on the website.
All of the iptables functions eventually call down to a single
bottom-level function, and fortunately, ip6tables syntax (for all the
args that we use) is identical to iptables format (except the
addresses), so all we need to do is:
1) Get an address family down to the lowest level function in each
case, either implied through an address, or explicitly when no
address is in the parameter list, and
2) At the lowest level, just decide whether to call "iptables" or
"ip6tables" based on the family.
The location of the ip6tables binary is determined at build time by
autoconf. If a particular target system happens to not have ip6tables
installed, any attempts to run it will generate an error, but that
won't happen unless someone tries to define an IPv6 address for a
network. This is identical behavior to IPv4 addresses and iptables.
This patch reorganizes the code in bridge_driver.c to account for the
concept of a single network with multiple IP addresses, without adding
in the extra variable of IPv6. A small bit of code has been
temporarily added that checks all given addresses to verify they are
IPv4 - this will be removed when full IPv6 support is turned on.
This commit adds support for IPv6 parsing and formatting to the
virtual network XML parser, including moving around data definitions
to allow for multiple <ip> elements on a single network, but only
changes the consumers of this API to accommodate for the changes in
API/structure, not to add any actual IPv6 functionality. That will
come in a later patch - this patch attempts to maintain the same final
functionality in both drivers that use the network XML parser - vbox
and "bridge" (the Linux bridge-based driver used by the qemu
hypervisor driver).
* src/libvirt_private.syms: Add new private API functions.
* src/conf/network_conf.[ch]: Change C data structure and
parsing/formatting.
* src/network/bridge_driver.c: Update to use new parser/formatter.
* src/vbox/vbox_tmpl.c: update to use new parser/formatter
* docs/schemas/network.rng: changes to the schema -
* there can now be more than one <ip> element.
* ip address is now an ip-addr (ipv4 or ipv6) rather than ipv4-addr
* new optional "prefix" attribute that can be used in place of "netmask"
* new optional "family" attribute - "ipv4" or "ipv6"
(will default to ipv4)
* define data types for the above
* tests/networkxml2xml(in|out)/nat-network.xml: add multiple <ip> elements
(including IPv6) to a single network definition to verify they are being
correctly parsed and formatted.
brSetInetAddress can only set a single IP address on the bridge, and
uses a method (ioctl(SIOCSETIFADDR)) that only works for IPv4. Replace
it and brSetInetNetmask with a single function that uses the external
"ip addr add" command to add an address/prefix to the interface - this
supports IPv6, and allows adding multiple addresses to the interface.
Although it isn't currently used in the code, we also add a
brDelInetAddress for completeness' sake.
Also, while we're modifying bridge.c, we change brSetForwardDelay and
brSetEnableSTP to use the new virCommand API rather than the
deprecated virRun, and also log an error message in bridge_driver.c if
either of those fail (previously the failure would be completely
silent).
When a netmask isn't specified for an IPv4 address, one can be implied
based on what network class range the address is in. The
virNetworkDefPrefix function does this for us, so netmask isn't
required.
IPv6 will use prefix exclusively, and IPv4 will also optionally be
able to use it, and the iptables functions really need a prefix
anyway, so use the new virNetworkDefPrefix() function to send prefixes
into iptables functions instead of netmasks.
Also, in a couple places where a netmask is actually needed, use the
new private API function for it rather than getting it directly. This
will allow for cases where no netmask or prefix is specified (it
returns the default for the current class of network.)
Some functions in this file were returning 1 on success and 0 on
failure, and others were returning 0 on success and -1 on
failure. Switch them all to return the libvirt-preferred 0/-1.
The functions in iptables.c all return -1 on failure, but all their
callers (which all happen to be in bridge_driver.c) assume that they
are returning an errno, and the logging is done accordingly. This
patch fixes all the error checking and logging to assume < 0 is an
error, and nothing else.
Later patches will add the possibility to define a network's netmask
as a prefix (0-32, or 0-128 in the case of IPv6). To make it easier to
deal with definition of both kinds (prefix or netmask), add two new
functions:
virNetworkDefNetmask: return a copy of the netmask into a
virSocketAddr. If no netmask was specified in the XML, create a
default netmask based on the network class of the virNetworkDef's IP
address.
virNetworkDefPrefix: return the netmask as numeric prefix (or the
default prefix for the network class of the virNetworkDef's IP
address, if no netmask was specified in the XML)
virSocketPrefixToNetmask: Given a 'prefix', which is the number of 1
bits in a netmask, fill in a virSocketAddr object with a netmask as an
IP address (IPv6 or IPv4).
virSocketAddrMask: Mask off the host bits in one virSocketAddr
according to the netmask in another virSocketAddr.
virSocketAddrMaskByPrefix, Mask off the host bits in a virSocketAddr
according to a prefix (number of 1 bits in netmask).
VIR_SOCKET_FAMILY: return the family of a virSocketAddr
Shorten qemuDomainSnapshotWriteSnapshotMetadata function name
and make it take a snapshot pointer instead of dealing with
the current snapshot. Update other functions accordingly.
Add a qemuDomainSnapshotReparentChildren hash iterator to
reparent the children of a snapshot that is being deleted. Use
qemuDomainSnapshotWriteMetadata to write updated metadata
to disk.
This fixes a problem where outdated parent information breaks
the snapshot tree and hinders the deletion of child snapshots.
Reported by Philipp Hahn.
I began noticing a race when reserving VNC ports as described here
https://www.redhat.com/archives/libvir-list/2010-November/msg00379.html
Turns out that we were not initializing the size field of bitmap
struct when allocating the bitmap. This subsequently caused
virBitmapSetBit() to fail since bitmap->size is 0, hence we never
actually reserved the port.
Fix glitch in commit cddd2a06 (thankfully post-0.8.6, so no
released version has the glitch).
Document and try to workaround glitch in commit 46e9b0f (in 0.8.0),
which invalidated 6 virErrorNumber values dating as far back as 0.7.1.
My audit did not find any other glitches until pre-0.1.0 days. I'm
not sure how to add a syntax-check off the top of my head, but
hopefully the explicit numbering will make people think twice about
renumbering in the future.
* include/libvirt/virterror.h (virErrorDomain): Avoid inserting
new values in the middle, and add explicit numbering to help avoid
this in the future.
(virErrorNumber): Add explicit numbering, and document the snafu.
* src/remote/remote_driver.c (remoteIO): Compensate for the snafu.
This fixes the build from a tarball and makes autobuild.sh
work again.
This should actually have been part of this earlier commit:
esx: Move VMX handling code out of the driver directory
42b2f35d36
Reported by Eric Blake.
All other drivers are explicitly linked to gnulib. The VMware
driver lacked this, resulting in mdir_name being an undefine
symbol.
Explicitly link the VMware driver to gnulib to fix this.
Now the VMware driver doesn't depend on the ESX driver anymore.
Add a WITH_VMX option that depends on WITH_ESX and WITH_VMWARE.
Also add a libvirt_vmx.syms file.
Move some escaping functions from esx_util.c to vmx.c.
Adapt the test suite, ESX and VMware driver to the new code layout.
Connecting to a ESX(i) server that is part of a cluster failed
when the connection also involved a vCenter.
Accept ClusterComputeResource type in addition to ComputeResource
type in the object lookup function.
Reported by Guillaume Le Louët.
If there is a dangling symbolic link in filesystem pool, the pool
will fail to start or refresh, this patch is to fix it by ignoring
it with a warning log.
Network disks are accessed by qemu directly, and have no
associated file on the host, so checking for file ownership etc.
is unnecessary.
Signed-off-by: Josh Durgin <joshd@hq.newdream.net>
* configure.ac (dlopen): Cygwin dlopen is in libc; avoid spurious
failure.
(XDR_CFLAGS): Define when needed.
* src/Makefile.am (libvirt_driver_remote_la_CFLAGS): Use it.
When running 'make check' under a multi-cpu Dom0 xen machine,
nodeinfotest had a spurious failure it was reading from
/sys/devices/system/cpu, but xen has no notion of topology. The test
was intended to be isolated from reading any real system files; the
regression was introduced in Mar 2010 with commit aa2f6f96dd.
Fix things by allowing an early exit for the testsuite.
* src/nodeinfo.c (linuxNodeInfoCPUPopulate): Add parameter.
(nodeGetInfo): Adjust caller.
* tests/nodeinfotest.c (linuxTestCompareFiles): Likewise.
* configure.ac (with_selinux): Check for <selinux/label.h>.
* src/security/security_selinux.c (getContext): New function.
(SELinuxRestoreSecurityFileLabel): Use it to restore compilation
when using older libselinux.
While not technically a double free (since VIR_FREE NULLs the
pointer), this is unnecessary extra code.
This crept in when the function was converted from virRun to virCommand.
The AUTHORS file has also been updated.
XPCOM returns an array as a pointer to an array of pointers to the
actual items. When the array isn't needed anymore the items are
released, but the actual array containing the pointers to the items
was not freed and leaked.
Free the actual array using ComUnallocMem.
This doesn't affect MSCOM as SafeArrayDestroy releases all items
and frees the array.
Don't require dlopen, but link to ole32 and oleaut32 on Windows.
Don't expose g_pVBoxFuncs anymore. It was only used to get the
version of the API. Make VBoxCGlueInit return the version instead.
This simplifies the implementation of the MSCOM glue layer.
Get the VirtualBox version from the registry.
Add a dummy implementation of the nsIEventQueue to the MSCOM glue
as there seems to be no direct equivalent with MSCOM. It might be
implemented using the normal window message loop. This requires
additional investigation.
The QEMU driver file is far too large. Move all the hotplug
helper code out into a separate file. No functional change.
* src/qemu/qemu_hotplug.c, src/qemu/qemu_hotplug.h,
src/Makefile.am: Add hotplug helper file
* src/qemu/qemu_driver.c: Delete hotplug code
To allow the APIs to be used from separate files, move the domain
lock / job helper code into qemu_domain.c
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h: Add domain lock
/ job code
* src/qemu/qemu_driver.c: Remove domain lock / job code
To allow their use from other source files, move qemuDriverLock
and qemuDriverUnlock to qemu_conf.h and make them non-static
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Add qemuDriverLock
qemuDriverUnlock
* src/qemu/qemu_driver.c: Remove qemuDriverLock and qemuDriverUnlock
The QEMU driver file is far too large. Move all the hostdev
helper code out into a separate file. No functional change.
* src/qemu/qemu_hostdev.c, src/qemu/qemu_hostdev.h,
src/Makefile.am: Add hostdev helper file
* src/qemu/qemu_driver.c: Delete hostdev code
The QEMU driver file is far too large. Move all the cgroup
helper code out into a separate file. No functional change.
* src/qemu/qemu_cgroup.c, src/qemu/qemu_cgroup.h,
src/Makefile.am: Add cgroup helper file
* src/qemu/qemu_driver.c: Delete cgroup code
The QEMU driver file is far too large. Move all the audit
helper code out into a separate file. No functional change.
* src/qemu/qemu_audit.c, src/qemu/qemu_audit.h,
src/Makefile.am: Add audit helper file
* src/qemu/qemu_driver.c: Delete audit code
Move the code for handling the QEMU virDomainObjPtr private
data, and custom XML namespace into a separate file
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h: New file
for private data & namespace code
* src/qemu/qemu_driver.c, src/qemu/qemu_driver.h: Remove
private data & namespace code
* src/qemu/qemu_driver.h, src/qemu/qemu_command.h: Update
includes
* src/Makefile.am: Add src/qemu/qemu_domain.c
The qemu_conf.c code is doing three jobs, driver config file
loading, QEMU capabilities management and QEMU command line
management. Move the command line code into its own file
* src/qemu/qemu_command.c, src/qemu/qemu_command.h: New
command line management code
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Delete command
line code
* src/qemu/qemu_conf.h, src/qemu_conf.c: Adapt for API renames
* src/Makefile.am: add src/qemu/qemu_command.c
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_text.c: Add
import of qemu_command.h
The qemu_conf.c code is doing three jobs, driver config file
loading, QEMU capabilities management and QEMU command line
management. Move the capabilities code into its own file
* src/qemu/qemu_capabilities.c, src/qemu/qemu_capabilities.h: New
capabilities management code
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Delete capabilities
code
* src/qemu/qemu_conf.h: Adapt for API renames
* src/Makefile.am: add src/qemu/qemu_capabilities.c
So far, CPUID data were stored in two different data structures. First
of them was a structure allowing direct access for CPUID data according
to function number and the second was a plain array of struct
cpuX86cpuid. This was a silly design which resulted in converting data
from one type to the other and back again or implementing similar
functionality for both data structures.
The patch leaves only the direct access structure. This makes the code
both smaller and more maintainable since operations on different objects
can use common low-level operations.
All 57 tests for cpu subsystem still pass after this rewrite.
Allows compilation, but no creation of child processes yet. Take it
one step at a time.
* src/util/util.c (virExecWithHook) [WIN32]: New dummy function.
* src/libvirt_private.syms: Export it.
* .gnulib: Update to latest.
* bootstrap.conf (gnulib_modules): Import pipe-posix and waitpid
for mingw.
* src/remote/remote_driver.c (pipe) [WIN32]: Drop dead macro.
* daemon/event.c (pipe) [WIN32]: Drop dead function.
Currently, all of domain "save/dump/managed save/migration"
use the same function "qemudDomainWaitForMigrationComplete"
to wait the job finished, but the error messages are all
about "migration", e.g. when a domain saving job is canceled
by user, "migration was cancled by client" will be throwed as
an error message, which will be confused for user.
As a solution, intoduce two new job types(QEMU_JOB_SAVE,
QEMU_JOB_DUMP), and set "priv->jobActive" to "QEMU_JOB_SAVE"
before saving, to "QEMU_JOB_DUMP" before dumping, so that we
could get the real job type in
"qemudDomainWaitForMigrationComplete", and give more clear
message further.
And as It's not important to figure out what's the exact job
is in the DEBUG and WARN log, also we don't need translated
string in logs, simply repace "migration" with "job" in some
statements.
* src/qemu/qemu_driver.c
Current code does not pass VM mac address to a 802.1Qbh direct attach
interface using IFLA_VF_MAC. This patch adds support in macvtap code to
send IFLA_VF_MAC netlink request during port profile association on a
802.1Qbh interface.
Stefan Cc'ed for comments because this patch changes a condition for
802.1Qbg
802.1Qbh support for IFLA_VF_MAC in enic driver has been posted and is
pending acceptance at http://marc.info/?l=linux-netdev&m=129185244410557&w=2
This is pretty straightforward - even though dnsmasq gets daemonized
and uses a pid file, those things are both handled by the dnsmasq
binary itself. And libvirt doesn't need any of the output of the
dnsmasq command either, so we just setup the args and call
virRun(). Mainly it was just a (mostly) mechanical job of replacing
the APPEND_ARG() macro (and some other *printfs()) with
virCommandAddArg*().
Instead of just reporting that a task failed get the
localized message from the TaskInfo error and include
it in the reported error message.
Implement minimal deserialization support for the
MethodFault type in order to obtain the actual fault
type.
For example, this changes the reported error message
when trying to create a volume with zero size from
Could not create volume
to
Could not create volume: InvalidArgument - A specified parameter was not correct.
Not perfect yet, but better than before.
Changes common to all network disks:
-Make source name optional in the domain schema, since NBD doesn't use it
-Add a hostName type to the domain schema, and use it instead of genericName, which doesn't include .
-Don't leak host names or ports
-Set the source protocol in qemuParseCommandline
Signed-off-by: Josh Durgin <joshd@hq.newdream.net>
This patch adds network disk support to libvirt/QEMU. The currently
supported protocols are nbd, rbd, and sheepdog. The XML syntax is like
this:
<disk type="network" device="disk">
<driver name="qemu" type="raw" />
<source protocol='rbd|sheepdog|nbd' name="...some image identifier...">
<host name="mon1.example.org" port="6000">
<host name="mon2.example.org" port="6000">
<host name="mon3.example.org" port="6000">
</source>
<target dev="vda" bus="virtio" />
</disk>
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
`dump' watchdog action lets libvirtd to dump the guest when receives a
watchdog event (which probably means a guest crash)
Currently only qemu is supported.
When we get an EOF event on monitor connection, it may be a result of
either crash or graceful shutdown. QEMU which supports async events
(i.e., we are talking to it using JSON monitor) emits SHUTDOWN event on
graceful shutdown. In case we don't get this event by the time monitor
connection is closed, we assume the associated domain crashed.
Currently libvirt doesn't confirm whether the guest has responded to the
disk removal request. In some cases this can leave the guest with
continued access to the device while the mgmt layer believes that it has
been removed. With a recent qemu monitor command[1] we can
deterministically revoke a guests access to the disk (on the QEMU side)
to ensure no futher access is permitted.
This patch adds support for the drive_del() command and introduces it
in the disk removal paths. If the guest is running in a QEMU without this
command we currently explicitly check for unknown command/CommandNotFound
and log the issue.
If QEMU supports the command we issue the drive_del command after we attempt
to remove the device. The guest may respond and remove the block device
before we get to attempt to call drive_del. In that case, we explicitly check
for 'Device not found' from the monitor indicating that the target drive
was auto-deleted upon guest responds to the device removal notification.
1. http://thread.gmane.org/gmane.comp.emulators.qemu/84745
Signed-off-by: Ryan Harper <ryanh@us.ibm.com>
Currently libvirt doesn't confirm whether the guest has responded to the
disk removal request. In some cases this can leave the guest with
continued access to the device while the mgmt layer believes that it has
been removed. With a recent qemu monitor command[1] we can
deterministically revoke a guests access to the disk (on the QEMU side)
to ensure no futher access is permitted.
This patch adds support for the drive_unplug() command and introduces it
in the disk removal paths. There is some discussion to be had about how
to handle the case where the guest is running in a QEMU without this
command (and the fact that we currently don't have a way of detecting
what monitor commands are available).
Changes since v2:
- use VIR_ERROR to report when unplug command not found
Changes since v1:
- return > 0 when command isn't present, < 0 on command failure
- detect when drive_unplug command isn't present and log error
instead of failing entire command
Signed-off-by: Ryan Harper <ryanh@us.ibm.com>
- qemudDomainAttachPciControllerDevice: Don't build "devstr"
if "-device" of qemu is not available, as "devstr" will only
be used by "qemuMonitorAddDevice", which depends on "-device"
argument of qemu is supported.
- "qemudDomainSaveImageOpen": Fix indent problem.
* src/qemu/qemu_driver.c
Commit febc591683 introduced -vga none in
case no video card is included in domain XML. However, old qemu
versions do not support this and such domain cannot be successfully
started.
* .gnulib: Update to latest, for at least a stdint.h fix
* src/storage/storage_driver.c (storageVolumeZeroSparseFile)
(storageWipeExtent): Use better type, although it still triggers
spurious -Wformat warning on MacOS's gcc.
popen must be matched with pclose (not fclose), or it will leak
resources. Furthermore, it is a lousy interface when it comes to
signal handling. We're much better off using our decent command
wrapper. Note that virCommand guarantees that VIR_FREE(outbuf) is
both required and safe to call, whether virCommandRun succeeded or
failed.
* src/openvz/openvz_conf.c (openvzLoadDomains, openvzGetVEID):
Replace popen with virCommand usage.
Guarantee that outbuf/errbuf are allocated on success, even if to the
empty string. Caller always has to free the result, and empty output
check requires checking if *outbuf=='\0'. Makes the API easier to use
safely. Failure is best effort allocation (some paths, like
out-of-memory, cannot allocate a buffer, but most do), so caller must
free buffer on failure.
* docs/internals/command.html.in: Update documentation.
* src/util/command.c (virCommandSetOutputBuffer)
(virCommandSetErrorBuffer, virCommandProcessIO) Guarantee empty
string on no output.
* tests/commandtest.c (test17): New test.
* docs/internals/command.html.in: Better documentation of buffer
vs. fd considerations.
* src/util/command.c (virCommandRunAsync): Reject raw execution
with string io.
(virCommandRun): Reject execution with user-specified fds not
visiting a regular file.
* src/conf/domain_conf.c (virDomainDefParseXML): Prefer sysinfo
uuid over generating one, and if both uuids are present, require
them to be identical.
* src/qemu/qemu_conf.c (qemuBuildSmbiosSystemStr): Allow skipping
the uuid.
(qemudBuildCommandLine): Adjust caller; <smbios mode=host/> must
not use host uuid in place of guest uuid.
The log lists things like -smbios type=1,vendor="Red Hat", which
is great for shell parsing, but not so great when you realize that
execve() then passes those literal "" on as part of the command
line argument, such that qemu sets SMBIOS with extra literal quotes.
The eventual addition of virCommand is needed before we have the API
to shell-quote a string representation of a command line, so that the
log can still be pasted into a shell, but without inserting extra
bytes into the execve() arguments.
* src/qemu/qemu_conf.c (qemuBuildSmbiosBiosStr)
(qemuBuildSmbiosSystemStr): Qemu doesn't like quotes around uuid
arguments, and the remaining quotes are passed literally to
smbios, making <smbios mode='host'/> inaccurate. Removing the
quotes makes the log harder to parse, but that can be fixed later
with virCommand improvements.
* tests/qemuxml2argvdata/qemuxml2argv-smbios.args: 'Fix' test; it
will need fixing again once virCommand learns how to shell-quote a
potential command line.
Humans consider January as month #1, while gmtime_r(3) calls it month #0.
While fixing it, render qemu's rtc parameter with leading zeros, as is more
commonplace.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=660194
* src/util/threads.h (virThreadID): New prototype.
* src/util/threads-pthread.c (virThreadID): New function.
* src/util/threads-win32.c (virThreadID): Likewise.
* src/libvirt_private.syms (threads.h): Export it.
* daemon/event.c (virEventInterruptLocked): Use it to avoid
warning on BSD systems.
"virCommandRun": if "cmd->outbuf" or "cmd->errbuf" is NULL,
libvirtd will be crashed when trying to start a qemu domain
(which invokes "virCommandRun"), it caused by we try to use
"*cmd->outbuf" and "*cmd->errbuf" regardless of cmd->outbuf
or cmd->errbuf is NULL.
* src/util/command.c (virCommandRun)
Two more calls to remote libvirtd have to be surrounded by
qemuDomainObjEnterRemoteWithDriver() and
qemuDomainObjExitRemoteWithDriver() to prevent possible deadlock between
two communicating libvirt daemons.
See commit f0c8e1cb37 for further details.
virDrvSupportsFeature API is allowed to return -1 on error while all but
one uses of VIR_DRV_SUPPORTS_FEATURE only check for (non)zero return
value. Let's make this macro return zero on error, which is what
everyone expects anyway.
This patch adds a mode_t parameter to virFileWriteStr().
If mode is different from 0, virFileWriteStr() will try
to create the file if it doesn't exist.
* src/util/util.h (virFileWriteStr): Alter signature.
* src/util/util.c (virFileWriteStr): Allow file creation.
* src/network/bridge_driver.c (networkEnableIpForwarding)
(networkDisableIPV6): Adjust clients.
* src/node_device/node_device_driver.c
(nodeDeviceVportCreateDelete): Likewise.
* src/util/cgroup.c (virCgroupSetValueStr): Likewise.
* src/util/pci.c (pciBindDeviceToStub, pciUnBindDeviceFromStub):
Likewise.
* src/qemu/qemu_conf.c (qemudExtractVersionInfo): Check for file
before executing it here, rather than in callers.
(qemudBuildCommandLine): Rewrite with virCommand.
* src/qemu/qemu_conf.h (qemudBuildCommandLine): Update signature.
* src/qemu/qemu_driver.c (qemuAssignPCIAddresses)
(qemudStartVMDaemon, qemuDomainXMLToNative): Adjust callers.
This proof of concept shows how two existing uses of virExec
and virRun can be ported to the new virCommand APIs, and how
much simpler the code becomes
This introduces a new set of APIs in src/util/command.h
to use for invoking commands. This is intended to replace
all current usage of virRun and virExec variants, with a
more flexible and less error prone API.
* src/util/command.c: New file.
* src/util/command.h: New header.
* src/Makefile.am (UTIL_SOURCES): Build it.
* src/libvirt_private.syms: Export symbols internally.
* tests/commandtest.c: New test.
* tests/Makefile.am (check_PROGRAMS): Run it.
* tests/commandhelper.c: Auxiliary program.
* tests/commanddata/test2.log - test15.log: New expected outputs.
* cfg.mk (useless_free_options): Add virCommandFree.
(msg_gen_function): Add virCommandError.
* po/POTFILES.in: New translation.
* .x-sc_avoid_write: Add exemption.
* tests/.gitignore: Ignore new built file.
* src/qemu/qemu_driver.c (though MACROS QEMU_VNC_PORT_MAX, and
QEMU_VNC_PORT_MIN are defined at the beginning, numbers (65535, 5900)
are still used, replace them)
The arguments passed to the thread function must be allocated on
the heap, rather than the stack, since it is possible for the
spawning thread to continue before the new thread runs at all.
In such a case, it is possible that the area of stack where the
thread args were stored is overwritten.
* src/util/threads-pthread.c, src/util/threads-win32.c: Allocate
thread arguments on the heap
Use macvtap specific functions depending on WITH_MACVTAP.
Use #if instead of #ifdef to check for WITH_MACVTAP, because
WITH_MACVTAP is always defined with value 0 or 1.
Also export virVMOperationType{To|From}String unconditional,
because they are used unconditional in the domain config code.
When dumping a domain, it's reasonable to save dump-file in raw format
if dump format is misconfigured or the corresponding compress program
is not available rather then fail dumping.
This patch introduces the usage of the pre-associate state of the IEEE 802.1Qbg standard on incoming VM migration on the target host. It is in response to bugzilla entry 632750.
https://bugzilla.redhat.com/show_bug.cgi?id=632750
For being able to differentiate the exact reason as to why a macvtap device is being created, either due to a VM creation or an incoming VM migration, I needed to pass that reason as a parameter from wherever qemudStartVMDaemon is being called in order to determine whether to send an ASSOCIATE (VM creation) or a PRE-ASSOCIATE (incoming VM migration) towards lldpad.
I am also fixing a problem with the virsh domainxml-to-native call on the way.
Gerhard successfully tested the patch with a recent blade network 802.1Qbg-compliant switch.
The patch should not have any side-effects on the 802.1Qbh support in libvirt, but Roopa (cc'ed) may want to verify this.
We currently use the next free veid although there's one given in the
domain xml. This currently breaks defining new domains since vmdef->name
and veid don't match leading to the following error later on:
error: Failed to define domain from 110.xml
error: internal error Could not set UUID
Since silently ignoring vmdef->name is not nice respect it instead. We
avoid veid collisions in the upper levels already.
This reverts commit
Log all errors at level INFO to stop polluting syslog
04bd0360f3.
and makes virRaiseErrorFull() log errors at debug priority
when called from inside libvirtd. This stops libvirtd from
polluting it's own log with client errors at error priority
that'll be reported and logged on the client side anyway.
When we set migrate_speed by json, we receive the following
error message:
libvirtError: internal error unable to execute QEMU command
'migrate_set_speed': Invalid parameter type, expected: number
The reason is that: the arguments of migrate_set_speed
by json is json number, not json string.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
The nodeinfo structure includes
nodes : the number of NUMA cell, 1 for uniform mem access
sockets : number of CPU socket per node
cores : number of core per socket
threads : number of threads per core
which does not work well for NUMA topologies where each node does not
consist of integral number of CPU sockets.
We also have VIR_NODEINFO_MAXCPUS macro in public libvirt.h which
computes maximum number of CPUs as (nodes * sockets * cores * threads).
As a result, we can't just change sockets to report total number of
sockets instead of sockets per node. This would probably be the easiest
since I doubt anyone is using the field directly. But because of the
macro, some apps might be using sockets indirectly.
This patch leaves sockets to be the number of CPU sockets per node (and
fixes qemu driver to comply with this) on machines where sockets can be
divided by nodes. If we can't divide sockets by nodes, we behave as if
there was just one NUMA node containing all sockets. Apps interested in
NUMA should consult capabilities XML, which is what they probably do
anyway.
This way, the only case in which apps that care about NUMA may break is
on machines with funky NUMA topology. And there is a chance libvirt
wasn't able to start any guests on those machines anyway (although it
depends on the topology, total number of CPUs and kernel version).
Nothing changes at all for apps that don't care about NUMA.
security_context_t happens to be a typedef for char*, and happens to
begin with a string usable as a raw context string. But in reality,
it is an opaque type that may or may not have additional information
after the first NUL byte, where that additional information can
include pointers that can only be freed via freecon().
Proof is from this valgrind run of daemon/libvirtd:
==6028== 839,169 (40 direct, 839,129 indirect) bytes in 1 blocks are definitely lost in loss record 274 of 274
==6028== at 0x4A0515D: malloc (vg_replace_malloc.c:195)
==6028== by 0x3022E0D48C: selabel_open (label.c:165)
==6028== by 0x3022E11646: matchpathcon_init_prefix (matchpathcon.c:296)
==6028== by 0x3022E1190D: matchpathcon (matchpathcon.c:317)
==6028== by 0x4F9D842: SELinuxRestoreSecurityFileLabel (security_selinux.c:382)
800k is a lot of memory to be leaking.
* src/storage/storage_backend.c
(virStorageBackendUpdateVolTargetInfoFD): Avoid leak on error.
* src/security/security_selinux.c
(SELinuxReserveSecurityLabel, SELinuxGetSecurityProcessLabel)
(SELinuxRestoreSecurityFileLabel): Use correct function to free
security_context_t.
Making this change makes it easier to spot the memory leaks
that will be fixed in the next patch.
* cfg.mk (sc_prohibit_xmlGetProp): New rule.
* .x-sc_prohibit_xmlGetProp: New exception.
* Makefile.am (EXTRA_DIST): Ship exception file.
* tools/virsh.c (cmdDetachInterface, cmdDetachDisk): Adjust
offenders.
* src/conf/storage_conf.c (virStoragePoolDefParseSource):
Likewise.
* src/conf/network_conf.c (virNetworkDHCPRangeDefParseXML)
(virNetworkIPParseXML): Likewise.
virConnectClose calls virUnrefConnect which in turn closes
all open drivers when the refcount of that connection dropped
to zero. This works fine when you free all other objects that
hold a ref to the connection before you close it, because in
this case virUnrefConnect is the one that removes the last
ref to the connection.
But it doesn't work when you close the connection first before
freeing the other objects. This is because the other virUnref*
functions call virReleaseConnect when they detect that the
connection's refcount dropped to zero. In this case another
virUnref* function (different from virUnrefConnect) removes the
last ref to the connection. This results in not closing the
open drivers and leaking things that should have been cleaned
up in the driver close functions.
To fix this move the driver close calls to virReleaseConnect.
Except LXC and UML driver, implementations of all other drivers
simply return 0, because these drivers doesn't have config both
in memory and on disk, no need to track if the domain of these
drivers updated or not.
Rename "xenUnifiedDomainisPersistent" to "xenUnifiedDomainIsPersistent"
* esx/esx_driver.c
* lxc/lxc_driver.c
* opennebula/one_driver.c
* openvz/openvz_driver.c
* phyp/phyp_driver.c
* test/test_driver.c
* uml/uml_driver.c
* vbox/vbox_tmpl.c
* xen/xen_driver.c
* xenapi/xenapi_driver.c
introduce new public API "virDomainIsUpdated"
* src/conf/domain_conf.h (new member "updated" for "virDomainObj")
* src/libvirt_public.syms
* include/libvirt/libvirt.h.in
gnulib wraps Windows' SOCKET handle based send() and recv() functions
into file descriptor based ones that are used in libvirt.
Even though GnuTLS is using gnulib too, it explicitly doesn't use
gnulib's replacement functions on Windows. By default GnuTLS uses the
SOCKET handle based send() and recv(). This makes gnutls_handshake()
fail internally with a WSAENOTSOCK error because libvirt passes a
file descriptor; GnuTLS needs the SOCKET handle.
To avoid this mismatch make sure that GnuTLS uses gnulib's replacment
functions, by setting custom pull() and push() functions for GnuTLS.
The stdio.h header has a function called 'remove' declared. This
clashes with the 'remove' parameter in virShrinkN
* src/util/memory.c: Rename 'remove' to 'toremove'
The SCSI volumes currently get a name like '17:0:0:1' based
on $host:$bus:$target:$lun. The names are intended to be unique
per pool and stable across pool restarts. The inclusion of the
$host component breaks this, because the $host number for iSCSI
pools is dynamically allocated by the kernel at time of login.
This changes the name to be 'unit:0:0:1', ie removes the leading
host component. The 'unit:' prefix is just to ensure the volume
name doesn't start with a number and make it clearer when seen
out of context.
* src/storage/storage_backend_scsi.c: Improve volume name
field value stability and uniqueness
Many operations are not valid on inactive storage pools. The
storage driver is currently returning VIR_ERR_INTERNAL_ERROR
in these cases, rather than the more suitable error code
VIR_ERR_OPERATION_INVALID
* src/storage/storage_driver.c: Fix error code when pool
is not active
When libvirt starts up all storage pools default to the inactive
state, even if the underlying storage is already active on the
host. This introduces a new API into the internal storage backend
drivers that checks whether a storage pool is already active. If
the pool is active at libvirtd startup, the volume list will be
immediately populated.
* src/storage/storage_backend.h: New internal API for checking
storage pool state
* src/storage/storage_driver.c: Check whether a pool is active
upon driver startup
* src/storage/storage_backend_fs.c, src/storage/storage_backend_iscsi.c,
src/storage/storage_backend_logical.c, src/storage/storage_backend_mpath.c,
src/storage/storage_backend_scsi.c: Add checks for pool state
Since the previous patch added support for parsing the output of
the 'sendtargets' command, it is now trivial to support the
storage pool discovery API.
Given a hostname and optional portnumber and initiator IQN,
the code can return a full list of storage pool source docs,
each one representing a iSCSI target.
* src/storage/storage_backend_iscsi.c: Wire up target
auto-discovery
The Linux iSCSI initiator toolchain has the dubious feature that
if you ever run the 'sendtargets' command to merely query what
targets are available from a server, the results will be recorded
in /var/lib/iscsi. Any time the '/etc/init.d/iscsi' script runs
in the future, it will then automatically login to all those
targets. /etc/init.d/iscsi is automatically run whenever a NIC
comes online.
So from the moment you ask a server what targets are available,
your client will forever more automatically try to login to all
targets without ever asking if you actually want it todo this.
To stop this stupid behaviour, we need to run
iscsiadm --portal $PORTAL --target $TARGET
--op update --name node.startup --value manual
For every target on the server.
* src/storage/storage_backend_iscsi.c: Disable automatic login
for targets found as a result of a 'sendtargets' command
The following series of patches are adding significant
extra functionality to the iSCSI driver. THe current
internal helper methods are not sufficiently flexible
to cope with these changes. This patch refactors the
code to avoid needing to have a virStoragePoolObjPtr
instance as a parameter, instead passing individual
target, portal and initiatoriqn parameters.
It also removes hardcoding of port 3260 in the portal
address, instead using the XML value if any.
* src/storage/storage_backend_iscsi.c: Refactor internal
helper methods
The XML docs describe a 'port' attribute for the
storage source <host> element, but the parser never
handled it.
* docs/schemas/storagepool.rng: Define port attribute
* src/conf/storage_conf.c: Add missing parsing/formatting
of host port number
* src/conf/storage_conf.h: Remove bogus/unused 'protocol' field
When running non-root, the QEMU log file is usually opened with
truncation, since there is no logrotate for non-root usage.
This means that when libvirt logs the shutdown timestamp, the
log is accidentally truncated
* src/qemu/qemu_driver.c: Never truncate log file with shutdown
message
The QEMU logger appends a ':' to the timestamp when it deems
it neccessary, so the virTimestamp API should not duplicate
this
* src/util/util.c: Remove trailing ':' from timestamp
Everytime a public API returns an error, libvirtd pollutes
syslog with that error message. Reduce the error logging
level to INFO so these don't appear by default.
* src/util/virterror.c: Log all errors at INFO
The virFork call resets all logging handlers that may have been
set. Re-enable them after fork in virExec, so that env variables
fir LIBVIRT_LOG_OUTPUTS and LIBVIRT_LOG_FILTERS take effect
until the execve()
* src/util/util.c: Preserve logging in child in virExec
To allow messages from different threads to be untangled,
include an integer thread identifier in log messages.
* src/util/logging.c: Include thread ID
* src/util/threads.h, src/util/threads.h, src/util/threads-pthread.c:
Add new virThreadSelfID() function
* configure.ac: Check for sys/syscall.h
Do this by adding a helper function to get the persistent domain config. This
should be useful for other functions that may eventually want to alter
the persistent domain config (attach/detach device). Also make similar changes
to the test drivers setvcpus command.
A caveat is that the function will return the running config for a transient
domain, rather than error. This simplifies callers, as long as they use
other methods to ensure the guest is persistent.
Doing 'virsh setvcpus $vm --config 10' doesn't check the value against the
domains maxvcpus value. A larger value for example will prevent the guest
from starting.
Also make a similar change to the test driver.
The current semantics of non-persistent hotplug/update are confusing: the
changes will persist as long as the in memory domain definition isn't
overwritten. This means hotplug changes stay around until the domain is
redefined or libvirtd is restarted.
Call virDomainObjSetDefTransient at VM startup, so that we properly discard
hotplug changes when the VM is shutdown.
This function sets the running domain definition as transient, by reparsing
the persistent config and assigning it to newDef. This ensures that any
changes made to the running definition and not the persistent config are
discarded when the VM is shutdown.
This patch makes two corrections to the newly-added QED support patch series:
- Correct the QED header field offsets
- Remove XML parsing for VIR_STORAGE_FILE_AUTO_SAFE
Signed-off-by: Adam Litke <agl@us.ibm.com>
The IP address learning thread was causing a deadlock when it instantiated a filter while a filter update/change was ongoing. The reason for this was the ordering of locks due to the following calls
virNWFilterUnlockFilterUpdates()
virNWFilterPoolObjFindByName()
The below patch now puts the order of the locks in the above shown order when instantiating the filter from the IP address learning thread.
Implement getBackingStore() for QED images. The header format is defined in
the QED spec: http://wiki.qemu.org/Features/QED .
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Stefan Hajnoczi <stefan.hajnoczi@uk.ibm.com>
Cc: Anthony Liguori <aliguori@linux.vnet.ibm.com>
Add an entry in fileTypeInfo for QED image files.
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Stefan Hajnoczi <stefan.hajnoczi@uk.ibm.com>
Cc: Anthony Liguori <aliguori@linux.vnet.ibm.com>
Disk image formats that wish to opt-out of version validation are supposed to
set versionOffset to -1 in their fileTypeInfo entry.
By unconditionally returning False for these formats,
virStorageFileMatchesVersion() incorrectly reports a version mismatch when the
test was actually skipped. The correct behavior is to return True so these
formats can be successfully probed using the magic bytes alone.
Signed-off-by: Adam Litke <agl@us.ibm.com>
The code in SELinuxRestoreSecurityChardevLabel() was trying to
use SELinuxSetFilecon directly for devices or file types while
it should really use SELinuxRestoreSecurityFileLabel encapsulating
routine, which avoid various problems like resolving symlinks,
making sure he file exists and work around NFS problems
Include locale.h for setlocale().
Revert the usage string back to it's original form.
Use puts() instead of fputs(), as fputs() expects a FILE*.
Add closing parenthesis to some vah_error() calls.
Use argv[0] instead of an undefined argv0.
These messages are visible to the user, so they should be
consistently translated.
* cfg.mk (msg_gen_function): Add vah_error, vah_warning.
* src/security/virt-aa-helper.c: Translate messages.
(catchXMLError): Fix capitalization.
Per the gettext developer:
http://lists.gnu.org/archive/html/bug-gnu-utils/2010-10/msg00019.htmlhttp://lists.gnu.org/archive/html/bug-gnu-utils/2010-10/msg00021.html
gettext() doesn't work correctly on all platforms unless you have
called setlocale(). Furthermore, gnulib's gettext.h has provisions
for setting up a default locale, which is the preferred method for
libraries to use gettext without having to call textdomain() and
override the main program's default domain (virInitialize already
calls bindtextdomain(), but this is insufficient without the
setlocale() added in this patch; and a redundant bindtextdomain()
in this patch doesn't hurt, but serves as a good example for other
packages that need to bind a second translation domain).
This patch is needed to silence a new gnulib 'make syntax-check'
rule in the next patch.
* daemon/libvirtd.c (main): Setup locale and gettext.
* src/lxc/lxc_controller.c (main): Likewise.
* src/security/virt-aa-helper.c (main): Likewise.
* src/storage/parthelper.c (main): Likewise.
* tools/virsh.c (main): Fix exit status.
* src/internal.h (DEFAULT_TEXT_DOMAIN): Define, for gettext.h.
(_): Simplify definition accordingly.
* po/POTFILES.in: Add src/storage/parthelper.c.
I am replacing the last instances of close() I found with VIR_CLOSE() / VIR_FORCE_CLOSE respectively.
The first part patches virsh, which I missed out on previously.
The 2nd patch I had left out intentionally to look at it more carefully:
The 'closed' variable could be easily removed since it wasn't used anywhere else. The possible race condition that could result from the filedescriptor being closed and not set to -1 (and possibly let us write into 'something' totally different if the fd was allocated by another thread) seems to be prevented by the qemuMonitorLock() already placed around the code that reads from or writes to the fd. So the change of this code as shown in the patch should not have any side-effects.
Rather than only cleaning any remaining ebtables rules, also clean those applied to iptables and ip6tables when detecting the IP address of an interface. Previous applied iptables rules may hinder DHCP packets.
Similarly to deprecating close(), I am now deprecating fclose() and
introduce VIR_FORCE_FCLOSE() and VIR_FCLOSE(). Also, fdopen() is replaced with
VIR_FDOPEN().
Most of the files are opened in read-only mode, so usage of
VIR_FORCE_CLOSE() seemed appropriate. Others that are opened in write
mode already had the fclose()< 0 check and I converted those to
VIR_FCLOSE()< 0.
I did not find occurrences of possible double-closed files on the way.
Currently only support domain start and shutdown, for domain start,
record timestamp before the qemu command line, and for domain shutdown,
just say it's shutting down with timestamp.
* src/qemu/qemu_driver.c (qemudStartVMDaemon, qemudShutdownVMDaemon
introduced two macros - START_POSTFIX, SHUTDOWN_POSTFIX)
* src/nwfilter/nwfilter_ebiptables_driver.c
(ebiptablesWriteToTempFile): Use /bin/sh.
(bash_cmd_path): Delete.
(ebiptablesDriverInit, ebiptablesDriverShutdown): No need to
search for bash.
(CMD_EXEC): Prefer $() over ``, since we can assume POSIX.
(iptablesSetupVirtInPost): Use portable 'test' syntax.
(iptablesLinkIPTablesBaseChain): Use POSIX $(()) syntax.
This is more flexible regarding the location of the python binary
but doesn't allow to pass the -u flag. The -i flag can be passed
from inside the script using the PYTHONINSPECT env variable.
This fixes a problem with the esx_vi_generator.py on FreeBSD.
This makes the storage driver fail when the connection is
opened with the VIR_CONNECT_RO flag, resulting in a read-only
connection with no storage driver.
In a first step I am converting the netlink message construction in
macvtap code to use libnl. It's pretty much a 1:1 conversion except that
now the message needs to be allocated and deallocated.
When <uuid> is not in the XML, a virUUIDGenerate() ends up being called which
is unnecessary and can lead to crashes if /dev/urandom isn't available
because virRandomInitialize() is not called within virt-aa-helper. This patch
adds verify_xpath_context() and updates caps_mockup() to use it.
Bug-Ubuntu: https://launchpad.net/bugs/672943
If virDomainAttachDevice() was called with an image that was located
on a root-squashed NFS server, and in a directory that was unreadable
by root on the machine running libvirtd, the attach would fail due to
an attempt to change the selinux label of the image with EACCES (which
isn't covered as an ignore case in SELinuxSetFilecon())
NFS doesn't support SELinux labelling anyway, so we mimic the failure
handling of commit 93a18bbafa, which
just ignores the errors if the target is on an NFS filesystem (in
SELinuxSetSecurityAllLabel() only, though.)
This can be seen as a follow-on to commit
347d266c51, which ignores file open
failures of files on NFS that occur directly in
virDomainDiskDefForeachPath() (also necessary), but does not ignore
failures in functions that are called from there (eg
SELinuxSetSecurityFileLabel()).
Introduce implementations of the virDomainOpenConsole() API
for LXC, Xen and UML drivers.
* src/lxc/lxc_driver.c, src/lxc/lxc_driver.c,
src/xen/xen_driver.c: Wire up virDomainOpenConsole
The util/threads.c/h code already has APIs for mutexes,
condition variables and thread locals. This commit adds
in code for actually creating threads.
* src/libvirt_private.syms: Export new symbols
* src/util/threads.h: Define APIs virThreadCreate, virThreadSelf,
virThreadIsSelf and virThreadJoin
* src/util/threads-win32.c, src/util/threads-win32.h: Win32
impl of threads
* src/util/threads-pthread.c, src/util/threads-pthread.h: POSIX
impl of threads
This provides an implementation of the virDomainOpenConsole
API with the QEMU driver. For the streams code, this reuses
most of the code previously added for the tunnelled migration
streams since it is generic.
* src/qemu/qemu_driver.c: Support virDomainOpenConsole
To avoid the need for duplicating implementations of virStream
drivers, provide a generic implementation that can handle any
FD based stream. This code is copied from the existing impl
in the QEMU driver, with the locking moved into the stream
impl, and addition of a read callback
The FD stream code will refuse to operate on regular files or
block devices, since those can't report EAGAIN properly when
they would block on I/O
* include/libvirt/virterror.h, include/libvirt/virterror.h: Add
VIR_FROM_STREAM error domain
* src/qemu/qemu_driver.c: Remove code obsoleted by the new
generic streams driver.
* src/fdstream.h, src/fdstream.c, src/fdstream.c,
src/libvirt_private.syms: Generic reusable FD based streams
Now that bi-directional, non-blocking streams are supported
in the remote driver, some of the VIR_WARN statements need
to be reduced to VIR_DEBUG.
* src/remote/remote_driver.c: Lower logging level
This provides an implementation of the virDomainOpenConsole
API for the remote driver client and server.
* daemon/remote.c: Server side impl
* src/remote/remote_driver.c: Client impl
* src/remote/remote_protocol.x: Wire definition
To enable virsh console (or equivalent) to be used remotely
it is necessary to provide remote access to the /dev/pts/XXX
pseudo-TTY associated with the console/serial/parallel device
in the guest. The virStream API provide a bi-directional I/O
stream capability that can be used for this purpose. This
patch thus introduces a virDomainOpenConsole API that uses
the stream APIs.
* src/libvirt.c, src/libvirt_public.syms,
include/libvirt/libvirt.h.in, src/driver.h: Define the
new virDomainOpenConsole API
* src/esx/esx_driver.c, src/lxc/lxc_driver.c,
src/opennebula/one_driver.c, src/openvz/openvz_driver.c,
src/phyp/phyp_driver.c, src/qemu/qemu_driver.c,
src/remote/remote_driver.c, src/test/test_driver.c,
src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
src/xen/xen_driver.c, src/xenapi/xenapi_driver.c: Stub
API entry point
The current remote driver code for streams only supports
blocking I/O mode. This is fine for the usage with migration
but is a problem for more general use cases, in particular
bi-directional streams.
This adds supported for the stream callbacks and non-blocking
I/O. with the minor caveat is that it doesn't actually do
non-blocking I/O for sending stream data, only receiving it.
A future patch will try to do non-blocking sends, but this is
quite tricky to get right.
* src/remote/remote_driver.c: Allow non-blocking I/O for
streams and support callbacks
The /dev/console device inside the container must NOT map
to the real /dev/console device node, since this allows the
container control over the current host console. A fun side
effect of this is that starting a container containing a
real Fedora OS will kill off your X server.
Remove the /dev/console node, and replace it with a symlink
to the primary console TTY
* src/lxc/lxc_container.c: Replace /dev/console with a
symlink to /dev/pty/0
* src/lxc/lxc_controller.c: Remove /dev/console from cgroups
ACL
QEMU allows forcing a CDROM eject even if the guest has locked the device.
Expose this via a new UpdateDevice flag, VIR_DOMAIN_DEVICE_MODIFY_FORCE.
This has been requested for RHEV:
https://bugzilla.redhat.com/show_bug.cgi?id=626305
v2: Change flag name, bool cleanups
I am trying to use a qcow image with libvirt where the backing 'file' is a
qemu-nbd server. Unfortunately virDomainDiskDefForeachPath() assumes that
backingStore is always a real file so something like 'nbd:0:3333' is rejected
because a file with that name cannot be accessed. Note that I am not worried
about directly using nbd images. That would require a new disk type with XML
markup, etc. I only want it to be permitted as a backingStore
The following patch implements danpb's suggestion:
> I think I'm inclined to push the logic for skipping NBD one stage higher.
> I'd rather expect virStorageFileGetMetadata() to return all backing
> stores, even if not files. The virDomainDiskDefForeachPath() method
> should definitely ignore non-file backing stores though.
>
> So what I'm thinking is to extend the virStorageFileMetadata struct and
> just add a 'bool isFile' field to it. Default this field to true, unless
> you see the prefix of nbd: in which case set it to false. The
> virDomainDiskDefForeachPath() method can then skip over any backing
> store with isFile == false
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Daniel P. Berrange <berrange@redhat.com>
xencapstest calls xenHypervisorMakeCapabilitiesInternal with conn == NULL
which calls xenDaemonNodeGetTopology with conn == NULL when a recent
enough Xen was detected (sys_interface_version >= SYS_IFACE_MIN_VERS_NUMA).
But xenDaemonNodeGetTopology insists in having conn != NULL and fails,
because it expects to be able to talk to an actual xend.
We cannot do that in a 'make check' test. Therefore, only call the xend
subdriver function when conn isn't NULL.
Reported by Andy Howell and Jim Fehlig.
Using automated replacement with sed and editing I have now replaced all
occurrences of close() with VIR_(FORCE_)CLOSE() except for one, of
course. Some replacements were straight forward, others I needed to pay
attention. I hope I payed attention in all the right places... Please
have a look. This should have at least solved one more double-close
error.
This extends the SPICE XML to allow channel security options
<graphics type='spice' port='-1' tlsPort='-1' autoport='yes'>
<channel name='main' mode='secure'/>
<channel name='record' mode='insecure'/>
</graphics>
Any non-specified channel uses the default, which allows both
secure & insecure usage
* src/conf/domain_conf.c, src/conf/domain_conf.h,
src/libvirt_private.syms: Add XML syntax for specifying per
channel security options for spice.
* src/qemu/qemu_conf.c: Configure channel security with spice
QEMU crashes & burns if you try multiple Cirrus video cards, but
QXL copes fine. Adapt QEMU config code to allow multiple QXL
video cards
* src/qemu/qemu_conf.c: Support multiple QXL video cards
This extends the XML syntax for <graphics> to allow a password
expiry time to be set
eg
<graphics type='vnc' port='5900' autoport='yes' keymap='en-us' passwd='12345' passwdValidTo='2010-04-09T15:51:00'/>
The timestamp is in UTC.
* src/conf/domain_conf.h: Pull passwd out into separate struct
virDomainGraphicsAuthDef to allow sharing between VNC & SPICE
* src/conf/domain_conf.c: Add parsing/formatting of new passwdValidTo
argument
* src/opennebula/one_conf.c, src/qemu/qemu_conf.c, src/qemu/qemu_driver.c,
src/xen/xend_internal.c, src/xen/xm_internal.c: Update for changed
struct containing VNC password
In common with VNC, the QEMU driver configuration file is used
specify the host level TLS certificate location and a default
password / listen address
* src/qemu/qemu.conf: Add spice_listen, spice_tls,
spice_tls_x509_cert_dir & spice_password config params
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Parsing of
spice config parameters and updating -spice arg generation
to use them
* tests/qemuxml2argvdata/qemuxml2argv-graphics-spice-rhel6.args,
tests/qemuxml2argvtest.c: Expand test case to cover driver
level configuration
This supports the -spice argument posted for review against
the latest upstream QEMU/KVM. This supports the bare minimum
config with port, TLS port & listen address. The x509 bits are
added in a later patch.
* src/qemu_conf.c, src/qemu_conf.h: Add SPICE flag. Check for
-spice availability. Format -spice arg for command line
* qemuhelptest.c: Add SPICE flag
* qemuxml2argvdata/qemuxml2argv-graphics-spice.args: Add <graphics>
for spice
* qemuxml2argvdata/qemuxml2argv-graphics-spice.xml: Add -spice arg
* qemuxml2argvtest.c: Add SPICE flag
This supports the '-vga qxl' parameter in upstream QEMU/KVM
which has SPICE support added. This isn't particularly useful
until you get the next patch for -spice support. Also note that
while the libvirt XML supports multiple video devices, this
patch only supports a single one. A later patch can add support
for 2nd, 3rd, etc PCI devices for QXL
* src/qemu/qemu_conf.h: Flag for QXL support
* src/qemu/qemu_conf.c: Probe for '-vga qxl' support and implement it
* tests/qemuxml2argvtest.c, tests/qemuxml2xmltest.c,
tests/qemuxml2argvdata/qemuxml2argv-graphics-spice.args,
tests/qemuxml2argvdata/qemuxml2argv-graphics-spice.xml: Test
case for generating spice args with RHEL6 kvm
This adds an element
<graphics type='spice' port='5903' tlsPort='5904' autoport='yes' listen='127.0.0.1'/>
This is the bare minimum that should be exposed in the guest
config for SPICE. Other parameters are better handled as per
host level configuration tunables
* docs/schemas/domain.rng: Define the SPICE <graphics> schema
* src/domain_conf.h, src/domain_conf.c: Add parsing and formatting
for SPICE graphics config
* src/qemu_conf.c: Complain about unsupported graphics types
* src/qemu_conf.c: Add dummy entry in enumeration
* docs/schemas/domain.rng: Add 'qxl' as a type for the <video> tag
* src/domain_conf.c, src/domain_conf.h: Add QXL to video type
enumerations
There is no point in trying to fill params beyond the first error,
because when lxcDomainGetMemoryParameters returns -1 then the caller
cannot detect which values in params are valid.
The patch is based on the possiblity in the QEmu command line to
add -smbios options allowing to override the default values picked
by QEmu. We need to detect this first from QEmu help output.
If the domain is defined with smbios to be inherited from host
then we pass the values coming from the Host own SMBIOS, but
if the domain is defined with smbios to come from sysinfo, we
use the ones coming from the domain definition.
* src/qemu/qemu_conf.h: add the QEMUD_CMD_FLAG_SMBIOS_TYPE enum
value
* src/qemu/qemu_conf.c: scan the help output for the smbios support,
and if available add support based on the domain definitions,
and host data
* tests/qemuhelptest.c: add the new enum in the outputs
Move existing routines about virSysinfoDef to an util module,
add a new entry point virSysinfoRead() to read the host values
with dmidecode
* src/conf/domain_conf.c src/conf/domain_conf.h src/util/sysinfo.c
src/util/sysinfo.h: move to a new module, add virSysinfoRead()
* src/Makefile.am: handle the new module build
* src/libvirt_private.syms: new internal symbols
* include/libvirt/virterror.h src/util/virterror.c: defined a new
error code for that module
* po/POTFILES.in: add new file for translations
the element has a mode attribute allowing only 3 values:
- emulate: use the smbios emulation from the hypervisor
- host: try to use the smbios values from the node
- sysinfo: grab the values from the <sysinfo> fields
* docs/schemas/domain.rng: extend the schemas
* src/conf/domain_conf.h: add the flag to the domain config
* src/conf/domain_conf.h: parse and serialize the smbios if present
* src/conf/domain_conf.h: defines a new internal type added to the
domain structure
* src/conf/domain_conf.c: parsing and serialization of that new type
During a shutdown/restart cycle libvirtd forgot the macvtap device name that it had created on behalf of a VM so that a stale macvtap device remained on the host when the VM terminated. Libvirtd has to actively tear down a macvtap device and it uses its name for identifying which device to tear down.
The solution is to not blank out the <target dev='...'/> completely, but only blank it out on VMs that are not active. So, if a VM is active, the device name makes it into the XML and is also being parsed. If a VM is not active, the device name is discarded.
virPipeReadUntilEOF is used to read the stdout of exec'ed
and this could fail to capture the full output and read only
1024 bytes.
The problem is that this is based on a poll loop, and in the
loop we read at most 1024 bytes per file descriptor, but we also
note in the loop if poll indicates that the process won't output
more than that on that fd by setting finished[i] = 1.
The simplest way is that if we read a full buffer make sure
finished[i] is still 0 because we will need another pass in the
loop.
The remoteIO() method has wierd calling conventions, where
it is passed a pre-allocated 'struct remote_call *' but
then free()s it itself, instead of letting the caller free().
This fixes those weird semantics
* src/remote/remote_driver.c: Sanitize semantics of remoteIO
method wrt to memory release
A couple of places in the text monitor were overwriting the
'ret' variable with a >= 0 value before success was actually
determined. So later error paths would not correctly return
the -1 value. The drive_add code was not checking for errors
like missing command
* src/qemu/qemu_monitor_text.c: Misc error handling fixes
NFS in root squash mode may prevent opening disk images to
determine backing store. Ignore errors in this scenario.
* src/security/security_selinux.c: Ignore open failures on disk
images
NFS does not support file labelling, so ignore this error
for stdin_path when on NFS.
* src/security/security_selinux.c: Ignore failures on labelling
stdin_path on NFS
* src/util/storage_file.c, src/util/storage_file.h: Refine
virStorageFileIsSharedFS() to allow it to check for a
specific FS type.
Commit 06f81c63eb attempted to make
QEMU driver ignore the failure to relabel 'stdin_path' if it was
on NFS. The actual result was that it ignores *all* failures to
label any aspect of the VM, unless stdin_path is non-NULL and
is not on NFS.
* src/qemu/qemu_driver.c: Treat all relabel failures as terminal
* src/xen/xend_internal.c (xenDaemonFormatSxpr): Hoist verify
outside of function to avoid a -Wnested-externs warning.
* src/xen/xm_internal.c (xenXMDomainConfigFormat): Likewise.
Reported by Daniel P. Berrange.
* src/xen/xen_hypervisor.c (MAX_VIRT_CPUS): Move...
* src/xen/xen_driver.h (MAX_VIRT_CPUS): ...so all xen code can see
same value.
* src/xen/xend_internal.c (sexpr_to_xend_domain_info)
(xenDaemonDomainGetVcpusFlags, xenDaemonParseSxpr)
(xenDaemonFormatSxpr): Work if MAX_VIRT_CPUS is 64 on a platform
where long is 64-bits.
* src/xen/xm_internal.c (xenXMDomainConfigParse)
(xenXMDomainConfigFormat): Likewise.
Add dump_image_format[] to qemu.conf and support compressed dump
at virsh dump. coredump compression is important for saving disk space
in an environment where multiple guests run.
In general, "disk space for dump" is specially allocated and will be
a dead space in the system. It's used only at emergency. So, it's better
to have both of save_image_format and dump_image_format. "save" is done
in scheduled manner with enough calculated disk space for it.
This code reuses some of save_image_format[] and supports the same format.
Changelog:
- modified libvirtd_qemu.aug
- modified test_libvirtd_qemu.aug
- fixed error handling of qemudSaveCompressionTypeFromString()
When we mount any cgroup without "-o devices", we will fail to start vms:
error: Failed to start domain vm1
error: Unable to deny all devices for vm1: No such file or directory
When we mount any cgroup without "-o cpu", we will fail to get schedinfo:
Scheduler : posix
error: unable to get cpu shares tunable: No such file or directory
We should only use the cgroup controllers which are mounted on host.
So I add virCgroupMounted() for qemuCgroupControllerActive()
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
This partly reverts df90ca7661.
Don't disable the VirtualBox driver when configure can't find
VBoxXPCOMC.so, rely on detection at runtime again instead.
Keep --with-vbox=/path/to/virtualbox intact, added to for:
https://bugzilla.redhat.com/show_bug.cgi?id=609185
Detection order for VBoxXPCOMC.so:
1. VBOX_APP_HOME environment variable
2. configure provided location
3. hardcoded list of known locations
4. dynamic linker search path
Also cleanup the glue code and improve error reporting.
fix warning
CC libvirt_util_la-virtaudit.lo
cc1: warnings being treated as errors
util/virtaudit.c: In function 'virAuditEncode':
util/virtaudit.c:146: error: implicit declaration of function 'virAsprintf' [-Wimplicit-function-declaration]
util/virtaudit.c:146: error: nested extern declaration of 'virAsprintf' [-Wnested-externs]
The 2nd and 3rd hunk show the only double-closed file descriptor code part that I found while trying to clean up close(). The first hunk seems a harmless cleanup in that same file.
https://bugzilla.redhat.com/show_bug.cgi?id=638285 - when migrating
a guest, it was very easy to provoke a race where an application
could query block information on a VM that had just been migrated
away. Any time qemu code obtains a job lock, it must also check
that the VM was not taken down in the time where it was waiting
for the lock.
* src/qemu/qemu_driver.c (qemudDomainSetMemory)
(qemudDomainGetInfo, qemuDomainGetBlockInfo): Check that vm still
exists after obtaining job lock, before starting monitor action.
During virtual network startup, the iptables rule that allows tftp
traffic is only added if network->def->tftproot is non-empty, but when
the virtual network is destroyed, we had been unconditionally trying
to delete the rule. This was harmless, except that it created a bogus
error message.
This patch conditionalizes the delete command in the same manner that
the insert command is already conditionalized.
Commit 9bd3cce0d2 added virFork and
virDriverLoadModule to libvirt_private.syms, but virFork didn't have
a body on Win32 and virDriverLoadModule was already correctly
exported conditional via libvirt_driver_modules.syms.
Add auditing of all initial disk/net assignments to QEMU guests
at startup. Add auditing for all hotplug & unplug events and
disk media changes.
* src/qemu/qemu_driver.c: Add disk/net resource auditing
Add a helper API for ecscaping the value in audit log
messages
* src/util/virtaudit.h, src/util/virtaudit.c,
src/libvirt_private.syms: Add virAuditEncode
Revert most of commit a8b5f9bd27.
The audit hooks will be re-added directly in the QEMU driver code
in a future commit
* daemon/remote.c: Remove all audit logging hooks
* src/qemu/qemu_driver.c: Remove all audit logging hooks
When using 0-prefixed numbers, QEmu will interpret them as octal numbers
(as C convention says); this means that if you attach a device that has
addr > 10 (decimal) you're going to attach a different device.
Older dash mistakenly truncates regular files when using <> redirection;
this kills our use of double dd to reduce storage overhead when
saving qemu images. But qemu insists on running a command through
/bin/sh, so we work around it by having qemu run $sh -c 'real command'
when we have a replacement $sh in mind.
* configure.ac (VIR_WRAPPER_SHELL): Define to a replacement shell,
if /bin/sh is broken on <> redirection.
* src/qemu/qemu_monitor.h (VIR_WRAPPER_SHELL_PREFIX)
(VIR_WRAPPER_SHELL_SUFFIX): New macros.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextMigrateToFile): Use
them.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONMigrateToFile):
Likewise.
When failing to start a virtual network, we have to cleanup,
tearing down any iptables rules. If the iptables rules were
not present yet though, this raises an error, which squashes
the original error we were handling.
* src/network/bridge_driver.c: When failing to start a virtual
network, don't squash the original error in cleanup
The network address was being set to 192.168.122.0 instead
of 192.168.122.0/24. Fix this by removing the unneccessary
'network' field from virNetworkDef and just pass the
network address and netmask into the iptables APIs directly.
* src/conf/network_conf.h, src/conf/network_conf.c: Remove
the 'network' field from virNEtworkDef.
* src/network/bridge_driver.c: Update for iptables API changes
* src/util/iptables.c, src/util/iptables.h: Require the
network address + netmask pair to be passed in
So far, readonly=on option is used when qemu supports -device. However,
there are qemu versions which support readonly option with -drive
although they don't have support for -device.
The boot server IP address is optional, so it needs to be
checked before attempting to parse it.
* src/conf/network_conf.c: Don't parse NULL ip address for
boot server
Instead of storing the IP address string in virNetwork related
structs, store the parsed virSocketAddr. This will make it
easier to add IPv6 support in the future, by letting driver
code directly check what address family is present
* src/conf/network_conf.c, src/conf/network_conf.h,
src/network/bridge_driver.c: Convert to use virSocketAddr
in virNetwork, instead of char *.
* src/util/bridge.c, src/util/bridge.h,
src/util/dnsmasq.c, src/util/dnsmasq.h,
src/util/iptables.c, src/util/iptables.h: Convert to
take a virSocketAddr instead of char * for any IP
address parameters
* src/util/network.h: Add macros to determine if an address
is set, and what address family is set.
It is useful to know where the client is connecting from,
so include the socket address in probe data.
* daemon/libvirtd.h: Use virSocketAddr for storing client
address and keep printable address handy for logging
* daemon/libvirtd.c: Include socket address in client
connect/disconnect probes
* daemon/probes.d: Add socket address to probes
* examples/systemtap/client.stp: Print socket address
* src/util/network.h: Add sockaddr_un to virSocketAddr union
The inet_pton and inet_ntop functions are obsolete, replaced
by getaddrinfo+getnameinfo with the AI_NUMERICHOST flag set.
These can be accessed via the virSocket APIs.
The bridge.c code had methods for fetching the IP address of
a bridge which used inet_ntop. Aside from the use of inet_ntop
these methods are broken, because a NIC can have multiple
addresses and this only returns one address. Since the methods
are never used, just remove them.
* src/conf/network_conf.c, src/nwfilter/nwfilter_learnipaddr.c:
Replace inet_pton and inet_ntop with virSocket APIs
* src/util/bridge.c, src/util/bridge.h: Remove unused methods
which called inet_ntop.
The addrToString functionality is now available via the
virSocketFormatAddrFull method.
* daemon/remote.c, src/remote/remote_driver.c: Remove
addrToString methods
The virSocketParse method was not doing any error reporting
which meant the true cause of the problem was lost. Remove
all error reporting from callers, and push it into virSocketParse
* src/util/network.c: Add error reporting to virSocketParse
* src/conf/domain_conf.c, src/conf/network_conf.c,
src/network/bridge_driver.c: Remove error reporting in
callers of virSocketParse
The getnameinfo() function is more flexible than inet_ntop()
avoiding the need to if/else the code based on socket family.
Also make it support UNIX socket addrs and allow inclusion
of a port (service) address. Finally do proper error reporting
via normal APIs.
* src/conf/domain_conf.c, src/nwfilter/nwfilter_ebiptables_driver.c,
src/qemu/qemu_conf.c: Fix error handling with virSocketFormat
* src/util/network.c: Rewrite virSocketFormat to use getnameinfo
and cope with UNIX socket addrs.
The nwIPAddress was simply a wrapper about virSocketAddr.
Just use the latter directly, removing all the extra field
de-references from code & helper APIs for parsing/formatting.
Also remove all the redundant casts from strong types to
void * and then immediately back to strong types.
* src/conf/nwfilter_conf.h: Remove nwIPAddress
* src/conf/nwfilter_conf.c, src/nwfilter/nwfilter_ebiptables_driver.c:
Update to use virSocketAddr and remove void * casts.
There was a typo in the IPv6 path of virSocketCheckNetmask which
caused it to never execute.
* src/util/network.c: s/AF_INET/AF_INET6/ in virSocketCheckNetmask
The virSocketParseAddr function was accepting any AF_* constant
and using that to set the ai_flags field in struct addrinfo.
This is invalid, since address families must go in the ai_family
field of the struct.
* src/util/network.c: Fix handling of address family
* src/conf/network_conf.c, src/network/bridge_driver.c: Pass
AF_UNSPEC instead of relying on it being 0.
Some operations on socket addresses need to know the length of
the sockaddr struct for the particular address family. This
info was being discarded when passing around virSocketAddr
instances. Turn it from a union into a struct containing
union+socklen_t fields, so length is always kept around.
* src/util/network.h: Add socklen_t field to virSocketAddr
* src/util/network.c, src/network/bridge_driver.c,
src/conf/domain_conf.c: Update to take account of new
struct definition.
If getnameinfo() with NI_NUMERICHOST set fails, there are no
grounds to expect inet_ntop to succeed, since these calls
are functionally equivalent. Remove useless inet_ntop code
in the getnameinfo() error path.
* daemon/remote.c, src/remote/remote_driver.c: Remove
calls to inet_ntop
* src/libvirt_private.syms: Sort by header name, then within
header, and drop duplicate virNetworkDefParseNode,
virFileLinkPointsTo and virXPathBoolean.
The QEMU 0.13 release is finally out and from testing in RHEL-6
we know that its JSON and netdev features are now good enough
for us to use by default.
* src/qemu/qemu_conf.c: Enable JSON + netdev for QEMU >= 0.13
There is no point in trying to fill params beyond the first error,
because when qemuDomainGetMemoryParameters returns -1 then the caller
cannot detect which values in params are valid.
This sets the process name to the same value as the Windows title,
but since the name is limited to 16 chars only this is kept as a
configuration option and turned off by default
* src/qemu/qemu.conf src/qemu/qemu_conf.[ch]: hceck for support in the
QEmu help output, add the option in qemu conf file and augment
qemudBuildCommandLine to add it if switched on
* src/qemu/libvirtd_qemu.aug src/qemu/test_libvirtd_qemu.aug: augment
the augeas lenses accordingly
* tests/qemuhelptest.c: cope with the extra flag being detected now
The libvirt_util.la library was mistakenly linked into libvirtd
directly. Since libvirt_util.la is already linked to libvirt.so,
this resulted in libvirtd getting two copies of the code and
more critically 2 copies of static global variables.
Testing in turn exposed a issue with loadable modules. The
gnulib replacement functions are not exported to loadable
modules. Rather than trying to figure out the name sof all
gnulib functions & export them, just linkage all loadable
modules against libgnu.la statically.
* daemon/Makefile.am: Remove linkage of libvirt_util.la
and libvirt_driver.la
* src/Makefile.am: Link driver modules against libgnu.la
* src/libvirt.c: Don't try to load modules which were
compiled out
* src/libvirt_private.syms: Export all other internal
symbols that are required by drivers
A more natural auditing point would perhaps be
SELinuxSetSecurityProcessLabel, but this happens in the child after root
permissions are dropped, so the kernel would refuse the audit record.
Most operations are audited at the libvirtd level; auditing in
src/libvirt.c would result in two audit entries per operation (one in
the client, one in libvirtd).
The only exception is a domain stopping of its own will (e.g. because
the user clicks on "shutdown" inside the interface). There can often be
no client connected at the time the domain stops, so libvirtd does not
have any virConnectPtr object on which to attach an event watch. This
patch therefore adds auditing directly inside the qemu driver (other
drivers are not supported).
Integrate with libaudit.so for auditing of important operations.
libvirtd gains a couple of config entries for auditing. By
default it will enable auditing, if its enabled on the host.
It can be configured to force exit if auditing is disabled
on the host. It will can also send audit messages via libvirt
internal logging API
Places requiring audit reporting can use the VIR_AUDIT
macro to report data. This is a no-op unless auditing is
enabled
* autobuild.sh, mingw32-libvirt.spec.in: Disable audit
on mingw
* configure.ac: Add check for libaudit
* daemon/libvirtd.aug, daemon/libvirtd.conf,
daemon/test_libvirtd.aug, daemon/libvirtd.c: Add config
options to enable auditing
* include/libvirt/virterror.h, src/util/virterror.c: Add
VIR_FROM_AUDIT source
* libvirt.spec.in: Enable audit
* src/util/virtaudit.h, src/util/virtaudit.c: Simple internal
API for auditing messages
This patch series focuses on xendConfigVersion 2 (xm_internal) and 3
(xend_internal), but leaves out changes for xenapi drivers.
See this link for more details about vcpu_avail for xm usage.
http://lists.xensource.com/archives/html/xen-devel/2009-11/msg01061.html
This relies on the fact that def->maxvcpus can be at most 32 with xen.
* src/xen/xend_internal.c (xenDaemonParseSxpr)
(sexpr_to_xend_domain_info, xenDaemonFormatSxpr): Use vcpu_avail
when current vcpus is less than maximum.
* src/xen/xm_internal.c (xenXMDomainConfigParse)
(xenXMDomainConfigFormat): Likewise.
* tests/xml2sexprdata/xml2sexpr-pv-vcpus.sexpr: New file.
* tests/sexpr2xmldata/sexpr2xml-pv-vcpus.sexpr: Likewise.
* tests/sexpr2xmldata/sexpr2xml-pv-vcpus.xml: Likewise.
* tests/xmconfigdata/test-paravirt-vcpu.cfg: Likewise.
* tests/xmconfigdata/test-paravirt-vcpu.xml: Likewise.
* tests/xml2sexprtest.c (mymain): New test.
* tests/sexpr2xmltest.c (mymain): Likewise.
* tests/xmconfigtest.c (mymain): Likewise.
* src/qemu/qemu_conf.c (qemuParseCommandLineSmp): Distinguish
between vcpus and maxvcpus, for new enough qemu.
* tests/qemuargv2xmltest.c (mymain): Add new test.
* tests/qemuxml2argvtest.c (mymain): Likewise.
* tests/qemuxml2xmltest.c (mymain): Likewise.
* tests/qemuxml2argvdata/qemuxml2argv-smp.args: New file.
Although this patch adds a distinction between maximum vcpus and
current vcpus in the XML, the values should be identical for all
drivers at this point. Only in subsequent per-driver patches will
a distinction be made.
In general, virDomainGetInfo should prefer the current vcpus.
* src/conf/domain_conf.h (_virDomainDef): Adjust vcpus to unsigned
short, to match virDomainGetInfo limit. Add maxvcpus member.
* src/conf/domain_conf.c (virDomainDefParseXML)
(virDomainDefFormat): parse and print out vcpu details.
* src/xen/xend_internal.c (xenDaemonParseSxpr)
(xenDaemonFormatSxpr): Manage both vcpu numbers, and require them
to be equal for now.
* src/xen/xm_internal.c (xenXMDomainConfigParse)
(xenXMDomainConfigFormat): Likewise.
* src/phyp/phyp_driver.c (phypDomainDumpXML): Likewise.
* src/openvz/openvz_conf.c (openvzLoadDomains): Likewise.
* src/openvz/openvz_driver.c (openvzDomainDefineXML)
(openvzDomainCreateXML, openvzDomainSetVcpusInternal): Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainDumpXML, vboxDomainDefineXML):
Likewise.
* src/xenapi/xenapi_driver.c (xenapiDomainDumpXML): Likewise.
* src/xenapi/xenapi_utils.c (createVMRecordFromXml): Likewise.
* src/esx/esx_vmx.c (esxVMX_ParseConfig, esxVMX_FormatConfig):
Likewise.
* src/qemu/qemu_conf.c (qemuBuildSmpArgStr)
(qemuParseCommandLineSmp, qemuParseCommandLine): Likewise.
* src/qemu/qemu_driver.c (qemudDomainHotplugVcpus): Likewise.
* src/opennebula/one_conf.c (xmlOneTemplate): Likewise.
Note - this wrapping is completely mechanical; the old API will
function identically, since the new API validates that the exact
same flags are provided by the old API. On a per-driver basis,
it may make sense to have the old API pass a different set of flags,
but that should be done in the per-driver patch that implements
the full range of flag support in the new API.
* src/esx/esx_driver.c (esxDomainSetVcpus, escDomainGetMaxVpcus):
Move guts...
(esxDomainSetVcpusFlags, esxDomainGetVcpusFlags): ...to new
functions.
(esxDriver): Trivially support the new API.
* src/openvz/openvz_driver.c (openvzDomainSetVcpus)
(openvzDomainSetVcpusFlags, openvzDomainGetMaxVcpus)
(openvzDomainGetVcpusFlags, openvzDriver): Likewise.
* src/phyp/phyp_driver.c (phypDomainSetCPU)
(phypDomainSetVcpusFlags, phypGetLparCPUMAX)
(phypDomainGetVcpusFlags, phypDriver): Likewise.
* src/qemu/qemu_driver.c (qemudDomainSetVcpus)
(qemudDomainSetVcpusFlags, qemudDomainGetMaxVcpus)
(qemudDomainGetVcpusFlags, qemuDriver): Likewise.
* src/test/test_driver.c (testSetVcpus, testDomainSetVcpusFlags)
(testDomainGetMaxVcpus, testDomainGetVcpusFlags, testDriver):
Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainSetVcpus)
(vboxDomainSetVcpusFlags, virDomainGetMaxVcpus)
(virDomainGetVcpusFlags, virDriver): Likewise.
* src/xen/xen_driver.c (xenUnifiedDomainSetVcpus)
(xenUnifiedDomainSetVcpusFlags, xenUnifiedDomainGetMaxVcpus)
(xenUnifiedDomainGetVcpusFlags, xenUnifiedDriver): Likewise.
* src/xenapi/xenapi_driver.c (xenapiDomainSetVcpus)
(xenapiDomainSetVcpusFlags, xenapiDomainGetMaxVcpus)
(xenapiDomainGetVcpusFlags, xenapiDriver): Likewise.
(xenapiError): New helper macro.
Factors common checks (such as nonzero vcpu count) up front, but
drivers will still need to do additional flag checks.
* src/libvirt.c (virDomainSetVcpusFlags, virDomainGetVcpusFlags):
New functions.
(virDomainSetVcpus, virDomainGetMaxVcpus): Refer to new API.
API agreed on in
https://www.redhat.com/archives/libvir-list/2010-September/msg00456.html,
but modified for enum names to be consistent with virDomainDeviceModifyFlags.
* include/libvirt/libvirt.h.in (virDomainVcpuFlags)
(virDomainSetVcpusFlags, virDomainGetVcpusFlags): New
declarations.
* src/libvirt_public.syms: Export new symbols.
In the table built for traffic coming from the VM going to the host make the following changes:
- don't ACCEPT the packets but do a 'RETURN' and let the host-specific firewall rules in subsequent rules evaluate whether the traffic is allowed to enter
- use the '-m state' in the rules as everywhere else
ESX(i) uses UTF-8, but a Windows based GSX server writes
Windows-1252 encoded VMX files.
Add a test case to ensure that libxml2 provides Windows-1252
to UTF-8 conversion.
Since bugs due to double-closed file descriptors are difficult to track down in a multi-threaded system, I am introducing the VIR_CLOSE(fd) macro to help avoid mistakes here.
There are lots of places where close() is being used. In this patch I am only cleaning up usage of close() in src/conf where the problems were.
I also dare to declare close() as being deprecated in libvirt code base (HACKING).
Over root-squashing nfs, when virFileOperation() is called as uid==0,
it may fail with EACCES, but also with EPERM, due to
virFileOperationNoFork()'s failed attemp to chown a writable file.
qemudDomainSaveFlag() should expect this case, too.
qemudOpenAsUID is intended to open a file with the credentials of a
specified uid. Current implementation fails if the file is accessible to
one of uid's groups but not owned by uid.
This patch replaces the supplementary group list that the child process
inherited from libvirtd with the default group list of uid.
* docs/formatdomain.html.in: Add memtune element details, added min_guarantee
* src/libvirt.c: Update virDomainGetMemoryParameters api description, make
it more clear that the user first needs to call the api to get the number
of parameters supported and then call again to get the values.
* tools/virsh.pod: Add usage of new command memtune in virsh manpage
This introduces new attribute to filesystem element
to support customizable access mode for mount type.
Valid accessmode are: passthrough, mapped and squash.
Usage:
<filesystem type='mount' accessmode='passthrough'>
<source dir='/export/to/guest'/>
<target dir='mount_tag'/>
</filesystem>
passthrough is the default model if not specified, that's
also the current behaviour.
The following filter transition from a filter allowing incoming TCP connections
<rule action='accept' direction='in' priority='401'>
<tcp/>
</rule>
<rule action='accept' direction='out' priority='500'>
<tcp/>
</rule>
to one that does not allow them
<rule action='drop' direction='in' priority='401'>
<tcp/>
</rule>
<rule action='accept' direction='out' priority='500'>
<tcp/>
</rule>
did previously not cut off existing (ssh) connections but only prevented newly initiated ones. The attached patch allows to cut off existing connections as well, thus enforcing what the filter is showing.
I had only tested with a configuration where the physical interface is connected to the bridge where the filters are applied. This patch now also solves a filtering problem where the physical interface is not connected to the bridge, but the bridge is given an IP address and the host routes between bridge and physical interface. Here the filters drop non-allowed traffic on the outgoing side on the host.
Explicitly raising a nice error in the case user tries to migrate a
guest with assigned host devices is much better than waiting for a
mysterious error with no clue for the reason.
When only some host CPUs given to cpuBaseline contain <vendor> element,
baseline CPU should not contain it. Otherwise the result would not be
compatible with the host CPUs without vendor. CPU vendors are still
taken into account when computing baseline CPU, it's just removed from
the result.
Recent CPU models were specified using invalid vendor element
<vendor>NAME</vendor>, which was silently ignored due to a bug in the
code which was parsing it.
'make -C src rpcgen' is supposed to be idempotent. But commit
f928f43b7b mistakently manually edited a generated file rather
than fixing the upstream file.
* src/remote/remote_protocol.x (remote_memory_param_value): Use
correct spelling of enum values.
* src/remote/remote_protocol.c: Regenerate.
This enables support for nested SVM using the regular CPU
model/features block. If the CPU model or features include
'svm', then the '-enable-nesting' flag will be added to the
QEMU command line. Latest out of tree patches for nested
'vmx', no longer require the '-enable-nesting' flag. They
instead just look at the cpu features. Several of the models
already include svm support, but QEMU was just masking out
the svm bit silently. So this will enable SVM on such
models
* src/qemu/qemu_conf.h: flag for -enable-nesting
* src/qemu/qemu_conf.c: Use -enable-nesting if VMX or SVM are in
the CPUID
* src/cpu/cpu.h, src/cpu/cpu.c: API to check for a named feature
* src/cpu/cpu_x86.c: x86 impl of feature check
* src/libvirt_private.syms: Add cpuHasFeature
* src/qemuhelptest.c: Add nesting flag where required
* src/xen/sexpr.c: Ensure () are escaped in sexpr2string
* tests/sexpr2xmldata/sexpr2xml-boot-grub.sexpr,
tests/sexpr2xmldata/sexpr2xml-boot-grub.xml,
tests/xml2sexprdata/xml2sexpr-boot-grub.sexpr,
tests/xml2sexprdata/xml2sexpr-boot-grub.xml: Data files to
check escaping
* tests/sexpr2xmltest.c, tests/xml2sexprtest.c: Add boot-grub
escaping test case
This is from a bug report and conversation on IRC where Soren reported that while a filter update is occurring on one or more VMs (due to a rule having been edited for example), a deadlock can occur when a VM referencing a filter is started.
The problem is caused by the two locking sequences of
qemu driver, qemu domain, filter # for the VM start operation
filter, qemu_driver, qemu_domain # for the filter update operation
that obviously don't lock in the same order. The problem is the 2nd lock sequence. Here the qemu_driver lock is being grabbed in qemu_driver:qemudVMFilterRebuild()
The following solution is based on the idea of trying to re-arrange the 2nd sequence of locks as follows:
qemu_driver, filter, qemu_driver, qemu_domain
and making the qemu driver recursively lockable so that a second lock can occur, this would then lead to the following net-locking sequence
qemu_driver, filter, qemu_domain
where the 2nd qemu_driver lock has been ( logically ) eliminated.
The 2nd part of the idea is that the sequence of locks (filter, qemu_domain) and (qemu_domain, filter) becomes interchangeable if all code paths where filter AND qemu_domain are locked have a preceding qemu_domain lock that basically blocks their concurrent execution
So, the following code paths exist towards qemu_driver:qemudVMFilterRebuild where we now want to put a qemu_driver lock in front of the filter lock.
-> nwfilterUndefine() [ locks the filter ]
-> virNWFilterTestUnassignDef()
-> virNWFilterTriggerVMFilterRebuild()
-> qemudVMFilterRebuild()
-> nwfilterDefine()
-> virNWFilterPoolAssignDef() [ locks the filter ]
-> virNWFilterTriggerVMFilterRebuild()
-> qemudVMFilterRebuild()
-> nwfilterDriverReload()
-> virNWFilterPoolLoadAllConfigs()
->virNWFilterPoolObjLoad()
-> virNWFilterPoolAssignDef() [ locks the filter ]
-> virNWFilterTriggerVMFilterRebuild()
-> qemudVMFilterRebuild()
-> nwfilterDriverStartup()
-> virNWFilterPoolLoadAllConfigs()
->virNWFilterPoolObjLoad()
-> virNWFilterPoolAssignDef() [ locks the filter ]
-> virNWFilterTriggerVMFilterRebuild()
-> qemudVMFilterRebuild()
Qemu is not the only driver using the nwfilter driver, but also the UML driver calls into it. Therefore qemuVMFilterRebuild() can be exchanged with umlVMFilterRebuild() along with the driver lock of qemu_driver that can now be a uml_driver. Further, since UML and Qemu domains can be running on the same machine, the triggering of a rebuild of the filter can touch both types of drivers and their domains.
In the patch below I am now extending each nwfilter callback driver with functions for locking and unlocking the (VM) driver (UML, QEMU) and introduce new functions for locking all registered callback drivers and unlocking them. Then I am distributing the lock-all-cbdrivers/unlock-all-cbdrivers call into the above call paths. The last shown callpath starting with nwfilterDriverStart() is problematic since it is initialize before the Qemu and UML drives are and thus a lock in the path would result in a NULL pointer attempted to be locked -- the call to virNWFilterTriggerVMFilterRebuild() is never called, so we never lock either the qemu_driver or the uml_driver in that path. Therefore, only the first 3 paths now receive calls to lock and unlock all callback drivers. Now that the locks are distributed where it matters I can remove the qemu_driver and uml_driver lock from qemudVMFilterRebuild() and umlVMFilterRebuild() and not requiring the recursive locks.
For now I want to put this out as an RFC patch. I have tested it by 'stretching' the critical section after the define/undefine functions each lock the filter so I can (easily) concurrently execute another VM operation (suspend,start). That code is in this patch and if you want you can de-activate it. It seems to work ok and operations are being blocked while the update is being done.
I still also want to verify the other assumption above that locking filter and qemu_domain always has a preceding qemu_driver lock.
* include/libvirt/libvirt.h.in: some of the function type description
were broken so they could not be automatically documented
* src/util/event.c docs/apibuild.py: event.c exports one public API
so it needs to be scanned too, avoid a few warnings
Make use of the existing <filesystem> element to support plan9fs
filesystem passthrough in the QEMU driver
<filesystem type='mount'>
<source dir='/export/to/guest'/>
<target dir='/import/from/host'/>
</filesystem>
NB, the target is not actually a directory, it is merely a arbitrary
string tag that is exported to the guest as a hint for where to mount
it.
Add proper documentation to the new VIR_DOMAIN_MEMORY_* macros in
libvirt.h.in to placate apibuild.py.
Mark args as unused in for libvirt_virDomain{Get,Set}MemoryParameters
in the Python bindings and add both to the libvirtMethods array.
Update remote_protocol-structs to placate make syntax-check.
Undo unintended modifications in vboxDomainGetInfo.
Update the function table of the VirtualBox and XenAPI drivers.
Adding parsing code for memory tunables in the domain xml file
also change the internal define structures used for domain memory
informations
Adds a new specific test
Public api to set/get memory tunables supported by the hypervisors.
dv:
* some cleanups in libvirt.c
* adding extra checks in libvirt.c new entry points
v4:
* Move exporting public API to this patch
* Add unsigned int flags to the public api for future extensions
v3:
* Add domainGetMemoryParamters and NULL in all the driver interface
v2:
* Initialize domainSetMemoryParameters to NULL in all the driver
interface structure.
Some features provided by the recently added CPU models were mentioned
twice for each model. This was a result of automatic generation of the
XML from qemu's CPU configuration file without noticing this redundancy.
To enable the CPU XML from the capabilities to be pasted directly
into the guest XML with no editing, pick a sensible default for
match and feature policy. The CPU match will be exact and the
feature policy will be require. This should ensure safety for
migration and give DWIM semantics for users
* src/conf/cpu_conf.c: Default to exact match and require policy
* docs/formatdomain.html.in: Document new defaults
According to API documentation virDomain{At,De}tachDevice calls are
supposed to only work on active guests for device hotplug. For anything
beyond that, their *Flags variants have to be used.
Despite the variant which was acked on libvirt mailing list
(https://www.redhat.com/archives/libvir-list/2010-January/msg00385.html)
commit ed9c14a7ef (by Jim Fehlig)
introduced automagic behavior of these API calls for xen driver. Since
January, these calls always change persistent configuration of a guest
and if the guest is currently active, they also hot(un)plug the device.
That change didn't follow API documentation and also broke device
hot(un)plug for older xend implementations which do not support changing
persistent configuration of a guest and hot(un)plugging in one step.
This patch should not break anything for active guests. On the other
hand, changing inactive guests is not supported any more.
When a user calls to virDomain{Attach,Detach,Update}DeviceFlags() with
flags == VIR_DOMAIN_DEVICE_MODIFY_LIVE on an inactive guest running on
an old Xen hypervisor (such as RHEL-5) xend_internal driver reports:
Xend version does not support modifying persistent config
which is pretty confusing since no-one requested to modify persistent
config.
In this patch I am extending the rule instantiator to create the state
match according to the state attribute in the XML. Only one iptables
rule in the incoming or outgoing direction will be created for a rule
in direction 'in' or 'out' respectively. A rule in direction 'inout' does
get iptables rules in both directions.
The xm internal xen driver only supports disk and network devices to be
added to a guest. On an attempt to attach any other device the xm driver
used VIR_ERR_XML_ERROR which resulted in a completely bogus error
message:
error: Failed to attach device from pci.xml
error: XML description for unknown device is not well formed or invalid
Since version 4.1 ESX(i) can expose virtual serial devices over TCP.
Add support in the VMX handling code for this, add test cases to cover
it and add links to some documentation.
ESX supports two additional protocols: TELNETS and TLS. Add them to
the list of serial-over-TCP protocols.
The <vcpu cpuset=...> attribute has been available since commit
e193b5dd, but without documentation or RNG validation.
* docs/schemas/domain.rng (vcpu): Further validate cpuset.
* docs/formatdomain.html.in: Document it.
* src/conf/domain_conf.c: Fix typos.
Description: Implement AppArmorSetSecurityHostdevLabel() and
AppArmorRestoreSecurityHostdevLabel() for hostdev and pcidev attach.
virt-aa-helper also has to be adjusted because *FileIterate() is used for pci
and usb devices and the corresponding XML for hot attached hostdev and pcidev
is not in the XML passed to virt-aa-helper. The new '-F filename' option is
added to append a rule to the profile as opposed to the existing '-f
filename', which rewrites the libvirt-<uuid>.files file anew. This new '-F'
option will append a rule to an existing libvirt-<uuid>.files if it exists,
otherwise it acts the same as '-f'.
load_profile() and reload_profile() have been adjusted to add an 'append'
argument, which when true will use '-F' instead of '-f' when executing
virt-aa-helper.
All existing calls to load_profile() and reload_profile() have been adjusted
to use the old behavior (ie append==false) except AppArmorSetSavedStateLabel()
where it made sense to use the new behavior.
This patch also adds tests for '-F'.
Bug-Ubuntu: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/640993
In this patch I am extending the rule instantiator to create the comment
node where supported, which is the case for iptables and ip6tables.
Since commands are written in the format
cmd='iptables ...-m comment --comment \"\" '
certain characters ('`) in the comment need to be escaped to
prevent comments from becoming commands themselves or cause other
forms of (bash) substitutions. I have tested this with various input and in
my tests the input made it straight into the comment. A test case for TCK
will be provided separately that tests this.
When creating a new gust, the function phypBuildLpar() was not
checking for NULL values
src/phyp/phyp_driver.c: check the definition arguments to avoid a segmentation
fault in phypBuildLpar()
This reverses commit 04c3704, which added a define to nwfilter to
allow libvirtd compilation on Mac OS X. Stefan Bergers commit, 2e7294d,
is the proper solution, removing the requirement for nwfilter on non-Linux.
The patch below reports a warning in the log if the generated ip(6)tables rules would not be effective due to the proc filesystem entries
/proc/sys/net/bridge/bridge-nf-call-iptables
/proc/sys/net/bridge/bridge-nf-call-ip6tables
containing a '0'. The warning tells the user what to do. I am rate-limiting the warning message to appear only every 10 seconds.
Description: Check for VIR_DOMAIN_CHR_TYPE in serial ports and add 'rw' for
defined serial ports, parallel ports and channels
Bug-Ubuntu: LP: #578527, LP: #609055
pciFindStubDriver currently returns 0 in one of the error cases.
While it's correct...NULL is more readable.
Signed-off-by: Chris Wright <chrisw@redhat.com>
The addrToString methods were not coping with UNIX domain sockets
which have no normal host+port address. Hardcode special handling
for these so that SASL routines can work over UNIX sockets. Also
fix up SSF logic in remote client so that it presumes that a UNIX
socket is secure
* daemon/remote.c: Fix addrToString for UNIX sockets.
* src/remote/remote_driver.c: Fix addrToString for UNIX sockets
and fix SSF logic to work for TLS + UNIX sockets in the same
manner
When nwfilter support was added to UML, I didn't realise the UML driver
needed instrumentation to make updating nwfilters on the fly work. This
patch adds this bit of glue.
Signed-off-by: Soren Hansen <soren@linux2go.dk>
* src/Makefile.am (libvirt.def, libvirt_qemu.def): '\}' and '\t'
are not required by POSIX. Use '}' and literal tab instead.
(install-data-local): Avoid sed -i.
* tests/read-bufsiz: Likewise.
Reported by Mitchell Hashimoto.
The current code will go into an infinite loop if the printf generated
string is >= 1000, AND exactly 1 character smaller than the amount of free
space in the buffer. When this happens, we are dropped into the loop body,
but nothing will actually change, because count == (buf->size - buf->use - 1),
and virBufferGrow returns unchanged if count < (buf->size - buf->use)
Fix this by removing the '- 1' bit from 'size'. The *nprintf functions handle
the NULL byte for us anyways, so we shouldn't need to manually accommodate
for it.
Here's a bug where we are actually hitting this issue:
https://bugzilla.redhat.com/show_bug.cgi?id=602772
v2: Eric's improvements: while -> if (), remove extra va_list variable,
make sure we report buffer error if snprintf fails
v3: Add tests/virbuftest which reproduces the infinite loop before this
patch, works correctly after
Apparently the xen block device statistics moved from
"/sys/devices/xen-backend/vbd-%d-%d/statistics/%s"
to
"/sys/bus/xen-backend/devices/vbd-%d-%d/statistics/%s"
* src/xen/block_stats.c: try the extra path in case of failure to
find the statistics in /sys
A QEMU guest can have upto VIR_DOMAIN_BOOT_LAST boot entries
defined. When building the QEMU arg, each entry takes a
single byte. This means the array must be declared to be
VIR_DOMAIN_BOOT_LAST+1 bytes in length to allow for the
trailing null
* src/qemu/qemu_conf.c: Fix off-by-1 boot arg array size
For static-only DHCP, i.e. with no <range> but at least one <host>
element within <dhcp> element, we have to add "--dhcp-range IP,static"
option to dnsmasq to actually enable the service. Without this option,
dnsmasq will not respond to DHCP requests.
QMP in QEMU 0.13 has been fixed to enforce type correctness,
this means that boolean types must be true or false, not
integers.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
QMP in QEMU 0.13 has been fixed to enforce type correctness,
this means that boolean types must be true or false, not
integers.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Before this commit SessionIsActive was not used because ESX(i)
doesn't implement it. vCenter supports SessionIsActive, so use
it here, but keep the fall back mechanism for ESX(i) and GSX.
QueryVirtualDiskUuid is only available on an ESX(i) server. vCenter
returns an NotImplemented fault and a GSX server is missing the
VirtualDiskManager completely. Therefore only use QueryVirtualDiskUuid
with an ESX(i) server and fall back to path as storage volume key for
vCenter and GSX server.
VirtualDisks are .vmdk file based. Other files in a datastore
like .iso or .flp files don't have a UUID attached, fall back
to the path as key for them.
This patch adds support for ethernet interface type to OpenVZ domains
as stated in this previous message: http://www.redhat.com/archives/libvir-
list/2010-July/msg00658.html
Instead of splitting the path part of a datastore path into
directory and file name, keep this in one piece. An example:
"[datastore] directory/file"
was split into this before:
datastoreName = "datastore"
directoryName = "directory"
fileName = "file"
Now it's split into this:
datastoreName = "datastore"
directoryName = "directory"
directoryAndFileName = "directory/file"
This simplifies code using esxUtil_ParseDatastorePath, because
directoryAndFileName is used more often than fileName. Also the
old approach expected the datastore path to reference an actual
file, but this isn't always correct, especially when listing
volumes. In that case esxUtil_ParseDatastorePath is used to parse
a path that references a directory. This fails for a vpx://
connection because the vCenter returns directory paths with a
trailing '/'. The new approach is robust against this and the
actual decision if the datastore path should reference a file or
a directory is up to the caller of esxUtil_ParseDatastorePath.
Update the tests accordingly.
For privileged UML connections (uml:///system), we shouldn't use root's
home dir, but rather somewhere in /var/run/libvirt/uml-guest.
https://bugzilla.redhat.com/show_bug.cgi?id=499536
Signed-off-by: Soren Hansen <soren@linux2go.dk>
uml_dir overrides user-mode-linux's default of ~/.uml. This is needed
for a couple of different reasons:
libvirt expects this to default to virGetUserDirectory(geteuid()) +
'/.uml'. However, user-mode-linux actually uses the HOME environment
variable to determine where to look for the uml sockets, but if running
libvirtd under sudo (which I routinely do during development), $HOME is
pointing at my user's homedir, while my euid is 0, so libvirt looks in
/root.
Also (and this was my actual motivation for this patch), if HOME isn't
set at all, user-mode-linux utterly fails. Looking at the code, it seems
it's meant to emit a warning, but alas, it doesn't for some reason.
If running libvirtd from upstart, HOME is not set, so any system using
upstart will need this change.
Signed-off-by: Soren Hansen <soren@linux2go.dk>
Xen4.0 includes a new blktap2 implementation, which is specified
with 'tap2' prefix. AFAICT it's configuration syntax is identical
to blktap, with exception of 'tap2' vs 'tap' prefix. This patch
takes the simple approach of accepting and generating sexp
containing 'tap2' prefix.
When creating a new domain from XML, the check for an existing
domain name should compare the return of the function to a valid
LPAR ID (!= -1) and not to error (== -1).
The check was altered in 8c48743b97
and got too strict, I've no clue how that snuck in. This check
makes every try to open a connection using the ESX driver fail
with an invalid argument error.
Revert the change to the check and add a comment to prevent future
mistakes with this check.
UML supports hot plugging and unplugging of various devices. This patch
exposes this functionality for disks.
Signed-off-by: Soren Hansen <soren@linux2go.dk>
Other drivers will need this same functionality, so move it to up to
conf/domain_conf.c and give it a more general name.
Signed-off-by: Soren Hansen <soren@linux2go.dk>
When finding a sparse NUMA topology, libnuma will return ENOENT
the first time it is invoked. On subsequent invocations it
will return success, but with an all-1's CPU mask. Check for
this, to avoid polluting the capabilities XML with 4096 bogus
CPUs
* src/nodeinfo.c: Check for all-1s CPU mask
Enabling debug doesn't show the capabilities XML for a connection.
Add an extra debug statement for the return value
* src/libvirt.c: Enable debug logging of capabilities XML
Like the comment suggested, we just open the file and pass the file
descriptor to uml. The input "stream" is set to "null", since I couldn't
find any useful way to actually use a file for input for a chardev and
this also mimics what e.g. QEmu does internally.
Signed-off-by: Soren Hansen <soren@linux2go.dk>
Instead of using one big traversal spec for lookup use a set of
more fine grained traversal specs that are selected based on the
actual needs of the lookup.
This gives up to 20% speedup for certain operations like domain
listing due to less HTTP(S) traffic.
* src/xenapi/xenapi_driver.c (xenapiDomainGetInfo): Avoid using
XEN_VM_POWER_STATE_UNKNOWN, which disappeared in newer xenapi.
* src/xenapi/xenapi_utils.c (mapPowerState): Likewise.
Previously QEMU enabled KQEMU by default and had -no-kqemu.
0.11.x switched to requiring -enable-kqemu. 0.12.x dropped
kqemu entirely. This patch adds support for -enable-kqemu
so 0.11.x works. It replaces a huge set of if() with a
switch() to make the code a bit more readable.
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Support
-enable-kqemu
With the previous storage pool UUID source not all storage pools
had a proper UUID, especially GSX storage pools. The mount path
is unique per host and cannot change during the lifetime of the
datastore. Therefore, it's MD5 sum can be used as UUID.
Use gnulib's crypto/md5 module to generate the MD5 sum.
Xen supports on_crash actions coredump-{destroy,restart}. libvirt
cannot parse config returned by xend that contains either of these
actions
xen52 # xm li -l test | grep on_crash
(on_crash coredump-restart)
xen52 # virsh dumpxml test
error: internal error unknown lifecycle type coredump-restart
This patch adds a new virDomainLifecycleCrash enum and appends
the new options to existing destroy, restart, preserve, and
rename-restart options.
We already filled the PCI address structure when we checked whether it's
free or not, so let's just use the structure here instead of filling it
again.
I wrote a patch to add support for listing the Vendor and Model of a
storage pool in the storage pool XML. This would allow vendor
extensions of specific devices. The patch includes a test for the new
attributes as well.
Patrick Dignan
* src/uml/uml_driver.c (umlMonitorCommand): Validate that enough
bytes were read to dereference both res.length, and that many
bytes from res.data.
Reported by Soren Hansen.
node_device/node_device_driver.c: In function 'nodeDeviceVportCreateDelete':
node_device/node_device_driver.c:423: error: implicit declaration of function 'stat' [-Wimplicit-function-declaration]
* src/node_device/node_device_driver.c (includes): Add <sys/stat.h>.
Doing `virsh schedinfo rhel5u3 --cap 65535' the hypervisor does the
call, but does not change the value nor raise an error. Best is just to
consider it's not in the allowed values. The problem is that the error
won't be output since the xend driver will then be called and raise an
error
error: this function is not supported by the hypervisor: unsupported
in xendConfigVersion < 4
which will override the useful information from
xenUnifiedDomainSetSchedulerParameters(). So best is to also invert the
order in which the xen sub-drivers are called.
* src/xen/xen_hypervisor.c: mark 65535 cap value as out of bound
* src/xen/xen_hypervisor.c: reverse the order of the calls to the xen
sub drivers to get the error message if needed
Some kernels, such as the one used in RHEL-5, have vport_create and
vport_delete operation files in /sys/class/scsi_host/hostN directory
instead of /sys/class/fc_host/hostN. Let's check both paths for
compatibility reasons.
This also removes unnecessary '/' characters from sysfs paths containing
LINUX_SYSFS_FC_HOST_PREFIX.
NodeDeviceCreateXML and NodeDeviceDestroy methods added for NPIV were
using the wrong privateData field for the remote driver. This doesn't
impact KVM, since the remote driver handles everything, thus
privateData == devMonPrivateData. It does impact Xen though, because
the remote driver only handles a subset of methods and thus
privateData != devMonPrivateData.
In case an optional object cannot be found the lookup function is
left early and the cleanup code is not executed.
This pattern occurs in some other functions too.
The current version of the qemu managed save implementation
is subject to a race where the domain shuts down between
the time that we start the command and the time that we
actually try to do the save. Close this race by making
qemuDomainSaveFlags() expect both the driver and the passed-in
vm object to be locked before executing.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
When reconnecting to existing VMs, we re-reserved only those PCI
addresses which were explicitly mentioned in domain XML. Since some
addresses are always reserved (e.g., 0:0:0 and 0:0:1), we need to handle
those too.
Also all this should only be done if device flag is supported by qemu.
In this patch I am extending and fixing the nwfilter module's reload support to stop all ongoing threads (for learning IP addresses of interfaces) and rebuild the filtering rules of all interfaces of all VMs when libvirt is started. Now libvirtd rebuilds the filters upon the SIGHUP signal and libvirtd restart.
About the patch: The nwfilter functions require a virConnectPtr. Therefore I am opening a connection in qemudStartup, which later on needs to be closed outside where the driver lock is held since otherwise it ends up in a deadlock due to virConnectClose() trying to lock the driver as well.
I have tested this now for a while with several machines running and needing the IP address learner thread(s). The rebuilding of the firewall rules seems to work fine following libvirtd restart or a SIGHUP. Also the termination of libvirtd worked fine.
floppy0.present defaults to true. Therefore, it needs to be
explicitly set to false when the XML config doesn't specify the
corresponding floppy device.
Also update tests accordingly.
I changed virStorage[Open|Close] to virVIOSDriver[Open|Close] so
the network driver can use it - since the network driver deals
with Open/Close in the same way.
This patch does two things:
* It makes umlConnectTapDevice ask brAddTap for a persistent tap by
passing it a NULL tapfd argument.
* Stops umlConnectTapDevice from immediately dismantling the bridge
it just set up.
Signed-off-by: Soren Hansen <soren@linux2go.dk>
When passing a NULL tapfd argument to brAddTap, we need to close the fd
of the tap device. If we don't, libvirt will keep the fd open
indefinitely and renders the the guest unable to configure its side of
the tap device.
Signed-off-by: Soren Hansen <soren@linux2go.dk>
If umlBuildCommandLineChr fails (e.g. due to an unsupported chardev
type), it returns NULL. umlBuildCommandLine does not check for this and
sets this as an argument on the comand line, effectively ending the
argument list. This patch checks for this case and sets the chardev to
"none".
Signed-off-by: Soren Hansen <soren@linux2go.dk>
When sniffing the network traffic, discard class D and E IP addresses when sniffing traffic. This was a reason why filters were not correctly rebuilt on VMs on the local 192.* network when libvirt was restarted and those VMs did not use a DHCP request to get its IP address.
While testing the SIGHUP handling and reloading of the nwfilter driver, I found that when the filters are rebuilt and mutlipe threads handled the individual interfaces, concurrently running multiple external bash scripts causes strange failures even though the executed ebtables commands are working on different tables for different interfaces. I cannot say for sure where the concurrency problems are caused, but introducing this lock definitely helps.
Since the qemu process is running as qemu:qemu, it can't actually
look at the unix socket in /var/run/libvirt/qemu which is owned by
root and has permission 700. Move the unix socket to
/var/lib/libvirt/qemu, which is already owned by qemu:qemu.
Thanks to Justin Clift for test this out for me.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
The problem is that on the source of the migration, libvirtd
is responsible for creating the unix socket over which the data
will flow. Since libvirtd is running as root, this file will
be created as root. When the qemu process running as qemu:qemu
goes to access the unix file to write data to it, it will get
permission denied and fail. Make sure to change the owner
of the unix file to qemu:qemu.
Thanks to Justin Clift for testing this patch out for me.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
This patch fixes a couple of complaints from valgrind when tickling libvirtd with SIGHUP.
The first two files contain fixes for memory leaks. The 3rd one initializes an uninitialized variable. The 4th one is another memory leak.
Basically a followup of the previous patch about balloon desactivation
if desactivated, to not ask for balloon information to qemu as we will
just get an error back.
This can make a huge difference in the time needed for domain
information or list when a machine is loaded, and balloon has been
desactivated in the guests.
* src/qemu/qemu_driver.c: do not get the balloon info if the balloon
suppor is disabled
--dhcp-no-override description from dnsmasq man page:
Disable re-use of the DHCP servername and filename fields as
extra option space. If it can, dnsmasq moves the boot server and
filename information (from dhcp-boot) out of their dedicated
fields into DHCP options. This make extra space available in the
DHCP packet for options but can, rarely, confuse old or broken
clients. This flag forces "simple and safe" behaviour to avoid
problems in such a case.
It seems some virtual network card ROMs are this old/buggy so let's add
--dhcp-no-override as a workaround for them. We don't use extra DHCP
options so this should be safe. The option was added in dnsmasq-2.41,
which becomes the minimum required version.
For parsing try to match by datastore mount path first, if that
fails fallback to /vmfs/volumes/<datastore>/<path> parsing. This
also fixes problems with GSX on Windows. Because GSX on Windows
doesn't use /vmfs/volumes/ style file names.
For formatting use the datastore mount path too, instead of using
/vmfs/volumes/<datastore>/<path> as fixed format.
We add --dhcp-lease-max=xxx argument when network->def->nranges > 0 but
we only allocate space for in the opposite case :-) I guess we are lucky
enough to miscount somewhere else so that we actually allocate more
space than we need since no-one has hit this bug so far.
Introduce esxVMX_Context containing functions pointers to
glue both parts together in a generic way.
Move the ESX specific part to esx_driver.c.
This is a step towards making the VMX code reusable in a
potential VMware Workstation and VMware Player driver.
The balloon device is automatically added to qemu guests if supported,
but it may be useful to desactivate it. The simplest to not change the
existing behaviour is to allow
<memballoon type="none"/>
as an extra option to desactivate it (it is automatically added if the
memballoon construct is missing for the domain).
The following simple patch just adds the extra option and does not
change the default behaviour but avoid creating a balloon device if
type="none" is used.
* docs/schemas/domain.rng: add the extra type attribute value
* src/conf/domain_conf.c src/conf/domain_conf.h: add the extra enum
value
* src/qemu/qemu_conf.c: if enum is NONE, don't activate the device,
i.e. don't pass the args to qemu/kvm
Added a more detailed error message when adding a tap devices fails and
the kernel is missing tun support.
Signed-off-by: Doug Goldstein <cardoe@gentoo.org>
Fix the error checking to use the return value from brAddTap() instead
of checking the current errno value which might have been changed by
clean up calls inside of brAddTap().
Signed-off-by: Doug Goldstein <cardoe@gentoo.org>
https://bugzilla.redhat.com/622515 - When hot-unplugging CPUs,
libvirt failed to start a guest that had been pinned to CPUs that
were still online.
Tested on a dual-core laptop, where I also discovered that, per
http://www.cyberciti.biz/files/linux-kernel/Documentation/cpu-hotplug.txt,
/sys/devices/system/cpu/cpu0/online does not exist on systems where it
cannot be hot-unplugged.
* src/nodeinfo.c (linuxNodeInfoCPUPopulate): Ignore CPUs that are
currently offline. Detect readdir failure.
(parse_socket): Move guts...
(get_cpu_value): ...to new function, shared with...
(cpu_online): New function.
device_del command is not synchronous for PCI devices, it merely asks
the guest to release the device and returns. If the host wants to use
that device before the guest actually releases it, we are in big
trouble. To avoid this, we already added a loop which waits up to 10
seconds until the device is actually released before we do anything else
with that device. But we only added this loop for managed PCI devices
before we try reattach them back to the host.
However, we need to wait even for non-managed devices. We don't reattach
them automatically, but we still want to prevent the host from using it.
This was revealed thanks to sVirt: when we relabel sysfs files
corresponding to the PCI device before the guest finished releasing the
device, qemu is no longer allowed to access those files and if it wants
(as a result of guest's request) to write anything to them, it just
exits, which kills the guest.
This is not a proper fix and needs some further work both on libvirt and
qemu side in the future.
virDiskNameToIndex has a list of disk name prefixes that it uses in the
process of finding the disk's index. This list is missing "ubd" which
is the disk prefix used for UML domains.
Signed-off-by: Soren Hansen <soren@linux2go.dk>
That way it can be used to verify a numeric address without storing
the details
* src/util/network.c: change virSocketParseAddr to allow a null @addr
parameter
According to <xen-3.4.3/tools/python/xen/xm/create.py:158>
gopts.var('bootargs', val='NAME',
fn=set_value, default=None,
use="Arguments to pass to boot loader")
the "bootloader_args" parameter needs to be translated into "bootargs"
when using "virsh domxml-to-native xen-xm".
The reverse direction (domxml-from-native) is already okay.
This patch fixes domxml-to-native and adds two test files to catch this
problem.
Signed-off-by: Philipp Hahn <hahn@univention.de>
src/phyp/phyp_driver.c:phypListDomainsGeneric was crashing due to a buffer
overflow if any line returned from virRun wasn't <=10 characters.
Since virStrToLong_i recognizes any non-numeric as a terminator (not
just NULL), there actually is no need to copy the number into a
separate string anyway, so this patch eliminates that copy, the fixed
length buffer, and therefore the potential to overflow.
This change also provided the oppurtunity to eliminate the character
counting loop, instead using the return from virStrToLong_i to point
past the end of the number, then simply skip the \n to get to the
next.
Fix the error checking to use the return value from brAddTap() instead
of checking the current errno value which might have been changed by
clean up calls inside of brAddTap().
Signed-off-by: Doug Goldstein <cardoe@gentoo.org>
Added a more detailed error message when adding a tap devices fails and
the kernel is missing tun support.
Signed-off-by: Doug Goldstein <cardoe@gentoo.org>
the followup on the boot=on problem, basically it's not needed to
specify it when booting out of IDE devices when using KVM
* src/qemu/qemu_conf.c: do not use boot=on for IDE devices
* tests/qemuxml2argvdata/qemuxml2argv*.args: this changes the output
for 5 of the tests
Patch version revamped by Eric Blake <eblake@redhat.com> of Jiri
Denemark <jdenemar@redhat.com> original patch
When attaching a PCI device which doesn't explicitly set its PCI
address, libvirt allocates the address automatically. The problem is
that when checking which PCI address is unused, we only check for those
with slot number higher than the highest slot number ever used.
Thus attaching/detaching such device several times in a row (31 is the
theoretical limit, less then 30 tries are enough in practise) makes any
further device attachment fail. Furthermore, attaching a device with
predefined PCI address to 0:0:31 immediately forbids attachment of any
PCI device without explicit address.
This patch changes the logic so that we always check all PCI addresses
before we say there is no PCI address available.
Modifications from v1: revert back to remembering the last slot
reserved, but allow wraparound to not be limited by the end.
In this way, slots are still assigned in the same order as
before the patch, rather than filling in the gaps closest to
0 and risking making windows guests mad.
* src/qemu/qemu_conf.c: fix pci reservation code to do a round-robbin
check of all available PCI splot availability before failing.
Don't rely on summary.url anymore, because its value is different
between an esx:// and vpx:// connection. Use host.mountInfo.path
instead.
Don't fallback to lookup by UUID (actually lookup by absolute path)
in esxVI_LookupDatastoreByName when lookup by name fails. Add a
seperate function for this: esxVI_LookupDatastoreByAbsolutePath
Now a vpx:// connection has an explicitly specified host. This
allows to enabled several functions for a vpx:// connection
again, like host UUID, hostname, general node info, max vCPU
count, free memory, migration and defining new domains.
Lookup datacenter, compute resource, resource pool and host
system once and cache them. This simplifies the rest of the
code and reduces overall HTTP(S) traffic a bit.
esx:// and vpx:// can be mixed freely for a migration.
Ensure that migration source and destination refer to the
same vCenter. Also directly encode the resource pool and
host system object IDs into the migration URI in the prepare
function. Then directly build managed object references in
the perform function instead of re-looking up already known
information.
This patch attempts to take advantage of a newly added netfilter
module to correct for a problem with some guest DHCP client
implementations when used in conjunction with a DHCP server run on the
host systems with packet checksum offloading enabled.
The problem is that, when the guest uses a RAW socket to read the DHCP
response packets, the checksum hasn't yet been fixed by the IP stack,
so it is incorrect.
The fix implemented here is to add a rule to the POSTROUTING chain of
the mangle table in iptables that fixes up the checksum for packets on
the virtual network's bridge that are destined for the bootpc port (ie
"dhcpc", ie port 68) port on the guest.
Only very new versions of iptables will have this support (it will be
in the next upstream release), so a failure to add this rule only
results in a warning message. The iptables patch is here:
http://patchwork.ozlabs.org/patch/58525/
A corresponding kernel module patch is also required (the backend of
the iptables patch) and that will be in the next release of the
kernel.
When trying to assign a PCI device to a guest, we have
to check that all bridges upstream of that device support
ACS. That means that we have to find the parent bridge of
the current device, check for ACS, then find the parent bridge
of that device, check for ACS, etc. As it currently stands,
the code to do this iterates through all PCI devices on the
system, looking for a device that has a range of busses that
included the current device's bus.
That check is not restrictive enough, though. Depending on
how we iterated through the list of PCI devices, we could first
find the *topmost* bridge in the system; since it necessarily had
a range of busses including the current device's bus, we
would only ever check the topmost bridge, and not check
any of the intermediate bridges.
Note that this also caused a fairly serious bug in the
secondary bus reset code, where we could erroneously
find and reset the topmost bus instead of the inner bus.
This patch changes pciGetParentDevice() so that it first
checks if a bridge device's secondary bus exactly matches
the bus of the device we are looking for. If it does, we've
found the correct parent bridge and we are done. If it does not,
then we check to see if this bridge device's busses *include* the
bus of the device we care about. If so, we mark this bridge device
as best, and go on. If we later find another bridge device whose
busses include this device, but is more restrictive, then we
free up the previous best and mark the new one as best. This
algorithm ensures that in the normal case we find the direct
parent, but in the case that the parent bridge secondary bus
is not exactly the same as the device, we still find the
correct bridge.
This patch was tested by me on a 4-port NIC with a
bridge without ACS (where assignment failed), a 4-port
NIC with a bridge with ACS (where assignment succeeded),
and a 2-port NIC with no bridges (where assignment
succeeded).
Signed-off-by: Chris Lalancette <clalance@redhat.com>
When parsing hostdev, the following message would be emitted:
10:17:19.052: error : virDomainHostdevDefParseXML:3748 : internal error unknown node alias
However, alias is appropriately parsed in
virDomainDeviceInfoParseXML anyway. Disable the error message
in the initial XML parsing loop.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
When an openvz domain is defined with virDomainDefineXML,
domain id is set to -1. A call to virDomainGetInfo after
starting the domain would then fail because this invalid
id is passed to openvzGetProcessInfo.
Found by clang. Clang complained that virStorageBackendProbeTarget
could dereference NULL if backingStoreFormat was NULL, but since all
callers passed a valid pointer, I added attributes instead of null
checks.
* src/storage/storage_backend.c
(virStorageBackendQEMUImgBackingFormat): Kill dead store.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Likewise. Skip null checks, by adding attributes.
valgrind was complaining that virUUIDParse was depending on
an uninitialized value. Indeed it was; virSetHostUUIDStr()
didn't initialize the dmiuuid buffer to 0's, meaning that
anything after the string read from /sys was uninitialized.
Clear out the dmiuuid buffer before use, and make sure to
always leave a \0 at the end.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Basically the 'boot=on' boot selection device is something present in
KVM but not in upstream QEmu, as a result if we boot a QEmu domain
without KVM acceleration we must disable boot=on ... even if the front
end kvm binary expose that capability in the help page.
* src/qemu/qemu_conf.c: in qemudBuildCommandLine if -no-kvm
is passed, then deactivate QEMUD_CMD_FLAG_DRIVE_BOOT
ADD_ARG_LIT should only be used for literal arguments,
since it duplicates the memory. Since virBufferContentAndReset
is already allocating memory, we should only use ADD_ARG.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
esxVI_WaitForTaskCompletion can take a UUID to lookup the
corresponding domain and check if the current task for it
is blocked by a question. It calls another function to do
this: esxVI_LookupAndHandleVirtualMachineQuestion looks up
the VirtualMachine and checks for a question. If there is
a question it calls esxVI_HandleVirtualMachineQuestion to
handle it.
If there was no question or it has been answered the call
to esxVI_LookupAndHandleVirtualMachineQuestion returns 0.
If any error occurred during the lookup and answering
process -1 is returned. The problem with this is, that -1
is also returned when there was no error but the question
could not be answered. So esxVI_WaitForTaskCompletion cannot
distinguish between this two situations and reports that a
question is blocking the task even when there was actually
another problem.
This inherent problem didn't surface until vSphere 4.1 when
you try to define a new domain. The driver tries to lookup
the domain that is just in the process of being registered.
There seems to be some kind of race condition and the driver
manages to issue a lookup command before the ESX server was
able to register the domain. This used to work before.
Due to the return value problem described above the driver
reported a false error message in that case.
To solve this esxVI_WaitForTaskCompletion now takes an
additional occurrence parameter that describes whether or
not to expect the domain to be existent. Also add a new
parameter to esxVI_LookupAndHandleVirtualMachineQuestion
that allows to distinguish if the call returned -1 because
of an actual error or because the question could not be
answered.
'./autobuild.sh' with lcov installed discovered that our
coverage support has been bit-rotting for a while. This
restores it back to a successful state, although I have
not yet spent any time looking through the resulting files to
look for low-hanging fruit in the unit test coverage front.
* configure.ac: Clear COMPILER_FLAGS at right place.
* Makefile.am (cov): Newer genhtml no longer likes plain -s.
* m4/compiler-flags.m4 (gl_COMPILER_FLAGS): Don't AC_SUBST
COMPILER_FLAGS; it is a shell variable for use in configure only.
* src/Makefile.am (AM_CFLAGS, AM_LDFLAGS): New variables, to make
it easier to provide global flag additions. Use throughout, to
uniformly apply coverage flags.
* .gitignore: Globally ignore gcov output.
* daemon/.gitignore: Simplify.
* src/.gitignore: Likewise.
* tests/.gitignore: Likewise.
The recent switch to enable -Wlogical-op paid off again.
gcc 4.5.0 (rawhide) is smarter than 4.4.4 (Fedora 13).
* src/xen/xend_internal.c (xenDaemonAttachDeviceFlags)
(xenDaemonUpdateDeviceFlags, xenDaemonDetachDeviceFlags): Use
correct operator.
Signed-off-by: Eric Blake <eblake@redhat.com>
src/lxc/veth.c:150: VIR_DEBUG(_("Failed to delete '%s' (%d)"),
src/lxc/veth.c:188: VIR_DEBUG(_("Failed to disable '%s' (%d)"),
maint.mk: do not mark these strings for translation
* src/lxc/veth.c (vethDelete, vethInterfaceUpOrDown): Don't
translate VIR_DEBUG.
Previously, the functions in src/lxc/veth.c could sometimes return
positive values on failure rather than -1. This made accurate error
reporting difficult, and led to one failure to catch an error in a
calling function.
This patch makes all the functions in veth.c consistently return 0 on
success, and -1 on failure. It also fixes up the callers to the veth.c
functions where necessary.
Note that this patch may be related to the bug:
https://bugzilla.redhat.com/show_bug.cgi?id=607496.
It will not fix the bug, but should unveil what happens.
* po/POTFILES.in - add veth.c, which previously had no translatable strings
* src/lxc/lxc_controller.c
* src/lxc/lxc_container.c
* src/lxc/lxc_driver.c - fixup callers to veth.c, and remove error logs,
as they are now done in veth.c
* src/lxc/veth.c - make all functions consistently return -1 on error.
* src/lxc/veth.h - use ATTRIBUTE_NONNULL to protect against NULL args.
This fixes a leak described in
https://bugzilla.redhat.com/show_bug.cgi?id=590073
xenUnifiedDomainInfoList has a pointer to a list of pointers to
xenUnifiedDomain. We were freeing up all the domains, but neglecting
to free the list.
This was found by Paolo Bonzini <pbonzini@redhat.com>.
If detecting the FLR flag of a pci device fails, then we
could run into the situation of trying to close a file
descriptor twice, once in pciInitDevice() and once in pciFreeDevice().
Fix that by removing the pciCloseConfig() in pciInitDevice() and
just letting pciFreeDevice() handle it.
Thanks to Chris Wright for pointing out this problem.
While we are at it, fix an error check. While it would actually
work as-is (since success returns 0), it's still more clear to
check for < 0 (as the rest of the code does).
Signed-off-by: Chris Lalancette <clalance@redhat.com>
All <console> devices now export a <target> type attribute. QEMU defaults
to 'serial', UML defaults to 'uml, xen can be either 'serial' or 'xen'
depending on fullvirt. Understandably there is lots of test fallout.
This will be used to differentiate between a serial vs. virtio console for
QEMU.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
targetType only tracks the actual <target> format we are parsing. Currently
we only fill abide this value for channel devices.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
There is actually a difference between the character device type (serial,
parallel, channel, ...) and the target type (virtio, guestfwd). Currently
they are awkwardly conflated.
Start to pull them apart by renaming targetType -> deviceType. This is
an entirely mechanical change.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
During function test of the 802.1Qbg implementation in lldpad we came
across a small problem in the handling of the netlink message
corresponding to PORT_PROFILE_RESPONSE_INPROGRESS. This should not
result in returning the default rc=1.
- src/util/macvtap.c: fix getPortProfileStatus() to return 0 in that
case and also fix an indentation problem
QEMU has had two different syntax for disk cache options
Old: on|off
New: writeback|writethrough|none
QEMU recently added another 'unsafe' option which broke the
libvirt check. We can avoid this & future breakage, if we
do a negative check for the old syntax, instead of a positive
check for the new syntax
* src/qemu/qemu_conf.c: Invert cache option check
Add a new element to the <os> block:
<bootmenu enable="yes|no"/>
Which maps to -boot,menu=on|off on the QEMU command line.
I decided to use an explicit 'enable' attribute rather than just make the
bootmenu element boolean. This allows us to treat lack of a bootmenu element
as 'use hypervisor default'.
Some buggy PCI devices actually support FLR, but
forget to advertise that fact in their PCI config space.
However, Virtual Functions on SR-IOV devices are
*required* to support FLR by the spec, so force has_flr
on if this is a virtual function.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
When doing a PCI secondary bus reset, we must be sure that there are no
active devices on the same bus segment. The active device tracking is
designed to only track host devices that are active in use by guests.
This ignores host devices that are actively in use by the host. So the
current logic will reset host devices.
Switch this logic around and allow sbus reset when we are assigning all
devices behind a bridge to the same guest at guest startup or as a result
of a single attach-device command.
* src/util/pci.h: change signature of pciResetDevice to add an
inactive devices list
* src/qemu/qemu_driver.c src/xen/xen_driver.c: use (or not) the new
functionality of pciResetDevice() depending on the place of use
* src/util/pci.c: implement the interface and logic changes
- src/qemu/qemu_driver.c: Eliminate code duplication by using the new
helpers qemuPrepareHostdevPCIDevices and qemuDomainReAttachHostdevDevices.
This reduces the number of open coded calls to pciResetDevice.
- src/qemu/qemu_driver.c: These new helpers take hostdev list and count
directly rather than getting them indirectly from domain definition.
This will allow reuse for the attach-device case.
- src/qemu/qemu_driver.c: Update qemuGetPciHostDeviceList to take a
hostdev list and count directly, rather than getting this indirectly
from domain definition. This will allow reuse for the attach-device case.
Add a pointer to the primary context of a connection and use it in all
driver functions that don't dependent on the context type. This includes
almost all functions that deal with a virDomianPtr. Therefore, using
a vpx:// connection allows you to perform all the usual domain related
actions like start, destroy, suspend, resume, dumpxml etc.
Some functions that require an explicitly specified ESX server don't work
yet. This includes the host UUID, the hostname, the general node info, the
max vCPU count and the free memory. Also not working yet are migration and
defining new domains.
Since 070f61002f the vcenter query
parameter has been ignored, because the refactoring to use
esxUtil_ParseQuery was incomplete. This effectively broke migration,
because the vcenter query parameter is essential for a migration.
Commit 68719c4bdd added the
p option to control disk format probing, but it wasn't added
to the getopt_long optstring parameter.
Add the p option to the getopt_long optstring parameter.
Commit a885334499 added this
function and wrapped vah_add_file in it. vah_add_file may
return -1, 0, 1. It returns 1 in case the call to valid_path
detects a restricted file. The original code treated a return
value != 0 as error. The refactored code treats a return
value < 0 as error. This triggers segfault in virt-aa-helper
and breaks virt-aa-helper-test for the restricted file tests.
Make sure that add_file_path returns -1 on error.
virt-aa-helper used to ignore errors when opening files.
Commit a885334499 refactored
the related code and changed this behavior. virt-aa-helper
didn't ignore open errors anymore and virt-aa-helper-test
fails.
Make sure that virt-aa-helper ignores open errors again.
Thanks to DV for knocking together the Relax-NG changes
quickly for me.
Changes since v1:
- Change the domain.rng to correspond to the new schema
- Don't allocate caps->ns in testQemuCapsInit since it is a static table
Changes since v2:
- Change domain.rng to add restrictions on allowed environment names
Changes since v3:
- Remove a bogus comment in the tests
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Since we are adding a new "per-hypervisor" protocol, we
make it so that the qemu remote protocol uses a new
PROTOCOL and PROGRAM number. This allows us to easily
distinguish it from the normal REMOTE protocol.
This necessitates changing the proc in remote_message_header
from a "remote_procedure" to an "unsigned", which should
be the same size (and thus preserve the on-wire protocol).
Changes since v1:
- Fixed up a couple of script problems in remote_generate_stubs.pl
- Switch an int flag to a bool in dispatch.c
Changes since v2:
- None
Changes since v3:
- Change unsigned proc to signed proc, to conform to spec
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Implement the qemu driver's virDomainQemuMonitorCommand
and hook it into the API entry point.
Changes since v1:
- Rename the (external) qemuMonitorCommand to qemuDomainMonitorCommand
- Add virCheckFlags to qemuDomainMonitorCommand
Changes since v2:
- Drop ATTRIBUTE_UNUSED from the flags
Changes since v3:
- Add a flag to priv so we only print out monitor command warning once. Note
that this has not been plumbed into qemuDomainObjPrivateXMLFormat or
qemuDomainObjPrivateXMLParse, which means that if you run a monitor command,
restart libvirtd, and then run another monitor command, you may get an
an erroneous VIR_INFO. It's a pretty minor matter, and I didn't think it
warranted the additional code.
- Add BeginJob/EndJob calls around EnterMonitor/ExitMonitor
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Add the library entry point for the new virDomainQemuMonitorCommand()
entry point. Because this is not part of the "normal" libvirt API,
it gets its own header file, library file, and will eventually
get its own over-the-wire protocol later in the series.
Changes since v1:
- Go back to using the virDriver table for qemuDomainMonitorCommand, due to
linking issues
- Added versioning information to the libvirt-qemu.so
Changes since v2:
- None
Changes since v3:
- Add LGPL header to libvirt-qemu.c
- Make virLibConnError and virLibDomainError macros instead of function calls
Changes since v4:
- Move exported symbols to libvirt_qemu.syms
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Now that we have the ability to specify arbitrary qemu
command-line parameters in the XML, use it to handle unknown
command-line parameters when doing a native-to-xml conversion.
Changes since v1:
- Rename num_extra to num_args
- Fix up a memory leak on an error path
Changes since v2:
- Add a VIR_WARN when adding the argument via qemu:arg
Changes since v3:
- None
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Implement the qemu hooks for XML namespace data. This
allows us to specify a qemu XML namespace, and then
specify:
<qemu:commandline>
<qemu:arg value='arg'/>
<qemu:env name='name' value='value'/>
</qemu:commandline>
In the domain XML.
Changes since v1:
- Change the <qemu:arg>arg</qemu:arg> XML to <qemu:arg value='arg'/> XML
- Fix up some memory leaks in qemuDomainDefNamespaceParse
- Rename num_extra and extra to num_args and args, respectively
- Fixed up some error messages
- Make sure to escape user-provided data in qemuDomainDefNamespaceFormatXML
Changes since v2:
- Add checking to ensure environment variable names are valid
- Invert the logic in qemuDomainDefNamespaceFormatXML to return early
Changes since v3:
- Change strspn() to c_isalpha() check of first letter of environment variable
Signed-off-by: Chris Lalancette <clalance@redhat.com>
This patch adds namespace XML parsers to be hooked into
the main domain parser. This allows for individual hypervisor
drivers to add per-namespace XML into the main domain XML.
Changes since v1:
- Use a statically declared table for caps->ns, removing the need to
allocate/free it.
Changes since v2:
- None
Changes since v3:
- None
Signed-off-by: Chris Lalancette <clalance@redhat.com>
The first conditional is always true which means the iterator will
never find another device on the same bus.
if (dev->domain != check->domain ||
dev->bus != check->bus ||
----> (check->slot == check->slot &&
check->function == check->function)) <-----
The goal of that check is to verify that the device is either:
in a different pci domain
on a different bus
is the same identical device
This means libvirt may issue a secondary bus reset when there are
devices
on that bus that actively in use by the host or another guest.
* src/util/pci.c: fix a bogus test in pciSharesBusWithActive()
The remote driver is using the wrong privateData field in
a couple of functions. THis is harmless for stateful
drivers like QEMU/UML/LXC, but will crash with Xen
* src/remote/remote_driver.c: Fix use of privateData field
A Linux software bridge will assume the MAC address of the enslaved
interface with the numerically lowest MAC addr. When the bridge
changes MAC address there is a period of network blackout, so a
change should be avoided. The kernel gives TAP devices a completely
random MAC address. Occassionally the random TAP device MAC is lower
than that of the physical interface (eth0, eth1etc) that is enslaved,
causing the bridge to change its MAC.
This change sets an explicit MAC address for all TAP devices created
using the configured MAC from the XML, but with the high byte set
to 0xFE. This should ensure TAP device MACs are higher than any
physical interface MAC.
* src/qemu/qemu_conf.c, src/uml/uml_conf.c: Pass in a MAC addr
for the TAP device with high byte set to 0xFE
* src/util/bridge.c, src/util/bridge.h: Set a MAC when creating
the TAP device to override random MAC
The PCI slot 1 must be reserved at all times, since PIIX3 is
always present, even if no IDE device is in use for guest disks
* src/qemu/qemu_conf.c: Always reserve slot 1 for PIIX3
Init process may remain after sending SIGTERM for some reason.
For example, if original init program is used, it is definitely
not killed by SIGTERM.
* src/lxc/lxc_controller.c: kill with SIGKILL if SIGTERM wasn't
sufficient
One error exit in virStorageBackendCreateBlockFrom was setting the
return value to errno. The convention for volume build functions is to
return 0 on success or -1 on failure. Not only was it not necessary to
set the return value (it defaults to -1, and is set to 0 when
everything has been successfully completed), in the case that some
caller were checking for < 0 rather than != 0, they would incorrectly
believe that it completed successfully.
virDirCreate also previously returned 0 on success and errno on
failure. This makes it fit the recommended convention of returning 0
on success, -errno (ie a negative number) on failure.
Previously virStorageBackendCopyToFD would simply return -1 on
error. This made the error return from one of its callers inconsistent
(createRawFileOpHook is supposed to return -errno, but if
virStorageBackendCopyToFD failed, createRawFileOpHook would just
return -1). Since there is a useful errno in every case of error
return from virStorageBackendCopyToFD, and since the other uses of
that function ignore the return code (beyond simply checking to see if
it is < 0), this is a safe change.
virFileOperation previously returned 0 on success, or the value of
errno on failure. Although there are other functions in libvirt that
use this convention, the preferred (and more common) convention is to
return 0 on success and -errno (or simply -1 in some cases) on
failure. This way the check for failure is always (ret < 0).
* src/util/util.c - change virFileOperation and virFileOperationNoFork to
return -errno on failure.
* src/storage/storage_backend.c, src/qemu/qemu_driver.c
- change the hook functions passed to virFileOperation to return
-errno on failure.
To try and ensure that people upgrading from old QEMU get guests
with the same PCI device ordering, change the way we assign addrs
to match QEMU's default order. This should make Windows less
annoyed.
* src/qemu/qemu_conf.c: Follow QEMU's default PCI ordering
logic when assigning addresses
* tests/*.args: Update for changed PCI addresses
To allow compatibility with older QEMU PCI device slot assignment
it is necessary to explicitly track the balloon device in the
XML. This introduces a new device
<memballoon model='virtio|xen'/>
It can also have a PCI address, auto-assigned if necessary.
The memballoon will be automatically added to all Xen and QEMU
guests by default.
* docs/schemas/domain.rng: Add <memballoon> element
* src/conf/domain_conf.c, src/conf/domain_conf.h: parsing
and formatting for memballoon device. Always add a memory
balloon device to Xen/QEMU if none exists in XML
* src/libvirt_private.syms: Export memballoon model APIs
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Honour the
PCI device address in memory balloon device
* tests/*: Update to test new functionality
The first VGA and IDE devices need to have fixed PCI address
reservations. Currently this is handled inline with the other
non-primary VGA/IDE devices. The fixed virtio balloon device
at slot 3, ensures auto-assignment skips the slots 1/2. The
virtio address will shortly become configurable though. This
means the reservation of fixed slots needs to be done upfront
to ensure that they don't get re-used for other devices.
This is more or less reverting the previous changeset:
commit 83acdeaf17
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Wed Feb 3 16:11:29 2010 +0000
Fix restore of QEMU guests with PCI device reservation
The difference is that this time, instead of unconditionally
reserving the address, we only reserve the address if it was
initially type=none. Addresses of type=pci were handled
earlier in process by qemuDomainPCIAddressSetCreate(). This
ensures restore step doesn't have problems
* src/qemu/qemu_conf.c: Reserve first VGA + IDE address
upfront
The VIR_ERR_NO_SUPPORT refers to an API which is not implemented.
There is a separate VIR_ERR_CONFIG_UNSUPPORTED for XML config
options that are not available with the current hypervisor.
* src/qemu/qemu_conf.c, src/qemu/qemu_driver.c: Remove
many VIR_ERR_NO_SUPPORT replace with VIR_ERR_CONFIG_UNSUPPORTED
If you try to execute two concurrent migrations p2p
from A->B and B->A, the two libvirtd's will deadlock
trying to perform the migrations. The reason for this is
that in p2p migration, the libvirtd's are responsible for
making the RPC Prepare, Migrate, and Finish calls. However,
they are currently holding the driver lock while doing so,
which basically guarantees deadlock in this scenario.
This patch fixes the situation by adding
qemuDomainObjEnterRemoteWithDriver and
qemuDomainObjExitRemoteWithDriver helper methods. The Enter
take an additional object reference, then drops both the
domain object lock and the driver lock. The Exit takes
both the driver and domain object lock, then drops the
reference. Adding calls to these Enter and Exit helpers
around remote calls in the various migration methods
seems to fix the problem for me in testing.
This should make the situation safe. The additional domain
object reference ensures that the domain object won't disappear
while this operation is happening. The BeginJob that is called
inside of qemudDomainMigratePerform ensures that we can't execute a
second migrate (or shutdown, or save, etc) job while the
migration is active. Finally, the additional check on the state
of the vm after we reacquire the locks ensures that we can't
be surprised by an external event (domain crash, etc).
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Originally the storage volume files were opened with O_DSYNC to make
sure they were flushed to disk immediately. It turned out that this
was extremely slow in some cases, so the O_DSYNC was removed in favor
of just calling fsync() after all the data had been written. However,
this call to fsync was inside the block that is executed to zero-fill
the end of the volume file. In cases where the new volume is copied
from an old volume, and they are the same length, this fsync would
never take place.
Now the fsync is *always* done, unless there is an error (in which
case it isn't important, and is most likely inappropriate.
A missing set of braces around an error condition caused us to skip
zero'ing out the remainder of a new volume file if the new volume was
longer than the original (the goto was supposed to be taken only in
the case of error, but was always being taken).
The storage volume lookup code was probing for the backing store
format, instead of using the format extracted from the file
itself. This meant it could report in accurate information. If
a format is included in the file, then use that in preference,
with probing as a fallback.
* src/storage/storage_backend_fs.c: Use extracted backing store
format
When creating qcow2 files with a backing store, it is important
to set an explicit format to prevent QEMU probing. The storage
backend was only doing this if it found a 'kvm-img' binary. This
is wrong because plenty of kvm-img binaries don't support an
explicit format, and plenty of 'qemu-img' binaries do support
a format. The result was that most qcow2 files were not getting
a backing store format.
This patch runs 'qemu-img -h' to check for the two support
argument formats
'-o backing_format=raw'
'-F raw'
and use whichever option it finds
* src/storage/storage_backend.c: Query binary to determine
how to set the backing store format
Record a default driver name/type in capabilities struct. Use this
when parsing disks if value is not set in XML config.
* src/conf/capabilities.h: Record default driver name/type for disks
* src/conf/domain_conf.c: Fallback to default driver name/type
when parsing disks
* src/qemu/qemu_driver.c: Set default driver name/type to raw
Disk format probing is now disabled by default. A new config
option in /etc/qemu/qemu.conf will re-enable it for existing
deployments where this causes trouble
The implementation of security driver callbacks often needs
to access the security driver object. Currently only a handful
of callbacks include the driver object as a parameter. Later
patches require this is many more places.
* src/qemu/qemu_driver.c: Pass in the security driver object
to all callbacks
* src/qemu/qemu_security_dac.c, src/qemu/qemu_security_stacked.c,
src/security/security_apparmor.c, src/security/security_driver.h,
src/security/security_selinux.c: Add a virSecurityDriverPtr
param to all security callbacks
Update the QEMU cgroups code, QEMU DAC security driver, SELinux
and AppArmour security drivers over to use the shared helper API
virDomainDiskDefForeachPath().
* src/qemu/qemu_driver.c, src/qemu/qemu_security_dac.c,
src/security/security_selinux.c, src/security/virt-aa-helper.c:
Convert over to use virDomainDiskDefForeachPath()
There is duplicated code which iterates over disk backing stores
performing some action. Provide a convenient helper for doing
this to eliminate duplication & risk of mistakes with disk format
probing
* src/conf/domain_conf.c, src/conf/domain_conf.h,
src/libvirt_private.syms: Add virDomainDiskDefForeachPath()
Require the disk image to be passed into virStorageFileGetMetadata.
If this is set to VIR_STORAGE_FILE_AUTO, then the format will be
resolved using probing. This makes it easier to control when
probing will be used
* src/qemu/qemu_driver.c, src/qemu/qemu_security_dac.c,
src/security/security_selinux.c, src/security/virt-aa-helper.c:
Set VIR_STORAGE_FILE_AUTO when calling virStorageFileGetMetadata.
* src/storage/storage_backend_fs.c: Probe for disk format before
calling virStorageFileGetMetadata.
* src/util/storage_file.h, src/util/storage_file.c: Remove format
from virStorageFileMeta struct & require it to be passed into
method.
The virStorageFileGetMetadataFromFD did two jobs in one. First
it probed for storage type, then it extracted metadata for the
type. It is desirable to be able to separate these jobs, allowing
probing without querying metadata, and querying metadata without
probing.
To prepare for this, split out probing code into a new pair of
methods
virStorageFileProbeFormatFromFD
virStorageFileProbeFormat
* src/util/storage_file.c, src/util/storage_file.h,
src/libvirt_private.syms: Introduce virStorageFileProbeFormat
and virStorageFileProbeFormatFromFD
Instead of including a field in FileTypeInfo struct for the
disk format, rely on the array index matching the format.
Use verify() to assert the correct number of elements in the
array.
* src/util/storage_file.c: remove type field from FileTypeInfo
When QEMU opens a backing store for a QCow2 file, it will
normally auto-probe for the format of the backing store,
rather than assuming it has the same format as the referencing
file. There is a QCow2 extension that allows an explicit format
for the backing store to be embedded in the referencing file.
This closes the auto-probing security hole in QEMU.
This backing store format can be useful for libvirt users
of virStorageFileGetMetadata, so extract this data and report
it.
QEMU does not require disk image backing store files to be in
the same format the file linkee. It will auto-probe the disk
format for the backing store when opening it. If the backing
store was intended to be a raw file this could be a security
hole, because a guest may have written data into its disk that
then makes the backing store look like a qcow2 file. If it can
trick QEMU into thinking the raw file is a qcow2 file, it can
access arbitrary files on the host by adding further backing
store links.
To address this, callers of virStorageFileGetMeta need to be
told of the backing store format. If no format is declared,
they can make a decision whether to allow format probing or
not.
IPtables will seek to preserve the source port unchanged when
doing masquerading, if possible. NFS has a pseudo-security
option where it checks for the source port <= 1023 before
allowing a mount request. If an admin has used this to make the
host OS trusted for mounts, the default iptables behaviour will
potentially allow NAT'd guests access too. This needs to be
stopped.
With this change, the iptables -t nat -L -n -v rules for the
default network will be
Chain POSTROUTING (policy ACCEPT 95 packets, 9163 bytes)
pkts bytes target prot opt in out source destination
14 840 MASQUERADE tcp -- * * 192.168.122.0/24 !192.168.122.0/24 masq ports: 1024-65535
75 5752 MASQUERADE udp -- * * 192.168.122.0/24 !192.168.122.0/24 masq ports: 1024-65535
0 0 MASQUERADE all -- * * 192.168.122.0/24 !192.168.122.0/24
* src/network/bridge_driver.c: Add masquerade rules for TCP
and UDP protocols
* src/util/iptables.c, src/util/iptables.c: Add source port
mappings for TCP & UDP protocols when masquerading.
There are many naming conventions for partitions associated with a
block device. Some of the major ones are:
/dev/foo -> /dev/foo1
/dev/foo1 -> /dev/foo1p1
/dev/mapper/foo -> /dev/mapper/foop1
/dev/disk/by-path/foo -> /dev/disk/by-path/foo-part1
The universe of possible conventions isn't clear. Rather than trying
to understand all possible conventions, this patch divides devices
into two groups, device mapper devices and everything else. Device
mapper devices seem always to follow the convention of device ->
devicep1; everything else is canonicalized.
* src/uml/uml_driver.c (umlMonitorCommand): Correct flaw that would
cause unconditional "incomplete reply ..." failure, since "nbytes"
was always 0 or 1.
* src/qemu/qemu_driver.c (qemuConnectMonitor): Correct erroneous
parenthesization in two expressions. Without this fix, failure
to set or clear SELinux security context in the monitor would go
undiagnosed. Also correct a diagnostic and split some long lines.
When comparing a CPU without <model> element, such as
<cpu>
<topology sockets='1' cores='1' threads='1'/>
</cpu>
libvirt would happily crash without warning.
When autodetecting whether XML describes guest or host CPU, the presence
of <arch> element is checked. If it's present, we treat the XML as host
CPU definition. Which is right, since guest CPU definitions do not
contain <arch> element. However, if at the same time the root <cpu>
element contains `match' attribute, we would silently ignore it and
still treat the XML as host CPU. We should rather refuse such invalid
XML.
When a CPU to be compared with host CPU describes a host CPU instead of
a guest CPU, the result is incorrect. This is because instead of
treating additional features in host CPU description as required, they
were treated as if they were mentioned with all possible policies at the
same time.
In case qemu supports -nodefconfig, libvirt adds uses it when launching
new guests. Since this option may affect CPU models supported by qemu,
we need to use it when probing for available models.
An indentation mistake meant that a check for return status
was not properly performed in all cases. This could result
in a crash on NULL pointer in a following line.
* src/qemu/qemu_monitor_json.c: Fix check for return status
when processing JSON for blockstats
By specifying <vendor> element in CPU requirements a guest can be
restricted to run only on CPUs by a given vendor. Host CPU vendor is
also specified in capabilities XML.
The vendor is checked when migrating a guest but it's not forced, i.e.,
guests configured without <vendor> element can be freely migrated.
All features in the baseline CPU definition were always created with
policy='require' even though an arch driver returned them with different
policy settings.
This allows the user to give an explicit path to configure
./configure --with-vbox=/path/to/virtualbox
instead of having the VirtualBox driver probe a set of possible
paths at runtime. If no explicit path is specified then configure
probes the set of "known" paths.
https://bugzilla.redhat.com/show_bug.cgi?id=609185
We only use libpciaccess for resolving device product/vendor. If
initializing the library fails (say if using qemu:///session), don't
warn so loudly, and carry on as usual.
Any error message raised after the process has forked needs
to be followed by virDispatchError, otherwise we have no chance of
ever seeing it. This was selectively done for hook functions in the past,
but really applies to all post-fork errors.
As pointed out by Eric Blake, using dirent->d_type breaks
compilation on MinGW. This patch addresses this by using
'#if defined' as same as doing for virCgroupForDriver.
Some, but not all, codepaths in the qemuMonitorOpen() method
would trigger the destroy callback. The caller does not expect
this to be invoked if construction fails, only during normal
release of the monitor. This resulted in a possible double-unref
of the virDomainObjPtr, because the caller explicitly unrefs
the virDomainObjPtr if qemuMonitorOpen() fails
* src/qemu/qemu_monitor.c: Don't invoke destroy callback from
qemuMonitorOpen() failure paths
ENOENT happens normally when a subsystem is enabled with any other
subsystems and the directory of the target group has already removed
in a prior loop. In that case, the function should just return without
leaving an error message.
NB this is the same behavior as before introducing virCgroupRemoveRecursively.
Make sure to *not* call qemuDomainPCIAddressReleaseAddr if
QEMUD_CMD_FLAG_DEVICE is *not* set (for older qemu). This
prevents a crash when trying to do device detachment from
a qemu guest.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
In the current libvirt PCI code, there is no checking whether
a PCI device is in use by a guest when doing node device
detach or reattach. This causes problems when a device is
assigned to a guest, and the administrator starts issuing
nodedevice commands. Make it so that we check the list
of active devices when trying to detach/reattach, and only
allow the operation if the device is not assigned to a guest.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Fix regression introduced in commit a4a287242 - basically, the
phyp storage driver should only accept the same URIs that the
main phyp driver is willing to accept. Blindly accepting all
URIs meant that the phyp storage driver was being consulted for
'virsh -c qemu:///session pool-list --all', rather than the
qemu storage driver, then since the URI was not for phyp, attempts
to then use the phyp driver crashed because it was not initialized.
* src/phyp/phyp_driver.c (phypStorageOpen): Only accept connections
already open to a phyp driver.
This code was just recently added (by me) and didn't account for the
fact that stdin_path is sometimes NULL. If it's NULL, and
SetSecurityAllLabel fails, a segfault would result.
Because tty path is unexpectedly not saved in the live configuration
file of a domain, libvirtd cannot get the console of the domain back
after restarting.
The reason why the tty path isn't saved is that, to save the tty path,
the save function, virDomainSaveConfig, requires that the target domain
is running (pid != -1), however, lxc driver calls the function before
starting the domain to pass the configuration to controller.
To ensure to save the tty path, the patch lets lxc driver call the save
function again after starting the domain.
The function is expected to return negative value on failure,
however, it returns positive value when either setInterfaceName
or vethInterfaceUpOrDown fails. Because the function returns
the return value of either as is, however, the two functions
may return positive value on failure.
The patch fixes the defects and add error messages.
When the saved domain image is on an NFS share, at least some part of
domainSetSecurityAllLabel will fail (for example, selinux labels can't
be modified). To allow domain restore to still work in this case, just
ignore the errors.
virStorageFileIsSharedFS would previously only work if the entire path
in question was stat'able by the uid of the libvirtd process. This
patch changes it to crawl backwards up the path retrying the statfs
call until it gets to a partial path that *can* be stat'ed.
This is necessary to use the function to learn the fstype for files
stored as a different user (and readable only by that user) on a
root-squashed remote filesystem.
Also restore the label to its original value after qemu is finished
with the file.
Prior to this patch, qemu domain restore did not function properly if
selinux was set to enforce.
If an active migration operation fails, or is cancelled by the
admin, the QEMU on the destination is shutdown and the one on
the source continues running. It is important in shutting down
the QEMU on the destination, the security drivers don't reset
the file labelling/permissions.
* src/qemu/qemu_driver.c: Don't reset labelling/permissions
on migration abort
Minor speedups by using the full power of sed.
* src/phyp/phyp_driver.c (phypGetVIOSFreeSCSIAdapter)
(phypDiskType, phypListDefinedDomains): Use fewer processes, by
folding other work into sed.
(phypGetVIOSPartitionID): Likewise. Also avoid non-portable use
of 'sed -s'.
Add the storage management driver to the Power Hypervisor driver.
This is a big but simple patch, it's just a new set of functions.
This patch includes:
* Storage driver: The set of pool-* and vol-* functions.
* attach-disk function.
* Support for IVM on the new functions.
Signed-off-by: Eric Blake <eblake@redhat.com>
* src/phyp/phyp_driver.c (phypStorageDriver): New driver.
(phypStorageOpen, phypStorageClose): New functions.
(phypRegister): Register it.
Signed-off-by: Eric Blake <eblake@redhat.com>
Several phyp functions are not namespace clean, and had no reason
to be exported since no one outside the phyp driver needed to use
them. Rather than do lots of forward declarations, I was able
to topologically sort the file. So, this patch looks huge, but
is really just a matter of marking things static and dealing with
the compiler fallout.
* src/phyp/phyp_driver.h (PHYP_DRIVER_H): Add include guard.
(phypCheckSPFreeSapce): Delete unused declaration.
(phypGetSystemType, phypGetVIOSPartitionID, phypCapsInit)
(phypBuildLpar, phypUUIDTable_WriteFile, phypUUIDTable_ReadFile)
(phypUUIDTable_AddLpar, phypUUIDTable_RemLpar, phypUUIDTable_Pull)
(phypUUIDTable_Push, phypUUIDTable_Init, phypUUIDTable_Free)
(escape_specialcharacters, waitsocket, phypGetLparUUID)
(phypGetLparMem, phypGetLparCPU, phypGetLparCPUGeneric)
(phypGetRemoteSlot, phypGetBackingDevice, phypDiskType)
(openSSHSession): Move declarations to phyp_driver.c and make static.
* src/phyp/phyp_driver.c: Rearrange file contents to provide
topological sorting of newly-static funtions (no semantic changes
other than reduced scope).
(phypGetBackingDevice, phypDiskType): Mark unused, for now.
The patches for shared storage migration were not correctly written
for json mode. Thus the 'blk' and 'inc' parameters were never being
set. In addition they didn't set the QEMU_MONITOR_MIGRATE_BACKGROUND
so migration was synchronous. Due to multiple bugs in QEMU's JSON
impl this wasn't noticed because it treated the sync migration requst
as asynchronous anyway. Finally 'background' parameter was converted
to take arbitrary flags but not renamed, and not all uses were changed
to unsigned int.
* src/qemu/qemu_driver.c: Set QEMU_MONITOR_MIGRATE_BACKGROUND in
doNativeMigrate
* src/qemu/qemu_monitor_json.c: Process QEMU_MONITOR_MIGRATE_NON_SHARED_DISK
and QEMU_MONITOR_MIGRATE_NON_SHARED_INC flags
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.h, src/qemu/qemu_monitor_text.c,
src/qemu/qemu_monitor_text.h: change 'int background' to
'unsigned int flags' in migration APIs. Add logging of flags
parameter
During incoming migration the QEMU monitor is not able to be
used. The incoming migration code did not keep hold of the
job lock because migration is split across multiple API calls.
This meant that further monitor commands on the guest would
hang until migration finished with no timeout.
In this change the qemuDomainMigratePrepare method sets the
job flag just before it returns. The qemuDomainMigrateFinish
method checks for this job flag & clears it once done. This
prevents any use of the monitor between prepare+finish steps.
The qemuDomainGetJobInfo method is also updated to refresh
the job elapsed time. This means that virsh domjobinfo can
return time data during incoming migration
* src/qemu/qemu_driver.c: Keep a job active during incoming
migration. Refresh job elapsed time when returning job info
When configuring serial, parallel, console or channel devices
with a file, dev or pipe backend type, it is necessary to label
the file path in the security drivers. For char devices of type
file, it is neccessary to pre-create (touch) the file if it does
not already exist since QEMU won't be allowed todo so itself.
dev/pipe configs already require the admin to pre-create before
starting the guest.
* src/qemu/qemu_security_dac.c: set file ownership for character
devices
* src/security/security_selinux.c: Set file labeling for character
devices
* src/qemu/qemu_driver.c: Add character devices to cgroup ACL
The parallel, serial, console and channel devices are all just
character devices. A lot of code needs todo the same thing to
all these devices. This provides an convenient API for iterating
over all of them.
* src/conf/domain_conf.c, src/conf/domain_conf.c,
src/libvirt_private.syms: Add virDomainChrDefForeach
We previously assumed that if the -device option existed in qemu, that
-nodefconfig would also exist. It turns out that isn't the case, as
demonstrated by qemu-kvm-0.12.3 in Fedora 13.
*/src/qemu/qemu_conf.[hc] - add a new QEMUD_CMD_FLAG, set it via the
help output, and check it before adding
-nodefconfig to the qemu commandline.
Also don't abuse the disk driver name to specify the SCSI controller
model anymore:
<driver name='buslogic'/>
Use the newly added model attribute of the controller element for this:
<controller type='scsi' index='0' model='buslogic'/>
The disk driver name approach is deprecated now, but still works for
backward compatibility reasons.
Update the documentation and tests accordingly.
Fix usage of the words controller and id in the VMX handling code. Use
controller, bus and unit properly.
The domain XML parsing code autogenerates disk address and
controller elements when they are not explicitly specified.
The code assumes a narrow SCSI bus (7 units per bus). ESX
uses a wide SCSI bus (16 units per bus).
This is a step towards controller support for the ESX driver.
Move libnl to libvirt_util.la, because macvtap.c requires it.
Add GnuTLS to libvirt_driver.la, because libvirt.c calls gcrypt functions.
When built without loadable driver modules, then the remote driver pulls
in GnuTLS.
Move libgnu.la from libvirt_parthelper_CFLAGS to libvirt_parthelper_LDADD.
Through conversation with Kumar L Srikanth-B22348, I found
that the function of getting memory usage (e.g., virsh dominfo)
doesn't work for lxc with ns subsystem of cgroup enabled.
This is because of features of ns and memory subsystems.
Ns creates child cgroup on every process fork and as a result
processes in a container are not assigned in a cgroup for
domain (e.g., libvirt/lxc/test1/). For example, libvirt_lxc
and init (or somewhat specified in XML) are assigned into
libvirt/lxc/test1/8839/ and libvirt/lxc/test1/8839/8849/,
respectively. On the other hand, memory subsystem accounts
memory usage within a group of processes by default, i.e.,
it does not take any child (and descendant) groups into
account. With the two features, virsh dominfo which just
checks memory usage of a cgroup for domain always returns
zero because the cgroup has no process.
Setting memory.use_hierarchy of a group allows to account
(and limit) memory usage of every descendant groups of the group.
By setting it of a cgroup for domain, we can get proper memory
usage of lxc with ns subsystem enabled. (To be exact, the
setting is required only when memory and ns subsystems are
enabled at the same time, e.g., mount -t cgroup none /cgroup.)
As same as normal directories, a cgroup cannot be removed if it
contains sub groups. This patch changes virCgroupRemove to remove
all descendant groups (subdirectories) of a target group before
removing the target group.
The handling is required when we run lxc with ns subsystem of cgroup.
Ns subsystem automatically creates child cgroups on every process
forks, but unfortunately the groups are not removed on process exits,
so we have to remove them by ourselves.
With this patch, such child (and descendant) groups are surely removed
at lxc shutdown, i.e., lxcVmCleanup which calls virCgroupRemove.
add iptables rules to allow TFTP from the virtual network if <tftp>
element is defined in the network definition.
Fedora bz#580215
* src/network/bridge_driver.c: open UDP port 69 for TFTP traffic if
tftproot is defined
We already use the '-nodefaults' command line arg with QEMU to stop
it adding any default devices to guests. Unfortunately, QEMU will
load global config files from /etc/qemu that may also add default
devices. These aren't blocked by '-nodefaults', so we need to also
add the '-nodefconfig' arg to prevent that.
Unfortunately these global config files are also used to define
custom CPU models. So in blocking global hardware device addition
we also block definitions of new CPU models. Libvirt doesn't know
about these custom CPU models though, so it would never make use
of them anyway. Thus blocking them via -nodefconfig isn't a show
stopping problem. We would need to expand libvirt's own CPU model
XML database to support these instead.
* src/qemu/qemu_conf.c: Add '-nodefconfig' if available
* tests/qemuxml2argvdata/: Add '-nodefconfig' to all data files which
have '-nodefaults' present
The current code pattern requires that callers of qemuMonitorClose
check for the return value == 0, and if so, set priv->mon = NULL
and release the reference held on the associated virDomainObjPtr
The change d84bb6d6a3 violated that
requirement, meaning that priv->mon never gets set to NULL, and
a reference count is leaked on virDomainObjPtr.
This design was a bad one, so remove the need to check the return
valueof qemuMonitorClose(). Instead allow registration of a
callback that's invoked just when the last reference on qemuMonitorPtr
is released.
Finally there was a potential reference leak in qemuConnectMonitor
in the failure path.
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add a destroy
callback invoked from qemuMonitorFree
* src/qemu/qemu_driver.c: Use the destroy callback to release the
reference on virDomainObjPtr when the monitor is freed. Fix other
potential reference count leak in connecting to monitor
Before issuing monitor commands it is neccessary to check whether
the guest is still running. Most places use virDomainIsActive()
correctly, but a few relied on 'priv->mon != NULL'. In theory
these should be equivalent, but the release of the last reference
count on priv->mon can be delayed a small amount of time until
the event handler is finally deregistered. A further ref counting
bug also means that priv->mon might be never released. In such a
case, code could mistakenly issue a monitor command and wait for
a response that will never arrive, effectively leaving the QEMU
driver waiting on virCondWait() forever..
To protect against these possibilities, make sure all code uses
virDomainIsActive(), not 'priv->mon != NULL'
* src/qemu/qemu_driver.c: Replace 'priv->mon != NULL' with
calls to 'priv->mon != NULL'()
If there is no driver for a URI we report
"no hypervisor driver available"
This is bad because not all virt drivers are hypervisors (ie container
based virt).
If there is no driver support for an API we report
"this function is not supported by the hypervisor"
This is bad for the same reason, and additionally because it is
also used for the network, interface & storage drivers.
* src/util/virterror.c: Improve error messages
Following Daniel Berrange's multiple helpful suggestions for improving
this patch and introducing another driver interface, I now wrote the
below patch where the nwfilter driver registers the functions to
instantiate and teardown the nwfilters with a function in
conf/domain_nwfilter.c called virDomainConfNWFilterRegister. Previous
helper functions that were called from qemu_driver.c and qemu_conf.c
were move into conf/domain_nwfilter.h with slight renaming done for
consistency. Those functions now call the function expored by
domain_nwfilter.c, which in turn call the functions of the new driver
interface, if available.
- Fix documentation for virGetStorageVol: it has 'key' argument instead
of 'uuid'.
- Remove TODO comment from virReleaseStorageVol: we use volume key as an
identifier instead of UUID.
- Print human-readable UUID string in debug message in virReleaseSecret.
Per-connection hashes for domains, networks, storage pools and network
filter pools were indexed by names which was not the best choice. UUIDs
are better identifiers, so lets use them.
If VM startup fails early enough (can't find a referenced USB device),
libvirtd will crash trying to clear the VNC port bit, since port = 0,
which overflows us out of the bitmap bounds.
Fix this by being more defensive in the bitmap operations, and only
clearing a previously set VNC port.
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Followup to https://bugzilla.redhat.com/show_bug.cgi?id=599091,
commit 20206a4b, to reduce disk waste in padding.
* src/qemu/qemu_monitor.h (QEMU_MONITOR_MIGRATE_TO_FILE_BS): Drop
back to 4k.
(QEMU_MONITOR_MIGRATE_TO_FILE_TRANSFER_SIZE): New macro.
* src/qemu/qemu_driver.c (qemudDomainSaveFlag): Update comment.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextMigrateToFile): Use
two invocations of dd to output non-aligned large blocks.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONMigrateToFile):
Likewise.
This patch adds an optional XML attribute to a nwfilter rule to give the user control over whether the rule is supposed to be using the iptables state match or not. A rule may now look like shown in the XML below with the statematch attribute either having value '0' or 'false' (case-insensitive).
[...]
<rule action='accept' direction='in' statematch='false'>
<tcp srcmacaddr='1:2:3:4:5:6'
srcipaddr='10.1.2.3' srcipmask='32'
dscp='33'
srcportstart='20' srcportend='21'
dstportstart='100' dstportend='1111'/>
</rule>
[...]
I am also extending the nwfilter schema and add this attribute to a test case.
Use virBuffer* API to conditionally keep the portion of the command
line specific to HMC, so that IVM can work.
Signed-off-by: Eric Blake <eblake@redhat.com>
This patch works around a recent extension of the netlink driver I had made use of when building the netlink messages. Unfortunately older kernels don't accept IFLA_IFNAME + name of interface as a replacement for the interface's index, so this patch now gets the interface index ifindex if it's not provided (ifindex <= 0).
Match earlier change for qemu pause support with virDomainCreateXML.
* src/qemu/qemu_driver.c (qemudDomainObjStart): Add parameter; all
callers changed.
(qemudDomainStartWithFlags): Implement flag support.
Define the wire format for the new virDomainCreateWithFlags
API, and implement client and server side of marshaling code.
* daemon/remote.c (remoteDispatchDomainCreateWithFlags): Add
server side dispatch for virDomainCreateWithFlags.
* src/remote/remote_driver.c (remoteDomainCreateWithFlags)
(remote_driver): Client side serialization.
* src/remote/remote_protocol.x
(remote_domain_create_with_flags_args)
(remote_domain_create_with_flags_ret)
(REMOTE_PROC_DOMAIN_CREATE_WITH_FLAGS): Define wire format.
* daemon/remote_dispatch_args.h: Regenerate.
* daemon/remote_dispatch_prototypes.h: Likewise.
* daemon/remote_dispatch_table.h: Likewise.
* src/remote/remote_protocol.c: Likewise.
* src/remote/remote_protocol.h: Likewise.
* src/remote_protocol-structs: Likewise.
Daniel's patch works with gcc and CFLAGS containing -O (the
autoconf default), but fails with non-gcc or with other
CFLAGS (such as -g), since c-ctype.h declares c_isdigit as
a macro only for certain compilation settings.
* src/Makefile.am (libvirt_parthelper_LDFLAGS): Add gnulib
library, for when c_isdigit is not a macro.
* src/storage/parthelper.c (main): Avoid out-of-bounds
dereference, noticed by Jim Meyering.
Disks with a trailing digit in their path (eg /dev/loop0 or
/dev/dm0) have an extra 'p' appended before the partition
number (eg, to form /dev/loop0p1 not /dev/loop01). Fix the
partition lookup to append this extra 'p' when required
* src/storage/parthelper.c: Add a 'p' before partition
number if required
Otherwise, a malicious packet could cause a DoS via spurious
out-of-memory failure.
* src/uml/uml_driver.c (umlMonitorCommand): Validate that incoming
data is reliable before using it to allocate/dereference memory.
Don't report bogus errno on short read.
Reported by Jim Meyering.
* src/util/threads.c (includes) [WIN32]: On mingw, favor native
threading over pthreads-win32 library.
* src/util/thread.h [WIN32] Likewise.
Suggested by Daniel P. Berrange.
When a disk is on a root squashed NFS server, it may not be
possible to stat() the disk file in virCgroupAllowDevice.
The virStorageFileGetMeta method may also fail to extract
the parent backing store. Both of these errors have to be
ignored to avoid breaking NFS deployments
* src/qemu/qemu_driver.c: Ignore errors in cgroup setup to
keep root squash NFS happy
https://bugzilla.redhat.com/show_bug.cgi?id=589465
Some guests (eg with badly configured grub, or Windows' installation cd)
require quick response from the console user. That's why we have a
"launchPaused" option in vdsm.
To implement it via libvirt, we need to ask libvirt not to call
qemuMonitorStartCPUs() after starting qemu. Calling virDomainStop
immediately after the domain is up is inherently raceful.
* src/qemu/qemu_driver.c (qemudStartVMDaemon): Add new parameter;
all callers adjusted.
(qemudDomainCreate): Implement support for new flag.
* This patch is a modification of a patch submitted by Nigel Jones.
It fixes several memory leaks on device addition/removal:
1. Free the virNodeDeviceDefPtr in udevAddOneDevice if the return
value is non-zero
2. Always release the node device reference after the device has been
processed.
* Refactored for better readability per the suggestion of clalance
A look at the QEMU source revealed the missing bits of info about
the VPC file format, so we can enable this now
* src/util/storage_file.c: Enable VPC format, providing version
and disk size offset fields
When an attempt to hotplug a PCI device to a guest fails,
the device was left attached to pci-stub. It is neccessary
to reset the device and then attach it to the host driver
again.
* src/qemu/qemu_driver.c: Reattach PCI device to host if
hotadd fails
The restore code is done in places where errors cannot be
raised, since they will overwrite over pre-existing errors.
* src/security/security_selinux.c: Only warn about failures
in label restore, don't report errors
Any output at all from device_add indicates an error in the
command execution. Thus it needs to check for reply != ""
* src/qemu/qemu_monitor_text.c: Fix reply check for errors
to treat any output as an error
HAL is deprecated and UDEV is the future. Thus if both
options are compiled, we should prefer use of UDEV over
HAL
* src/node_device/node_device_driver.c: Switch init
order to try UDEV first, then HAL
When SELinux is running in MLS mode, libvirtd will have a
different security level to the VMs. For libvirtd to be
able to connect to the monitor console, the client end of
the UNIX domain socket needs a different label. This adds
infrastructure to set the socket label via the security
driver framework
* src/qemu/qemu_driver.c: Call out to socket label APIs in
security driver
* src/qemu/qemu_security_stacked.c: Wire up socket label
drivers
* src/security/security_driver.h: Define security driver
entry points for socket labelling
* src/security/security_selinux.c: Set socket label based on
VM label
The network driver is not doing correct checking for
duplicate UUID/name values. This introduces a new method
virNetworkObjIsDuplicate, based on the previously
written virDomainObjIsDuplicate.
* src/conf/network_conf.c, src/conf/network_conf.c,
src/libvirt_private.syms: Add virNetworkObjIsDuplicate,
* src/network/bridge_driver.c: Call virNetworkObjIsDuplicate
for checking uniqueness of uuid/names
The storage pool driver is mistakenly using the error code
VIR_ERR_INVALID_STORAGE_POOL which is for diagnosing invalid
pointers. This patch switches it to use VIR_ERR_NO_STORAGE_POOL
which is the correct code for cases where the storage pool does
not exist
* src/storage/storage_driver.c: Replace VIR_ERR_INVALID_STORAGE_POOL
with VIR_ERR_NO_STORAGE_POOL
The storage pool driver is not doing correct checking for
duplicate UUID/name values. This introduces a new method
virStoragePoolObjIsDuplicate, based on the previously
written virDomainObjIsDuplicate.
* src/conf/storage_conf.c, src/conf/storage_conf.c,
src/libvirt_private.syms: Add virStoragePoolObjIsDuplicate,
* src/storage/storage_driver.c: Call virStoragePoolObjIsDuplicate
for checking uniqueness of uuid/names
The domain parsing code would auto-add a virtio serial controller
if it saw any virtio serial channel defined. Unfortunately it
always added a controller with index=0, even if the channel address
specified an index != 0. It only added one controller, even if
multiple controllers were referenced by channels. Finally, it let
the ports+vectors parameters initialize to zero instead of -1, which
prevented the controllers accepting any ports.
* src/conf/domain_conf.c: Initialize ports+vectors when adding
virtio serial controllers. Add all neccessary virtio serial
controllers, instead of hardcoding controller 0
* qemuxml2argvdata/qemuxml2argv-channel-virtio.args,
qemuxml2argvdata/qemuxml2argv-channel-virtio.xml: Expand to
test controller auto-add behaviour
To ensure that the device addressing scheme is stable across
hotplug/unplug, all virtio serial channels needs to have an
associated port number in their address. This is then specified
to QEMU using the nr=NNN parameter
* src/conf/domain_conf.c, src/conf/domain_conf.h: Parsing
for port number in vioserial address types.
* src/qemu/qemu_conf.c: Set 'nr=NNN' parameter with virtio
serial port number
* tests/qemuxml2argvdata/qemuxml2argv-channel-virtio.args,
tests/qemuxml2argvdata/qemuxml2argv-channel-virtio.xml: Expand
data set to ensure coverage of port addressing
QEMU upstream decided against adding a 'reason' field to
the block IO event in QMP. Disable this code to remove a
annoying warning message. It will be renabled when the
error string reason is re-introduced in QEMU
Adjust args to qemudStartVMDaemon() to also specify path to stdin_fd,
so this can be passed to the AppArmor driver via SetSecurityAllLabel().
This updates all calls to qemudStartVMDaemon() as well as setting up
the non-AppArmor security driver *SetSecurityAllLabel() declarations
for the above. This is required for the following
"apparmor-fix-save-restore" patch since AppArmor resolves the passed
file descriptor to the pathname given to open().
See https://bugzilla.redhat.com/show_bug.cgi?id=599091
Saving a paused 512MB domain took 3m47s with the old block size of 512
bytes. Changing the block size to 1024*1024 decreased the time to 56
seconds. (Doubling again to 2048*1024 yielded 0 improvement; lowering
to 512k increased the save time to 1m10s, about 20%)
The pointer to the xml describing the domain is saved into an object
prior to calling VIR_REALLOC_N() to make the size of the memory it
points to a multiple of QEMU_MONITOR_MIGRATE_TO_FILE_BS. If that
operation needs to allocate new memory, the pointer that was saved is
no longer valid.
To avoid this situation, adjust the size *before* saving the pointer.
(This showed up when experimenting with very large values of
QEMU_MONITOR_MIGRATE_TO_FILE_BS).
Fixes for issues in commit 211dd1e9 noted by by Jim Meyering.
1. Allocate content buffer of size content_length + 1 to ensure
NUL-termination.
2. Limit content buffer size to 64k
3. Fix whitespace issue
V2:
- Add comment to clarify allocation of content buffer
- Add ATTRIBUTE_NONNULL where appropriate
- User NULLSTR macro
There are cases when a response from xend can exceed 4096 bytes, in
which case anything beyond 4096 is ignored. This patch changes the
current fixed-size, stack-allocated buffer to a dynamically allocated
buffer based on Content-Length in HTTP header.
* It appears that the udev event for HBA creation arrives before the
associated sysfs data is fully populated, resulting in bogus data
for the nodedev entry until the entry is refreshed. This problem is
particularly troublesome when creating NPIV vHBAs because it results
in libvirt failing to find the newly created adapter and waiting for
the full timeout period before erroneously failing the create
operation. This patch forces an update before any attempt to use
any scsi_host nodedev entry.
This patch that adds support for configuring 802.1Qbg and 802.1Qbh
switches. The 802.1Qbh part has been successfully tested with real
hardware. The 802.1Qbg part has only been tested with a (dummy)
server that 'behaves' similarly to how we expect lldpad to 'behave'.
The following changes were made during the development of this patch:
- Merging Scott's v13-pre1 patch
- Fixing endptr related bug while using virStrToLong_ui() pointed out
by Jim Meyering
- Addressing Jim Meyering's comments to v11
- requiring mac address to the vpDisassociateProfileId() function to
pass it further to the 802.1Qbg disassociate part (802.1Qbh untouched)
- determining pid of lldpad daemon by reading it from /var/run/libvirt.pid
(hardcode as is hardcode alson in lldpad sources)
- merging netlink send code for kernel target and user space target
(lldpad) using one function nlComm() to send the messages
- adding a select() after the sending and before the reading of the
netlink response in case lldpad doesn't respond and so we don't hang
- when reading the port status, in case of 802.1Qbg, no status may be
received while things are 'in progress' and only at the end a status
will be there.
- when reading the port status, use the given instanceId and vf to pick
the right IFLA_VF_PORT among those nested under IFLA_VF_PORTS.
- never sending nor parsing IFLA_PORT_SELF type of messages in the
802.1Qbg case
- iterating over the elements in a IFLA_VF_PORTS to pick the right
IFLA_VF_PORT by either IFLA_PORT_PROFILE and given profileId
(802.1Qbh) or IFLA_PORT_INSTANCE_UUID and given instanceId (802.1Qbg)
and reading the current status in IFLA_PORT_RESPONSE.
- recycling a previous patch that adds functionality to interface.c to
- get the vlan identifier on an interface
- get the flags of an interface and some convenience function to
check whether an interface is 'up' or not (not currently used here)
- adding function to determine the root physical interface of an
interface. For example if a macvtap is linked to eth0.100, it will
find eth0. Also adding a function that finds the vlan on the 'way to
the root physical interface'
- conveying the root physical interface name and index in case of 802.1Qbg
- conveying mac address of macvlan device and vlan identifier in
IFLA_VFINFO_LIST[ IFLA_VF_INFO[ IFLA_VF_MAC(mac), IFLA_VF_VLAN(vlan) ] ]
to (future) lldpad via netlink
- To enable build with --without-macvtap rename the
[dis|]associatePortProfileId functions, prepend 'vp' before their
name and make them non-static functions.
- Renaming variable multicast to nltarget_kernel and inverting
the logic
- Addressing Jim Meyering's comments; this also touches existing
code for example for correcting indentation of break statements or
simplification of switch statements.
- Renamed occurrencvirVirtualPortProfileDef to virVirtualPortProfileParamses
- 802.1Qbg part prepared for sending a RTM_SETLINK and getting
processing status back plus a subsequent RTM_GETLINK to
get IFLA_PORT_RESPONSE.
Note: This interface for 802.1Qbg may still change
- [David Allan] move getPhysfn inside IFLA_VF_PORT_MAX to avoid
compiler
warning when latest if_link.h isn't available
- move from Stefan's 802.1Qb{g|h} XML v8 to v9
- move hostuuid and vf index calcs to inside doPortProfileOp8021Qbh
- remove debug fprintfs
- use virGetHostUUID (thanks Stefan!)
- fix compile issue when latest if_link.h isn't available
- change poll timeout to 10s, at 1/8 intervals
- if polling times out, log msg and return -ETIMEDOUT
- Add Stefan's code for getPortProfileStatus
- Poll for up to 2 secs for port-profile status, at 1/8 sec intervals:
- if status indicates error, abort openMacvtapTap
- if status indicates success, exit polling
- if status is "in-progress" after 2 secs of polling, exit
polling loop silently, without error
My patch finishes out the 802.1Qbh parts, which Stefan had mostly complete.
I've tested using the recent kernel updates for VF_PORT netlink msgs and
enic for Cisco's 10G Ethernet NIC. I tested many VMs, each with several
direct interfaces, each configured with a port-profile per the XML. VM-to-VM,
and VM-to-external work as expected. VM-to-VM on same host (using same NIC)
works same as VM-to-VM where VMs are on diff hosts. I'm able to change
settings on the port-profile while the VM is running to change the virtual
port behaviour. For example, adjusting a QoS setting like rate limit. All
VMs with interfaces using that port-profile immediatly see the effect of the
change to the port-profile.
I don't have a SR-IOV device to test so source dev is a non-SR-IOV device,
but most of the code paths include support for specifing the source dev and
VF index. We'll need to complete this by discovering the PF given the VF
linkdev. Once we have the PF, we'll also have the VF index. All this info-
mation is available from sysfs.
Fedora bug https://bugzilla.redhat.com/show_bug.cgi?id=598272
Some files under /sys/bus/usb/devices/ have the format 'usbX', where
X is the USB bus number. Use STRPREFIX to correctly parse the bus numbers.
If a directory pool contains pipes or sockets, a pool start can fail or hang:
https://bugzilla.redhat.com/show_bug.cgi?id=589577
We already try to avoid these special files, but only attempt after
opening the path, which is where the problems lie. Unify volume opening
into helper functions, which use the proper open() flags to avoid error,
followed by fstat to validate storage mode.
Previously, virStorageBackendUpdateVolTargetInfoFD attempted to enforce the
storage mode check, but allowed callers to detect this case and silently
continue. In practice, only the FS backend was using this feature, the rest
were treating unknown mode as an error condition. Unfortunately the InfoFD
function wasn't raising an error message here, so error reporting was
busted.
This patch adds 2 functions: virStorageBackendVolOpen, and
virStorageBackendVolOpenModeSkip. The latter retains the original opt out
semantics, the former now throws an explicit error.
This patch maintains the previous volume mode checks: allowing specific
modes for specific pool types requires a bit of surgery, since VolOpen
is called through several different helper functions.
v2: Use ATTRIBUTE_NONNULL. Drop stat check, just open with
O_NONBLOCK|O_NOCTTY.
v3: Move mode check logic back to VolOpen. Use 2 VolOpen functions with
different error semantics.
v4: Make second VolOpen function more extensible. Didn't opt to change
FS backend defaults, this can just be to fix the original bug.
v5: Prefix default flags with VIR_, use ATTRIBUTE_RETURN_CHECK
Since the macvtap device needs active tear-down and the teardown logic
is based on the interface name, it can happen that if for example 1 out
of 3 interfaces was successfully created, that during the failure path
the macvtap's target device name is used to tear down an interface that
is doesn't own (owned by another VM).
So, in this patch, the target interface name is reset so that there is
no target interface name and the interface name is always cleared after
a tear down.
* If a nodedev has a parent that we don't want to display, we should
continue walking up the udev device tree to see if any of its
earlier ancestors are devices that we display. It makes the tree
much nicer looking than having a whole lot of devices hanging off
the root node.
These files are borrowed from upstream release versions, and should
not need further edits in the context of libvirt (instead, a new
upstream vbox release would entail adding a new header file). We do
not re-generate these files as part of libvirt, nor do we want to lose
our minor edits (such as cppi cleanups).
* src/vbox/vbox_CAPI_v2_2.h: Clarify file origins.
* src/vbox/vbox_CAPI_v3_0.h: Likewise.
* src/vbox/vbox_CAPI_v3_1.h: Likewise.
* src/vbox/vbox_CAPI_v3_2.h: Likewise. Reindent with cppi.
Fedora bug https://bugzilla.redhat.com/show_bug.cgi?id=235961
If using the default virtual network, an easy way to lose guest network
connectivity is to install libvirt inside the VM. The autostarted
default network inside the guest collides with host virtual network
routing. This is a long standing issue that has caused users quite a
bit of pain and confusion.
On network startup, parse /proc/net/route and compare the requested
IP+netmask against host routing destinations: if any matches are found,
refuse to start the network.
v2: Drop sscanf, fix a comment typo, comment that function could use
libnl instead of /proc
v3: Consider route netmask. Compare binary data rather than convert to
string.
v4: Return to using sscanf, drop inet functions in favor of virSocket,
parsing safety checks. Don't make parse failures fatal, in case
expected format changes.
v5: Try and continue if we receive unexpected. Delimit parsed lines to
prevent scanning past newline
'listen' isn't a valid qemu-dm option, as reported a long time ago here:
https://bugzilla.redhat.com/show_bug.cgi?id=492958
Matches the near identical logic in qemu_conf.c
v2: When parsing sexpr, only match on ",server", rather than
full ',server,nowait'.
* Incorporated Jim's feedback (v1 & v2)
* Moved case of DEVTYPE == "wlan" up as it's definitive that we have a network interface.
* Made comment more detailed about the wired case to explain better
how it differentiates between wired network interfaces and USB
devices.
Eliminate almost all backward jumps by replacing this common pattern:
int
some_random_function(void)
{
int result = 0;
...
cleanup:
<unconditional cleanup code>
return result;
failure:
<cleanup code in case of an error>
result = -1;
goto cleanup
}
with this simpler pattern:
int
some_random_function(void)
{
int result = -1;
...
result = 0;
cleanup:
if (result < 0) {
<cleanup code in case of an error>
}
<unconditional cleanup code>
return result;
}
Add a bool success variable in functions that don't have a int result
that can be used for the new pattern.
Also remove some unnecessary memsets in error paths.
The hotplug methods still had the qemuCmdFlags variable declared
as an int, instead of unsigned long long. This caused flag checks
to be incorrect for flags > 31
* src/qemu/qemu_driver.c: Fix integer overflow in hotplug
This allows libvirt to open the PCI device sysfs config file prior
to dropping privileges so qemu can access the full config space.
Without this, a de-privileged qemu can only access the first 64
bytes of config space.
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Detect support
for pci-assign.configfd option. Use this option when formatting
PCI device string if possible
* src/qemu/qemu_driver.c: Pre-open PCI sysfs config file and pass
to QEMU
We've been running into a lot of situations where
virGetHostname() is returning "localhost", where a plain
gethostname() would have returned the correct thing. This
is because virGetHostname() is *always* trying to canonicalize
the name returned from gethostname(), even when it doesn't
have to.
This patch changes virGetHostname so that if the value returned
from gethostname() is already FQDN or localhost, it returns
that string directly. If the value returned from gethostname()
is a shortened hostname, then we try to canonicalize it. If
that succeeds, we returned the canonicalized hostname. If
that fails, and/or returns "localhost", then we just return
the original string we got from gethostname() and hope for
the best.
Note that after this patch it is up to clients to check whether
"localhost" is an allowed return value. The only place
where it's currently not is in qemu migration.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
The virVirtualPortProfileFormat just went below the
virVirtualPortProfileParamsParseXML function and got inside the
The attached patch moves virVirtualPortProfileFormat below the #ifndef
PROXY block.
Allows listing existing pools and requesting information about them.
Alter the esxVI_ProductVersion enum in a way that allows to check for
product type by masking.
This patch parses the following two XML descriptions, one for
802.1Qbg and one for 802.1Qbh, and stores the data internally.
The actual triggering of the switch setup protocol has not been
implemented here but the relevant code to do that should go into
the functions associatePortProfileId() and disassociatePortProfileId().
<interface type='direct'>
<source dev='eth0.100' mode='vepa'/>
<model type='virtio'/>
<virtualport type='802.1Qbg'>
<parameters managerid='12' typeid='0x123456' typeidversion='1'
instanceid='fa9b7fff-b0a0-4893-8e0e-beef4ff18f8f'/>
</virtualport>
<filterref filter='clean-traffic'/>
</interface>
<interface type='direct'>
<source dev='eth0.100' mode='vepa'/>
<model type='virtio'/>
<virtualport type='802.1Qbh'>
<parameters profileid='my_profile'/>
</virtualport>
</interface>
I'd suggest to use this patch as a base for triggering the setup
protocol with the 802.1Qb{g|h} switch.
Several rounds of changes were made to this patch. The
following is a list of these changes.
- Renamed structure virVirtualPortProfileDef to virVirtualPortProfileParams
as per Daniel Berrange's request
- Addressing Daniel Berrange's comments:
- removing macvtap.h's dependency on domain_conf.h by
moving the virVirtualPortProfileDef structure into macvtap.h
and not passing virtDomainNetDefPtr to any functions in
macvtap.c
- Addressed most of Chris Wright's comments:
- indicating error in case virtualport XML node cannot be parsed
properly
- parsing hex and decimal numbers using virStrToLong_ui() with
parameter '0' for base
- tgifname (target interface name) variable wasn't necessary
to pass to openMacvtapTap function anymore
- assigning the virtual port data structure to the virDomainNetDef
only if it was previously parsed
- make sure that the error code returned by openMacvtapTap() is a negative n
in case the associatePortProfileId() function failed.
- renaming vsi in the XML to virtualport
- replace all occurrences of vsi in the source as well
- removing mode and MAC address parameters from the functions that
will communicate with the hareware diretctly or indirectly
- moving the associate and disassociate functions to the end of the
file for subsequent patches to easier make them generally available
for export
- passing the macvtap interface name rather than the link device since
this otherwise gives funny side effects when using netlink messages
where IFLA_IFNAME and IFLA_ADDRESS are specified and the link dev
all of a sudden gets the MAC address of the macvtap interface.
- Removing rc = -1 error indications in the case of 802.1Qbg|h setup in case
we wanted to use hook scripts for the setup and so the setup doesn't fail
here.
- if instance ID UUID is not supplied it will automatically be generated
- adapted schema to make instance ID UUID optional
- added test case
- parser and XML generator have been separated into their own
functions so they can be re-used elsewhere (passthrough case
for example)
- Adapted XML parser and generator support the above shown type
(802.1Qbg, 802.1Qbh).
- Adapted schema to above XML
- Adapted test XML to above XML
- Passing through the VM's UUID which seems to be necessary for
802.1Qbh -- sorry no host UUID
- adding virtual function ID to association function, in case it's
necessary to use (for SR-IOV)
This patch introduces a dependency on libnl, which subsequent patches
will then use.
Changes from V1 to V2:
- added diffstats
- following changes in tree
Spurious / in a pool target path makes life difficult for apps using the
GetVolByPath, and doing other path based comparisons with pools. This
has caused a few issues for virt-manager users:
https://bugzilla.redhat.com/show_bug.cgi?id=494005https://bugzilla.redhat.com/show_bug.cgi?id=593565
Add a new util API which removes spurious /, virFileSanitizePath. Sanitize
target paths when parsing pool XML, and for paths passed to GetVolByPath.
v2: Leading // must be preserved, properly sanitize path=/, sanitize
away /./ -> /
v3: Properly handle starting ./ and ending /.
v4: Drop all '.' handling, just sanitize / for now.
Allow for a host UUID in the capabilities XML. Local drivers
will initialize this from the SMBIOS data. If a sanity check
shows SMBIOS uuid is invalid, allow an override from the
libvirtd.conf configuration file
* daemon/libvirtd.c, daemon/libvirtd.conf: Support a host_uuid
configuration option
* docs/schemas/capability.rng: Add optional host uuid field
* src/conf/capabilities.c, src/conf/capabilities.h: Include
host UUID in XML
* src/libvirt_private.syms: Export new uuid.h functions
* src/lxc/lxc_conf.c, src/qemu/qemu_driver.c,
src/uml/uml_conf.c: Set host UUID in capabilities
* src/util/uuid.c, src/util/uuid.h: Support for host UUIDs
* src/node_device/node_device_udev.c: Use the host UUID functions
* tests/confdata/libvirtd.conf, tests/confdata/libvirtd.out: Add
new host_uuid config option to test
The cgroups ACL code was only allowing the primary disk image.
It is possible to chain images together, so we need to search
for backing stores and add them to the ACL too. Since the ACL
only handles block devices, we ignore the EINVAL we get from
plain files. In addition it was missing code to teardown the
cgroup when hot-unplugging a disk
* src/qemu/qemu_driver.c: Allow backing stores in cgroup ACLs
and add missing teardown code in unplug path
Basic live migration was broken by the commit that added
non-shared block support in two ways:
1) It added a virCheckFlags() to doNativeMigrate(). Besides
the fact that typical usage of virCheckFlags() is in driver
entry points, and doNativeMigrate() is not an entry point,
it was missing important flags like VIR_MIGRATE_LIVE. Move
the virCheckFlags to the top-level qemuDomainMigratePrepare2
and friends.
2) It also added a memory leak in qemuMonitorTextMigrate()
by not freeing the memory used by virBufferContentAndReset().
This is fixed by storing the pointer in a temporary variable
and freeing it at the end.
With this patch in place, normal live migration works again.
v3: Instead of the churn for virCheckFlagsUI and UL, instead
always promote flags to an unsigned long and always use %lx
for the fprintf.
v2: Add back flags check, which required adding virCheckFlagsUI
and virCheckFlagsUL
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Currently all host audio backends are disabled if a VM is using VNC, in
favor of the QEMU VNC audio extension. Unfortunately no released VNC
client supports this extension, so users have no way of getting audio
to work if using VNC.
Add a new config option in qemu.conf which allows changing libvirt's
behavior, but keep the default intact.
v2: Fix doc typos, change name to vnc_allow_host_audio
The device path doesn't make use of guestAddr, so the memcpy corrupts
the guest info struct.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The virDomainGetBlockInfo API allows query physical block
extent and allocated block extent. These are normally the
same value unless storing a special format like qcow2
inside a block device. In this scenario we can query QEMU
to get the actual allocated extent.
Since last time:
- Return fatal error in text monitor
- Only invoke monitor command for block devices
- Fix error handling JSON code
* src/qemu/qemu_driver.c: Fill in block aloction extent when VM
is running
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
API to query the highest block extent via info blockstats
* src/lxc/lxc_driver.c (lxcSetSchedulerParameters): Ensure that
"->field" is "cpu_shares" before possibly giving a diagnostic about
a type for a "cpu_shares" value.
Also, virCgroupSetCpuShares could fail without evoking a diagnostic.
Add one.
Volume detection in the scsi backend was duplicating code already
present in storage_backend.c. Let's drop the duplicate code.
Also, change the shared function name to be less generic, and remove
some error squashing in the other call site.
We were squashing error messages in a few cases. Recode to follow common
ret = -1 convention.
v2: Handle more error squashing issues further up in MakeNewVol and
CreateVols. Use ret = -1 convention in MakeVols.
The qemu driver contains a subtle race in the logic to find next
available vnc port. Currently it iterates through all available ports
and returns the first for which bind(2) succeeds. However it is possible
that a previously issued port has not yet been bound by qemu, resulting
in the same port used for a subsequent domain.
This patch addresses the race by using a simple bitmap to "reserve" the
ports allocated by libvirt.
V2:
- Put port bitmap in struct qemud_driver
- Initialize bitmap in qemudStartup
V3:
- Check for failure of virBitmapGetBit
- Additional check for port != -1 before calling virbitmapClearBit
V4:
- Check for failure of virBitmap{Set,Clear}Bit
V2:
- Move bitmap impl to src/util/bitmap.[ch]
- Use CHAR_BIT instead of explicit '8'
- Use size_t instead of unsigned int
- Fix calculation of bitmap size in virBitmapAlloc
- Ensure bit is within range of map in the set, clear, and get
operations
- Use bool in virBitmapGetBit
- Add virBitmapFree to free-like funcs in cfg.mk
V3:
- Check for overflow in virBitmapAlloc
- Fix copy and paste bug in virBitmapAlloc
- Use size_t in prototypes
- Add ATTRIBUTE_NONNULL in prototypes where appropriate
and remove NULL check from impl
V4:
- Add ATTRIBUTE_RETURN_CHECK in prototypes where appropriate.
We shouldn't be checking validity in domain_conf, since
it can be used by multiple different hosts and hypervisors.
Remove the check completely.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
We need a common internal function for starting managed domains to be
used during autostart. This patch factors out relevant code from
qemudDomainStart into qemudDomainObjStart and makes it use the
refactored code for domain restore instead of calling qemudDomainRestore
API directly.
We need to be able to assign new def to an existing virDomainObj which
is already locked. This patch factors out the relevant code from
virDomainAssignDef into virDomainObjAssignDef.
We need to be able to restore a domain which we already locked and
started a job for it without undoing these steps. This patch factors
out internals of qemudDomainRestore into separate functions which work
for locked objects.
* src/qemu/qemu_conf.c (qemudParseHelpStr): Fix errors that made
it impossible to diagnose invalid minor and micro version number
components.
Signed-off-by: Chris Wright <chrisw@redhat.com>
The current cleanup: in StartVMDaemon path is a poor duplication.
qemuShutdownVMDaemon can handle teardown for inactive VMs, so let's use it.
v2: Remove old abort: label, only use cleanup:
* src/qemu/qemu_conf.c (QEMU_VERSION_STR_1, QEMU_VERSION_STR_2):
Define these instead of...
(QEMU_VERSION_STR): ... this. Remove definition.
(qemudParseHelpStr): Check first for the new, shorter prefix,
"QEMU emulator version", and then for the old one,
"QEMU PC emulator version" when trying to parse the version number.
Based on a patch by Chris Wright.
* src/lxc/lxc_controller.c (ignorable_epoll_accept_errno): New function.
(lxcControllerMain): Handle a failed accept carefully:
most errno values indicate legitimate failure and must be fatal.
However, ignore a special case: that in which an incoming client quits
between the poll() indicating its presence, and our accept() which
is trying to process it.
There doesn't seem to be anything specific to tap devices for this
array of file descriptors which need to stay open of the guest to use.
Rename then for others to make use of.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Chris Lalancette <clalance@redhat.com>
These files may be useful for anyone making modifications to
source files in a tarball distribution.
* src/Makefile.am (EXTRA_DIST): Add THREADS.txt.
* daemon/Makefile.am (EXTRA_DIST): Add THREADING.txt.
This test was failing on systems using pdwtags from dwarves-1.3.
Reported by Matthias Bolte.
Two-pronged fix:
- use --verbose to work also with dwarves-1.3; adapt regular
expressions to handle now-varying separators
- require a minimum number of post-split clauses, in order to
skip upon any future format change.
Currently there are 318; if there are 300 or fewer,
give a warning similar to when pdwtags is missing.
* src/Makefile.am (remote_protocol-structs): Use pdwtags' --verbose
option to make 1.3 emit member sizes and offsets.
Consistently output WARNING messages to stderr.
Do not require each caller of virStorageFileGetMetadata and
virStorageFileGetMetadataFromFD to first clear the storage of the
"meta" buffer. Instead, initialize that storage in
virStorageFileGetMetadataFromFD.
* src/util/storage_file.c (virStorageFileGetMetadataFromFD): Clear
"meta" here, not before each of the following callers.
* src/qemu/qemu_driver.c (qemuSetupDiskCgroup): Don't clear "meta" here.
(qemuTeardownDiskCgroup): Likewise.
* src/qemu/qemu_security_dac.c (qemuSecurityDACSetSecurityImageLabel):
Likewise.
* src/security/security_selinux.c (SELinuxSetSecurityImageLabel):
Likewise.
* src/security/virt-aa-helper.c (get_files): Likewise.
Approximately 60 messages were marked. Since these diagnostics are
intended solely for developers and maintainers, encouraging translation
is deemed to be counterproductive:
http://thread.gmane.org/gmane.comp.emulators.libvirt/25050/focus=25052
Run this command:
git grep -l VIR_WARN|xargs perl -pi -e \
's/(VIR_WARN0?)\s*\(_\((".*?")\)/$1($2/'
There were three very similar uses of qemuMonitorAddDrive.
This change makes the three 17-line sequences identical.
* src/qemu/qemu_driver.c (qemudDomainAttachPciDiskDevice): Detect
failure. Add VIR_WARN and braces.
(qemudDomainAttachSCSIDisk): Add VIR_WARN and braces.
(qemudDomainAttachUsbMassstorageDevice): Likewise.
History has shown that there are frequent bugs in the QEMU driver
code leading to the monitor being invoked with a NULL pointer.
Although the QEMU driver code should always report an error in
this case before invoking the monitor, as a safety net put in a
generic check in the monitor code entry points.
* src/qemu/qemu_monitor.c: Safety net to check for NULL monitor
object
Any method which intends to invoke a monitor command must have
a check for virDomainObjIsActive() before using the monitor to
ensure that priv->mon != NULL.
There is one subtle edge case in this though. If a method invokes
multiple monitor commands, and calls qemuDomainObjExitMonitor()
in between two of these commands then there is no guarentee that
priv->mon != NULL anymore. This is because the QEMU process may
exit or die at any time, and because qemuDomainObjEnterMonitor()
releases the lock on virDomainObj, it is possible for the background
thread to close the monitor handle and thus qemuDomainObjExitMonitor
will release the last reference allowing priv->mon to become NULL.
This affects several methods, most notably migration but also some
hotplug methods. This patch takes a variety of approaches to solve
the problem, depending on the particular usage scenario. Generally
though it suffices to add an extra virDomainObjIsActive() check
if qemuDomainObjExitMonitor() was called during the method.
* src/qemu/qemu_driver.c: Fix multiple potential NULL pointer flaws
in usage of the monitor
* src/qemu/qemu_driver.c (qemudDomainSetVcpus): Upon look-up failure,
i.e., vm==NULL, goto cleanup, rather than to "endjob", superficially
since the latter would dereference vm, but more fundamentally because
we certainly don't want to call qemuDomainObjEndJob before we've
even attempted qemuDomainObjBeginJob.
(gdb) p/x QEMUD_CMD_FLAG_VNET_HOST
$7 = 0xffffffff80000000
Oops - that meant we were incorrectly setting QEMU_CMD_FLAG_RTC_TD_HACK
for qemu-kvm-0.12.3 (and probably botching a few other settings as well).
Fixes Red Hat BZ#592070
* src/qemu/qemu_conf.h (QEMUD_CMD_FLAG_VNET_HOST): Avoid sign
extension.
* tests/qemuhelpdata/qemu-kvm-0.12.3: New file.
* tests/qemuhelptest.c (mymain): Add another case.
A fedora translator filed:
https://bugzilla.redhat.com/show_bug.cgi?id=580816
Pointing out these two error messages as unclear: "write save" sounds
like a typo without context, and lack of a colon made the second message
difficult to parse.
virFileResolveLink was returning a positive value on error,
thus confusing callers that assumed failure was < 0. The
confusion is further evidenced by callers that would have
ended up calling virReportSystemError with a negative value
instead of a valid errno.
Fixes Red Hat BZ #591363.
* src/util/util.c (virFileResolveLink): Live up to documentation.
* src/qemu/qemu_security_dac.c
(qemuSecurityDACRestoreSecurityFileLabel): Adjust callers.
* src/security/security_selinux.c
(SELinuxRestoreSecurityFileLabel): Likewise.
* src/storage/storage_backend_disk.c
(virStorageBackendDiskDeleteVol): Likewise.
Fix the cygwin regression introduced in commit 48445ccff, but
without repeating the fresh build regression of commit
2d550542e.
* src/Makefile.am (libvirt_test_la_LIBADD): Split out subset of
locally-built libraries...
(libvirt_test_la_BUILT_LIBADD): ...into new variable.
(libvirt_test_la_DEPENDENCIES): Depend only on the subset that
automake would have given us for free if we didn't have to add our
own extra file.
Matthias noted that the line:
virt_aa_helper_LDFLAGS = $(WARN_CFLAGS)
looks inconsistent, so I did an audit.
Currently, the set of compiler warning flags passed to gcc as $CC are
equally permitted as the set of linker flags passed to gcc as $LD, so
there was no problem with that usage. But if we ever get in a
situation where $CC and $LD treat particular flags differently, using
the right variable form will make it easier.
In the process, I spotted a couple of typos that were omitting useful
flags, as well as specifying a -l under the wrong variable.
* acinclude.m4 (LIBVIRT_COMPILE_WARNINGS): Define WARN_LDFLAGS as
an alias for WARN_CFLAGS.
* tools/Makefile.am (virsh_LDFLAGS): Use more canonical spelling.
* proxy/Makefile.am (libvirt_proxy_LDFLAGS): Likewise. Move
library...
(libvirt_proxy_LDADD): ...here.
* src/Makefile.am (virt_aa_helper_LDFLAGS): Use more canonical
spelling of WARN_LDFLAGS.
(libvirt_parthelper_LDFLAGS, libvirt_lxc_LDFLAGS): Likewise. Use
correct spelling of COVERAGE_LDFLAGS.
Reported by Matthias Bolte.
* configure.ac: Check for <linux/magic.h>.
* src/util/storage_file.c: Include <linux/magic.h> only if present.
Linux kernels prior to 2.6.19 lacked it.
[__linux__] (NFS_SUPER_MAGIC): Define if not already defined.
qemuReadLogOutput early VM death detection is racy and won't always work.
Startup then errors when connecting to the VM monitor. This won't report
the emulator cmdline output which is typically the most useful diagnostic.
Check if the VM has died at the very end of the monitor connection step,
and if so, report the cmdline output.
See also: https://bugzilla.redhat.com/show_bug.cgi?id=581381
* src/qemu/qemu_driver.c (qemudDomainSetVcpus): Avoid NULL-deref
upon unknown UUID. Call qemuDomainObjBeginJob(vm) only after
ensuring that vm != NULL, not before. This potential NULL-deref
was introduced by commit 2c555d87b0.
This reverts commit 2d550542ee.
The patch worked for incremental builds, but broke fresh
builds, because it interfered with automake's automatic
dependency generation. Until I figure out how to make
automake do what we want, I'd rather leave cygwin broken
but fresh Linux builds working.
make[3]: *** No rule to make target `-lxml2', needed by `libvirt.la'. Stop.
Due to treating the wrong string as a dependency.
* src/Makefile.am (libvirt_la_DEPENDENCIES): Depend only on
locally-built file, not on strings that might resolve as '-lxml2'.
The code specifies driver->cacheDir as the format string,
but it usually doesn't contain '%s', so the subsequent
argument, "/qemu.mem.XXXXXX", is always ignored.
The patch fixes the misuse.
Setting dynamic_ownership=0 in /etc/libvirt/qemu.conf prevents
libvirt's DAC security driver from setting uid/gid on disk
files when starting/stopping QEMU, allowing the admin to manage
this manually. As a side effect it also stopped setting of
uid/gid when saving guests to a file, which completely breaks
save when QEMU is running non-root. Thus saved state labelling
code must ignore the dynamic_ownership parameter
* src/qemu/qemu_security_dac.c: Ignore dynamic_ownership=0 when
doing save/restore image labelling
When QEMU runs with its disk on NFS, and as a non-root user, the
disk is chownd to that non-root user. When migration completes
the last step is shutting down the QEMU on the source host. THis
normally resets user/group/security label. This is bad when the
VM was just migrated because the file is still in use on the dest
host. It is thus neccessary to skip the reset step for any files
found to be on a shared filesystem
* src/libvirt_private.syms: Export virStorageFileIsSharedFS
* src/util/storage_file.c, src/util/storage_file.h: Add a new
method virStorageFileIsSharedFS() to determine if a file is
on a shared filesystem (NFS, GFS, OCFS2, etc)
* src/qemu/qemu_driver.c: Tell security driver not to reset
disk labels on migration completion
* src/qemu/qemu_security_dac.c, src/qemu/qemu_security_stacked.c,
src/security/security_selinux.c, src/security/security_driver.h,
src/security/security_apparmor.c: Add ability to skip disk
restore step for files on shared filesystems.
The cgroups ACL code was only allowing the primary disk image.
It is possible to chain images together, so we need to search
for backing stores and add them to the ACL too. Since the ACL
only handles block devices, we ignore the EINVAL we get from
plain files. In addition it was missing code to teardown the
cgroup when hot-unplugging a disk
* src/qemu/qemu_driver.c: Allow backing stores in cgroup ACLs
and add missing teardown code in unplug path
If the IO error event does not include a reason, then there
is a possible crash dispatching the event
* src/conf/domain_event.c: Missing check for a NULL reason before
strduping allows for a crash
QEMU is gaining a new monitor command netdev_add for hotplugging
NICs using the netdev backend code. We already support this on
the command this, though it is disabled. This adds support for
hotplug too, also to remain disabled until 0.13 QEMU is released
* src/qemu/qemu_driver.c: Support netdev hotplug for NICs
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
support for netdev_add and netdev_remove commands
When closing a monitor using qemuMonitorClose(), we are aware of
the possibility the monitor is still being used somewhere:
/* NB: ordinarily one might immediately set mon->watch to -1
* and mon->fd to -1, but there may be a callback active
* that is still relying on these fields being valid. So
* we merely close them, but not clear their values and
* use this explicit 'closed' flag to track this state */
but since we call virEventAddHandle() on that monitor without increasing
its ref counter, the monitor is still freed which makes possible users
of it quite unhappy. The unhappiness can lead to a hang if qemuMonitorIO
tries to lock mutex which no longer exists.
First calling REMOTE_PROC_CLOSE and then removing watches might lead to
a hang as HANGUP event can be triggered before the watches are actually
removed but after virConnectPtr is already freed. As a result of that
remoteDomainEventFired() would try to lock uninitialized mutex, which
would hang for ever.
Allow debugging of GNUTLS interactions by setting
LIBVIRT_GNUTLS_DEBUG=10 LIBVIRT_DEBUG=1 virsh
* src/remote/remote_driver.c: Use LIBVIRT_GNUTLS_DEBUG to
enable gnutls debugging
Some shells warn about missing programs before redirection;
the idiomatic way to silence them is to run the program check
inside a subshell, with the redirections outside the subshell.
But a subshell is only needed in places where it is reasonable
to expect the use of such a noisy shell in the first place.
* src/Makefile.am (remote_protocol-structs): Use subshell, for
FreeBSD 8.0 /bin/sh.
* cfg.mk (sc_preprocessor_indentation): Avoid subshell, since the
only users running cfg.mk can be assumed to have decent tools.
For printf("%*s",foo,bar), clang complains if foo is not int:
warning: field width should have type 'int', but argument has
type 'unsigned int' [-Wformat]
* src/conf/storage_encryption_conf.c
(virStorageEncryptionSecretFormat, virStorageEncryptionFormat):
Use correct type.
* src/conf/storage_encryption_conf.h (virStorageEncryptionFormat):
Likewise.
Now, if you update remote_protocol.x without also updating
remote_protocol-structs to match, then "make check" will fail.
* src/Makefile.am (remote_protocol-structs): Extract list of
structs and member names from remote_protocol.o.
(check-local): Depend on it.
* src/remote_protocol-structs: New file.
This reverts commit b5b8a6db69.
That commit was not necessary. The problem is fixed by commit
0e9b3a269b, but I didn't rebuild
it properly after pulling in the commit and didn't notice it.
Per automake, LDFLAGS is used early in the line, and LIBADD
(libraries) or LDADD (programs) is used late. On platforms like
cygwin, without lazy linking, this order matters. Therefore, libtool
commands, -L, and similar should be in LDFLAGS, but -l should be in
L*ADD.
* src/Makefile.am (*_LDFLAGS): Move libraries...
(*_LIBADD): ...to their LIBADD counterpart.
Change 965466c1 added a new field to struct remote_error, which broke
the RPC protocol. Fortunately the new field is unused, so this change
simply removes it again.
* src/remote/remote_protocol.(c|h|x): Remove remote_nwfilter from struct
remote_error
With the introduction of the generic qemu device model, unplugging
SCSI disks works like a charm, so support it in libvirt.
* src/qemu/qemu_driver.c: Add qemudDomainDetachSCSIDiskDevice() to do the
unplugging, extend qemudDomainDetachDeviceAdd().
Signed-off-by: Wolfgang Mauerer <wolfgang.mauerer@siemens.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Gnulib can guarantee that pthread.h exists, but for now, it is a dummy
header with no support for most pthread_* functions. Modify our
use of pthread to use function checks, rather than header checks,
to determine how much pthread support is present.
* bootstrap.conf (gnulib_modules): Add pthread.
* configure.ac: Drop all pthread.h checks. Optimize function
checks. Add check for pthread functions.
* src/Makefile.am (libvirt_lxc_LDADD): Ensure proper link.
* src/remote/remote_driver.c (remoteIOEventLoop): Depend on
pthread_sigmask, now that gnulib guarantees pthread.h.
* src/util/util.c (virFork): Likewise.
* src/util/threads.c (threads-pthread.c): Depend on
pthread_mutexattr_init, as a witness of full pthread support.
* src/util/threads.h (threads-pthread.h): Likewise.
Detected by clang. POSIX requires that the second argument to
va_start be the name of the last variable; and in some implementations,
passing *path instead of path would dereference bogus memory instead
of pulling arguments off the stack.
* src/util/util.c (virBuildPathInternal): Use correct argument to
va_start.
Support for live migration between hosts that do not share storage was
added to qemu-kvm release 0.12.1.
It supports two flags:
-b migration without shared storage with full disk copy
-i migration without shared storage with incremental copy (same base image
shared between source and destination).
I tested the live migration without shared storage (both flags) for native
and p2p with and without tunnelling. I also verified that the fix doesn't
affect normal migration with shared storage.
Add an empty body for virCondWaitUntil and move virPipeReadUntilEOF
out of the '#ifndef WIN32' block, because it compiles fine with MinGW
in combination with gnulib.
Necessary on cygwin, where uid_t and gid_t are 4-byte long rather
than int, causing gcc -Wformat warnings.
* src/util/util.c (virFileOperationNoFork, virDirCreateNoFork)
(virFileOperation, virDirCreate, virGetUserEnt): Cast uid_t and
gid_t before passing to printf.
* .gitignore: Ignore Windows executables.
When a filter is updated, only those interfaces must have their old
rules cleared that either reference the filter directly or indirectly
through another filter. Remember between the different steps of the
instantiation of the filters which interfaces must be skipped. I am
using a hash map to remember the names of the interfaces and store a
bogus pointer to ~0 into it that need not be freed.
For the decision on whether to instantiate the rules, the check for a
pending IP address learn request is not sufficient since then only the
thread could instantiate the rules. So, a boolean needs to be passed
when the thread instantiates the filter rules late and the IP address
learn request is still pending in order to override the check for the
pending learn request. If the rules are to be updated while the thread
is active, this will not be done immediately but the thread will do that
later on.
WIN32 is always defined when __MINGW32__ is defined, but the
converse is not true. WIN32 is more generic, if someone were
to ever attempt porting to a microsoft compiler. This does
not affect Cygwin, which intentionally does not define WIN32.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Use more
generic flag macro.
* src/storage/storage_backend.c
(virStorageBackendUpdateVolTargetInfoFD)
(virStorageBackendRunProgRegex): Likewise.
* tools/console.h (vshRunConsole): Likewise.
[Error message]
error: Failed to start domain lxc_test1
error: internal error Failed to create veth device pair: 512
The reason of the failure is that lxc driver unexpectedly re-uses
an auto-assigned veth name and tries to create the created veth
again. The failure will happen when a domain has multiple network
interfaces and the names of those are not specified in XML.
The patch fixes the problem by resetting buffers of veth names
in every iteration of creating veth.
* src/lxc/lxc_driver.c: prevent re-using auto-assigned veth name
Reported by Kumar L Srikanth-B22348.
<hostdev> address parsing previously attempted to detect the number
base: currently it is hardcoded to base 16, which can break PCI
assignment via virt-manager. Revert to the previous behavior.
* src/conf/domain_conf.c: virDomainDevicePCIAddressParseXML, switch to
virStrToLong_ui(bus, NULL, 0, ...) to autodetect base
This introduces a new event type
VIR_DOMAIN_EVENT_ID_IO_ERROR_REASON
This event is the same as the previous VIR_DOMAIN_ID_IO_ERROR
event, but also includes a string describing the cause of
the event.
Thus there is a new callback definition for this event type
typedef void (*virConnectDomainEventIOErrorReasonCallback)(virConnectPtr conn,
virDomainPtr dom,
const char *srcPath,
const char *devAlias,
int action,
const char *reason,
void *opaque);
This is currently wired up to the QEMU block IO error events
* daemon/remote.c: Dispatch IO error events to client
* examples/domain-events/events-c/event-test.c: Watch for
IO error events
* include/libvirt/libvirt.h.in: Define new IO error event ID
and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle IO error events
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for block IO errors and emit a libvirt IO error event
* src/remote/remote_driver.c: Receive and dispatch IO error
events to application
* src/remote/remote_protocol.x: Wire protocol definition for
IO error events
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for BLOCK_IO_ERROR event
from QEMU monitor
Introduce a function to notify the IP address learning
thread to terminate and thus release the lock on the interface.
Notify the thread before grabbing the lock on the interface
and tearing down the rules. This prevents a 'virsh destroy' to
tear down the rules that the IP address learning thread has
applied.
The functions invoked by the IP address learning thread
that apply some basic filtering rules did not clean up
any previous filtering rules that may still be there
(due to a libvirt restart for example). With the
patch below all the rules are cleaned up first.
Also, I am introducing a function to drop all traffic
in case the IP address learning thread could not apply
the rules.
The local DHCP server on virtbr0 sends DHCP ACK messages when a VM is
started and requests an IP address while the initial DHCP lease on the
VM's MAC address hasn't expired. So, also pick the IP address of the VM
if that type of message is seen.
Thanks to Gerhard Stenzel for providing a test case for this.
Changes from V1 to V2:
- cleanup: replacing DHCP option numbers through constants
When using -device syntax, the IO event will have a different
prefix, 'drive-' that needs to be skipped over before matching
against the libvirt disk alias
* src/qemu/qemu_driver.c: Skip QEMU_DRIVE_HOST_PREFIX in IO event
* daemon/remote.c: Server side dispatcher
* daemon/remote_dispatch_args.h, daemon/remote_dispatch_prototypes.h,
daemon/remote_dispatch_ret.h, daemon/remote_dispatch_table.h: Update
with new API
* src/remote/remote_driver.c: Client side dispatcher
* src/remote/remote_protocol.c, src/remote/remote_protocol.h: Update
* src/remote/remote_protocol.x: Define new wire protocol
This defines the internal driver API and stubs out each driver
* src/driver.h: Define virDrvDomainGetBlockInfo signature
* src/libvirt.c, src/libvirt_public.syms: Glue public API to drivers
* src/esx/esx_driver.c, src/lxc/lxc_driver.c, src/opennebula/one_driver.c,
src/openvz/openvz_driver.c, src/phyp/phyp_driver.c,
src/test/test_driver.c, src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
src/xen/xen_driver.c, src/xenapi/xenapi_driver.c: Stub out driver
qemuDomainPCIAddressSetFree was freeing up the hash
table for the pci addresses, but not freeing up the addr
structure. Looking over the callers of this function, it
seems like they expect it to also free up the structure,
so do that here.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
We were over-writing a pointer without freeing it in
case of a disk device, leading to a memory leak.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
When building on Ubuntu with make -j3 (or more), it would always
fail when trying to build virt-aa-helper. I'm not an expert in
automake by any means, but I think the entry for virt-aa-helper
is mis-using LDADD; it shouldn't be putting direct paths to
libvirt_conf.la and libvirt_util.la, but instead referencing those
names. With this patch in place, I'm able to successfully build
on Ubuntu 9.04 with make -j3.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
The previous commit changes a goto from 'endjob' to 'cleanup',
leaving the endjob label unused. Remove it to avoid compile
warning.
* src/qemu/qemu_driver.c: Remove 'endjob' label
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): When setting
"vm" to NULL, jump over vm-dereferencing code to "cleanup".
(qemuDomainRevertToSnapshot): Likewise.
use /var/lib/libvirt/dnsmasq since /var/lib/libvirt/network is
unreadable by the dnsmasq binary
* src/network/bridge_driver.c: update DNSMASQ_STATE_DIR
* src/Makefile.am: create it on make install
* libvirt.spec.in: take the new directory into account
In cases where the security driver failed to restore a label after a
guest has saved, we mistakenly jumped to the error cleanup paths.
This is not good, because the operation has in fact completed and
cannot be rolled back completely. Label restore is non-critical, so
just log the problem instead. Also add a missing restore call in
the error cleanup path
* src/qemu/qemu_driver.c: Fix handling of security driver
restore failures in QEMU domain save
When cgroups is enabled, access to block devices is likely to be
restricted to a whitelist. Prior to saving a guest to a block device,
it is necessary to add the block device to the whitelist. This is
not required upon restore, since QEMU reads from stdin
* src/qemu/qemu_driver.c: Add block device to cgroups whitelist
if neccessary during domain save.
The save process was relying on use of the shell >> append
operator to ensure the save data was placed after the libvirt
header + XML. This doesn't work for block devices though.
Replace this code with use of 'dd' and its 'seek' parameter.
This means that we need to pad the header + XML out to a
multiple of dd block size (in this case we choose 512).
The qemuMonitorMigateToCommand() monitor API is used for both
save/coredump, and migration via UNIX socket. We can't simply
switch this to use 'dd' since this causes problems with the
migration usage. Thus, create a dedicated qemuMonitorMigateToFile
which can accept an filename + offset, and remove the filename
from the current qemuMonitorMigateToCommand() API
* src/qemu/qemu_driver.c: Switch to qemuMonitorMigateToFile
for save and core dump
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Create
a new qemuMonitorMigateToFile, separate from the existing
qemuMonitorMigateToCommand to allow handling file offsets
It is possible to use block devices with domain save/restore. Upon
failure QEMU unlinks the path being saved to. This isn't good when
it is a block device !
* src/qemu/qemu_driver.c: Don't unlink block devices if save fails
If a transient QEMU crashes during save attempt, then the virDomainPtr
object may be freed. If a persistent QEMU crashes during save, then
the 'priv->mon' field is no longer valid since it will be inactive.
* src/qemu/qemu_driver.c: Fix two crashes when QEMU exits
during a save attempt
In particular I was forgetting to take the qemuMonitorPrivatePtr
lock (via qemuDomainObjBeginJob), which would cause problems
if two users tried to access the same domain at the same time.
This patch also fixes a problem where I was forgetting to remove
a transient domain from the list of domains.
Thanks to Stephen Shaw for pointing out the problem and testing
out the initial patch.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
This patch adds support for the RARP protocol. This may be needed due to
qemu sending out a RARP packet (at least that's what it seems to want to
do even though the protocol id is wrong) when migration finishes and
we'd need a rule to let the packets pass.
Unfortunately my installation of ebtables does not understand -p RARP
and also seems to otherwise depend on strings in /etc/ethertype
translated to protocol identifiers. Therefore I need to pass -p 0x8035
for RARP. To generally get rid of the dependency of that file I switch
all so far supported protocols to use their protocol identifier in the
-p parameter rather than the string.
I am also extending the schema and added a test case.
changes from v1 to v2:
- added test case into patch
With JSON qemu monitor, we get a STOP event from qemu whenever qemu
stops guests CPUs. The downside of it is that vm->state is changed to
PAUSED and a new generic paused event is send to applications. However,
when we ask qemu to stop the CPUs we are not really interested in qemu
event and we usually want to issue a more specific event.
By setting vm->status to PAUSED before actually sending the request to
qemu (and resetting it back if the request fails) we can ignore the
event since the event handler does nothing when the guest is already
paused. This solution is quite hacky but unfortunately it's the best
solution which I was able to come up with and it doesn't introduce a
race condition.
* virStorageEncryptionFormat is called from both
virDomainDiskDefFormat and virStorageVolTargetDefFormat. The proper
indentation in the generated XML depends on the caller. My earlier
patch to fix the incorrect indentation for the domain XML broke the
indentation for the storage XML. This patch adopts Laine's
suggestion of requring the caller of virStorageEncryptionFormat to
provide an unsigned int with the number of spaces the output should
be indented. The patch modifies both callers to provide the
additional argument.
* Add a regression test for the domain XML
* src/conf/domain_conf.c src/conf/storage_conf.c
src/conf/storage_encryption_conf.c src/conf/storage_encryption_conf.h:
change the indentation code
* tests/qemuxml2xmltest.c
tests/qemuxml2argvdata/qemuxml2argv-encrypted-disk.args
tests/qemuxml2argvdata/qemuxml2argv-encrypted-disk.xml: add a regression test
Cygwin's XDR implementation defines xdr_u_int64_t instead of
xdr_uint64_t and lacks IXDR_PUT_INT32/IXDR_GET_INT32.
Alter the IXDR_GET_LONG regex in rpcgen_fix.pl so it doesn't destroy
the #define IXDR_GET_INT32 IXDR_GET_LONG in remote_protocol.x.
Also fix the remote_protocol.h regex in rpcgen_fix.pl.
With this patch I want to enable hex number inputs in the filter XML. A
number that was entered as hex is also printed as hex unless a string
representing the meaning can be found.
I am also extending the schema and adding a test case. A problem with
the DSCP value is fixed on the way as well.
Changes from V1 to V2:
- using asHex boolean in all printf type of functions to select the
output format in hex or decimal format
This patch makes libvirtd start the dnsmasq daemon with a
--dhcp-hostsfile option instead of --dhcp-host options for each
'//ip/dhcp/host' entries defined in network xml file.
the dnsmasq host file is stored into /var/lib/libvirt/network
* src/network/bridge_driver.c: define the directory for the hostfiles
and save/delete them to be used by dnsmasq
* po/POTFILES.in: the new module contains translatable strings
* src/Makefile.am: include the files in the utils set
* src/libvirt_private.syms: exports the symbols internally
It implements an idea to save dhcp hosts' macaddr vs. ipaddr mappings to
static file and make dnsmasq loading it with "--dhcp-hostsfile" option,
originally suggested by Dan, and can address the problem that too
many "--dhcp-host" args hitting ARG_MAX limit
* src/util/dnsmasq.h src/util/dnsmasq.c: adds the 2 new files
While doing some testing of the snapshot code I noticed that
if qemuDomainSnapshotLoad failed, it would print a NULL as
part of the error. That's not desirable, so leave the
full_path variable around until after we are done printing
errors.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
We were freeing the virDomainSnapshotDefPtr, but not
the virDomainSnapshotObjPtr in virDomainSnapshotObjFree.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
It's not needed and is currently ignored, but this is a bug.
It will get fixed soon and QMP will return an error for keys
it doesn't know about, this will break libvirt.
* src/qemu/qemu_monitor_json.c: remove qemuMonitorJSONCommandAddTimestamp()
and the place where it's invoked in qemuMonitorJSONMakeCommand()
The nwfilterDriverActive() could de-reference a NULL pointer
if it hadn't be started at the point it was called. It was
also not thread safe, since it lacked locking around data
accesses.
* src/nwfilter/nwfilter_driver.c: Fix locking & NULL checks
in nwfilterDriverActive()
The user probably doesn't care what the gai error numbers are, as
much as what the failed conversion IP address was.
* src/remote/remote_driver.c (addrToString): Mention which address
could not be converted.
* daemon/remote.c (addrToString): Likewise.
* The error messages coming from qemu's DAC support contain strings
from the original SELinux security driver code. This just removes
references to "security context" and other SELinux-isms from the DAC
code.
Signed-off-by: Spencer Shimko <sshimko@tresys.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
- using INT_BUFSIZE_BOUND() to determine the length of the buffersize
for printing and integer into
- not explicitly initializing static var threadsTerminate to false
anymore, since that's done automatically
Changes after V2:
- removed while looks in case of OOM error
- removed on ifaceDown() call
- preceding one ifaceDown() call with an ifaceCheck() call
Since the name of an interface can be the same between stops and starts
of different VMs I have to switch the IP address learning thread to use
the index of the interface to determine whether an interface is still
available or not - in the case of macvtap the thread needs to listen for
traffic on the physical interface, thus having to time out periodically
to check whether the VM's macvtap device is still there as an indication
that the VM is still alive. Previously the following sequence of 2 VMs
with macvtap device
virsh start testvm1; virsh destroy testvm1 ; virsh start testvm2
would not terminate the thread upon testvm1's destroy since the name of
the interface on the host could be the same (i.e, macvtap0) on testvm1
and testvm2, thus it was easily race-able. The thread would then
determine the IP address parameter for testvm2 but apply the rule set
for testvm1. :-(
I am also introducing a lock for the interface (by name) that the thread
must hold while it listens for the traffic and releases when it
terminates upon VM termination or 0.5 second thereafter. Thus, the new
thread for a newly started VM with the same interface name will not
start while the old one still holds the lock. The only other code that I
see that also needs to grab the lock to serialize operation is the one
that tears down the firewall that were established on behalf of an
interface.
I am moving the code applying the 'basic' firewall rules during the IP
address learning phase inside the thread but won't start the thread
unless it is ensured that the firewall driver has the ability to apply
the 'basic' firewall rules.
The hang fix in d376b7d63e was incomplete
since it left quite a few {Enter,Exit}Monitor calls which require driver
to be unlocked. Since the driver is locked throughout the whole
function, {Enter,Exit}MonitorWithDriver need to be used instead to
ensure driver is not locked when issuing monitor commands.
The comment in qemuDomainWaitForMigrationComplete says we are polling
every 50ms but the code sleeps only for 50us. This was already discussed
during review but apparently forgotten when the series was pushed.
The text monitor code was checking for a '\n' prefix on several
places. Previously this would work, but since the monitor code
re-write the '\n' is already stripped off, so mustn't be checked
for.
* src/qemu/qemu_monitor_text.c: Fix monitor error checking
Probably as a result of a merge error, the CPU hotplug command
names were completely wrong.
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_text.c: Fix
the CPU hotplug command names
Adds ability to provide a preferred CPU model for CPUID data decoding.
Such model would be considered as the best possible model (if it's
supported by hypervisor) regardless on number of features which have to
be added or removed for describing required CPU.
So far, when CPUID data were converted into CPU model and features, the
features can only be added to the model. As a result, when a guest asked
for something like "qemu64,-svm" it would get a qemu32 plus a bunch of
additional features instead.
This patch adds support for removing feature from the base model.
Selection algorithm remains the same: the best CPU model is the model
which requires lowest number of features to be added/removed from it.
Qemu committed a patch which list some CPU names in [] when asked for
supported CPUs (qemu -cpu ?). Yet, it needs such CPUs to be passed
without those square braces. When probing for supported CPU models, we
can just strip the square braces and pretend we have never seen them.
First, inital VCPU pinning is set correctly but then it is reset by
assigning qemu process to a new cgroup (which contains all CPUs). It's
easily fixed by swapping these two actions.
An empty root snapshot list was considered as error condition. Creating a
new snapshot would fail if the domain didn't have snapshots yet, because
the snapshot-create function tries to lookup the list of existing snapshots
in order to verify that the snapshot name is unique. This fails if the
domain doesn't have snapshots yet.
Removing the NULL check from esxVI_LookupRootSnapshotTreeList fixes this.
I am moving some of the eb/iptables related functions into the interface
of the firewall driver and am making them only accessible via the driver's
interface. Otherwise exsiting code is adapted where needed. I am adding one
new function to the interface that checks whether the 'basic' rules can be
applied, which will then be used by a subsequent patch.
According to GCC, ATTRIBUTE_UNUSED means that an attribute _might_
be unused, not _must_ be unused. Therefore, it is easier to
blindly mark a variable, than to try and do preprocessor limiting
of when we know it is unused.
* src/remote/remote_driver.c (remoteAuthenticate): Mark attribute
as potentially unused.
Reported by Gustovo Morozowski.
The initial boot of VMs uses -device for NICs where available. The
corresponding monitor command is device_add, but the network hotplug
code was still using device_del by mistake.
* src/qemu/qemu_driver.c: Use device_add for NIC hotplug where
available
If either of the getfd or host_net_add monitor commands return
any text, this indicates an error condition. Don't ignore this!
* src/qemu/qemu_monitor_text.c: Report errors for getfd and
host_net_add
The 'device_del' command expects a parameter called 'id' but we
were passing 'config'.
* src/qemu/qemu_monitor_json.c: Fix device_del command parameter
The idea is that every API implementation in driver which has flags
parameter should first call virCheckFlags() macro to check the function
was called with supported flags:
virCheckFlags(VIR_SUPPORTED_FLAG_1 |
VIR_SUPPORTED_FLAG_2 |
VIR_ANOTHER_SUPPORTED_FLAG, -1);
The error massage which is printed when unsupported flags are passed
looks like:
invalid argument in virFooBar: unsupported flags (0x2)
Where the unsupported flags part only prints those flags which were
passed but are not supported rather than all flags passed.
Based on a warning from coverity. The safe* functions
guarantee complete transactions on success, but don't guarantee
freedom from failure.
* src/util/util.h (saferead, safewrite, safezero): Add
ATTRIBUTE_RETURN_CHECK.
* src/remote/remote_driver.c (remoteIO, remoteIOEventLoop): Ignore
some failures.
(remoteIOReadBuffer): Adjust error messages on read failure.
* daemon/event.c (virEventHandleWakeup): Ignore read failure.
Disk devices in QEMU have two parts, the guest device and the host
backend driver. Historically these two parts have had the same
"unique" name. With the switch to using -device though, they now
have separate names. Thus when changing CDROM media, for guests
using -device syntax, we need to prepend the QEMU_DRIVE_HOST_PREFIX
constant
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Add helper function
qemuDeviceDriveHostAlias() for building a host backend alias
* src/qemu/qemu_driver.c: Use qemuDeviceDriveHostAlias() to determine
the host backend alias for performing eject/change commands in the
monitor
The device_add command was added in JSON mode in a way I didn't
expect. Instead of passing the normal device string to the JSON
command:
{ "execute": "device_add", "arguments": { "device": "ne2k_pci,id=nic.1,netdev=net.1" } }
We need to split up the device string into a full JSON object
{ "execute": "device_add", "arguments": { "driver": "ne2k_pci", "id": "nic.1", "netdev": "net.1" } }
* src/qemu/qemu_conf.h, src/qemu/qemu_conf.c: Rename the
qemuCommandLineParseKeywords method to qemuParseKeywords
and export it to monitor
* src/qemu/qemu_monitor_json.c: Split up device string into
a JSON object for device_add command
The parameter for the qemuMonitorDeviceDel() is a device alias,
not a device config string. Rename the parameter reflect this
and avoid confusion to readers.
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h:
Rename devicestr to devalias in qemuMonitorDeviceDel()
The QEMU developers have stated that they will not be porting
the commands 'pci_add', 'pci_del', 'usb_add', 'usb_del' to the
JSON mode monitor, since they're obsoleted by 'device_add'
and 'device_del'. libvirt has (untested) code that would have
supported those commands in theory, but since we already use
device_add/del where available, there's no need to keep the
legacy stuff anymore.
The text mode monitor keeps support for all commands for sake
of historical compatability.
* src/qemu/qemu_monitor_json.c: Remove 'pci_add', 'pci_del',
'usb_add', 'usb_del' commands
The QEMU driver is mistakenly calling directly into the text
mode monitor for the domain memory stats query.
* src/qemu/qemu_driver.c: Replace qemuMonitorTextGetMemoryStats with
qemuMonitorGetMemoryStats
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Add the new
wrapper for qemuMonitorGetMemoryStats
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h: Add
qemuMonitorJSONGetMemoryStats implementation
Instead of reporting VIR_ERR_INTERNAL_ERROR use the more specific
VIR_ERR_CONFIG_UNSUPPORTED
* src/qemu/qemu_conf.c: Report VIR_ERR_CONFIG_UNSUPPORTED for
unsupported video adapters
To avoid race-conditions, the tear down of a filter has to happen before
the tap interface disappears and another tap interface with the same
name can re-appear. This patch tries to fix this. In one place, where
communication with the qemu monitor may fail, I am only tearing the
filters down after knowing that the function did not fail.
I am also moving the tear down functions into an include file for other
drivers to reuse.
* This patch implements a memory allocator to obtain memory for
structures whose last member is a variable length array. C99 refers
to these variable length objects as structs containing flexible
array members.
* Fixed macro parentheses per Eric Blake
* src/xen/xend_internal.c (xend_parse_sexp_desc_char): Add three
uses of sa_assert, each preceding a strchr(value,... to assure
clang that "value" is non-NULL.
* src/qemu/qemu_driver.c (qemudDomainAttachSCSIDisk):
Initialize "cont" to NULL, so clang knows it's set.
Add an sa_assert so it knows it's non-NULL when dereferenced.
* src/nwfilter/nwfilter_ebiptables_driver.c (ebiptablesApplyNewRules):
Don't dereference a NULL or uninitialized pointer when given
an empty list of rules. Add an sa_assert(inst) in each loop to
tell clang that the uses of "inst[i]" are valid.
Among some here, there is a strong aversion to the use of "assert", yet
some others think it is essential (when applied judiciously) even --
perhaps "especially" -- at the heart of libraries and core hypervisor-
related code.
Here is a compromise that lets us make assertions about the code (e.g.,
to tell static analyzers about invariants) without even a hint of risk
of an abort.
* src/internal.h [STATIC_ANALYSIS]: Include <assert.h>.
(sa_assert): Define. A no-op most of the time, but equivalent
to classical assert when STATIC_ANALYSIS is nonzero.
* src/storage/storage_backend_fs.c (virStorageBackendFileSystemMount):
Clang was not smart enough, and mistakenly reported that "options"
could be used uninitialized. Initialize it.
Somehow the backend of this function was never implemented in
libvirt's netcf driver, and nobody noticed until now. (The required
netcf function was already in place, so nothing needs to change
there.)
* src/interface/netcf_driver.c: add in the backend function, and point
to it from the table of driver functions.
* src/openvz/openvz_driver.c (openvzGetProcessInfo): Reorganize
so that unexpected /proc/vz/vestat content cannot make us use
uninitialized variables. Without this change, an input line with
a matching "readvps", but fewer than 4 numbers would result in our
using at least "systime" uninitialized.
I am getting rid of determining the path to necessary CLI tools at
compile time. Instead, now the firewall driver has an initialization
function that uses virFindFileInPath() to determine the path to
necessary CLI tools and a shutdown function to free allocated memory.
The rest of the patch mostly deals with availability of the CLI tools
and to not call certain code blocks if a tool is not available and that
strings now have to be built slightly differently.
Generate almost all SOAP method mapping code.
Update the driver code to use the complete paramater list of some methods
that had parameters skipped before.
Improve the ESX_VI__METHOD marco to do automatic output deserialization
based on output occurrence. Also incorporate automatic _this binding and
output pointer check.
* src/esx/esx_vmx.c (esxVMX_GatherSCSIControllers): Do not dereference
a NULL disk->driverName. We already detect this condition in another
case. Check for it here, too.
When building libvirt on RHEL-5, I saw this error:
cc1: warnings being treated as errors
openvz/openvz_conf.c: In function 'openvzGetVPSUUID':
openvz/openvz_conf.c:835: warning: 'saveptr' may be used uninitialized in this function
make[3]: *** [libvirt_driver_openvz_la-openvz_conf.lo] Error 1
gcc in RHEL-5 gets upset about this usage of strtok_r (even though
it is perfectly valid). Just set *saveptr to NULL at the
start to quiet it down.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Changes from v1 to v2:
- changed function name prefixes to 'iface' from previous 'Iface'
- Further to make make syntax-check pass:
- indentation fix in interface.h
- added entry to POTFILES.in
I am consolidating network interface related functions used in nwfilter
and macvtap code in utils/interface.c. All function names are prefixed
with 'Iface'. The following functions are now available through
interface.h:
int ifaceCtrl(const char *name, bool up);
int ifaceUp(const char *name);
int ifaceDown(const char *name);
int ifaceCheck(bool reportError, const char *ifname,
const unsigned char *macaddr, int ifindex);
int ifaceGetIndex(bool reportError, const char *ifname, int *ifindex);
I added 'int ifindex' as parameter to ifaceCheck to the original
function and modified the code accordingly.
* configure.ac docs/news.html.in libvirt.spec.in src/libvirt_public.syms:
updates for release of 0.8.0
* po/*.po po/libvirt.pot: updated a lar set of localizations, and merge
the messages
I mistakenly took the op field in the DHCP message as the DHCP_OFFER
type. Rather than basing the decision to read the VM's IP address on
that field, process the appended DHCP options where option 53 indicates
the actual type of the packet. I am also reading the broadcast address
of the VM, but don't use it so far.
In a couple of cases typos meant we were firing the wrong type
of event. In the python code my previous commit accidentally
missed some chunks of the code.
* python/libvirt-override-virConnect.py: Add missing python glue
accidentally left out of previous commit
* src/conf/domain_event.c, src/qemu/qemu_monitor_json.c: Fix typos
in event name / method name to invoke
Currently when we attempt to change the cdrom in a qemu VM the monitor
doesn't generate an error if the target filename doesn't exist. I've
submitted a patch[1] for this. This patch is the libvirt qemu-driver
side which catches the error message from the monitor and reportes the
error to libvirt. This means that virsh attach-disk cdrom commands
won't appear to succeed when qemu change command actually failed.
* src/qemu/qemu_monitor_text.c: in qemuMonitorTextChangeMedia() look
for failure to access the new data
Fix invalid code generating in esx_vi_generator.py regarding deep copy
types that contain enum properties.
Add strptime and timegm to bootstrap.conf. Both are used to convert a
xsd:dateTime to calendar time.
Add a testcase of the xsd:dateTime conversion.
The network filter / snapshot / hooks code introduced some
non-portable pices that broke the win32 build
* configure.ac: Check for net/ethernet.h required by nwfile config
parsing code
* src/conf/nwfilter_conf.c: Define ethernet protocol constants
if net/ethernet.h is missing
* src/util/hooks.c: Disable hooks build on Win32 since it lacks
fork/exec/pipe
* src/util/threads-win32.c: Fix unchecked return value
* tools/virsh.c: Disable SIGPIPE on Win32 since it doesn't exist.
Fix non-portable strftime() formats
Changes from V1 to V2 of this patch
- I had reversed the logic thinking that icmp type 0 is a echo
request,but it's reply -- needed to reverse the logic
- Found that ebtables takes the --ip-tos argument only as a hex number
This patch enables the skipping of some of the ICMP traffic rules on the
iptables level under certain circumstances so that the following filter
properly enables unidirectional pings:
<filter name='testcase'>
<uuid>d6b1a2af-def6-2898-9f8d-4a74e3c39558</uuid>
<!-- allow incoming ICMP Echo Request -->
<rule action='accept' direction='in' priority='500'>
<icmp type='8'/>
</rule>
<!-- allow outgoing ICMP Echo Reply -->
<rule action='accept' direction='out' priority='500'>
<icmp type='0'/>
</rule>
<!-- drop all other ICMP traffic -->
<rule action='drop' direction='inout' priority='600'>
<icmp/>
</rule>
</filter>
Extend tests to cover all SCSI controller types and document the
new type.
The lsisas1068 SCSI controller type was added in ESX 4.0. The VMX
parser reports an error when this controller type is present. This
makes virsh dumpxml fail for every domain that uses this controller
type.
This patch fixes this and adds lsisas1068 to the list of accepted
SCSI controller types.
Reported by Jonathan Kelley.
This patch implements support for learning a VM's IP address. It uses
the pcap library to listen on the VM's backend network interface (tap)
or the physical ethernet device (macvtap) and tries to capture packets
with source or destination MAC address of the VM and learn from DHCP
Offers, ARP traffic, or first-sent IPv4 packet what the IP address of
the VM's interface is. This then allows to instantiate the network
traffic filtering rules without the user having to provide the IP
parameter somewhere in the filter description or in the interface
description as a parameter. This only supports to detect the parameter
IP, which is for the assumed single IPv4 address of a VM. There is not
support for interfaces that may have multiple IP addresses (IP
aliasing) or IPv6 that may then require more than one valid IP address
to be detected. A VM can have multiple independent interfaces that each
uses a different IP address and in that case it will be attempted to
detect each one of the address independently.
So, when for example an interface description in the domain XML has
looked like this up to now:
<interface type='bridge'>
<source bridge='mybridge'/>
<model type='virtio'/>
<filterref filter='clean-traffic'>
<parameter name='IP' value='10.2.3.4'/>
</filterref>
</interface>
you may omit the IP parameter:
<interface type='bridge'>
<source bridge='mybridge'/>
<model type='virtio'/>
<filterref filter='clean-traffic'/>
</interface>
Internally I am walking the 'tree' of a VM's referenced network filters
and determine with the given variables which variables are missing. Now,
the above IP parameter may be missing and this causes a libvirt-internal
thread to be started that uses the pcap library's API to listen to the
backend interface (in case of macvtap to the physical interface) in an
attempt to determine the missing IP parameter. If the backend interface
disappears the thread terminates assuming the VM was brought down. In
case of a macvtap device a timeout is being used to wait for packets
from the given VM (filtering by VM's interface MAC address). If the VM's
macvtap device disappeared the thread also terminates. In all other
cases it tries to determine the IP address of the VM and will then apply
the rules late on the given interface, which would have happened
immediately if the IP parameter had been explicitly given. In case an
error happens while the firewall rules are applied, the VM's backend
interface is 'down'ed preventing it to communicate. Reasons for failure
for applying the network firewall rules may that an ebtables/iptables
command failes or OOM errors. Essentially the same failure reasons may
occur as when the firewall rules are applied immediately on VM start,
except that due to the late application of the filtering rules the VM
now is already running and cannot be hindered anymore from starting.
Bringing down the whole VM would probably be considered too drastic.
While a VM's IP address is attempted to be determined only limited
updates to network filters are allowed. In particular it is prevented
that filters are modified in such a way that they would introduce new
variables.
A caveat: The algorithm does not know which one is the appropriate IP
address of a VM. If the VM spoofs an IP address in its first ARP traffic
or IPv4 packets its filtering rules will be instantiated for this IP
address, thus 'locking' it to the found IP address. So, it's still
'safer' to explicitly provide the IP address of a VM's interface in the
filter description if it is known beforehand.
* configure.ac: detect libpcap
* libvirt.spec.in: require libpcap[-devel] if qemu is built
* src/internal.h: add the new ATTRIBUTE_PACKED define
* src/Makefile.am src/libvirt_private.syms: add the new modules and symbols
* src/nwfilter/nwfilter_learnipaddr.[ch]: new module being added
* src/nwfilter/nwfilter_driver.c src/conf/nwfilter_conf.[ch]
src/nwfilter/nwfilter_ebiptables_driver.[ch]
src/nwfilter/nwfilter_gentech_driver.[ch]: plu the new functionality in
* tests/nwfilterxml2xmltest: extend testing
When comparing a CPU to host CPU, the result would be
VIR_CPU_COMPARE_SUPERSET (or even VIR_CPU_COMPARE_INCOMPATIBLE if strict
match was required) even though the two CPUs were identical.
When qemu libvirt driver doesn't support guest CPU selection with given
qemu binary, guests requiring specific CPU should fail to start instead
of being silently supplied with a default CPU.
git grep found 12 of the former but 100 of the latter in src/.
* src/remote/remote_driver.c (initialise_gnutls): Rename...
(initialize_gnutls): ...to this.
(doRemoteOpen): Adjust caller.
* src/xen/xen_driver.c (xenUnifiedOpen): Adjust output string.
* src/util/network.c: Adjust comments.
Suggested by Matthias Bolte.
* src/conf/domain_event.c (virDomainEventGraphicsNewFromDom):
Return NULL when handling out-of-memory error, rather than
falling through with ev=NULL and then assigning to ev->member.
(virDomainEventGraphicsNewFromObj): Likewise.
The attached patch fixes a problem due to the mac match in iptables only
supporting --mac-source and no --mac-destination, thus it not being
symmetric. Therefore a rule like this one
<rule action='drop' direction='out'>
<all match='no' srcmacaddr='$MAC'/>
</rule>
should only have the MAC match on traffic leaving the VM and not test
for the same source MAC address on traffic that the VM receives.
* src/qemu/qemu_driver.c (qemudStartVMDaemon): Initialize "logfile"
to ensure that we don't use it uninitialized -- thus closing an
arbitrary file descriptor -- in the cleanup block.
* src/security/virt-aa-helper.c: adjust virt-aa-helper to handle pci
devices. Update valid_path() to have an override array to check against,
and add "/sys/devices/pci" to it. Then rename file_iterate_cb() to
file_iterate_hostdev_cb() and create file_iterate_pci_cb() based on it
To avoid an error when hitting the <seclabel...> definition
* src/security/virt-aa-helper.c: add VIR_DOMAIN_XML_INACTIVE flag
to virDomainDefParseString
Don't exit with error if the user unloaded the profile outside of
libvirt
* src/security/virt-aa-helper.c: check the exit error from apparmor_parser
before exiting with a failure
The calls to virExec() in security_apparmor.c when
invoking virt-aa-helper use VIR_EXEC_CLEAR_CAPS. When compiled without
libcap-ng, this is not a problem (it's effectively a no-op) but with
libcap-ng this causes MAC_ADMIN to be cleared. MAC_ADMIN is needed by
virt-aa-helper to manipulate apparmor profiles and without it VMs will
not start[1]. This patch calls virExec with the default VIR_EXEC_NONE
instead.
* src/security/security_apparmor.c: fallback to VIR_EXEC_NONE flags for
virExec of virt_aa_helper
Also define ESX_ERROR and ESX_VI_ERROR in a central place, instead of
defining them in each source file.
Add ESX_ERROR and ESX_VI_ERROR to the msg_gen_function list in cfg.mk.
Update po/POTFILES.in accordingly.
With Eric Blake's suggestions applied.
The following rule for direction 'in'
<rule direction='in' action='drop'>
<mac srcmacaddr='1:2:3:4:5:6'/>
</rule>
drops all traffic from the given mac address.
The following rule for direction 'out'
<rule direction='out' action='drop'>
<mac dstmacaddr='1:2:3:4:5:6'/>
</rule>
drops all traffic to the given mac address.
The following rule in direction 'inout'
<rule direction='inout' action='drop'>
<mac srcmacaddr='1:2:3:4:5:6'/>
</rule>
now drops all traffic from and to the given MAC address.
So far it would have dropped traffic from the given MAC address
and outgoing traffic with the given source MAC address, which is not useful
since the packets will always have the VM's MAC address as source
MAC address. The attached patch fixes this.
This is the last bug I currently know of and want to fix.
When starting up qemu VNC autoport guests, we were
only looking through ports 5900 to 6000, meaning we
were limited to 100 total clients. Increase that
limit to 65535 (the last available port), so we can
have up to 59635 VNC autoport guests.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
While playing around with def/newDef with the qemu code,
I noticed that newDef was *always* getting set to a value,
even when I didn't redefine the domain. I think the problem
is the virDomainLoadConfig is always doing virDomainAssignDef
regardless of whether the domain already exists in the hashtable.
In turn, virDomainAssignDef is assigning the definition (which
is actually a duplicate) to newDef. Fix this so that newDef stays
NULL until we actually have a new def.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
values. Rather use the strspn() function. Along with this cleanup the
initialization function for the code that used the regular expression
can also be removed.
The images are saved in /var/lib/libvirt/qemu/save/
and named $domainname.save . The directory is created appropriately
at daemon startup. When a domain is started while a saved image is
available, libvirt will try to load this saved image, and start the
domain as usual in case of failure. In any case the saved image is
discarded once the domain is created.
* src/qemu/qemu_conf.h: adds an extra save path to the driver config
* src/qemu/qemu_driver.c: implement the 3 new operations and handling
of the image directory
* src/remote/remote_protocol.x src/remote/remote_protocol.h
src/remote/remote_protocol.c src/remote/remote_driver.c: add the entry
points in the remote driver
* daemon/remote.c daemon/remote_dispatch_args.h
daemon/remote_dispatch_prototypes.h daemon/remote_dispatch_table.h:
and implement the daemon counterpart
virDomainManagedSave() is to be run on a running domain. Once the call
complete, as in virDomainSave() the domain is stopped upon completion,
but there is no restore counterpart as any order to start the domain
from the API would load the state from the managed file, similary if
the domain is autostarted when libvirtd starts.
Once a domain has restarted his managed save image is destroyed,
basically managed save image can only exist for a stopped domain,
for a running domain that would be by definition outdated data.
* include/libvirt/libvirt.h.in src/libvirt.c src/libvirt_public.syms:
adds the new entry points virDomainManagedSave(),
virDomainHasManagedSaveImage() and virDomainManagedSaveRemove()
* src/driver.h src/esx/esx_driver.c src/lxc/lxc_driver.c
src/opennebula/one_driver.c src/openvz/openvz_driver.c
src/phyp/phyp_driver.c src/qemu/qemu_driver.c src/vbox/vbox_tmpl.c
src/remote/remote_driver.c src/test/test_driver.c src/uml/uml_driver.c
src/xen/xen_driver.c: add corresponding new internal drivers entry
points