The qemu_conf.c code is doing three jobs, driver config file
loading, QEMU capabilities management and QEMU command line
management. Move the capabilities code into its own file
* src/qemu/qemu_capabilities.c, src/qemu/qemu_capabilities.h: New
capabilities management code
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Delete capabilities
code
* src/qemu/qemu_conf.h: Adapt for API renames
* src/Makefile.am: add src/qemu/qemu_capabilities.c
So far, CPUID data were stored in two different data structures. First
of them was a structure allowing direct access for CPUID data according
to function number and the second was a plain array of struct
cpuX86cpuid. This was a silly design which resulted in converting data
from one type to the other and back again or implementing similar
functionality for both data structures.
The patch leaves only the direct access structure. This makes the code
both smaller and more maintainable since operations on different objects
can use common low-level operations.
All 57 tests for cpu subsystem still pass after this rewrite.
Allows compilation, but no creation of child processes yet. Take it
one step at a time.
* src/util/util.c (virExecWithHook) [WIN32]: New dummy function.
* src/libvirt_private.syms: Export it.
* .gnulib: Update to latest.
* bootstrap.conf (gnulib_modules): Import pipe-posix and waitpid
for mingw.
* src/remote/remote_driver.c (pipe) [WIN32]: Drop dead macro.
* daemon/event.c (pipe) [WIN32]: Drop dead function.
Currently, all of domain "save/dump/managed save/migration"
use the same function "qemudDomainWaitForMigrationComplete"
to wait the job finished, but the error messages are all
about "migration", e.g. when a domain saving job is canceled
by user, "migration was cancled by client" will be throwed as
an error message, which will be confused for user.
As a solution, intoduce two new job types(QEMU_JOB_SAVE,
QEMU_JOB_DUMP), and set "priv->jobActive" to "QEMU_JOB_SAVE"
before saving, to "QEMU_JOB_DUMP" before dumping, so that we
could get the real job type in
"qemudDomainWaitForMigrationComplete", and give more clear
message further.
And as It's not important to figure out what's the exact job
is in the DEBUG and WARN log, also we don't need translated
string in logs, simply repace "migration" with "job" in some
statements.
* src/qemu/qemu_driver.c
Current code does not pass VM mac address to a 802.1Qbh direct attach
interface using IFLA_VF_MAC. This patch adds support in macvtap code to
send IFLA_VF_MAC netlink request during port profile association on a
802.1Qbh interface.
Stefan Cc'ed for comments because this patch changes a condition for
802.1Qbg
802.1Qbh support for IFLA_VF_MAC in enic driver has been posted and is
pending acceptance at http://marc.info/?l=linux-netdev&m=129185244410557&w=2
This is pretty straightforward - even though dnsmasq gets daemonized
and uses a pid file, those things are both handled by the dnsmasq
binary itself. And libvirt doesn't need any of the output of the
dnsmasq command either, so we just setup the args and call
virRun(). Mainly it was just a (mostly) mechanical job of replacing
the APPEND_ARG() macro (and some other *printfs()) with
virCommandAddArg*().
Instead of just reporting that a task failed get the
localized message from the TaskInfo error and include
it in the reported error message.
Implement minimal deserialization support for the
MethodFault type in order to obtain the actual fault
type.
For example, this changes the reported error message
when trying to create a volume with zero size from
Could not create volume
to
Could not create volume: InvalidArgument - A specified parameter was not correct.
Not perfect yet, but better than before.
Changes common to all network disks:
-Make source name optional in the domain schema, since NBD doesn't use it
-Add a hostName type to the domain schema, and use it instead of genericName, which doesn't include .
-Don't leak host names or ports
-Set the source protocol in qemuParseCommandline
Signed-off-by: Josh Durgin <joshd@hq.newdream.net>
This patch adds network disk support to libvirt/QEMU. The currently
supported protocols are nbd, rbd, and sheepdog. The XML syntax is like
this:
<disk type="network" device="disk">
<driver name="qemu" type="raw" />
<source protocol='rbd|sheepdog|nbd' name="...some image identifier...">
<host name="mon1.example.org" port="6000">
<host name="mon2.example.org" port="6000">
<host name="mon3.example.org" port="6000">
</source>
<target dev="vda" bus="virtio" />
</disk>
Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
`dump' watchdog action lets libvirtd to dump the guest when receives a
watchdog event (which probably means a guest crash)
Currently only qemu is supported.
When we get an EOF event on monitor connection, it may be a result of
either crash or graceful shutdown. QEMU which supports async events
(i.e., we are talking to it using JSON monitor) emits SHUTDOWN event on
graceful shutdown. In case we don't get this event by the time monitor
connection is closed, we assume the associated domain crashed.
Currently libvirt doesn't confirm whether the guest has responded to the
disk removal request. In some cases this can leave the guest with
continued access to the device while the mgmt layer believes that it has
been removed. With a recent qemu monitor command[1] we can
deterministically revoke a guests access to the disk (on the QEMU side)
to ensure no futher access is permitted.
This patch adds support for the drive_del() command and introduces it
in the disk removal paths. If the guest is running in a QEMU without this
command we currently explicitly check for unknown command/CommandNotFound
and log the issue.
If QEMU supports the command we issue the drive_del command after we attempt
to remove the device. The guest may respond and remove the block device
before we get to attempt to call drive_del. In that case, we explicitly check
for 'Device not found' from the monitor indicating that the target drive
was auto-deleted upon guest responds to the device removal notification.
1. http://thread.gmane.org/gmane.comp.emulators.qemu/84745
Signed-off-by: Ryan Harper <ryanh@us.ibm.com>
Currently libvirt doesn't confirm whether the guest has responded to the
disk removal request. In some cases this can leave the guest with
continued access to the device while the mgmt layer believes that it has
been removed. With a recent qemu monitor command[1] we can
deterministically revoke a guests access to the disk (on the QEMU side)
to ensure no futher access is permitted.
This patch adds support for the drive_unplug() command and introduces it
in the disk removal paths. There is some discussion to be had about how
to handle the case where the guest is running in a QEMU without this
command (and the fact that we currently don't have a way of detecting
what monitor commands are available).
Changes since v2:
- use VIR_ERROR to report when unplug command not found
Changes since v1:
- return > 0 when command isn't present, < 0 on command failure
- detect when drive_unplug command isn't present and log error
instead of failing entire command
Signed-off-by: Ryan Harper <ryanh@us.ibm.com>
- qemudDomainAttachPciControllerDevice: Don't build "devstr"
if "-device" of qemu is not available, as "devstr" will only
be used by "qemuMonitorAddDevice", which depends on "-device"
argument of qemu is supported.
- "qemudDomainSaveImageOpen": Fix indent problem.
* src/qemu/qemu_driver.c
Commit febc591683 introduced -vga none in
case no video card is included in domain XML. However, old qemu
versions do not support this and such domain cannot be successfully
started.
* .gnulib: Update to latest, for at least a stdint.h fix
* src/storage/storage_driver.c (storageVolumeZeroSparseFile)
(storageWipeExtent): Use better type, although it still triggers
spurious -Wformat warning on MacOS's gcc.
popen must be matched with pclose (not fclose), or it will leak
resources. Furthermore, it is a lousy interface when it comes to
signal handling. We're much better off using our decent command
wrapper. Note that virCommand guarantees that VIR_FREE(outbuf) is
both required and safe to call, whether virCommandRun succeeded or
failed.
* src/openvz/openvz_conf.c (openvzLoadDomains, openvzGetVEID):
Replace popen with virCommand usage.
Guarantee that outbuf/errbuf are allocated on success, even if to the
empty string. Caller always has to free the result, and empty output
check requires checking if *outbuf=='\0'. Makes the API easier to use
safely. Failure is best effort allocation (some paths, like
out-of-memory, cannot allocate a buffer, but most do), so caller must
free buffer on failure.
* docs/internals/command.html.in: Update documentation.
* src/util/command.c (virCommandSetOutputBuffer)
(virCommandSetErrorBuffer, virCommandProcessIO) Guarantee empty
string on no output.
* tests/commandtest.c (test17): New test.
* docs/internals/command.html.in: Better documentation of buffer
vs. fd considerations.
* src/util/command.c (virCommandRunAsync): Reject raw execution
with string io.
(virCommandRun): Reject execution with user-specified fds not
visiting a regular file.
* src/conf/domain_conf.c (virDomainDefParseXML): Prefer sysinfo
uuid over generating one, and if both uuids are present, require
them to be identical.
* src/qemu/qemu_conf.c (qemuBuildSmbiosSystemStr): Allow skipping
the uuid.
(qemudBuildCommandLine): Adjust caller; <smbios mode=host/> must
not use host uuid in place of guest uuid.
The log lists things like -smbios type=1,vendor="Red Hat", which
is great for shell parsing, but not so great when you realize that
execve() then passes those literal "" on as part of the command
line argument, such that qemu sets SMBIOS with extra literal quotes.
The eventual addition of virCommand is needed before we have the API
to shell-quote a string representation of a command line, so that the
log can still be pasted into a shell, but without inserting extra
bytes into the execve() arguments.
* src/qemu/qemu_conf.c (qemuBuildSmbiosBiosStr)
(qemuBuildSmbiosSystemStr): Qemu doesn't like quotes around uuid
arguments, and the remaining quotes are passed literally to
smbios, making <smbios mode='host'/> inaccurate. Removing the
quotes makes the log harder to parse, but that can be fixed later
with virCommand improvements.
* tests/qemuxml2argvdata/qemuxml2argv-smbios.args: 'Fix' test; it
will need fixing again once virCommand learns how to shell-quote a
potential command line.
Humans consider January as month #1, while gmtime_r(3) calls it month #0.
While fixing it, render qemu's rtc parameter with leading zeros, as is more
commonplace.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=660194
* src/util/threads.h (virThreadID): New prototype.
* src/util/threads-pthread.c (virThreadID): New function.
* src/util/threads-win32.c (virThreadID): Likewise.
* src/libvirt_private.syms (threads.h): Export it.
* daemon/event.c (virEventInterruptLocked): Use it to avoid
warning on BSD systems.
"virCommandRun": if "cmd->outbuf" or "cmd->errbuf" is NULL,
libvirtd will be crashed when trying to start a qemu domain
(which invokes "virCommandRun"), it caused by we try to use
"*cmd->outbuf" and "*cmd->errbuf" regardless of cmd->outbuf
or cmd->errbuf is NULL.
* src/util/command.c (virCommandRun)
Two more calls to remote libvirtd have to be surrounded by
qemuDomainObjEnterRemoteWithDriver() and
qemuDomainObjExitRemoteWithDriver() to prevent possible deadlock between
two communicating libvirt daemons.
See commit f0c8e1cb37 for further details.
virDrvSupportsFeature API is allowed to return -1 on error while all but
one uses of VIR_DRV_SUPPORTS_FEATURE only check for (non)zero return
value. Let's make this macro return zero on error, which is what
everyone expects anyway.
This patch adds a mode_t parameter to virFileWriteStr().
If mode is different from 0, virFileWriteStr() will try
to create the file if it doesn't exist.
* src/util/util.h (virFileWriteStr): Alter signature.
* src/util/util.c (virFileWriteStr): Allow file creation.
* src/network/bridge_driver.c (networkEnableIpForwarding)
(networkDisableIPV6): Adjust clients.
* src/node_device/node_device_driver.c
(nodeDeviceVportCreateDelete): Likewise.
* src/util/cgroup.c (virCgroupSetValueStr): Likewise.
* src/util/pci.c (pciBindDeviceToStub, pciUnBindDeviceFromStub):
Likewise.
* src/qemu/qemu_conf.c (qemudExtractVersionInfo): Check for file
before executing it here, rather than in callers.
(qemudBuildCommandLine): Rewrite with virCommand.
* src/qemu/qemu_conf.h (qemudBuildCommandLine): Update signature.
* src/qemu/qemu_driver.c (qemuAssignPCIAddresses)
(qemudStartVMDaemon, qemuDomainXMLToNative): Adjust callers.
This proof of concept shows how two existing uses of virExec
and virRun can be ported to the new virCommand APIs, and how
much simpler the code becomes
This introduces a new set of APIs in src/util/command.h
to use for invoking commands. This is intended to replace
all current usage of virRun and virExec variants, with a
more flexible and less error prone API.
* src/util/command.c: New file.
* src/util/command.h: New header.
* src/Makefile.am (UTIL_SOURCES): Build it.
* src/libvirt_private.syms: Export symbols internally.
* tests/commandtest.c: New test.
* tests/Makefile.am (check_PROGRAMS): Run it.
* tests/commandhelper.c: Auxiliary program.
* tests/commanddata/test2.log - test15.log: New expected outputs.
* cfg.mk (useless_free_options): Add virCommandFree.
(msg_gen_function): Add virCommandError.
* po/POTFILES.in: New translation.
* .x-sc_avoid_write: Add exemption.
* tests/.gitignore: Ignore new built file.
* src/qemu/qemu_driver.c (though MACROS QEMU_VNC_PORT_MAX, and
QEMU_VNC_PORT_MIN are defined at the beginning, numbers (65535, 5900)
are still used, replace them)
The arguments passed to the thread function must be allocated on
the heap, rather than the stack, since it is possible for the
spawning thread to continue before the new thread runs at all.
In such a case, it is possible that the area of stack where the
thread args were stored is overwritten.
* src/util/threads-pthread.c, src/util/threads-win32.c: Allocate
thread arguments on the heap
Use macvtap specific functions depending on WITH_MACVTAP.
Use #if instead of #ifdef to check for WITH_MACVTAP, because
WITH_MACVTAP is always defined with value 0 or 1.
Also export virVMOperationType{To|From}String unconditional,
because they are used unconditional in the domain config code.
When dumping a domain, it's reasonable to save dump-file in raw format
if dump format is misconfigured or the corresponding compress program
is not available rather then fail dumping.
This patch introduces the usage of the pre-associate state of the IEEE 802.1Qbg standard on incoming VM migration on the target host. It is in response to bugzilla entry 632750.
https://bugzilla.redhat.com/show_bug.cgi?id=632750
For being able to differentiate the exact reason as to why a macvtap device is being created, either due to a VM creation or an incoming VM migration, I needed to pass that reason as a parameter from wherever qemudStartVMDaemon is being called in order to determine whether to send an ASSOCIATE (VM creation) or a PRE-ASSOCIATE (incoming VM migration) towards lldpad.
I am also fixing a problem with the virsh domainxml-to-native call on the way.
Gerhard successfully tested the patch with a recent blade network 802.1Qbg-compliant switch.
The patch should not have any side-effects on the 802.1Qbh support in libvirt, but Roopa (cc'ed) may want to verify this.
We currently use the next free veid although there's one given in the
domain xml. This currently breaks defining new domains since vmdef->name
and veid don't match leading to the following error later on:
error: Failed to define domain from 110.xml
error: internal error Could not set UUID
Since silently ignoring vmdef->name is not nice respect it instead. We
avoid veid collisions in the upper levels already.
This reverts commit
Log all errors at level INFO to stop polluting syslog
04bd0360f3.
and makes virRaiseErrorFull() log errors at debug priority
when called from inside libvirtd. This stops libvirtd from
polluting it's own log with client errors at error priority
that'll be reported and logged on the client side anyway.
When we set migrate_speed by json, we receive the following
error message:
libvirtError: internal error unable to execute QEMU command
'migrate_set_speed': Invalid parameter type, expected: number
The reason is that: the arguments of migrate_set_speed
by json is json number, not json string.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
The nodeinfo structure includes
nodes : the number of NUMA cell, 1 for uniform mem access
sockets : number of CPU socket per node
cores : number of core per socket
threads : number of threads per core
which does not work well for NUMA topologies where each node does not
consist of integral number of CPU sockets.
We also have VIR_NODEINFO_MAXCPUS macro in public libvirt.h which
computes maximum number of CPUs as (nodes * sockets * cores * threads).
As a result, we can't just change sockets to report total number of
sockets instead of sockets per node. This would probably be the easiest
since I doubt anyone is using the field directly. But because of the
macro, some apps might be using sockets indirectly.
This patch leaves sockets to be the number of CPU sockets per node (and
fixes qemu driver to comply with this) on machines where sockets can be
divided by nodes. If we can't divide sockets by nodes, we behave as if
there was just one NUMA node containing all sockets. Apps interested in
NUMA should consult capabilities XML, which is what they probably do
anyway.
This way, the only case in which apps that care about NUMA may break is
on machines with funky NUMA topology. And there is a chance libvirt
wasn't able to start any guests on those machines anyway (although it
depends on the topology, total number of CPUs and kernel version).
Nothing changes at all for apps that don't care about NUMA.
security_context_t happens to be a typedef for char*, and happens to
begin with a string usable as a raw context string. But in reality,
it is an opaque type that may or may not have additional information
after the first NUL byte, where that additional information can
include pointers that can only be freed via freecon().
Proof is from this valgrind run of daemon/libvirtd:
==6028== 839,169 (40 direct, 839,129 indirect) bytes in 1 blocks are definitely lost in loss record 274 of 274
==6028== at 0x4A0515D: malloc (vg_replace_malloc.c:195)
==6028== by 0x3022E0D48C: selabel_open (label.c:165)
==6028== by 0x3022E11646: matchpathcon_init_prefix (matchpathcon.c:296)
==6028== by 0x3022E1190D: matchpathcon (matchpathcon.c:317)
==6028== by 0x4F9D842: SELinuxRestoreSecurityFileLabel (security_selinux.c:382)
800k is a lot of memory to be leaking.
* src/storage/storage_backend.c
(virStorageBackendUpdateVolTargetInfoFD): Avoid leak on error.
* src/security/security_selinux.c
(SELinuxReserveSecurityLabel, SELinuxGetSecurityProcessLabel)
(SELinuxRestoreSecurityFileLabel): Use correct function to free
security_context_t.
Making this change makes it easier to spot the memory leaks
that will be fixed in the next patch.
* cfg.mk (sc_prohibit_xmlGetProp): New rule.
* .x-sc_prohibit_xmlGetProp: New exception.
* Makefile.am (EXTRA_DIST): Ship exception file.
* tools/virsh.c (cmdDetachInterface, cmdDetachDisk): Adjust
offenders.
* src/conf/storage_conf.c (virStoragePoolDefParseSource):
Likewise.
* src/conf/network_conf.c (virNetworkDHCPRangeDefParseXML)
(virNetworkIPParseXML): Likewise.
virConnectClose calls virUnrefConnect which in turn closes
all open drivers when the refcount of that connection dropped
to zero. This works fine when you free all other objects that
hold a ref to the connection before you close it, because in
this case virUnrefConnect is the one that removes the last
ref to the connection.
But it doesn't work when you close the connection first before
freeing the other objects. This is because the other virUnref*
functions call virReleaseConnect when they detect that the
connection's refcount dropped to zero. In this case another
virUnref* function (different from virUnrefConnect) removes the
last ref to the connection. This results in not closing the
open drivers and leaking things that should have been cleaned
up in the driver close functions.
To fix this move the driver close calls to virReleaseConnect.
Except LXC and UML driver, implementations of all other drivers
simply return 0, because these drivers doesn't have config both
in memory and on disk, no need to track if the domain of these
drivers updated or not.
Rename "xenUnifiedDomainisPersistent" to "xenUnifiedDomainIsPersistent"
* esx/esx_driver.c
* lxc/lxc_driver.c
* opennebula/one_driver.c
* openvz/openvz_driver.c
* phyp/phyp_driver.c
* test/test_driver.c
* uml/uml_driver.c
* vbox/vbox_tmpl.c
* xen/xen_driver.c
* xenapi/xenapi_driver.c
introduce new public API "virDomainIsUpdated"
* src/conf/domain_conf.h (new member "updated" for "virDomainObj")
* src/libvirt_public.syms
* include/libvirt/libvirt.h.in
gnulib wraps Windows' SOCKET handle based send() and recv() functions
into file descriptor based ones that are used in libvirt.
Even though GnuTLS is using gnulib too, it explicitly doesn't use
gnulib's replacement functions on Windows. By default GnuTLS uses the
SOCKET handle based send() and recv(). This makes gnutls_handshake()
fail internally with a WSAENOTSOCK error because libvirt passes a
file descriptor; GnuTLS needs the SOCKET handle.
To avoid this mismatch make sure that GnuTLS uses gnulib's replacment
functions, by setting custom pull() and push() functions for GnuTLS.
The stdio.h header has a function called 'remove' declared. This
clashes with the 'remove' parameter in virShrinkN
* src/util/memory.c: Rename 'remove' to 'toremove'
The SCSI volumes currently get a name like '17:0:0:1' based
on $host:$bus:$target:$lun. The names are intended to be unique
per pool and stable across pool restarts. The inclusion of the
$host component breaks this, because the $host number for iSCSI
pools is dynamically allocated by the kernel at time of login.
This changes the name to be 'unit:0:0:1', ie removes the leading
host component. The 'unit:' prefix is just to ensure the volume
name doesn't start with a number and make it clearer when seen
out of context.
* src/storage/storage_backend_scsi.c: Improve volume name
field value stability and uniqueness
Many operations are not valid on inactive storage pools. The
storage driver is currently returning VIR_ERR_INTERNAL_ERROR
in these cases, rather than the more suitable error code
VIR_ERR_OPERATION_INVALID
* src/storage/storage_driver.c: Fix error code when pool
is not active
When libvirt starts up all storage pools default to the inactive
state, even if the underlying storage is already active on the
host. This introduces a new API into the internal storage backend
drivers that checks whether a storage pool is already active. If
the pool is active at libvirtd startup, the volume list will be
immediately populated.
* src/storage/storage_backend.h: New internal API for checking
storage pool state
* src/storage/storage_driver.c: Check whether a pool is active
upon driver startup
* src/storage/storage_backend_fs.c, src/storage/storage_backend_iscsi.c,
src/storage/storage_backend_logical.c, src/storage/storage_backend_mpath.c,
src/storage/storage_backend_scsi.c: Add checks for pool state
Since the previous patch added support for parsing the output of
the 'sendtargets' command, it is now trivial to support the
storage pool discovery API.
Given a hostname and optional portnumber and initiator IQN,
the code can return a full list of storage pool source docs,
each one representing a iSCSI target.
* src/storage/storage_backend_iscsi.c: Wire up target
auto-discovery
The Linux iSCSI initiator toolchain has the dubious feature that
if you ever run the 'sendtargets' command to merely query what
targets are available from a server, the results will be recorded
in /var/lib/iscsi. Any time the '/etc/init.d/iscsi' script runs
in the future, it will then automatically login to all those
targets. /etc/init.d/iscsi is automatically run whenever a NIC
comes online.
So from the moment you ask a server what targets are available,
your client will forever more automatically try to login to all
targets without ever asking if you actually want it todo this.
To stop this stupid behaviour, we need to run
iscsiadm --portal $PORTAL --target $TARGET
--op update --name node.startup --value manual
For every target on the server.
* src/storage/storage_backend_iscsi.c: Disable automatic login
for targets found as a result of a 'sendtargets' command
The following series of patches are adding significant
extra functionality to the iSCSI driver. THe current
internal helper methods are not sufficiently flexible
to cope with these changes. This patch refactors the
code to avoid needing to have a virStoragePoolObjPtr
instance as a parameter, instead passing individual
target, portal and initiatoriqn parameters.
It also removes hardcoding of port 3260 in the portal
address, instead using the XML value if any.
* src/storage/storage_backend_iscsi.c: Refactor internal
helper methods
The XML docs describe a 'port' attribute for the
storage source <host> element, but the parser never
handled it.
* docs/schemas/storagepool.rng: Define port attribute
* src/conf/storage_conf.c: Add missing parsing/formatting
of host port number
* src/conf/storage_conf.h: Remove bogus/unused 'protocol' field
When running non-root, the QEMU log file is usually opened with
truncation, since there is no logrotate for non-root usage.
This means that when libvirt logs the shutdown timestamp, the
log is accidentally truncated
* src/qemu/qemu_driver.c: Never truncate log file with shutdown
message
The QEMU logger appends a ':' to the timestamp when it deems
it neccessary, so the virTimestamp API should not duplicate
this
* src/util/util.c: Remove trailing ':' from timestamp
Everytime a public API returns an error, libvirtd pollutes
syslog with that error message. Reduce the error logging
level to INFO so these don't appear by default.
* src/util/virterror.c: Log all errors at INFO
The virFork call resets all logging handlers that may have been
set. Re-enable them after fork in virExec, so that env variables
fir LIBVIRT_LOG_OUTPUTS and LIBVIRT_LOG_FILTERS take effect
until the execve()
* src/util/util.c: Preserve logging in child in virExec
To allow messages from different threads to be untangled,
include an integer thread identifier in log messages.
* src/util/logging.c: Include thread ID
* src/util/threads.h, src/util/threads.h, src/util/threads-pthread.c:
Add new virThreadSelfID() function
* configure.ac: Check for sys/syscall.h
Do this by adding a helper function to get the persistent domain config. This
should be useful for other functions that may eventually want to alter
the persistent domain config (attach/detach device). Also make similar changes
to the test drivers setvcpus command.
A caveat is that the function will return the running config for a transient
domain, rather than error. This simplifies callers, as long as they use
other methods to ensure the guest is persistent.
Doing 'virsh setvcpus $vm --config 10' doesn't check the value against the
domains maxvcpus value. A larger value for example will prevent the guest
from starting.
Also make a similar change to the test driver.
The current semantics of non-persistent hotplug/update are confusing: the
changes will persist as long as the in memory domain definition isn't
overwritten. This means hotplug changes stay around until the domain is
redefined or libvirtd is restarted.
Call virDomainObjSetDefTransient at VM startup, so that we properly discard
hotplug changes when the VM is shutdown.
This function sets the running domain definition as transient, by reparsing
the persistent config and assigning it to newDef. This ensures that any
changes made to the running definition and not the persistent config are
discarded when the VM is shutdown.
This patch makes two corrections to the newly-added QED support patch series:
- Correct the QED header field offsets
- Remove XML parsing for VIR_STORAGE_FILE_AUTO_SAFE
Signed-off-by: Adam Litke <agl@us.ibm.com>
The IP address learning thread was causing a deadlock when it instantiated a filter while a filter update/change was ongoing. The reason for this was the ordering of locks due to the following calls
virNWFilterUnlockFilterUpdates()
virNWFilterPoolObjFindByName()
The below patch now puts the order of the locks in the above shown order when instantiating the filter from the IP address learning thread.
Implement getBackingStore() for QED images. The header format is defined in
the QED spec: http://wiki.qemu.org/Features/QED .
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Stefan Hajnoczi <stefan.hajnoczi@uk.ibm.com>
Cc: Anthony Liguori <aliguori@linux.vnet.ibm.com>
Add an entry in fileTypeInfo for QED image files.
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Stefan Hajnoczi <stefan.hajnoczi@uk.ibm.com>
Cc: Anthony Liguori <aliguori@linux.vnet.ibm.com>
Disk image formats that wish to opt-out of version validation are supposed to
set versionOffset to -1 in their fileTypeInfo entry.
By unconditionally returning False for these formats,
virStorageFileMatchesVersion() incorrectly reports a version mismatch when the
test was actually skipped. The correct behavior is to return True so these
formats can be successfully probed using the magic bytes alone.
Signed-off-by: Adam Litke <agl@us.ibm.com>
The code in SELinuxRestoreSecurityChardevLabel() was trying to
use SELinuxSetFilecon directly for devices or file types while
it should really use SELinuxRestoreSecurityFileLabel encapsulating
routine, which avoid various problems like resolving symlinks,
making sure he file exists and work around NFS problems
Include locale.h for setlocale().
Revert the usage string back to it's original form.
Use puts() instead of fputs(), as fputs() expects a FILE*.
Add closing parenthesis to some vah_error() calls.
Use argv[0] instead of an undefined argv0.
These messages are visible to the user, so they should be
consistently translated.
* cfg.mk (msg_gen_function): Add vah_error, vah_warning.
* src/security/virt-aa-helper.c: Translate messages.
(catchXMLError): Fix capitalization.
Per the gettext developer:
http://lists.gnu.org/archive/html/bug-gnu-utils/2010-10/msg00019.htmlhttp://lists.gnu.org/archive/html/bug-gnu-utils/2010-10/msg00021.html
gettext() doesn't work correctly on all platforms unless you have
called setlocale(). Furthermore, gnulib's gettext.h has provisions
for setting up a default locale, which is the preferred method for
libraries to use gettext without having to call textdomain() and
override the main program's default domain (virInitialize already
calls bindtextdomain(), but this is insufficient without the
setlocale() added in this patch; and a redundant bindtextdomain()
in this patch doesn't hurt, but serves as a good example for other
packages that need to bind a second translation domain).
This patch is needed to silence a new gnulib 'make syntax-check'
rule in the next patch.
* daemon/libvirtd.c (main): Setup locale and gettext.
* src/lxc/lxc_controller.c (main): Likewise.
* src/security/virt-aa-helper.c (main): Likewise.
* src/storage/parthelper.c (main): Likewise.
* tools/virsh.c (main): Fix exit status.
* src/internal.h (DEFAULT_TEXT_DOMAIN): Define, for gettext.h.
(_): Simplify definition accordingly.
* po/POTFILES.in: Add src/storage/parthelper.c.
I am replacing the last instances of close() I found with VIR_CLOSE() / VIR_FORCE_CLOSE respectively.
The first part patches virsh, which I missed out on previously.
The 2nd patch I had left out intentionally to look at it more carefully:
The 'closed' variable could be easily removed since it wasn't used anywhere else. The possible race condition that could result from the filedescriptor being closed and not set to -1 (and possibly let us write into 'something' totally different if the fd was allocated by another thread) seems to be prevented by the qemuMonitorLock() already placed around the code that reads from or writes to the fd. So the change of this code as shown in the patch should not have any side-effects.
Rather than only cleaning any remaining ebtables rules, also clean those applied to iptables and ip6tables when detecting the IP address of an interface. Previous applied iptables rules may hinder DHCP packets.
Similarly to deprecating close(), I am now deprecating fclose() and
introduce VIR_FORCE_FCLOSE() and VIR_FCLOSE(). Also, fdopen() is replaced with
VIR_FDOPEN().
Most of the files are opened in read-only mode, so usage of
VIR_FORCE_CLOSE() seemed appropriate. Others that are opened in write
mode already had the fclose()< 0 check and I converted those to
VIR_FCLOSE()< 0.
I did not find occurrences of possible double-closed files on the way.
Currently only support domain start and shutdown, for domain start,
record timestamp before the qemu command line, and for domain shutdown,
just say it's shutting down with timestamp.
* src/qemu/qemu_driver.c (qemudStartVMDaemon, qemudShutdownVMDaemon
introduced two macros - START_POSTFIX, SHUTDOWN_POSTFIX)
* src/nwfilter/nwfilter_ebiptables_driver.c
(ebiptablesWriteToTempFile): Use /bin/sh.
(bash_cmd_path): Delete.
(ebiptablesDriverInit, ebiptablesDriverShutdown): No need to
search for bash.
(CMD_EXEC): Prefer $() over ``, since we can assume POSIX.
(iptablesSetupVirtInPost): Use portable 'test' syntax.
(iptablesLinkIPTablesBaseChain): Use POSIX $(()) syntax.
This is more flexible regarding the location of the python binary
but doesn't allow to pass the -u flag. The -i flag can be passed
from inside the script using the PYTHONINSPECT env variable.
This fixes a problem with the esx_vi_generator.py on FreeBSD.
This makes the storage driver fail when the connection is
opened with the VIR_CONNECT_RO flag, resulting in a read-only
connection with no storage driver.
In a first step I am converting the netlink message construction in
macvtap code to use libnl. It's pretty much a 1:1 conversion except that
now the message needs to be allocated and deallocated.
When <uuid> is not in the XML, a virUUIDGenerate() ends up being called which
is unnecessary and can lead to crashes if /dev/urandom isn't available
because virRandomInitialize() is not called within virt-aa-helper. This patch
adds verify_xpath_context() and updates caps_mockup() to use it.
Bug-Ubuntu: https://launchpad.net/bugs/672943
If virDomainAttachDevice() was called with an image that was located
on a root-squashed NFS server, and in a directory that was unreadable
by root on the machine running libvirtd, the attach would fail due to
an attempt to change the selinux label of the image with EACCES (which
isn't covered as an ignore case in SELinuxSetFilecon())
NFS doesn't support SELinux labelling anyway, so we mimic the failure
handling of commit 93a18bbafa, which
just ignores the errors if the target is on an NFS filesystem (in
SELinuxSetSecurityAllLabel() only, though.)
This can be seen as a follow-on to commit
347d266c51, which ignores file open
failures of files on NFS that occur directly in
virDomainDiskDefForeachPath() (also necessary), but does not ignore
failures in functions that are called from there (eg
SELinuxSetSecurityFileLabel()).
Introduce implementations of the virDomainOpenConsole() API
for LXC, Xen and UML drivers.
* src/lxc/lxc_driver.c, src/lxc/lxc_driver.c,
src/xen/xen_driver.c: Wire up virDomainOpenConsole
The util/threads.c/h code already has APIs for mutexes,
condition variables and thread locals. This commit adds
in code for actually creating threads.
* src/libvirt_private.syms: Export new symbols
* src/util/threads.h: Define APIs virThreadCreate, virThreadSelf,
virThreadIsSelf and virThreadJoin
* src/util/threads-win32.c, src/util/threads-win32.h: Win32
impl of threads
* src/util/threads-pthread.c, src/util/threads-pthread.h: POSIX
impl of threads
This provides an implementation of the virDomainOpenConsole
API with the QEMU driver. For the streams code, this reuses
most of the code previously added for the tunnelled migration
streams since it is generic.
* src/qemu/qemu_driver.c: Support virDomainOpenConsole
To avoid the need for duplicating implementations of virStream
drivers, provide a generic implementation that can handle any
FD based stream. This code is copied from the existing impl
in the QEMU driver, with the locking moved into the stream
impl, and addition of a read callback
The FD stream code will refuse to operate on regular files or
block devices, since those can't report EAGAIN properly when
they would block on I/O
* include/libvirt/virterror.h, include/libvirt/virterror.h: Add
VIR_FROM_STREAM error domain
* src/qemu/qemu_driver.c: Remove code obsoleted by the new
generic streams driver.
* src/fdstream.h, src/fdstream.c, src/fdstream.c,
src/libvirt_private.syms: Generic reusable FD based streams
Now that bi-directional, non-blocking streams are supported
in the remote driver, some of the VIR_WARN statements need
to be reduced to VIR_DEBUG.
* src/remote/remote_driver.c: Lower logging level
This provides an implementation of the virDomainOpenConsole
API for the remote driver client and server.
* daemon/remote.c: Server side impl
* src/remote/remote_driver.c: Client impl
* src/remote/remote_protocol.x: Wire definition
To enable virsh console (or equivalent) to be used remotely
it is necessary to provide remote access to the /dev/pts/XXX
pseudo-TTY associated with the console/serial/parallel device
in the guest. The virStream API provide a bi-directional I/O
stream capability that can be used for this purpose. This
patch thus introduces a virDomainOpenConsole API that uses
the stream APIs.
* src/libvirt.c, src/libvirt_public.syms,
include/libvirt/libvirt.h.in, src/driver.h: Define the
new virDomainOpenConsole API
* src/esx/esx_driver.c, src/lxc/lxc_driver.c,
src/opennebula/one_driver.c, src/openvz/openvz_driver.c,
src/phyp/phyp_driver.c, src/qemu/qemu_driver.c,
src/remote/remote_driver.c, src/test/test_driver.c,
src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
src/xen/xen_driver.c, src/xenapi/xenapi_driver.c: Stub
API entry point
The current remote driver code for streams only supports
blocking I/O mode. This is fine for the usage with migration
but is a problem for more general use cases, in particular
bi-directional streams.
This adds supported for the stream callbacks and non-blocking
I/O. with the minor caveat is that it doesn't actually do
non-blocking I/O for sending stream data, only receiving it.
A future patch will try to do non-blocking sends, but this is
quite tricky to get right.
* src/remote/remote_driver.c: Allow non-blocking I/O for
streams and support callbacks
The /dev/console device inside the container must NOT map
to the real /dev/console device node, since this allows the
container control over the current host console. A fun side
effect of this is that starting a container containing a
real Fedora OS will kill off your X server.
Remove the /dev/console node, and replace it with a symlink
to the primary console TTY
* src/lxc/lxc_container.c: Replace /dev/console with a
symlink to /dev/pty/0
* src/lxc/lxc_controller.c: Remove /dev/console from cgroups
ACL
QEMU allows forcing a CDROM eject even if the guest has locked the device.
Expose this via a new UpdateDevice flag, VIR_DOMAIN_DEVICE_MODIFY_FORCE.
This has been requested for RHEV:
https://bugzilla.redhat.com/show_bug.cgi?id=626305
v2: Change flag name, bool cleanups
I am trying to use a qcow image with libvirt where the backing 'file' is a
qemu-nbd server. Unfortunately virDomainDiskDefForeachPath() assumes that
backingStore is always a real file so something like 'nbd:0:3333' is rejected
because a file with that name cannot be accessed. Note that I am not worried
about directly using nbd images. That would require a new disk type with XML
markup, etc. I only want it to be permitted as a backingStore
The following patch implements danpb's suggestion:
> I think I'm inclined to push the logic for skipping NBD one stage higher.
> I'd rather expect virStorageFileGetMetadata() to return all backing
> stores, even if not files. The virDomainDiskDefForeachPath() method
> should definitely ignore non-file backing stores though.
>
> So what I'm thinking is to extend the virStorageFileMetadata struct and
> just add a 'bool isFile' field to it. Default this field to true, unless
> you see the prefix of nbd: in which case set it to false. The
> virDomainDiskDefForeachPath() method can then skip over any backing
> store with isFile == false
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Daniel P. Berrange <berrange@redhat.com>
xencapstest calls xenHypervisorMakeCapabilitiesInternal with conn == NULL
which calls xenDaemonNodeGetTopology with conn == NULL when a recent
enough Xen was detected (sys_interface_version >= SYS_IFACE_MIN_VERS_NUMA).
But xenDaemonNodeGetTopology insists in having conn != NULL and fails,
because it expects to be able to talk to an actual xend.
We cannot do that in a 'make check' test. Therefore, only call the xend
subdriver function when conn isn't NULL.
Reported by Andy Howell and Jim Fehlig.
Using automated replacement with sed and editing I have now replaced all
occurrences of close() with VIR_(FORCE_)CLOSE() except for one, of
course. Some replacements were straight forward, others I needed to pay
attention. I hope I payed attention in all the right places... Please
have a look. This should have at least solved one more double-close
error.
This extends the SPICE XML to allow channel security options
<graphics type='spice' port='-1' tlsPort='-1' autoport='yes'>
<channel name='main' mode='secure'/>
<channel name='record' mode='insecure'/>
</graphics>
Any non-specified channel uses the default, which allows both
secure & insecure usage
* src/conf/domain_conf.c, src/conf/domain_conf.h,
src/libvirt_private.syms: Add XML syntax for specifying per
channel security options for spice.
* src/qemu/qemu_conf.c: Configure channel security with spice
QEMU crashes & burns if you try multiple Cirrus video cards, but
QXL copes fine. Adapt QEMU config code to allow multiple QXL
video cards
* src/qemu/qemu_conf.c: Support multiple QXL video cards
This extends the XML syntax for <graphics> to allow a password
expiry time to be set
eg
<graphics type='vnc' port='5900' autoport='yes' keymap='en-us' passwd='12345' passwdValidTo='2010-04-09T15:51:00'/>
The timestamp is in UTC.
* src/conf/domain_conf.h: Pull passwd out into separate struct
virDomainGraphicsAuthDef to allow sharing between VNC & SPICE
* src/conf/domain_conf.c: Add parsing/formatting of new passwdValidTo
argument
* src/opennebula/one_conf.c, src/qemu/qemu_conf.c, src/qemu/qemu_driver.c,
src/xen/xend_internal.c, src/xen/xm_internal.c: Update for changed
struct containing VNC password
In common with VNC, the QEMU driver configuration file is used
specify the host level TLS certificate location and a default
password / listen address
* src/qemu/qemu.conf: Add spice_listen, spice_tls,
spice_tls_x509_cert_dir & spice_password config params
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Parsing of
spice config parameters and updating -spice arg generation
to use them
* tests/qemuxml2argvdata/qemuxml2argv-graphics-spice-rhel6.args,
tests/qemuxml2argvtest.c: Expand test case to cover driver
level configuration
This supports the -spice argument posted for review against
the latest upstream QEMU/KVM. This supports the bare minimum
config with port, TLS port & listen address. The x509 bits are
added in a later patch.
* src/qemu_conf.c, src/qemu_conf.h: Add SPICE flag. Check for
-spice availability. Format -spice arg for command line
* qemuhelptest.c: Add SPICE flag
* qemuxml2argvdata/qemuxml2argv-graphics-spice.args: Add <graphics>
for spice
* qemuxml2argvdata/qemuxml2argv-graphics-spice.xml: Add -spice arg
* qemuxml2argvtest.c: Add SPICE flag
This supports the '-vga qxl' parameter in upstream QEMU/KVM
which has SPICE support added. This isn't particularly useful
until you get the next patch for -spice support. Also note that
while the libvirt XML supports multiple video devices, this
patch only supports a single one. A later patch can add support
for 2nd, 3rd, etc PCI devices for QXL
* src/qemu/qemu_conf.h: Flag for QXL support
* src/qemu/qemu_conf.c: Probe for '-vga qxl' support and implement it
* tests/qemuxml2argvtest.c, tests/qemuxml2xmltest.c,
tests/qemuxml2argvdata/qemuxml2argv-graphics-spice.args,
tests/qemuxml2argvdata/qemuxml2argv-graphics-spice.xml: Test
case for generating spice args with RHEL6 kvm
This adds an element
<graphics type='spice' port='5903' tlsPort='5904' autoport='yes' listen='127.0.0.1'/>
This is the bare minimum that should be exposed in the guest
config for SPICE. Other parameters are better handled as per
host level configuration tunables
* docs/schemas/domain.rng: Define the SPICE <graphics> schema
* src/domain_conf.h, src/domain_conf.c: Add parsing and formatting
for SPICE graphics config
* src/qemu_conf.c: Complain about unsupported graphics types
* src/qemu_conf.c: Add dummy entry in enumeration
* docs/schemas/domain.rng: Add 'qxl' as a type for the <video> tag
* src/domain_conf.c, src/domain_conf.h: Add QXL to video type
enumerations
There is no point in trying to fill params beyond the first error,
because when lxcDomainGetMemoryParameters returns -1 then the caller
cannot detect which values in params are valid.
The patch is based on the possiblity in the QEmu command line to
add -smbios options allowing to override the default values picked
by QEmu. We need to detect this first from QEmu help output.
If the domain is defined with smbios to be inherited from host
then we pass the values coming from the Host own SMBIOS, but
if the domain is defined with smbios to come from sysinfo, we
use the ones coming from the domain definition.
* src/qemu/qemu_conf.h: add the QEMUD_CMD_FLAG_SMBIOS_TYPE enum
value
* src/qemu/qemu_conf.c: scan the help output for the smbios support,
and if available add support based on the domain definitions,
and host data
* tests/qemuhelptest.c: add the new enum in the outputs
Move existing routines about virSysinfoDef to an util module,
add a new entry point virSysinfoRead() to read the host values
with dmidecode
* src/conf/domain_conf.c src/conf/domain_conf.h src/util/sysinfo.c
src/util/sysinfo.h: move to a new module, add virSysinfoRead()
* src/Makefile.am: handle the new module build
* src/libvirt_private.syms: new internal symbols
* include/libvirt/virterror.h src/util/virterror.c: defined a new
error code for that module
* po/POTFILES.in: add new file for translations
the element has a mode attribute allowing only 3 values:
- emulate: use the smbios emulation from the hypervisor
- host: try to use the smbios values from the node
- sysinfo: grab the values from the <sysinfo> fields
* docs/schemas/domain.rng: extend the schemas
* src/conf/domain_conf.h: add the flag to the domain config
* src/conf/domain_conf.h: parse and serialize the smbios if present
* src/conf/domain_conf.h: defines a new internal type added to the
domain structure
* src/conf/domain_conf.c: parsing and serialization of that new type
During a shutdown/restart cycle libvirtd forgot the macvtap device name that it had created on behalf of a VM so that a stale macvtap device remained on the host when the VM terminated. Libvirtd has to actively tear down a macvtap device and it uses its name for identifying which device to tear down.
The solution is to not blank out the <target dev='...'/> completely, but only blank it out on VMs that are not active. So, if a VM is active, the device name makes it into the XML and is also being parsed. If a VM is not active, the device name is discarded.
virPipeReadUntilEOF is used to read the stdout of exec'ed
and this could fail to capture the full output and read only
1024 bytes.
The problem is that this is based on a poll loop, and in the
loop we read at most 1024 bytes per file descriptor, but we also
note in the loop if poll indicates that the process won't output
more than that on that fd by setting finished[i] = 1.
The simplest way is that if we read a full buffer make sure
finished[i] is still 0 because we will need another pass in the
loop.
The remoteIO() method has wierd calling conventions, where
it is passed a pre-allocated 'struct remote_call *' but
then free()s it itself, instead of letting the caller free().
This fixes those weird semantics
* src/remote/remote_driver.c: Sanitize semantics of remoteIO
method wrt to memory release
A couple of places in the text monitor were overwriting the
'ret' variable with a >= 0 value before success was actually
determined. So later error paths would not correctly return
the -1 value. The drive_add code was not checking for errors
like missing command
* src/qemu/qemu_monitor_text.c: Misc error handling fixes
NFS in root squash mode may prevent opening disk images to
determine backing store. Ignore errors in this scenario.
* src/security/security_selinux.c: Ignore open failures on disk
images
NFS does not support file labelling, so ignore this error
for stdin_path when on NFS.
* src/security/security_selinux.c: Ignore failures on labelling
stdin_path on NFS
* src/util/storage_file.c, src/util/storage_file.h: Refine
virStorageFileIsSharedFS() to allow it to check for a
specific FS type.
Commit 06f81c63eb attempted to make
QEMU driver ignore the failure to relabel 'stdin_path' if it was
on NFS. The actual result was that it ignores *all* failures to
label any aspect of the VM, unless stdin_path is non-NULL and
is not on NFS.
* src/qemu/qemu_driver.c: Treat all relabel failures as terminal
* src/xen/xend_internal.c (xenDaemonFormatSxpr): Hoist verify
outside of function to avoid a -Wnested-externs warning.
* src/xen/xm_internal.c (xenXMDomainConfigFormat): Likewise.
Reported by Daniel P. Berrange.
* src/xen/xen_hypervisor.c (MAX_VIRT_CPUS): Move...
* src/xen/xen_driver.h (MAX_VIRT_CPUS): ...so all xen code can see
same value.
* src/xen/xend_internal.c (sexpr_to_xend_domain_info)
(xenDaemonDomainGetVcpusFlags, xenDaemonParseSxpr)
(xenDaemonFormatSxpr): Work if MAX_VIRT_CPUS is 64 on a platform
where long is 64-bits.
* src/xen/xm_internal.c (xenXMDomainConfigParse)
(xenXMDomainConfigFormat): Likewise.