Similar to the last patch in isolating the filtering from the
client actions, so that clients don't have to reinvent the
filtering.
* src/conf/domain_conf.h (virDomainSnapshotForEachChild): New
prototype.
* src/libvirt_private.syms (domain_conf.h): Export it.
* src/conf/domain_conf.c (virDomainSnapshotActOnChild)
(virDomainSnapshotForEachChild): New functions.
(virDomainSnapshotCountChildren): Delete.
(virDomainSnapshotHasChildren): Simplify.
* src/qemu/qemu_driver.c (qemuDomainSnapshotReparentChildren)
(qemuDomainSnapshotDelete): Likewise.
This one's nasty. Ever since we fixed virHashForEach to prevent
nested hash iterations for safety reasons (commit fba550f6),
virDomainSnapshotDelete with VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN
has been broken for qemu: it deletes children, while leaving
grandchildren intact but pointing to a no-longer-present parent.
But even before then, the code would often appear to succeed to
clean up grandchildren, but risked memory corruption if you have
a large and deep hierarchy of snapshots.
For acting on just children, a single virHashForEach is sufficient.
But for acting on an entire subtree, it requires iteration; and
since we declared recursion as invalid, we have to switch to a
while loop. Doing this correctly requires quite a bit of overhaul,
so I added a new helper function to isolate the algorithm from the
actions, so that callers do not have to reinvent the iteration.
Note that this _still_ does not handle CHILDREN correctly if one
of the children is the current snapshot; that will be next.
* src/conf/domain_conf.h (_virDomainSnapshotDef): Add mark.
(virDomainSnapshotForEachDescendant): New prototype.
* src/libvirt_private.syms (domain_conf.h): Export it.
* src/conf/domain_conf.c (virDomainSnapshotMarkDescendant)
(virDomainSnapshotActOnDescendant)
(virDomainSnapshotForEachDescendant): New functions.
* src/qemu/qemu_driver.c (qemuDomainSnapshotDiscardChildren):
Replace...
(qemuDomainSnapshotDiscardDescenent): ...with callback that
doesn't nest hash traversal.
(qemuDomainSnapshotDelete): Use new function.
This API labels all sockets created until ClearSocketLabel is called in
a way that a vm can access them (i.e., they are labeled with svirt_t
based label in SELinux).
The APIs are designed to label a socket in a way that the libvirt daemon
itself is able to access it (i.e., in SELinux the label is virtd_t based
as opposed to svirt_* we use for labeling resources that need to be
accessed by a vm). The new name reflects this.
Often, we want to use XPath functions on the just-parsed document;
fold this into the parser function for convenience.
* src/util/xml.h (virXMLParseHelper): Add argument.
(virXMLParseStrHelper, virXMLParseFileHelper): Delete.
(virXMLParseCtxt, virXMLParseStringCtxt, virXMLParseFileCtxt): New
macros.
* src/libvirt_private.syms (xml.h): Remove deleted functions.
* src/util/xml.c (virXMLParseHelper): Add argument.
(virXMLParseStrHelper, virXMLParseFileHelper): Delete.
Get rid of the #if __linux__ check in virPidFileReadPathIfAlive that
was preventing a check of a symbolic link in /proc/<pid>/exe on
non-linux platforms against an expected executable. Replace
this with a run-time check testing whether the /proc/<pid>/exe is a
symbolic link and if so call the function doing the comparison
against the expected file the link is supposed to point to.
In some versions of qemu, both virtio-blk-pci and virtio-net-pci
devices can have an event_idx setting that determines some details of
event processing. When it is enabled, it "reduces the number of
interrupts and exits for the guest". qemu will automatically enable
this feature when it is available, but there may be cases where this
new feature could actually make performance worse (NB: no such case
has been found so far).
As a safety switch in case such a situation is encountered in the
field, this patch adds a new attribute "event_idx" to the <driver>
element of both disk and interface devices. event_idx can be set to
"on" (to force event_idx on in case qemu has it disabled by default)
or "off" (for force event_idx off). In the case that event_idx support
isn't present in qemu, the attribute is ignored (this on the advice of
the qemu developer).
docs/formatdomain.html.in: document the new flag (marking it as
"don't mess with this!"
docs/schemas/domain.rng: add event_idx in appropriate places
src/conf/domain_conf.[ch]: add event_idx to parser and formatter
src/libvirt_private.syms: export
virDomainVirtioEventIdx(From|To)String
src/qemu/qemu_capabilities.[ch]: detect and report event_idx in
disk/net
src/qemu/qemu_command.c: add event_idx parameter to qemu commandline
when appropriate.
tests/qemuxml2argvdata/qemuxml2argv-event_idx.args,
tests/qemuxml2argvdata/qemuxml2argv-event_idx.xml,
tests/qemuxml2argvtest.c,
tests/qemuxml2xmltest.c: test cases for event_idx.
In daemons using pidfiles to protect against concurrent
execution there is a possibility that a crash may leave a stale
pidfile on disk, which then prevents later restart of the daemon.
To avoid this problem, introduce a pair of APIs which make
use of virFileLock to ensure crash-safe & race condition-safe
pidfile acquisition & releae
* src/libvirt_private.syms, src/util/virpidfile.c,
src/util/virpidfile.h: Add virPidFileAcquire and virPidFileRelease
In some cases the caller of virPidFileRead might like extra checks
to determine whether the pid just read is really the one they are
expecting. This adds virPidFileReadIfAlive which will check whether
the pid is still alive with kill(0, -1), and (on linux only) will
look at /proc/$PID/path
* libvirt_private.syms, util/virpidfile.c, util/virpidfile.h: Add
virPidFileReadIfValid and virPidFileReadPathIfValid
* network/bridge_driver.c: Use new APIs to check PID validity
The functions for manipulating pidfiles are in util/util.{c,h}.
We will shortly be adding some further pidfile related functions.
To avoid further growing util.c, this moves the pidfile related
functions into a dedicated virpidfile.{c,h}. The functions are
also all renamed to have 'virPidFile' as their name prefix
* util/util.h, util/util.c: Remove all pidfile code
* util/virpidfile.c, util/virpidfile.h: Add new APIs for pidfile
handling.
* lxc/lxc_controller.c, lxc/lxc_driver.c, network/bridge_driver.c,
qemu/qemu_process.c: Add virpidfile.h include and adapt for API
renames
Add some simple wrappers around the fcntl() discretionary file
locking capability.
* src/util/util.c, src/util/util.h, src/libvirt_private.syms: Add
virFileLock and virFileUnlock APIs
Originally noticed by comparing the xml generated by virDomainSave
with the xml produced by reparsing and redumping that xml, but I
also did an audit of every last use of VIR_DOMAIN_XML_INACTIVE in
domain_conf.c to ensure that no other discrepancies exist.
* src/conf/domain_conf.c (virDomainDeviceInfoIsSet): Add
parameter, and update all callers. Make static.
(virDomainNetDefFormat): Skip generated ifname.
(virDomainDefFormatInternal): Skip default <seclabel>.
(virDomainChrSourceDefParseXML): Skip generated pty path, and add
parameter. Update callers.
* src/conf/domain_conf.h (virDomainDeviceInfoIsSet): Delete.
* src/libvirt_private.syms (domain_conf.h): Update.
Once it's plugged in, the <listen> element will be an optional
replacement for the "listen" attribute that graphics elements already
have. If the <listen> element is type='address', it will have an
attribute called 'address' which will contain an IP address or dns
name that the guest's display server should listen on. If, however,
type='network', the <listen> element should have an attribute called
'network' that will be set to the name of a network configuration to
get the IP address from.
* docs/schemas/domain.rng: updated to allow the <listen> element
* docs/formatdomain.html.in: document the <listen> element and its
attributes.
* src/conf/domain_conf.[hc]:
1) The domain parser, formatter, and data structure are modified to
support 0 or more <listen> subelements to each <graphics>
element. The old style "legacy" listen attribute is also still
accepted, and will be stored internally just as if it were a
separate <listen> element. On output (i.e. format), the address
attribute of the first <listen> element of type 'address' will be
duplicated in the legacy "listen" attribute of the <graphic>
element.
2) The "listenAddr" attribute has been removed from the unions in
virDomainGRaphicsDef for graphics types vnc, rdp, and spice.
This attribute is now in the <listen> subelement (aka
virDomainGraphicsListenDef)
3) Helper functions were written to provide simple access
(both Get and Set) to the listen elements and their attributes.
* src/libvirt_private.syms: export the listen helper functions
* src/qemu/qemu_command.c, src/qemu/qemu_hotplug.c,
src/qemu/qemu_migration.c, src/vbox/vbox_tmpl.c,
src/vmx/vmx.c, src/xenxs/xen_sxpr.c, src/xenxs/xen_xm.c
Modify all these files to use the listen helper functions rather
than directly referencing the (now missing) listenAddr
attribute. There can be multiple <listen> elements to a single
<graphics>, but the drivers all currently only support one, so all
replacements of direct access with a helper function indicate index
"0".
* tests/* - only 3 of these are new files added explicitly to test the
new <listen> element. All the others have been modified to reflect
the fact that any legacy "listen" attributes passed in to the domain
parse will be saved in a <listen> element (i.e. one of the
virDomainGraphicsListenDefs), and during the domain format function,
both the <listen> element as well as the legacy attributes will be
output.
Every DomainNetDef has a bandwidth, as does every portgroup.
Whenever a DomainNetDef of type NETWORK is about to be used, a call is
made to networkAllocateActualDevice(). This function chooses the "best"
bandwidth object and places it in the DomainActualNetDef.
From that point on, whenever some code needs to use the bandwidth data
for the interface, it's retrieved with virDomainNetGetActualBandwidth(),
which will always return the "best" info as determined in the
previous step.
These functions parse given XML node and return pointer to the
output. Unknown elements are silently ignored. Attributes must
be integer and must fit in unsigned long long.
Free function frees elements of virBandwidth structure.
This function uses ioctl(SIOCGIFADDR), which limits it to returning
the first IPv4 address of an interface, but that's what we want right
now (the place we're going to use the address only accepts one).
The bind mount setup is about to get more complicated.
To avoid having to deal with several copies, pull it
out into a separate lxcContainerMountFSBind method.
Also pull out the iteration over container filesystems,
so that it will be easier to drop in support for non-bind
mount filesystems
* src/lxc/lxc_container.c: Pull bind mount code out into
lxcContainerMountFSBind
When an operation started by virDomainBlockPull completes (either with
success or with failure), raise an event to indicate the final status.
This API allow users to avoid polling on virDomainGetBlockJobInfo if
they would prefer to use an event mechanism.
* daemon/remote.c: Dispatch events to client
* include/libvirt/libvirt.h.in: Define event ID and callback signature
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Extend API to handle the new event
* src/qemu/qemu_driver.c: Connect to the QEMU monitor event
for block_stream completion and emit a libvirt block pull event
* src/remote/remote_driver.c: Receive and dispatch events to application
* src/remote/remote_protocol.x: Wire protocol definition for the event
* src/remote_protocol-structs: structure definitions for protocol verification
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c: Watch for BLOCK_STREAM_COMPLETED event
from QEMU monitor
O_DIRECT has stringent requirements. Rather than make lots of changes
at each site that wants to use O_DIRECT, it is easier to offload
the work through a helper process that mirrors the I/O between a
pipe and the actual direct fd, so that the other end of the pipe
no longer has to worry about constraints.
Plus, if the kernel ever gains better posix_fadvise support, then we
only have to touch a single file to let all callers benefit from a
more efficient way to avoid file system caching.
* src/util/virfile.h (virFileDirectFdFlag, virFileDirectFdNew)
(virFileDirectFdClose, virFileDirectFdFree): New prototypes.
* src/util/virdirect.c: Implement new wrapper object.
* src/libvirt_private.syms (virfile.h): Export new symbols.
* cfg.mk (useless_free_options): Add to list.
* po/POTFILES.in: Add new translations.
The network XML is updated in the following ways:
1) The <forward> element can now contain a list of forward interfaces:
<forward .... >
<interface dev='eth10'/>
<interface dev='eth11'/>
<interface dev='eth12'/>
<interface dev='eth13'/>
</forward>
The first of these takes the place of the dev attribute that is
normally in <forward> - when defining a network you can specify
either one, and on output both will be present. If you specify
both on input, they must match.
2) In addition to forward modes of 'nat' and 'route', these new modes
are supported:
private, passthrough, vepa - when this network is referenced by a
domain's interface, it will have the same effect as if the
interface had been defined as type='direct', e.g.:
<interface type='direct'>
<source mode='${mode}' dev='${dev}>
...
</interface>
where ${mode} is one of the three new modes, and ${dev} is an interface
selected from the list given in <forward>.
bridge - if a <forward> dev (or multiple devs) is defined, and
forward mode is 'bridge' this is just like the modes 'private',
'passthrough', and 'vepa' above. If there is no forward dev
specified but a bridge name is given (e.g. "<bridge
name='br0'/>"), then guest interfaces using this network will use
libvirt's "host bridge" mode, equivalent to this:
<interface type='bridge'>
<source bridge='${bridge-name}'/>
...
</interface>
3) A network can have multiple <portgroup> elements, which may be
selected by the guest interface definition (by adding
"portgroup='${name}'" in the <source> element along with the
network name). Currently a portgroup can only contain a
virtportprofile, but the intent is that other configuration items
may be put there int the future (e.g. bandwidth config). When
building a guest's interface, if the <interface> XML itself has no
virtportprofile, and if the requested network has a portgroup with
a name matching the name given in the <interface> (or if one of the
network's portgroups is marked with the "default='yes'" attribute),
the virtportprofile from that portgroup will be used by the
interface.
4) A network can have a virtportprofile defined at the top level,
which will be used by a guest interface when connecting in one of
the 'direct' modes if the guest interface XML itself hasn't
specified any virtportprofile, and if there are also no matching
portgroups on the network.
the domain XML <interface> element is updated in the following ways:
1) <virtualportprofile> can be specified when source type='network'
(previously it was only valid for source type='direct')
2) A new attribute "portgroup" has been added to the <source>
element. When source type='network' (the only time portgroup is
recognized), extra configuration information will be taken from the
<portgroup> element of the given name in the network definition.
3) Each virDomainNetDef now also potentially has a
virDomainActualNetDef which is a private object (never
exported/imported via the public API, and not defined in the RNG) that
is used to maintain information about the physical device that was
actually used for a NetDef of type VIR_DOMAIN_NET_TYPE_NETWORK.
The virDomainActualNetDef will only be parsed/formatted if the
parse/format function is called with the
VIR_DOMAIN_XML_INTERNAL_ACTUAL_NET flag set (which is only needed when
saving/loading a running domain's state info to the stateDir).
virtPortProfiles are currently only used in the domain XML, but will
soon also be used in the network XML. To prepare for that change, this
patch moves the structure definition into util/network.h and the parse
and format functions into util/network.c (I decided that this was a
better choice than macvtap.h/c for something that needed to always be
available on all platforms).
Add virtkey lib for usage-improvment and keycode translating.
Add 4 internal API for the aim
const char *virKeycodeSetTypeToString(int codeset);
int virKeycodeSetTypeFromString(const char *name);
int virKeycodeValueFromString(virKeycodeSet codeset, const char *keyname);
int virKeycodeValueTranslate(virKeycodeSet from_codeset,
virKeycodeSet to_offset,
int key_value);
* include/libvirt/libvirt.h.in: extend virKeycodeSet enum
* src/Makefile.am: add new virtkeycode module and rule to generate
virkeymaps.h
* src/util/virkeycode.c src/util/virkeycode.h: new module
* src/util/virkeycode-mapgen.py: python generator for virkeymaps.h
out of keymaps.csv
* src/libvirt_private.syms: extend private symbols for new module
* .gitignore: add generated virkeymaps.h
When using virCommandRunAsync and saving the pid for later, it
is useful to be able to reap that pid in the same way that it
would have been auto-reaped by virCommand if we had passed
NULL for the pid argument in the first place.
* src/util/command.c (virPidWait, virPidAbort): New functions,
created from...
(virCommandWait, virCommandAbort): ...bodies of these.
(includes): Drop duplicate <stdlib.h>. Ensure that our pid_t
assumptions hold.
(virCommandRunAsync): Improve documentation.
* src/util/command.h (virPidWait, virPidAbort): New prototypes.
* src/libvirt_private.syms: Export them.
* docs/internals/command.html.in: Document them.
Getting metadata on storage allocates a memory (path) which need to
be freed after use otherwise it gets leaked. This means after use of
virStorageFileGetMetadataFromFD or virStorageFileGetMetadata one
must call virStorageFileFreeMetadata to free it. This function frees
structure internals and structure itself.
When passing through filesystems from the host to a guest, the
host filesystem passed must be audited
* src/conf/domain_audit.{c,h}: Add virDomainAuditFS
The LXC and UML drivers can both make use of auditing. Move
the qemu_audit.{c,h} files to src/conf/domain_audit.{c,h}
* src/conf/domain_audit.c: Rename from src/qemu/qemu_audit.c
* src/conf/domain_audit.h: Rename from src/qemu/qemu_audit.h
* src/Makefile.am: Remove qemu_audit.{c,h}, add domain_audit.{c,h}
* src/qemu/qemu_audit.h, src/qemu/qemu_cgroup.c,
src/qemu/qemu_command.c, src/qemu/qemu_driver.c,
src/qemu/qemu_hotplug.c, src/qemu/qemu_migration.c,
src/qemu/qemu_process.c: Update for changed audit API names
Avoid re-formatting the pidfile path everytime we need it. Create
it once when starting the guest, and preserve it until the guest
is shutdown.
* src/libvirt_private.syms, src/util/util.c,
src/util/util.h: Add virFileReadPidPath
* src/qemu/qemu_domain.h: Add pidfile field
* src/qemu/qemu_process.c: Store pidfile path in qemuDomainObjPrivate
This option accepts 3 values:
-keep, to keep current client connected (Spice+VNC)
-disconnect, to disconnect client (Spice)
-fail, to fail setting password if there is a client connected (Spice)
The next patch wants to adjust an end pointer to trim trailing
spaces but without modifying the underlying string, but a more
generally useful ability to trim trailing spaces in place is
also worth providing.
* src/util/util.h (virTrimSpaces, virSkipSpacesBackwards): New
prototypes.
* src/util/util.c (virTrimSpaces, virSkipSpacesBackwards): New
functions.
* src/libvirt_private.syms (util.h): Export new functions.
Inspired by a patch by Minoru Usui.
Most clients of virSkipSpaces don't want to omit backslashes.
Also, open-coding the list of spaces is not as nice as using
c_isspace.
* src/util/util.c (virSkipSpaces): Use c_isspace.
(virSkipSpacesAndBackslash): New function.
* src/util/util.h (virSkipSpacesAndBackslash): New prototype.
* src/xen/xend_internal.c (sexpr_to_xend_topology): Update caller.
* src/libvirt_private.syms (util.h): Export new function.
add a new API pciDeviceReAttachInit() in pci.c to initialize state values for nodedev reattach
Initialize three state value of device driver to 1. This is just for a new call to
qemudNodeDeviceReAttach()
Add a new security driver method for labelling an FD with
the process label, rather than the image label
* src/libvirt_private.syms, src/security/security_apparmor.c,
src/security/security_dac.c, src/security/security_driver.h,
src/security/security_manager.c, src/security/security_manager.h,
src/security/security_selinux.c, src/security/security_stack.c:
Add virSecurityManagerSetProcessFDLabel & impl
The virSecurityManagerSetFDLabel method is used to label
file descriptors associated with disk images. There will
shortly be a need to label other file descriptors in a
different way. So the current name is ambiguous. Rename
the method to virSecurityManagerSetImageFDLabel to clarify
its purpose
* src/libvirt_private.syms,
src/qemu/qemu_migration.c, src/qemu/qemu_process.c,
src/security/security_apparmor.c, src/security/security_dac.c,
src/security/security_driver.h, src/security/security_manager.c,
src/security/security_manager.h, src/security/security_selinux.c,
src/security/security_stack.c: s/FDLabel/ImageFDLabel/
We already have a public virDomainPinVcpu, which implies that
Pin and Vcpu are treated as separate words. Unreleased commit
e261987c introduced virDomainGetVcpupinInfo as the first public
API that used Vcpupin, although we had prior internal uses of
that spelling. For consistency, change the spelling to be two
words everywhere, regardless of whether pin comes first or last.
* daemon/remote.c: Treat vcpu and pin as separate words.
* include/libvirt/libvirt.h.in: Likewise.
* src/conf/domain_conf.c: Likewise.
* src/conf/domain_conf.h: Likewise.
* src/driver.h: Likewise.
* src/libvirt.c: Likewise.
* src/libvirt_private.syms: Likewise.
* src/libvirt_public.syms: Likewise.
* src/libxl/libxl_driver.c: Likewise.
* src/qemu/qemu_driver.c: Likewise.
* src/remote/remote_driver.c: Likewise.
* src/xen/xend_internal.c: Likewise.
* tools/virsh.c: Likewise.
* src/remote/remote_protocol.x: Likewise.
* src/remote_protocol-structs: Likewise.
Suggested by Matthias Bolte.