This patch resolves the problem reported in:
https://bugzilla.redhat.com/show_bug.cgi?id=886663
The source of the problem was the fix for CVE 2011-3411:
https://bugzilla.redhat.com/show_bug.cgi?id=833033
which was originally committed upstream in commit
753ff83a50. That commit improperly
removed the "--except-interface lo" from dnsmasq commandlines when
--bind-dynamic was used (based on comments in the latter bug).
It turns out that the problem reported in the CVE could be eliminated
without removing "--except-interface lo", and removing it actually
caused each instance of dnsmasq to listen on localhost on port 53,
which created a new problem:
If another instance of dnsmasq using "bind-interfaces" (instead of
"bind-dynamic") had already been started (or if another instance
started later used "bind-dynamic"), this wouldn't have any immediately
visible ill effects, but if you tried to start another dnsmasq
instance using "bind-interfaces" *after* starting any libvirt
networks, the new dnsmasq would fail to start, because there was
already another process listening on port 53.
(Subsequent to the CVE fix, another patch changed the network driver
to put dnsmasq options in a conf file rather than directly on the
dnsmasq commandline, but preserved the same options.)
This patch changes the network driver to *always* add
"except-interface=lo" to dnsmasq conf files, regardless of whether we use
bind-dynamic or bind-interfaces. This way no libvirt dnsmasq instances
are listening on localhost (and the CVE is still fixed).
The actual code change is miniscule, but must be propogated through all
of the test files as well.
The default lockd driver behavour is to acquire leases
directly on the disk files. This introduces an alternative
mode, where leases are acquire indirectly on a file that
is based on a SHA256 hash of the disk filename.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This adds a 'lockd' lock driver which is just a client which
talks to the lockd daemon to perform all locking. This will
be the default lock driver for any hypervisor which needs one.
* src/Makefile.am: Add lockd.so plugin
* src/locking/lock_driver_lockd.c: Lockd driver impl
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virtlockd daemon maintains file locks on behalf of libvirtd
and any VMs it is running. These file locks must be held for as
long as any VM is running. If virtlockd itself ever quits, then
it is expected that a node would be fenced/rebooted. Thus to
allow for software upgrads on live systemd, virtlockd needs the
ability to re-exec() itself.
Upon receipt of SIGUSR1, virtlockd will save its current live
state out to a file /var/run/virtlockd-restart-exec.json
It then re-exec()'s itself with exactly the same argv as it
originally had, and loads the state file, reconstructing any
objects as appropriate.
The state file contains information about all locks held and
all network services and clients currently active. An example
state document is
{
"server": {
"min_workers": 1,
"max_workers": 20,
"priority_workers": 0,
"max_clients": 20,
"keepaliveInterval": 4294967295,
"keepaliveCount": 0,
"keepaliveRequired": false,
"services": [
{
"auth": 0,
"readonly": false,
"nrequests_client_max": 1,
"socks": [
{
"fd": 6,
"errfd": -1,
"pid": 0,
"isClient": false
}
]
}
],
"clients": [
{
"auth": 0,
"readonly": false,
"nrequests_max": 1,
"sock": {
"fd": 9,
"errfd": -1,
"pid": 0,
"isClient": true
},
"privateData": {
"restricted": true,
"ownerPid": 1722,
"ownerId": 6,
"ownerName": "f18x86_64",
"ownerUUID": "97586ba9-df27-9459-c806-f016c8bbd224"
}
},
{
"auth": 0,
"readonly": false,
"nrequests_max": 1,
"sock": {
"fd": 10,
"errfd": -1,
"pid": 0,
"isClient": true
},
"privateData": {
"restricted": true,
"ownerPid": 1784,
"ownerId": 7,
"ownerName": "f16x86_64",
"ownerUUID": "7b8e5e42-b875-61e9-b981-91ad8fa46979"
}
}
]
},
"defaultLockspace": {
"resources": [
{
"name": "/var/lib/libvirt/images/f16x86_64.raw",
"path": "/var/lib/libvirt/images/f16x86_64.raw",
"fd": 14,
"lockHeld": true,
"flags": 0,
"owners": [
1784
]
},
{
"name": "/var/lib/libvirt/images/shared.img",
"path": "/var/lib/libvirt/images/shared.img",
"fd": 12,
"lockHeld": true,
"flags": 1,
"owners": [
1722,
1784
]
},
{
"name": "/var/lib/libvirt/images/f18x86_64.img",
"path": "/var/lib/libvirt/images/f18x86_64.img",
"fd": 11,
"lockHeld": true,
"flags": 0,
"owners": [
1722
]
}
]
},
"lockspaces": [
],
"magic": "30199"
}
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This enhancement virtlockd so that it can receive a pre-opened
UNIX domain socket from systemd at launch time, and adds the
systemd service/socket unit files
* daemon/libvirtd.service.in: Require virtlockd to be running
* libvirt.spec.in: Add virtlockd systemd files
* src/Makefile.am: Install systemd files
* src/locking/lock_daemon.c: Support socket activation
* src/locking/virtlockd.service.in, src/locking/virtlockd.socket.in:
systemd unit files
* src/rpc/virnetserverservice.c, src/rpc/virnetserverservice.h:
Add virNetServerServiceNewFD() method
* src/rpc/virnetsocket.c, src/rpc/virnetsocket.h: Add virNetSocketNewListenFD
method
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Introduce a lock_daemon_dispatch.c file which implements the
server side dispatcher the RPC APIs previously defined in the
lock protocol.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virtlockd daemon will be responsible for managing locks
on virtual machines. Communication will be via the standard
RPC infrastructure. This provides the XDR protocol definition
* src/locking/lock_protocol.x: Wire protocol for virtlockd
* src/Makefile.am: Include lock_protocol.[ch] in virtlockd
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virtlockd daemon will maintain locks on behalf of libvirtd.
There are two reasons for it to be separate
- Avoid risk of other libvirtd threads accidentally
releasing fcntl() locks by opening + closing a file
that is locked
- Ensure locks can be preserved across libvirtd restarts.
virtlockd will need to be able to re-exec itself while
maintaining locks. This is simpler to achieve if its
sole job is maintaining locks
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Refactor virLockManagerPluginNew() so that the caller does
not need to pass in the config file path itself - just the
config directory and driver name.
Fix QEMU to actually pass in a config file when creating the
default lock manager plugin, rather than NULL.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The current virStorageFileGet{LVM,SCSI}Key methods return
the key as the return value. Unfortunately it is desirable
for "NULL" to be a valid return value, as well as an error
indicator. Thus the returned key must instead be provided
as an out-parameter.
When we invoke lvs or scsi_id to extract ID for block devices,
we don't want virCommandWait logging errors messages. Thus we
must explicitly check 'status != 0', rather than letting
virCommandWait do it.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The libxl driver ignored boot devices in the domain config,
preventing PXE booting HVM domains. This patch accounts for
user-specified boot devices when building the libxl domain
configuration.
The QED file format is non-versioned, so although the magic
value matched, libvirt rejected it due to lack of a version
number to compare against. We need to distinguish this case
by allowing a value of '-2' to indicate a non-versioned file
where only the magic is required to match
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To help us detect when new storage file versions come into
existance log a warning if the storage file magic matches,
but the version does not
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Fully stub out the virCgroupGetAppRoot method as done with other
methods in the file, rather than just the body. This lets us
annotate the unused parameter to avoid a warning
I noticed that /var/lib/libvirt/dnsmasq/*.conf used the wrong word;
it was intended to match the wording in src/util/xml.c.
* src/network/bridge_driver.c (networkDnsmasqConfContents): Fix typo.
* tests/networkxml2confdata/*.conf: Update accordingly.
* Autotools changes:
- Don't assume Qemu is Linux-only
- Check Linux headers only on Linux
- Disable firewalld on FreeBSD
* Initctl:
Initctl seem to present only on Linux, so stub it on other platforms
* Raw I/O: Linux-only as well
* Headers cleanup
make check fails in check-symsorting if configure is not run in
the source directory. Prefixing symfile names with $(srcdir)
fixes this.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
This patch adds a new domain lookup helper qemuDomObjFromDomainDriver
that lookups the domain and leaves the driver locked. The driver is
returned as the second argument of that function. If the lookup fails
the driver is unlocked to help avoid cleanup codepaths.
This patch also improves docs for the helpers.
This patch gets rid of the undeterministic error reporting code done on
return values of get(pw|gr)nam_r. With this patch, if the group record
is not returned by the corresponding function this error is not
considered fatal even if errno != 0. The error is logged in such case.
Move the code for matching hostdev instances out of virDomainHostdevFind
and into virDomainHostdevMatch method, which in turn calls out to other
helper methods depending on the type of hostdev.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Rename virDomainHostdevPartsParse to virDomainHostdevDefParseSubsys
to reflect the fact that it only deals with hostdevs uing the
traditional mode=subsystem, and not mode=capabilities
Rename virDomainHostSourceFormat to virDomainHostdevDefFormatSubsys
for the same reason.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
virStorageFileGetLVMKey and virStorageFileGetSCSIKey
both return heap allocated strings, so the return value
should not be marked const.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add check-symsorting.pl to perform case-insensitive alphabetical
sorting of groups of symbols. Fix all violations it reports
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When using vnc gaphics over a unix socket, virt-aa-helper needs to provide
access for the qemu domain to access the sockfile.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
When a qemu domain is backed by huge pages, apparmor needs to grant the domain
rw access to files under the hugetlbfs mount point. Add a hook, called in
qemu_process.c, which ends up adding the read-write access through
virt-aa-helper. Qemu will be creating a randomly named file under the
mountpoint and unlinking it as soon as it has mmap()d it, therefore we
cannot predict the full pathname, but for the same reason it is generally
safe to provide access to $path/**.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
This patch exports qemuMigrationIsAllowed and adds a new parameter to it
to denote if it's a remote migration or a local migration. Local
migrations are used in snapshots and saving of the machine state and
have fewer restrictions. This patch also adjusts callers of the function
and tweaks some error messages to be more universal.
Currently there is no way to detect it via QMP and requesting "-sandbox
off" works correctly even if it was compiled out, so this will work
unless someone both requests the sandbox in qemu.conf and builds QEMU
without the support for it.
Interfaces keeps a class_id, which is an ID from which bridge
part of QoS settings is derived. We need to store class_id
in domain status file, so we can later pass it to
virNetDevBandwidthUnplug.
Currently, we are only keeping a inactive XML configuration
in status dir. This is no longer enough as we need to keep
this class_id attribute so we don't overwrite old entries
when the daemon restarts. However, since there has already
been release which has just <network/> as root element,
and we want to keep things compatible, detect that loaded
status file is older one, and don't scream about it.
Network should be notified if we plug in or unplug an
interface, so it can perform some action, e.g. set/unset
network part of QoS. However, we are doing this in very
early stage, so iface->ifname isn't filled in yet. So
whenever we want to report an error, we must use a different
identifier, e.g. the MAC address.
This will be used whenever a NIC with guaranteed throughput is to
be plugged into a bridge. It will adjust the average throughput of
non guaranteed NICs (classid 1:2) to meet new requirements.
These set bridge part of QoS when bringing domain's interface up.
Long story short, if there's a 'floor' set, a new QoS class is created.
ClassID MUST be unique within the bridge and should be kept for
unplug phase.
These classes can borrow unused bandwidth. Basically,
only egress qdsics can have classes, therefore we can
do this kind of traffic shaping only on host's outgoing,
that is domain's incoming traffic.
This is however supported only on domain interfaces with
type='network'. Moreover, target network needs to have at least
inbound QoS set. This is required by hierarchical traffic shaping.
From now on, the required attribute for <inbound/> is either 'average'
(old) or 'floor' (new). This new attribute can be used just for
interfaces type of network (<interface type='network'/>) currently.
Stochastic Fairness Queuing (SFQ) is queuing discipline
(qdisc) which doesn't really shape any traffic but 'just'
re-arrange packets in sending buffer so no stream starve.
The goal is to ensure fairness. There is basically only one
configuration parameter (perturb) which is set to advised
value of 10.
Network adapters of type 'routed' is a special case. Other adapters
have 'network' parameter in prlctl's output instead.
Routed network adapters should be connected to 'routed' network
from libvirt's view.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Historically if traffic from the adapter is routed to LAN without
NAT, it isn't connected to any virtual networks, but has a 'type'
instead. Sinse libvirt has special virtual network type for such case,
let's add pseudo network 'routed' to fit libvirt's API well.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Parallels Cloud Server uses virtual networks model for network
configuration. It uses own tools for virtual network management.
So add network driver, which will be responsible for listing
virtual networks and performing different operations on them
(in consequent patched).
This patch only allows listing virtual network names, without
any parameters like DHCP server settings.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Allow changing network interfaces in domain configuration.
ifname is used as iterface identifier: if there is interface
with some ifname in old config and there are no interfaces with
such name in the new config - issue prlctl command to delete
the network interface. And vice versa - if interface with
some ifname exists only in new config - issue prlctl command
to create it.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Parse network interfaces info from prlctl output.
Parallels Cloud Server uses virtual networks model for
network configuration: You can add network adapter to
VM and connect it to some predefined virtual network.
Fill type, mac, network name and linkstate fields of
virDomainNetDef structure.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
When the disk snapshot part of an external system checkpoint fails the
memory image is retained. This patch adds code to remove the image in
such case.
In case the snapshot code isn't able to restart CPUs after an external
checkpoint we would leak a copy of the domains XML definition. This
patch fixes the cleanup path.
False positive, but it breaks the build with gcc-4.6.3.
qemu/qemu_migration.c:2931:37: error: 'offline' may be used
uninitialized in this function [-Werror=uninitialized]
qemu/qemu_migration.c:2887:10: note: 'offline' was declared here
This patch changes how parameters are passed to dnsmasq. Instead of
being on the command line, the parameters are put into a file (one
parameter per line) and a commandline --conf-file= specifies the
location of the file. The file is located in the same directory as
the leases file.
Putting the dnsmasq parameters into a configuration file
allows them to be examined and more easily understood than
examining the command lines displayed by "ps ax". This is
especially true when a number of networks have been started.
When the use of dnsmasq was originally done, the required command line
was simple, but it has gotten more complicated over time and will
likely become even more complicated in the future.
Note: The test conf files have all been renamed .conf instead of
.argv, and tests/networkxml2xmlargvdata was moved to
tests/networkxml2xmlconfdata.
The DHCPv6 support includes IPV6 dhcp-range and dhcp-host for one
IPv6 subnetwork on one interface. This support will only work
if dnsmasq version >= 2.64; otherwise an error occurs if
dhcp-range or dhcp-host is specified for an IPv6 address.
Essentially, this change provides the same DHCP support for IPv6
that has been available for IPv4.
With dnsmasq >= 2.64, support for the RA service is also now provided
by dnsmasq (radvd is no longer used/started). (Although at least one
version of dnsmasq prior to 2.64 "supported" IPv6 Router
Advertisement, there were bugs (fixed in 2.64) that rendered it
unusable.)
Documentation and the network schema has been updated
to reflect the new support.
virNetworkDefUpdateForward requires separate functions to parse and
clear a virNetworkForwardDef by itself, but they were previously just
inlined in the virNetworkDef parse and free functions. This patch
makes them into separate functions.
The attributes of a <network> element's <forward> element were
previously stored directly in the virNetworkDef object, but
virNetworkUpdateForward() needs to operate on a <forward> in
isolation, so this patchs pulls out all those attributes into a
separate virNetworkForwardDef struct (and shortens their names
appropriately). This new object is contained in the virNetworkDef, not
pointed to by it, so there is no extra memory management.
This patch makes no functional changes, it only changes, e.g.,
"nForwardIfs" to "forward.nifs".
The other clear functions in network_conf.c that clear out arrays of
sub-objects do so by using the n[itemname]s value as a counter going
down to 0. Make this one consistent. There's no functional value, just
makes the style more consistent with the rest of the file.
This makes some function names and arg lists for consistent with other
parse functions in network_conf.c. While modifying
virNetworkIPParseXML(), also change its "error" label to "cleanup",
since the code at that label is executed on success as well as
failure.
These three functions are very similar - none allow a MODIFY
operation; you can only add or delete.
The biggest difference between them (other than the data itself) is in
the criteria for determining a match, and whether or not multiple
matches are possible:
1) for HOST records, it's considered a match if the IP address or any
of the hostnames of an existing record matches.
2) for SRV records, it's a match if all of
domain+service+protocol+target *which have been specified* are
matched.
3) for TXT records, there is only a single field to match - name
(value can be the same for multiple records, and isn't considered a
search term), so by definition there can be no ambiguous matches.
In all three cases, if any matches are found, ADD will fail; if
multiple matches are found, it means the search term was ambiguous,
and a DELETE will fail.
The upper level code in bridge_driver.c is already implemented for
these functions - appropriate conf files will be re-written, and
dnsmasq will be SIGHUPed or restarted as appropriate.
Since there is only a single virNetworkDNSDef for any virNetworkDef,
and it's trivial to determine whether or not it contains any real
data, it's much simpler (and fits more uniformly with the parse
function calling sequence of the parsers for many other objects that
are subordinates of virNetworkDef) if virNetworkDef *contains* an
virNetworkDNSDef rather than pointing to one.
Since it is now just a part of another object rather than its own
object, it no longer makes sense to have a *Free() function, so that
is changed to a *Clear() function.
More importantly though, ParseXML and Clear functions are needed for
the individual items contained in a virNetworkDNSDef (srv, txt, and
host records), but none of them have a *Clear(), and only two of the
three had *ParseXML() functions (both of which used a non-uniform
arglist). Those problems are cleared up by this patch - it splits the
higher-level Clear function into separate functions for each of the
three, creates a parse for txt records, and cleans up the srv and host
parsers, so we now have all the utility functions necessary to
implement virNetworkDefUpdateDNS(Host|Srv|Txt).
This shortens the name of the structs for srv and txt, and their
instances in virNetworkDNSDef, to be more compact and uniform with the
naming of the dns host array. It also changes the type of ntxts, etc
from unsigned int to size_t, so that they can be used directly as args
to VIR_*_ELEMENT.
The already-written backend functions for virNetworkUpdate that add
and delete items into lists within the a network were already debugged
to work properly, but future such functions will use
VIR_(INSERT|DELETE)_ELEMENT instead, so these are changed for
uniformity.
I noticed when writing the backend functions for virNetworkUpdate that
I was repeating the same sequence of memmove, VIR_REALLOC, nXXX-- (and
messed up the args to memmove at least once), and had seen the same
sequence in a lot of other places, so I decided to write a few
utility functions/macros - see the .h file for full documentation.
The intent is to reduce the number of lines of code, but more
importantly to eliminate the need to check the element size and
element count arithmetic every time we need to do this (I *always*
make at least one mistake.)
VIR_INSERT_ELEMENT: insert one element at an arbitrary index within an
array of objects. The size of each object is determined
automatically by the macro using sizeof(*array). The new element's
contents are copied into the inserted space, then the original copy
of contents are 0'ed out (if everything else was
successful). Compile-time assignment and size compatibility between
the array and the new element is guaranteed (see explanation below
[*])
VIR_INSERT_ELEMENT_COPY: identical to VIR_INSERT_ELEMENT, except that
the original contents of newelem are not cleared to 0 (i.e. a copy
is made).
VIR_APPEND_ELEMENT: This is just a special case of VIR_INSERT_ELEMENT
that "inserts" one past the current last element.
VIR_APPEND_ELEMENT_COPY: identical to VIR_APPEND_ELEMENT, except that
the original contents of newelem are not cleared to 0 (i.e. a copy
is made).
VIR_DELETE_ELEMENT: delete one element at an arbitrary index within an
array of objects. It's assumed that the element being deleted is
already saved elsewhere (or cleared, if that's what is appropriate).
All five of these macros have an _INPLACE variant, which skips the
memory re-allocation of the array, assuming that the caller has
already done it (when inserting) or will do it later (when deleting).
Note that VIR_DELETE_ELEMENT* can return a failure, but only if an
invalid index is given (index + amount to delete is > current array
size), so in most cases you can safely ignore the return (that's why
the helper function virDeleteElementsN isn't declared with
ATTRIBUTE_RETURN_CHECK). A warning is logged if this ever happens,
since it is surely a coding error.
[*] One initial problem with the INSERT and APPEND macros was that,
due to both the array pointer and newelem pointer being cast to void*
when passing to virInsertElementsN(), any chance of type-checking was
lost. If we were going to move in newelem with a memmove anyway, we
would be no worse off for this. However, most current open-coded
insert/append operations use direct struct assignment to move the new
element into place (or just populate the new element directly) - thus
use of the new macros would open a possibility for new usage errors
that didn't exist before (e.g. accidentally sending &newelemptr rather
than newelemptr - I actually did this quite a lot in my test
conversions of existing code).
But thanks to Eric Blake's clever thinking, I was able to modify the
INSERT and APPEND macros so that they *do* check for both assignment
and size compatibility of *ptr (an element in the array) and newelem
(the element being copied into the new position of the array). This is
done via clever use of the C89-guaranteed fact that the sizeof()
operator must have *no* side effects (so an assignment inside sizeof()
is checked for validity, but not actually evaluated), and the fact
that virInsertElementsN has a "# of new elements" argument that we
want to always be 1.
When restarting CPUs after an external snapshot, the restarting function
was called without the appropriate async job type. This caused that a
new sync job wasn't created and allowed races in the monitor.
New VM will have default values for all parameters, like
cpu number, we have to change its configuration as provided
by xml definition, given to parallelsDomainDefineXML.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Implement creation of new disks - if a new disk found
in configuration, find a volume by disk path and
actually create a disk image by issuing prlctl command.
If it's successfully finished - remove the file with volume
definition.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Add function for convertion bus from libvirt's numeric constant
to a name, used in a parallels command-line tools.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Move part, which deletes existing volume, to a new function
parallelsStorageVolumeDefRemove so that we can use it later
in parallels_driver.c
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Read disk images size from xml description and fill
virStorageVolDef.capacity and allocation (let's consider
that allocation is the same as capacity, calculating real
allcoation will be implemented later).
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Disk images in Parallels Cloud Server stored in directories. Each
one has files with data and xml description of an image stored in
file DiskDescriptior.xml.
Since we have to support 'detached' images, which are not used by
any VM, the better way to collect info about volumes is searching for
directories with a file DiskDescriptior.xml in each VM directory.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
There are no storage pools in Parallels Cloud Server -
All VM data stored in a single directory: config, snapshots,
memory dump together with disk images.
Let's look through list of VMs and create a storage pool for
each directory, containing VMs.
So if you have 3 vms: /var/parallels/vm-1.pvm,
/var/parallels/vm-2.pvm and /root/test.pvm - 2 storage pools
appear: -var-parallels and -root. xml descriptions of the pools
will be saved in /etc/libvirt/parallels-storage, so UUIDs will
not change netween connections to libvirt.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
We don't support unprivileged users anymore, so remove code, which
selects configuration directory depending on user.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Allow changing some parameters of the hard disks: bus,
image and drive address.
Creating new disk devices and removing existing ones
require changes in the storage driver, so it will be
implemented later.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Offline migration transfers inactive definition of a domain (which may
or may not be active). After successful completion, the domain remains
in its current state on source host and is defined but inactive on
destination host. It's a bit more clever than virDomainGetXMLDesc() on
source host followed by virDomainDefineXML() on destination host, as
offline migration will run pre-migration hook to update the domain XML
on destination host. Currently, copying non-shared storage is not
supported during offline migration.
Offline migration can be requested with a new migration flag called
VIR_MIGRATE_OFFLINE (which has to be combined with
VIR_MIGRATE_PERSIST_DEST flag).
This fixes a problem that showed up during testing of:
https://bugzilla.redhat.com/show_bug.cgi?id=881480
Due to a logic error in the function that gets the name of the bridge
an interface connects to, any time a bridge was specified directly
(type='bridge') rather than indirectly (type='network'), An error
would be logged (although the operation would then complete
successfully):
Network type 6 is not supported
The final virReportError() in the function
qemuDomainNetGetBridgeName() was apparently avoided in the past with a
"goto cleanup" at the end of each case, but the case of bridge somehow
no longer has that final goto cleanup.
The proper solution is anyway to not rely on goto's, but put the error
log inside an else {} clause, so that it's executed only if the type
is neither bridge nor network (in reality, this function should only
ever be called for those two types, that's why this is an internal
error).
While making this change, the error message was also tuned to be more
correct (since it's not really the type of the network, but the type
of the interface, and it *is* otherwise supported, it's just that the
interface type in question doesn't *have* a bridge device associated
with it, or at least we don't know how to get it).
If a network interface model is not specified, libvirt will run
into an unchecked NULL pointer coredump. On the other hand if
the empty model is ignored, a PCI bus address would be generated,
which is not supported by S390.
Since the only valid network type model for S390 is virtio,
we use this as the default value, which is the same for QEMU.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Things are supposed to look like:
<machine canonical='pc-0.12'>pc</machine>
But are currently swapped. This can cause many VMs to revert to having
machine type='pc' which will affect save/restore across qemu upgrades.
Add VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA flag to virStorageVolCreateXML
and virStorageVolCreateXMLFrom. This flag requests metadata
preallocation when creating/cloning qcow2 images, resulting in creating
a sparse file with qcow2 metadata. It has only slightly larger disk usage
compared to new image with no allocation, but offers higher performance.
QEMU supports setting vendor and product strings for disk since
1.2.0 (only scsi-disk, scsi-hd, scsi-cd support it), this patch
exposes it with new XML elements <vendor> and <product> of disk
device.
Based on a patch originally authored by Daniel De Graaf
http://lists.xen.org/archives/html/xen-devel/2012-05/msg00565.html
This patch converts the Xen libxl driver to support only Xen >= 4.2.
Support for Xen 4.1 libxl is dropped since that version of libxl is
designated 'technology preview' only and is incompatible with Xen 4.2
libxl. Additionally, the default toolstack in Xen 4.1 is still xend,
for which libvirt has a stable, functional driver.
virGetGroupIDByName is documented as returning 1 if the groupname
cannot be found. getgrnam_r is documented as returning:
« 0 or ENOENT or ESRCH or EBADF or EPERM or ... The given name
or gid was not found. »
and that:
« The formulation given above under "RETURN VALUE" is from POSIX.1-2001.
It does not call "not found" an error, hence does not specify what
value errno might have in this situation. But that makes it impossible to
recognize errors. One might argue that according to POSIX errno should be
left unchanged if an entry is not found. Experiments on various UNIX-like
systems shows that lots of different values occur in this situation: 0,
ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. »
virGetGroupIDByName returns an error when the return value of getgrnam_r
is non-0. However on my RHEL system, getgrnam_r returns ENOENT when the
requested user cannot be found, which then causes virGetGroupID not
to behave as documented (it returns an error instead of falling back
to parsing the passed-in value as an gid).
This commit makes virGetGroupIDByName only report an error when errno
is set to one of the values in the posix description of getgrnam_r
(which are the same as the ones described in the manpage on my system).
virGetUserIDByName is documented as returning 1 if the username
cannot be found. getpwnam_r is documented as returning:
« 0 or ENOENT or ESRCH or EBADF or EPERM or ... The given name
or uid was not found. »
and that:
« The formulation given above under "RETURN VALUE" is from POSIX.1-2001.
It does not call "not found" an error, hence does not specify what
value errno might have in this situation. But that makes it impossible to
recognize errors. One might argue that according to POSIX errno should be
left unchanged if an entry is not found. Experiments on various UNIX-like
systems shows that lots of different values occur in this situation: 0,
ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. »
virGetUserIDByName returns an error when the return value of getpwnam_r
is non-0. However on my RHEL system, getpwnam_r returns ENOENT when the
requested user cannot be found, which then causes virGetUserID not
to behave as documented (it returns an error instead of falling back
to parsing the passed-in value as an uid).
This commit makes virGetUserIDByName only report an error when errno
is set to one of the values in the posix description of getpwnam_r
(which are the same as the ones described in the manpage on my system).
If debugging is enabled, the debug messages are sent to stderr.
Moreover, if a command has catching of stderr set, the messages
gets mixed with stdout output (assuming both outputs are stored
in the same variable). The resulting string then doesn't
necessarily have to start with desired prefix then. This bug
exposes itself when parsing dnsmasq output:
2012-12-06 11:18:11.445+0000: 18491: error :
dnsmasqCapsSetFromBuffer:664 : internal error cannot parse
/usr/sbin/dnsmasq version number in '2012-12-06
11:11:02.232+0000: 18492: debug : virFileClose:72 : Closed fd 22'
We can clearly see that the output of dnsmasq --version doesn't
start with expected "Dnsmasq version " string but a libvirt debug
output.
If the debugging is enabled, the virCommand subsystem catches debug
messages in the command output as well. In that case, we can't assume
the string corresponding to command's stdout will start with specific
prefix. But the prefix can be moved deeper in the string. This bug
shows itself when parsing dnsmasq output:
2012-12-06 11:18:11.445+0000: 18491: error :
dnsmasqCapsSetFromBuffer:664 : internal error cannot parse
/usr/sbin/dnsmasq version number in '2012-12-06 11:11:02.232+0000:
18492: debug : virFileClose:72 : Closed fd 22'
We can clearly see that the output of dnsmasq --version
doesn't start with expected "Dnsmasq version " string but a libvirt
debug output.
This resolves: https://bugzilla.redhat.com/show_bug.cgi?id=767057
It was possible to define a network with <forward mode='bridge'> that
had both a bridge device and a forward device defined. These two are
mutually exclusive by definition (if you are using a bridge device,
then this is a host bridge, and if you have a forward dev defined,
this is using macvtap). It was also possible to put <ip>, <dns>, and
<domain> elements in this definition, although those aren't supported
by the current driver (although it's conceivable that some other
driver might support that).
The items that are invalid by definition, are now checked in the XML
parser (since they will definitely *always* be wrong), and the others
are checked in networkValidate() in the network driver (since, as
mentioned, it's possible that some other network driver, or even this
one, could some day support setting those).
This patch adds the capability for virtual guests to do IPv6
communication via a virtual network interface with no IPv6 (gateway)
addresses specified. This capability has always been enabled by
default for IPv4, but disabled for IPv6 for security concerns, and
because it requires the ip6tables command to be operational (which
isn't the case on a system with the ipv6 module completely disabled).
This patch adds a new attribute "ipv6" at the toplevel of a <network>
object. If ipv6='yes', the extra ip6tables rules required to permite
inter-guest communications are added when the network is started. If
it is 'no', or not present, those rules will not be added; thus the
default behavior doesn't change, so there should be no compatibility
issues with any existing installations.
Note that virtual guests cannot communication with the virtualization
host via this interface, because the following kernel tunable has
been set:
net.ipv6.conf.<bridge_interface_name>.disable_ipv6 = 1
This assures that the bridge interface will not have an IPv6
link-local (fe80::) address.
To control this behavior so that it is not enabled by default, the parameter
ipv6='yes' on the <network> statement has been added.
Documentation related to this patch has been updated.
The network schema has also been updated.
https://bugzilla.redhat.com/show_bug.cgi?id=832302
It's odd to fall through to buildVol, and the existed file is
removed when buildVol fails. This checks if the volume target
path already exists in createVol. The reason for not using
error like "Volume already exists" is that there isn't volume
maintained by libvirt for the path until a operation like
pool-refresh, using error like that will just cause confusion.
https://bugzilla.redhat.com/show_bug.cgi?id=866524
Since the virConnect object is not locked wholely when doing
virConenctDispose, a thread can get the lock and thus might
cause the race.
Detected by valgrind:
==23687== Invalid read of size 4
==23687== at 0x38BAA091EC: pthread_mutex_lock (pthread_mutex_lock.c:61)
==23687== by 0x3FBA919E36: remoteClientCloseFunc (remote_driver.c:337)
==23687== by 0x3FBA936BF2: virNetClientCloseLocked (virnetclient.c:688)
==23687== by 0x3FBA9390D8: virNetClientIncomingEvent (virnetclient.c:1859)
==23687== by 0x3FBA851AAE: virEventPollRunOnce (event_poll.c:485)
==23687== by 0x3FBA850846: virEventRunDefaultImpl (event.c:247)
==23687== by 0x40CD61: vshEventLoop (virsh.c:2128)
==23687== by 0x3FBA8626F8: virThreadHelper (threads-pthread.c:161)
==23687== by 0x38BAA077F0: start_thread (pthread_create.c:301)
==23687== by 0x33F68E570C: clone (clone.S:115)
==23687== Address 0x4ca94e0 is 144 bytes inside a block of size 312 free'd
==23687== at 0x4A0595D: free (vg_replace_malloc.c:366)
==23687== by 0x3FBA8588B8: virFree (memory.c:309)
==23687== by 0x3FBA86AAFC: virObjectUnref (virobject.c:145)
==23687== by 0x3FBA8EA767: virConnectClose (libvirt.c:1458)
==23687== by 0x40C8B8: vshDeinit (virsh.c:2584)
==23687== by 0x41071E: main (virsh.c:3022)
The above race is caused by the eventLoop thread tries to handle
the net client event by calling the callback set by:
virNetClientSetCloseCallback(priv->client,
remoteClientCloseFunc,
conn, NULL);
I.E. remoteClientCloseFunc, which lock/unlock the virConnect object.
This patch is to fix the bug by setting the callback to NULL when
doRemoteClose.
The pciWrite32 function assembled the array of data to be written to the
fd with a bad offset on the last byte. This issue was probably caused by
a typo (14, 24).
Unmanaged PCI devices were only leaked if pciDeviceListAdd failed but
managed devices were always leaked. And leaking PCI device is likely to
leave PCI config file descriptor open. This patch fixes
qemuReattachPciDevice to either free the PCI device or add it to the
inactivePciHostdevs list.
An attempt to attach device that is already attached to a domain results
in the following error:
virsh # attach-device rhel6 pci2 --persistent
error: Failed to attach device from pci2
error: invalid argument: device is already in the domain configuration
The "invalid argument" error code looks wrong, we usually use "operation
invalid" when the action cannot be done in current state.
Only one error in qemu_monitor was already using the relatively
new OPERATION_UNSUPPORTED error, even though it is a better fit
for all of the messages related to options that are unsupported
due to the version of qemu in use rather than due to a user's
XML or .conf file choice. Suggested by Osier Yang.
* src/qemu/qemu_monitor.c (qemuMonitorSendFileHandle)
(qemuMonitorAddHostNetwork, qemuMonitorRemoveHostNetwork)
(qemuMonitorAttachDrive, qemuMonitorDiskSnapshot)
(qemuMonitorDriveMirror, qemuMonitorTransaction)
(qemuMonitorBlockCommit, qemuMonitorDrivePivot)
(qemuMonitorBlockJob, qemuMonitorSystemWakeup)
(qemuMonitorGetVersion, qemuMonitorGetMachines)
(qemuMonitorGetCPUDefinitions, qemuMonitorGetCommands)
(qemuMonitorGetEvents, qemuMonitorGetKVMState)
(qemuMonitorGetObjectTypes, qemuMonitorGetObjectProps)
(qemuMonitorGetTargetArch): Use better error category.
Without this patch, attempts to create a disk snapshot when qemu
is too old results in a cryptic message:
virsh # snapshot-create 23 --disk-only
error: operation failed: Failed to take snapshot: unknown command: 'snapshot_blkdev'
Now it reports:
virsh # snapshot-create 23 --disk-only
error: unsupported configuration: live disk snapshot not supported with this QEMU binary
All versions of qemu that support live disk snapshot also support
QMP (basically upstream qemu 1.1 and later, and backports to RHEL 6.2).
* src/qemu/qemu_capabilities.h (QEMU_CAPS_DISK_SNAPSHOT): New
capability.
* src/qemu/qemu_capabilities.c (qemuCaps): Track it.
(qemuCapsProbeQMPCommands): Set it.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive): Use
it.
* src/qemu/qemu_monitor.c (qemuMonitorDiskSnapshot): Simplify.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDiskSnapshot):
Likewise.
* src/qemu/qemu_monitor_text.h (qemuMonitorTextDiskSnapshot):
Delete.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextDiskSnapshot):
Likewise.
RHEL 6.3 uses dbus-devel-1.2.24, which lacked support for the
DBUS_TYPE_UNIX_FD define (contrast with Fedora 18 using 1.6.8).
But since it is an older dbus, it also lacks support for shutdown
inhibitions as provided by newer systemd.
Compilation failure introduced in commit 31330926.
* src/rpc/virnetserver.c (virNetServerAddShutdownInhibition):
Compile out if dbus is too old.
Implement the domainManagedSave, domainHasManagedSaveImage, and
domainManagedSaveRemove functions in the libvirt legacy xen driver.
domainHasManagedSaveImage check the managedsave image from filesystem
everytime. This is different from qemu and libxl driver. In qemu or
libxl driver, there is a hasManagesSave flag in virDomainObjPtr which
is not used in xen legacy driver. This flag could not add into xen
driver ptr either, because the driver ptr will be released at the end of
every libvirt api call. Meanwhile, AFAIK, xen store all the flags in
xen not in libvirt xen driver. There is no need to add this flag in xen.
Signed-off-by: Bamvor Jian Zhang <bjzhang@suse.com>
Use the freedesktop inhibition DBus service to prevent host
shutdown or session logout while any VMs are running.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently to deal with auto-shutdown libvirtd must periodically
poll all stateful drivers. Thus sucks because it requires
acquiring both the driver lock and locks on every single virtual
machine. Instead pass in a "inhibit" callback to virStateInitialize
which drivers can invoke whenever they want to inhibit shutdown
due to existance of active VMs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The only important state that should prevent libvirtd shutdown
is from running VMs. Networks, host devices, network filters
and storage pools are all long lived resources that have no
significant in-memory state. They should not block shutdown.
The patch adds the backend driver to support iSCSI format storage pools
and volumes for ESX host. The mapping of ESX iSCSI specifics to Libvirt
is as follows:
1. ESX static iSCSI target <------> Libvirt Storage Pools
2. ESX iSCSI LUNs <------> Libvirt Storage Volumes.
The above understanding is based on http://libvirt.org/storage.html.
The operation supported on iSCSI pools includes:
1. List storage pools & volumes.
2. Get XML descriptor operaion on pools & volumes.
3. Lookup operation on pools & volumes by name, UUID and path (if applicable).
iSCSI pools does not support operations such as: Create / remove pools
and volumes.
Since we can't (currently) rely on the ability to provide blanket
support for all possible network changes by calling the toplevel
netdev hostside disconnect/connect functions (due to qemu only
supporting a lockstep between initialization of host side and guest
side of devices), in order to support live change of an interface's
nwfilter we need to make a special purpose function to only call the
nwfilter teardown and setup functions if the filter for an interface
(or its parameters) changes. The pattern is nearly identical to that
used to change the bridge that an interface is connected to.
This patch was inspired by a request from Guido Winkelmann
<guido@sagersystems.de>, who tested an earlier version.
To detect if an interface's nwfilter has changed, we need to also
compare the filterparams, which is a hashtable of virNWFilterVarValue.
virHashEqual can do this nicely, but requires a pointer to a function
that will compare two of the items being stored in the hashes.
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=881480
These three functions:
virDomainNetGetActualBridgeName
virDomainNetGetActualDirectDev
virDomainNetGetActualDirectMode
return attributes that are in a union whose contents are interpreted
differently depending on the actual->type and so they should only
return non-0 when actual->type is 'bridge' (in the first case) or
'direct' (in the other two cases, but I had neglected to do that, so
...DirectDev() was returning bridge.brname (which happens to share the
same spot in the union with direct.linkdev) if actual->type was
'bridge', and ...BridgeName was returning direct.linkdev when
actual->type was 'direct'.
How does this involve Bug 881480 (which was about the inability to
switch between two networks that both have "<forward mode='bridge'/>
<bridge name='xxx'/>"? Whenever the return value of
virDomainNetGetActualDirectDev() for the new and old network
definitions doesn't match, qemuDomainChangeNet() requires a "complete
reconnect" of the device, which qemu currently doesn't
support. ...DirectDev() *should* have been returning NULL for old and
new, but was instead returning the old and new bridge names, which
differ.
(The other two functions weren't causing any behavioral problems in
virDomainChangeNet(), but their problem and fix was identical, so I
included them in this same patch).
Fix the null pointer access when UUID is not specified.
Introduce a bool 'uuidUsable' to virStoragePoolAuthCephx that indicates
if uuid was specified or not and use it instead of the pointless
comparison of the static UUID array to NULL.
Add an error message if both uuid and usage are specified.
Fixes:
Error: FORWARD_NULL (CWE-476):
libvirt-0.10.2/src/conf/storage_conf.c:461: var_deref_model: Passing
null pointer "uuid" to function "virUUIDParse(char const *, unsigned
char *)", which dereferences it. (The dereference is assumed on the
basis of the 'nonnull' parameter attribute.)
Error: NO_EFFECT (CWE-398):
libvirt-0.10.2/src/conf/storage_conf.c:979: array_null: Comparing an
array to null is not useful: "src->auth.cephx.secret.uuid != NULL".
Commit a21f5112 fixed one API, but missed two others that also
failed to log their 'flags' argument.
* src/libvirt.c (virNodeSuspendForDuration, virDomainGetHostname):
Log flags parameter.
This introduces a few new APIs for dealing with strings.
One to split a char * into a char **, another to join a
char ** into a char *, and finally one to free a char **
There is a simple test suite to validate the edge cases
too. No more need to use the horrible strtok_r() API,
or hand-written code for splitting strings.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add support for doing controlled shutdown / reboot in the LXC
driver. The default behaviour is to try talking to /dev/initctl
inside the container's virtual root (/proc/$INITPID/root). This
works with sysvinit or systemd. If that file does not exist
then send SIGTERM (for shutdown) or SIGHUP (for reboot). These
signals are not any kind of particular standard for shutdown
or reboot, just something apps can choose to handle. The new
virDomainSendProcessSignal allows for sending custom signals.
We might allow the choice of SIGTERM/HUP to be configured for
LXC containers via the XML in the future.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The fact that only the guest agent, or ACPI flag can be used
when requesting reboot/shutdown is merely a limitation of the
QEMU driver impl at this time. Thus it should not be in
libvirt.c code
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To be able todo controlled shutdown/reboot of containers an
API to talk to init via /dev/initctl is required. Fortunately
this is quite straightforward to implement, and is supported
by both sysvinit and systemd. Upstart support for /dev/initctl
is unclear.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When seeing a message
virNetSASLContextCheckIdentity:146 : SASL client admin not allowed in whitelist
it isn't immediately obvious that 'admin' is the identity
being checked. Quote the string to make it more obvious
The default machine type must be stored in the first element of
the caps->machineTypes array. This was done for help output
parsing but not for QMP probing.
Added a helper function qemuSetDefaultMachine to apply the same
fix up for both probing methods.
Further, it was necessary to set caps->nmachineTypes after QMP
probing.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
https://bugzilla.redhat.com/show_bug.cgi?id=872292
Libvirt should not attempt to call a QMP command that has not been
documented in qemu.git - if future qemu introduces a command by the
same name but with subtly different semantics, then libvirt will be
broken when trying to use that command.
We also had some code that could never be reached - some of our
commands have an alternate for new vs. old qemu HMP commands; but
if we are new enough to support QMP, we only need a fallback to
the new HMP counterpart, and don't need to try for a QMP counterpart
for the old HMP version.
See also this attempt to convert the three snapshot commands to QMP:
https://lists.gnu.org/archive/html/qemu-devel/2012-07/msg01597.html
although it looks like that will still not happen before qemu 1.3.
That thread eventually decided that qemu would use the name
'save-vm' rather than 'savevm', which mitigates the fact that
libvirt's attempt to use a QMP 'savevm' would be broken, but we
might not be as lucky on the other commands.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONSetCPU)
(qemuMonitorJSONAddDrive, qemuMonitorJSONDriveDel)
(qemuMonitorJSONCreateSnapshot, qemuMonitorJSONLoadSnapshot)
(qemuMonitorJSONDeleteSnapshot): Use only HMP fallback for now.
(qemuMonitorJSONAddHostNetwork, qemuMonitorJSONRemoveHostNetwork)
(qemuMonitorJSONAttachDrive, qemuMonitorJSONGetGuestDriveAddress):
Delete; QMP implies QEMU_CAPS_DEVICE, which prefers AddNetdev,
RemoveNetdev, and AddDrive anyways (qemu_hotplug.c has all callers).
* src/qemu/qemu_monitor.c (qemuMonitorAddHostNetwork)
(qemuMonitorRemoveHostNetwork, qemuMonitorAttachDrive): Reflect
deleted commands.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONAddHostNetwork)
(qemuMonitorJSONRemoveHostNetwork, qemuMonitorJSONAttachDrive):
Likewise.
https://bugzilla.redhat.com/show_bug.cgi?id=876828
Commit 38c4a9cc introduced a regression in hot unplugging of disks
from qemu, where cgroup device ACLs were no longer being revoked
(thankfully not a security hole: cgroup ACLs only prevent open()
of the disk; so reverting the ACL prevents future abuse but doesn't
stop abuse from an fd that was already opened before the ACL change).
Commit 1b2ebf95 overlooked that there were two spots affected.
* src/qemu/qemu_hotplug.c (qemuDomainDetachDiskDevice):
Transfer backing chain before deletion.
* src/qemu/qemu_driver.c (qemuDomainDetachDeviceDiskLive): Fix
spacing (partly to ensure a different-looking patch).
Also removed some unreachable code found by coverity:
libvirt-0.10.2/src/nwfilter/nwfilter_driver.c:259: unreachable: This
code cannot be reached: "nwfilterDriverUnlock(driver...".
This patch adds two labels and gets rid of a ton of duplicated code.
This patch also fixes some error message and switches most of them to
proper error reporting functions.
This patch adds macros to help retrieve configuration values from qemu
driver's configuration. Some configuration options are grouped
together in the process.
This bug resolves CVE-2012-3411, which is described in the following
bugzilla report:
https://bugzilla.redhat.com/show_bug.cgi?id=833033
The following report is specifically for libvirt on Fedora:
https://bugzilla.redhat.com/show_bug.cgi?id=874702
In short, a dnsmasq instance run with the intention of listening for
DHCP/DNS requests only on a libvirt virtual network (which is
constructed using a Linux host bridge) would also answer queries sent
from outside the virtualization host.
This patch takes advantage of a new dnsmasq option "--bind-dynamic",
which will cause the listening socket to be setup such that it will
only receive those requests that actually come in via the bridge
interface. In order for this behavior to actually occur, not only must
"--bind-interfaces" be replaced with "--bind-dynamic", but also all
"--listen-address" options must be replaced with a single
"--interface" option. Fully:
--bind-interfaces --except-interface lo --listen-address x.x.x.x ...
(with --listen-address possibly repeated) is replaced with:
--bind-dynamic --interface virbrX
Of course libvirt can't use this new option if the host's dnsmasq
doesn't have it, but we still want libvirt to function (because the
great majority of libvirt installations, which only have mode='nat'
networks using RFC1918 private address ranges (e.g. 192.168.122.0/24),
are immune to this vulnerability from anywhere beyond the local subnet
of the host), so we use the new dnsmasqCaps API to check if dnsmasq
supports the new option and, if not, we use the "old" option style
instead. In order to assure that this permissiveness doesn't lead to a
vulnerable system, we do check for non-private addresses in this case,
and refuse to start the network if both a) we are using the old-style
options, and b) the network has a publicly routable IP
address. Hopefully this will provide the proper balance of not being
disruptive to those not practically affected, and making sure that
those who *are* affected get their dnsmasq upgraded.
(--bind-dynamic was added to dnsmasq in upstream commit
54dd393f3938fc0c19088fbd319b95e37d81a2b0, which was included in
dnsmasq-2.63)
This new function returns true if the given address is in the range of
any "private" or "local" networks as defined in RFC1918 (IPv4) or
RFC3484/RFC4193 (IPv6), otherwise they return false.
These ranges are:
192.168.0.0/16
172.16.0.0/16
10.0.0.0/24
FC00::/7
FEC0::/10
In order to optionally take advantage of new features in dnsmasq when
the host's version of dnsmasq supports them, but still be able to run
on hosts that don't support the new features, we need to be able to
detect the version of dnsmasq running on the host, and possibly
determine from the help output what options are in this dnsmasq.
This patch implements a greatly simplified version of the capabilities
code we already have for qemu. A dnsmasqCaps device can be created and
populated either from running a program on disk, reading a file with
the concatenated output of "dnsmasq --version; dnsmasq --help", or
examining a buffer in memory that contains the concatenated output of
those two commands. Simple functions to retrieve capabilities flags,
the version number, and the path of the binary are also included.
bridge_driver.c creates a single dnsmasqCaps object at driver startup,
and disposes of it at driver shutdown. Any time it must be used, the
dnsmasqCapsRefresh method is called - it checks the mtime of the
binary, and re-runs the checks if the binary has changed.
networkxml2argvtest.c creates 2 "artificial" dnsmasqCaps objects at
startup - one "restricted" (doesn't support --bind-dynamic) and one
"full" (does support --bind-dynamic). Some of the test cases use one
and some the other, to make sure both code pathes are tested.
On OOM, xdr_destroy got called even though it wasn't created yet.
Found by coverity:
Error: UNINIT (CWE-457):
libvirt-0.10.2/src/rpc/virnetmessage.c:214: var_decl: Declaring
variable "xdr" without initializer.
libvirt-0.10.2/src/rpc/virnetmessage.c:219: cond_true: Condition
"virReallocN(&msg->buffer, 1UL /* sizeof (*msg->buffer) */,
msg->bufferLength) < 0", taking true branch
libvirt-0.10.2/src/rpc/virnetmessage.c:221: goto: Jumping to label
"cleanup"
libvirt-0.10.2/src/rpc/virnetmessage.c:257: label: Reached label
"cleanup"
libvirt-0.10.2/src/rpc/virnetmessage.c:258: uninit_use: Using
uninitialized value "xdr.x_ops".
Found by coverity:
Error: REVERSE_INULL (CWE-476):
libvirt-0.10.2/src/util/processinfo.c:141: deref_ptr: Directly
dereferencing pointer "map".
libvirt-0.10.2/src/util/processinfo.c:142: check_after_deref:
Null-checking "map" suggests that it may be null, but it has already
been dereferenced on all paths leading to the check.
Found by coverity:
Error: REVERSE_INULL (CWE-476):
libvirt-0.10.2/src/conf/netdev_bandwidth_conf.c:99: deref_ptr:
Directly dereferencing pointer "node".
libvirt-0.10.2/src/conf/netdev_bandwidth_conf.c:107:
check_after_deref: Null-checking "node" suggests that it may be
null, but it has already been dereferenced on all paths leading to
the check.
The virStateInitialize method and several cgroups methods were
using an 'int privileged' parameter or similar for dual-state
values. These are better represented with the bool type.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To allow actions to be performed in libvirtd when the host
shuts down, or user session exits, introduce a 'stop'
method to virDriverState. This will do things like saving
the VM state to a file.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Implement the new API for sending signals to processes in a guest
for the LXC driver. Only support sending signals to the init
process for now, because
- The kernel does not appear to expose the mapping between
container PID numbers and host PID numbers anywhere in the
host OS namespace
- There is no race-free way to validate whether a host PID
corresponds to a process in a container.
* src/lxc/lxc_driver.c: Allow sending processes signals
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
* src/remote/remote_protocol.x: message definition
* src/remote/remote_driver.c: Register driver function
* src/remote_protocol-structs: Test case
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add an API for sending signals to arbitrary processes in the
guest OS. This is primarily useful for container based virt,
but can be used for machine virt too, if there is a suitable
guest agent,
* include/libvirt/libvirt.h.in: Add virDomainSendProcessSignal
and virDomainProcessSignal enum
* src/driver.h: Driver entry point
* src/libvirt.c, src/libvirt_public.syms: Impl for new API
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
As of 1a50ba2cb0 we fail to connect to the
monitor instead of getting an exit status != 0 from qemu itself. This
breaks capabilities probing for the non QMP case.
The documentation to this API has some defects from
grammar and wording POV. These were raised after I've
pushed the patches, so they are in a separate commit.
It makes no sense to fail the whole getting command if there is
a parameter unsupported by the kernel. This patch fixes it by
omitting the unsupported parameter for getMemoryParameters.
And for setMemoryParameters, this checks if there is an unsupported
parameter up front of the setting, and just returns failure if not
all parameters are supported.
Replace the following names
* struct qemu_snap_remove with virQEMUSnapRemovePtr
* struct qemu_snap_reparent with virQEMUSnapReparentPtr
* struct qemu_save_header with virQEMUSaveHeaderPtr
* enum qemu_save_formats with virQEMUSaveFormat
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove the obsolete 'qemud' naming prefix and underscore
based type name. Introduce virQEMUDriverPtr as the replacement,
in common with LXC driver naming style
This resolves: https://bugzilla.redhat.com/show_bug.cgi?id=879473
The name attribute is required for portgroup elements (yes, the RNG
specifies that), and there is code in libvirt that assumes it is
non-null. Unfortunately, the portgroup parsing function wasn't
checking for lack of portgroup. One adverse result of this was that
attempts to update a network by adding a portgroup with no name would
cause libvirtd to segfault. For example:
virsh net-update default add portgroup "<portgroup default='yes'/>"
This patch causes virNetworkPortGroupParseXML to fail if no name is
specified, thus avoiding any later problems.
Throughout the code, we've always used VIR_DOMAIN_SHUTDOWN* flags
even for virDomainReboot() API and its implementation. Fortunately,
the appropriate macros has the same value. But if we want to keep
things consistent, we should be using the correct macros. This
patch doesn't break anything, luckily.
Commit cb022152 went overboard and introduced a dead conditional
while trying to get rid of a potential NULL dereference.
* src/nwfilter/nwfilter_dhcpsnoop.c (virNWFilterSnoopReqNew):
Remove redundant conditional.
In a few places, the return value could get passed to VIR_ALLOC_N without
being checked, resulting in a request to allocate a lot of memory if the
return value was negative.
This can't lead to a crash since virNWFilterSnoopReqNew is only called
with a static array as the argument, but if we check for NULL we should
do it right.
Error messages produced while dispatching guest agent commands didn't
have an apparent reference to the fact that they are dealing with guest
agent commands. This patch fixes up some of the messages to contain that
reference.
using qemu guest agent. As said in previous patch,
@mountPoint must be NULL and @flags zero because
qemu guest agent doesn't support these arguments
yet. If qemu learns them, we can start supporting
them as well.
This will call FITRIM within guest. The API has 4 arguments,
however, only 2 will be used for now (@dom and @minumum).
The rest two are there if in future qemu guest agent learns them.
With QMP capability probing, the version was not set.
virsh version returns:
...
Cannot extract running QEMU hypervisor version
This is fixed by computing caps->version from QMP major,
minor, micro values.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
QMP Capability probing will fail if QEMU cannot bind to the
QMP monitor socket in the qemu_driver->libDir directory.
That's because the child process is stripped of all
capabilities and this directory is chown'ed to the configured
QEMU user/group (normally qemu:qemu) by the QEMU driver.
To prevent this from happening, the driver startup will now pass
the QEMU uid and gid down to the capability probing code.
All capability probing invocations of QEMU will be run with
the configured QEMU uid instead of libvirtd's.
Furter, the pid file handling is moved to libvirt, as QEMU
cannot write to the qemu_driver->runDir (root:root). This also
means that the libvirt daemonizing must be used.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
If qemuMonitorOpenUnix is called without a related pid, i.e. for
QMP probing, a connect failure can happen as the result of a race.
Without a pid there is no retry and thus we give up too early.
This changes the code to retry if no pid is supplied.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
we already have virtualize meminfo for container through fuse filesystem,
add function lxcContainerMountProcFuse to mount this meminfo file to
the container's /proc/meminfo.
So we can isolate container's /proc/meminfo from host now.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
with this patch,container's meminfo will be shown based on
containers' mem cgroup.
Right now,it's impossible to virtualize all values in meminfo,
I collect some values such as MemTotal,MemFree,Cached,Active,
Inactive,Active(anon),Inactive(anon),Active(file),Inactive(anon),
Active(file),Inactive(file),Unevictable,SwapTotal,SwapFree.
if I miss something, please let me know.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
because libvirt_lxc's cgroup mountpoint is what it shown
in /proc/self/cgroup.
we can get container's cgroup through virCgroupNew("/", &group),
add interface virCgroupGetAppRoot to help container to
get it's cgroup.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
virCgroupGetMemSwapUsage is used to get container's swap usage,
with this interface,we can get swap usage in fuse filesystem.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
this patch addes fuse support for libvirt lxc.
we can use fuse filesystem to generate sysinfo dynamically,
So we can isolate /proc/meminfo,cpuinfo and so on through
fuse filesystem.
we mount fuse filesystem for every container.
the mount name is libvirt,mount point is
localstatedir/run/libvirt/lxc/containername.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
This bug leads to getting incorrect vcpupin information via
qemudDomainGetVcpuPinInfo() API when the number of maximum
cpu on a host falls into a range such as 31 < ncpus < 64.
gcc warning:
left shift count >= width of type
The following bug is such the case
https://bugzilla.redhat.com/show_bug.cgi?id=876415
Change some legacy function names to use 'qemu' as their
prefix instead of 'qemud' which was a hang over from when
the QEMU driver ran inside a separate daemon
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When starting an LXC guest with a virNetwork based NIC device,
if the network was not active, the virNetworkPtr device would
be leaked
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In virNetDevVethDelete the virRun method will properly report
errors, but when checking the exit status for non-zero exit
code no error is reported
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When starting a container, newDef is initialized to a
copy of 'def', but when startup fails newDef is never
removed. This cause later attempts to use 'virDomainDefine'
to lose the new data being defined.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A mistaken initialization of 'ret' caused failure to create
macvtap devices to be ignored. The libvirt_lxc process
would later fail to start due to missing devices
Also make sure code checks '< 0' and not '!= 0' since only
-1 is considered an error condition
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If the <interface> device did not contain any <target>
element, LXC would crash on a NULL pointer if starting
the container failed
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When failing to create a macvlan interface, make sure the
error message contains the name of the host interface
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The LXC driver relies on use of cgroups to kill off LXC processes
in shutdown. If cgroups aren't available, we're unable to kill
off processes, so we must treat lack of cgroups as a fatal startup
error.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The code setting up LXC cgroups used an 'rc' variable both
for capturing the return value of methods it calls, and
its own return status. The result was that several failures
in setting up cgroups would actually result in success being
returned.
Use a separate 'ret' for tracking return value as per normal
code design in other parts of libvirt
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The initpid will be required long term to enable LXC to
implement various hotplug operations. Thus it needs to be
persisted in the domain status XML. LXC has not used the
domain status XML before, so this introduces use of the
helpers.
Currently the lxcContainerSetupMounts method uses the
virSecurityManagerPtr instance to obtain the mount options
string and then only passes the string down into methods
it calls. As functionality in LXC grows though, those
methods need to have direct access to the virSecurityManagerPtr
instance. So push the code down a level.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The impls of virSecurityManagerGetMountOptions had no way to
return errors, since the code was treating 'NULL' as a success
value. This is somewhat pointless, since the calling code did
not want NULL in the first place and has to translate it into
the empty string "". So change the code so that the impls can
return "" directly, allowing use of NULL for error reporting
once again
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=876828
Commit 38c4a9cc introduced a regression in hot unplugging of disks
from qemu, where cgroup device ACLs were no longer being revoked
(thankfully not a security hole: cgroup ACLs only prevent open()
of the disk; so reverting the ACL prevents future abuse but doesn't
stop abuse from an fd that was already opened before the ACL change).
The actual regression is due to a latent bug. The hot unplug code
was computing the set of files needing cgroup ACL revocation based
on the XML passed in by the user, rather than based on the domain's
details on which disk was being deleted. As long as the revoke
path was always recomputing the backing chain, this didn't really
matter; but now that we want to compute the chain exactly once and
remember that computation, we need to hang on to the backing chain
until after the revoke has happened.
* src/qemu/qemu_hotplug.c (qemuDomainDetachPciDiskDevice):
Transfer backing chain before deletion.
This patch introduces the RNG schema and updates necessary data strucutures
to allow various hypervisors to make use of Gluster protocol as one of the
supported network disk backend. Next patch will add support to make use of
this feature in Qemu since it now supports Gluster protocol as one of the
network based storage backend.
Two new optional attributes for <host> element are introduced - 'transport'
and 'socket'. Valid transport values are tcp, unix or rdma. If none specified,
tcp is assumed. If transport is unix, socket specifies path to unix socket.
This patch allows users to specify disks on gluster backends like this:
<disk type='network' device='disk'>
<driver name='qemu' type='raw'/>
<source protocol='gluster' name='Volume1/image'>
<host name='example.org' port='6000' transport='tcp'/>
</source>
<target dev='vda' bus='virtio'/>
</disk>
<disk type='network' device='disk'>
<driver name='qemu' type='raw'/>
<source protocol='gluster' name='Volume2/image'>
<host transport='unix' socket='/path/to/sock'/>
</source>
<target dev='vdb' bus='virtio'/>
</disk>
Signed-off-by: Harsh Prateek Bora <harsh@linux.vnet.ibm.com>
Although we require various C99 features, we don't yet require a
complete C99 compiler. On RHEL 5, compilation complained:
qemu/qemu_command.c: In function 'qemuBuildGraphicsCommandLine':
qemu/qemu_command.c:4688: error: 'for' loop initial declaration used outside C99 mode
* src/qemu/qemu_command.c (qemuBuildGraphicsCommandLine): Declare
variable sooner.
* src/qemu/qemu_process.c (qemuProcessInitPasswords): Likewise.
The patch refactors the current ESX storage driver due to following reasons:
1. Given most of the public APIs exposed by the storage driver in Libvirt
remains same, ESX storage driver should not implement logic specific
for only one supported format (current implementation only supports VMFS).
2. Decoupling interface from specific storage implementation gives us an
extensible design to hook implementation for other supported storage
formats.
This patch refactors the current driver to implement it as a facade pattern i.e.
the driver exposes all the public libvirt APIs, but uses backend drivers to get
the required task done. The backend drivers provide implementation specific to
the type of storage device.
File changes:
------------------
esx_storage_driver.c ----> esx_storage_driver.c (base storage driver)
|
|---> esx_storage_backend_vmfs.c (VMFS backend)
When no security driver is specified libvirt_lxc segfaults as a debug
message tries to access security labels for the container that are not
present.
This problem was introduced in commit 6c3cf57d6c.
Early jumps to the cleanup label caused a crash of the libvirt_lxc
container helper as the cleanup section called
virLXCControllerDeleteInterfaces(ctrl) without checking the ctrl argument
for NULL. The argument was de-referenced soon after.
$ /usr/libexec/libvirt_lxc
/usr/libexec/libvirt_lxc: missing --name argument for configuration
Segmentation fault
This will simplify the refactoring of the ESX storage driver to support
a VMFS and an iSCSI backend.
One of the tasks the storage driver needs to do is to decide which backend
driver needs to be invoked for a given request. This approach extends
virStoragePool and virStorageVol to store extra parameters:
1. privateData: stores pointer to respective backend storage driver.
2. privateDataFreeFunc: stores cleanup function pointer.
virGetStoragePool and virGetStorageVol are modfied to accept these extra
parameters as user params. virStoragePoolDispose and virStorageVolDispose
checks for cleanup operation if available.
The private data pointer allows the ESX storage driver to store a pointer
to the used backend with each storage pool and volume. This avoids the need
to detect the correct backend in each storage driver function call.
The new model supports following features in addition to those supported
by SandyBridge:
fma, pcid, movbe, fsgsbase, bmi1, hle, avx2, smep, bmi2, erms, invpcid,
rtm
Commit 258e06c removed setting of the volume type to
VIR_STORAGE_VOL_BLOCK, which leads to failures in
storageVolumeCreateXMLFrom.
The type (and target.format) of the volume was set to zero. In
virStorageBackendGetBuildVolFromFunction, this gets interpreted as
VIR_STORAGE_FILE_NONE and the qemu-img tool is called with unknown
"none" format.
Bug: https://bugzilla.redhat.com/show_bug.cgi?id=879780
bridge_driver.h: silence gcc warnings:
statement with no effect [-Wunused-value]
unused variable 'net' [-Wunused-variable]
virdrivermoduletest.c: don't require network driver module
if it hasn't been built.
The virLXCControllerClientCloseHook method was mistakenly
assuming that the private data associated with the network
client was the virLXCControllerPtr. In fact it was just a
dummy int, so we were derefencing a bogus struct. The
frequent result of this was that we would never quit, because
we tried to arm a non-existant timer.
Fix the code by removing the dummy private data and just
using the virLXCControllerPtr instance as private data
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
It is possible for there to be deleted timers when we
calculate the next timeout, and they must be skipped.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The event code is a no-op if requested to update a non-existent
timer/handle watch. This makes it hard to detect bugs in the
caller who have passed bogus data. Add a VIR_WARN output in
such cases, since the API does not allow for return errors.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The docs for virDiskNameToIndex claim it ignores partition
numbers. In actual fact though, a code ordering bug means
that a partition number will cause the code to accidentally
multiply the result by 26.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit e0c469e58b that fixes the detection
of image chain wasn't complete. Iteration through the backing image
chain has to stop at the last existing image if some of the images are
missing otherwise the backing chain that is cached contains entries with
paths being set to NULL resulting to:
error: Unable to allow access for disk path (null): Bad address
Fortunately stat() is kind enough not to crash when it's presented with
a NULL argument. At least on Linux.
The error "... but the cause is unknown" appeared for XMLs similar to
this:
<disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<source file='/dev/zero'/>
<target dev='sr0'/>
</disk>
Notice unsupported disk type (for the driver), but also no address
specified. The first part is not a problem and we should not abort
immediately because of that, but the combination with the address
unknown was causing an unspecified error.
While fixing this, I added an error to one place where this return
value was not managed properly.
Fixes this error when building with -Werror on Alpine Linux:
util/processinfo.c: In function 'virProcessInfoSetAffinity':
util/processinfo.c:52:5: error: implicit declaration of function 'malloc' [-Werror=implicit-function-declaration]
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Currently the LXC driver logs audit messages when a container
is started or stopped. These audit messages, however, contain
the PID of the libvirt_lxc supervisor process. To enable
sysadmins to correlate with audit messages generated by
processes /inside/ the container, we need to include the
container init process PID.
We can't do this in the main 'start' audit message, since
the init PID is not available at that point. Instead we output
a completely new audit record, that lists both PIDs.
type=VIRT_CONTROL msg=audit(1353433750.071:363): pid=20180 uid=0 auid=501 ses=3 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 msg='virt=lxc op=init vm="busy" uuid=dda7b947-0846-1759-2873-0f375df7d7eb vm-pid=20371 init-pid=20372 exe="/home/berrange/src/virt/libvirt/daemon/.libs/lt-libvirtd" hostname=? addr=? terminal=pts/6 res=success'
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The LXC controller code currently directly invokes the
libvirt main loop code. The problem is that this misses
the cleanup of virNetServerClient connections that
virNetServerRun takes care of.
The result is that when libvirtd is stopped, the
libvirt_lxc controller process gets stuck in a I/O loop.
When libvirtd is then started again, it fails to connect
to the controller and thus kills off the entire domain.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Regression introduced by commit 258e06c85b, "ret" could be set to 1
or 0 by virStorageBackendFileSystemIsMounted before goto cleanup.
This could mislead the callers (up to the public API
virStoragePoolDestroy) to return success even the underlying umount
command fails.
I have been testing libvirt v1.0.0 for deployment within my
organization, and in the process discovered what appears to be a bug
that breaks virsh attach-device, when attaching an RBD volume to an
instance. First, here is the error presented, with v1.0.0 (this worked
in v0.10.2):
[root@host ~]# virsh attach-device W5APQ8 G84VV1.xml
error: Failed to attach device from G84VV1.xml
error: cannot open file 'dc3-1-test/G84VV1': No such file or directory
Using git bisect, I narrowed the problem down to this as the first
commit to break this setup:
4d34c92947 is the first bad commit
Commit a4c19459aa only added the
QEMU capability flag, command line option and added the boot element
for redirdev's in the XML schema.
This patch adds support for parsing and writing the XML with redirdevs
with the boot flag. It also ignores unknown XML elements in redirdev
instead of failing with:
"error: An error occurred, but the cause is unknown"
Bug: https://bugzilla.redhat.com/show_bug.cgi?id=805414
Upcoming patches for revert-and-clone branching of snapshots need
to be able to copy a domain definition; make this step reusable.
* src/conf/domain_conf.h (virDomainDefCopy): New prototype.
* src/conf/domain_conf.c (virDomainObjCopyPersistentDef): Split...
(virDomainDefCopy): ...into new function.
(virDomainObjSetDefTransient): Use it.
* src/libvirt_private.syms (domain_conf.h): Export it.
* src/qemu/qemu_driver.c (qemuDomainRevertToSnapshot): Use it.
Relatively straight-forward. And since qemu was already using
VIR_DOMAIN_SNAPSHOT_FILTERS_ALL, with 6 different APIs all calling
into this common code, I've instantly added all 5 flags to 6 APIs.
* src/conf/snapshot_conf.h (VIR_DOMAIN_SNAPSHOT_FILTERS_ALL):
Enable new filters.
* src/conf/snapshot_conf.c (virDomainSnapshotObjListGetNames):
Prep the new flags.
(virDomainSnapshotObjListCopyNames): Actually do the filtering.
As we enable more modes of snapshot creation, it becomes more important
to be able to quickly filter based on snapshot properties. This patch
introduces new filter flags; subsequent patches will introduce virsh
back-compat filtering, as well as actual libvirt filtering.
* include/libvirt/libvirt.h.in (virDomainSnapshotListFlags): Add
five new flags in two new groups.
* src/libvirt.c (virDomainSnapshotNum, virDomainSnapshotListNames)
(virDomainListAllSnapshots, virDomainSnapshotNumChildren)
(virDomainSnapshotListChildrenNames)
(virDomainSnapshotListAllChildren): Document them.
* src/conf/snapshot_conf.h (VIR_DOMAIN_SNAPSHOT_FILTERS_STATUS)
(VIR_DOMAIN_SNAPSHOT_FILTERS_LOCATION): Add new convenience filter
collection macros.
* tools/virsh-snapshot.c (cmdSnapshotList): Add 5 new flags.
* tools/virsh.pod (snapshot-list): Document them.
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=873134
The reported problem is that an attempt to restore a saved domain that
was configured with <currentMemory> and <memory> set to some (same for
both) number that's not a multiple of 4096KiB results in an error like
this:
error: Failed to start domain libvirt_test_api
error: XML error: current memory '4001792k' exceeds maximum '4000768k'
(in this case, currentMemory was set to 4000000KiB).
The reason for this failure is:
1) a saved image contains the "live xml" of the domain at the time of
the save.
2) the live xml of a running domain gets its currentMemory
(a.k.a. cur_balloon) directly from the qemu monitor rather than from
the configuration of the domain.
3) the value reported by qemu is (sometimes) not exactly what was
originally given to qemu when the domain was started, but is rounded
up to [some indeterminate granularity] - in some versions of qemu that
granularity is apparently 1MiB, and in others it is 4MiB.
4) When the XML is parsed to setup the state of the restored domain,
the XML parser for <currentMemory> compares it to <memory> (which is
the maximum allowed memory size for the domain) and if <currentMemory>
is greater than the next 1024KiB boundary above <memory>, it spits out
an error and fails.
For example (from the BZ) if you start qemu on RHEL6 with both
<currentMemory> and <memory> of 4000000 (this number is in KiB),
libvirt's dominfo or dumpxml will report "4001792" back (rounded up to
next 4MiB) for 10-20 seconds after the start, then revert to reporting
"4000000". On Fedora 16 (which uses qemu-1.0), it will instead report
"4000768" (rounded up to next 1MiB). On Fedora 17 (qemu-1.2), it seems
to always report "4000000". ("4000000" is of course okay, and
"4000768" is also okay since that's the next 1024KiB boundary above
"4000000" and the parser was already allowing for that. But "4001792
is *not* okay and produces the error message.)
This patch solves the problem by changing the allowed "fudge factor"
when parsing from 1024KiB to 4096KiB to match the maximum up-rounding
that could be done in qemu.
(I had earlier thought to fix this by up-rounding <memory> in the
dumpxml that's put into the saved image, but that wouldn't have fixed
the case where the save image was produced by an "unfixed"
libvirtd.)
Prior to this patch, 'virsh nodecpumap' on older kernels reported:
error: Unable to get cpu map
error: out of memory
* src/nodeinfo.c (linuxParseCPUmax): Don't overwrite error.
(nodeGetCPUBitmap): Provide backup implementation.
On RHEL 5, I was getting a segfault trying to start libvirtd,
because we were failing virNodeParseSocket but not checking
for errors, and then calling CPU_SET(-1, &sock_map) as a result.
But if you don't have a topology/physical_package_id file,
then you can just assume that the cpu belongs to socket 0.
* src/nodeinfo.c (virNodeGetCpuValue): Change bool into
default_value.
(virNodeParseSocket): Allow for default value when file is missing,
different from fatal error on reading file.
(virNodeParseNode): Update call sites to fail on error.
For disk snapshots, the user could request an external snapshot
but not supply a filename; later on, we would check this condition
and generate a suitable name if possible, or gracefully error out
when not possible (such as when the original file was a block
device). But unless we come up with a suitable way to generate
external memory file names, we have no later code point that was
checking for NULL, so we should forbid this up front.
* src/conf/snapshot_conf.c (virDomainSnapshotDefParseString):
Avoid NULL deref, since we don't generate names yet.
It may take some time for sanlock to add a lockspace. And if user
restart libvirtd service meanwhile, the fresh daemon can fail adding
the same lockspace with EINPROGRESS. Recent sanlock has
sanlock_inq_lockspace() function which should block until lockspace
changes state. If we are building against older sanlock we should
retry a few times before claiming an error. This issue can be easily
reproduced:
for i in {1..1000} ; do echo $i; service libvirtd restart; sleep 2; done
20
Stopping libvirtd daemon: [FAILED]
Starting libvirtd daemon: [ OK ]
21
Stopping libvirtd daemon: [ OK ]
Starting libvirtd daemon: [ OK ]
22
Stopping libvirtd daemon: [ OK ]
Starting libvirtd daemon: [ OK ]
error : virLockManagerSanlockSetupLockspace:334 : Unable to add
lockspace /var/lib/libvirt/sanlock/__LIBVIRT__DISKS__: Operation now in
progress
Since /sys/devices/system/cpu/present is not available on
older kernels like on RHEL 5.x nodeGetCPUCount will
fail there. The fallback implemented is to scan for
/sys/devices/system/cpu/cpuNN entries.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
This simplifies the top-level code, at the cost of using a little more
stack space. The primary benefit is being able to send more fields
without knowing in advance how many of them, and of which types, these
fields will be, and without having to individually add buffer variables.
The code imposes an upper limit on the total number of iovs/buffers
used, and fields that wouldn't fit are silently dropped. This is not
significant in this patch, but will affect the following one.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... and update all users. No change in functionality, the parameter
will be used later.
The metadata representation is as minimal as possible, but requires
the caller to allocate an array on stack explicitly.
The alternative of using varargs in the virLogMessage() callers:
* Would not allow the caller to optionally omit some metadata elements,
except by having two calls to virLogMessage.
* Would not be as type-safe (e.g. using int vs. size_t), and the compiler
wouldn't be able to do type checking
* Depending on parameter order:
a) virLogMessage(..., message format, message params...,
metadata..., NULL)
can not be portably implemented (parse_printf_format() is a glibc
function)
b) virLogMessage(..., metadata..., NULL,
message format, message params...)
would prevent usage of ATTRIBUTE_FMT_PRINTF and the associated
compiler checking.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
The "restart" function for locks allocates a new array according to
and pre-sets its length, then reads the owner pids from a JSON
document in a loop. Rather than adding each owner at a different
index, though, it repeatedly overwrites the last element of the array
with all the owners.
This patch adds a helper to determine if snapshots are external and uses
the helper to fix detection of those in snapshot deletion code.
Snapshots are external if they have an external memory image or if the
disk locations are external. As mixed snapshots are forbidden for now
we need to check just one disk to know.
Lately there were a few reports of the output of the virsh nodeinfo
command being inaccurate. This patch tries to avoid that by checking if
the topology actually makes sense. If it doesn't we then report a
synthetic topology that indicates to the user that the host capabilities
should be checked for the actual topology.
Currently, if user calls virDomainAbortJob we just issue
'migrate_cancel' and hope for the best. However, if user calls
the API in wrong phase when migration hasn't been started yet
(perform phase) the cancel request is just ignored. With this
patch, the request is remembered and as soon as perform phase
starts, migration is cancelled.
For S390, the default console target type cannot be of type 'serial'.
It is necessary to at least interpret the 'arch' attribute
value of the os/type element to produce the correct default type.
Therefore we need to extend the signature of defaultConsoleTargetType
to account for architecture. As a consequence all the drivers
supporting this capability function must be updated.
Despite the amount of changed files, the only change in behavior is
that for S390 the default console target type will be 'virtio'.
N.B.: A more future-proof approach could be to to use hypervisor
specific capabilities to determine the best possible console type.
For instance one could add an opaque private data pointer to the
virCaps structure (in case of QEMU to hold capsCache) which could
then be passed to the defaultConsoleTargetType callback to determine
the console target type.
Seems to be however a bit overengineered for the use case...
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
When the libvirt daemon is restarted it tries to reconnect to running
qemu domains. Since commit d38897a5d4 the
re-connection code runs in separate threads. In the original
implementation the maximum of domain ID's (that is used as an
initializer for numbering guests created next) while libvirt was
reconnecting to the guest.
With the threaded implementation this opens a possibility for race
conditions with the thread that is autostarting guests. When there's a
guest running with id 1 and the daemon is restarted. The autostart code
is reached first and spawns the first guest that should be autostarted
as id 1. This results into the following unwanted situation:
# virsh list
Id Name State
----------------------------------------------------
1 guest1 running
1 guest2 running
This patch extracts the detection code before the re-connection threads
are started so that the maximum id of the guests being reconnected to is
known.
The only semantic change created by this is if the guest with greatest ID
quits before we are able to reconnect it's ID is used anyway as the
greatest one as without this patch the greatest ID of a process we could
successfuly reconnect to would be used.
82507838 refactored the code to keep both the raw and canonicalized form
of the backingStore, which breaks badly when the storage pool contains a
storage volume, which is missing its backing store file:
# ./daemon/libvirtd -l
2012-11-07 12:43:33.279+0000: 22175: info : libvirt version: 1.0.0
2012-11-07 12:43:33.279+0000: 22175: error : absolutePathFromBaseFile:542 : Can't canonicalize path '/var/lib/libvirt/images/base.qcow2': No such file or directory
2012-11-07 12:43:33.280+0000: 22175: error : storageDriverAutostart:115 : Failed to autostart storage pool 'default': Can't canonicalize path '/var/lib/libvirt/images/base.qcow2': No such file or directory
This is because virStorageFileGetMetadataFromBuf() aborts with -1 if the
filename of the backingStore can not be canonicalized:
#0 absolutePathFromBaseFile () at util/storage_file.c:541
#1 virStorageFileGetMetadataFromBuf () at util/storage_file.c:728
#2 virStorageFileGetMetadataFromFD () at util/storage_file.c:932
#3 virStorageBackendProbeTarget () at storage/storage_backend_fs.c:94
#4 virStorageBackendFileSystemRefresh () at storage/storage_backend_fs.c:849
#5 storagePoolStart () at storage/storage_driver.c:700
#6 virStoragePoolCreate () at libvirt.c:12471
...
Treat files which miss their backing file as standalone files.
Signed-off-by: Philipp Hahn <hahn@univention.de>
This patch adds support for external disk snapshots of inactive domains.
The snapshot is created by calling using qemu-img by calling:
qemu-img create -f format_of_snapshot -o
backing_file=/path/to/src,backing_fmt=format_of_backing_image
/path/to/snapshot
in case the backing image format is known or probing is allowed and
otherwise:
qemu-img create -f format_of_snapshot -o backing_file=/path/to/src
/path/to/snapshot
on each of the disks selected for snapshotting. This patch also modifies
the snapshot preparing function to support creating external snapshots
and to sanitize arguments. For now the user isn't able to mix external
and internal snapshots but this restriction might be lifted in the
future.
Some operations, APIs needs domain to be paused prior operation can be
performed, e.g. (managed-) save of a domain. The processors should be
restored in the end. However, if 'cont' fails for some reason, we log a
message but this is not sufficient as an event should be emitted as
well. Mgmt application can then decide what to do.
The code that was split out into the qemuDomainSaveMemory expands the
pointer containing the XML description of the domain that it gets from
higher layers. If the pointer changes the old one is invalid and the
upper layer function tries to free it causing an abort.
This patch changes the expansion of the original string to a new
allocation and copy of the contents.
After the connection to ESX 5.1 being broken since g1e7cd39, the fix
in bab7752c helped a bit, but still missed a spot, so the connection
is now successful, but some APIs (for example defineXML) don't work.
Two cases missing are added in this patch to avoid that.
qemu is sensitive to the order of arguments passed. Hence, if a
device requires a controller, the controller cmd string must
precede device cmd string. The same apply for controllers, when
for instance ccid controller requires usb controller. So
controllers create partial ordering in which they should be added
to qemu cmd line.
Some FDs may not implement fdatasync() functionality,
e.g. pipes. In that case EINVAL or EROFS is returned.
We don't want to fail then nor report any error.
Reported-by: Christophe Fergeau <cfergeau@redhat.com>
Some of the pre-snapshot check have restrictions wired in regarding
configuration options that influence taking of external checkpoints.
This patch removes restrictions that would inhibit taking of such a
snapshot.
This patch adds support to take external system checkpoints.
The functionality is layered on top of the previous disk-only snapshot
code. When the checkpoint is requested the domain memory is saved to the
memory image file using migration to file. (The user may specify to
take the memory image while the guest is live with the
VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag.)
The memory save image shares format with the image created by
virDomainSave() API.
Before now, libvirt supported only internal snapshots for active guests.
This patch renames this function to qemuDomainSnapshotCreateActiveInternal
to prepare the grounds for external active snapshots.
The new external system checkpoints will require an async job while the
snapshot is taken. This patch adds QEMU_ASYNC_JOB_SNAPSHOT to track this
job type.
The default behavior while creating external checkpoints is to pause the
guest while the memory state is captured. We want the users to sacrifice
space saving for creating the memory save image while the guest is live
to minimize downtime.
This patch adds a flag that causes the guest not to be paused before
taking the snapshot.
*include/libvirt/libvirt.h.in:
- add new paused reason: VIR_DOMAIN_PAUSED_SNAPSHOT
- add new flag for taking snapshot: VIR_DOMAIN_SNAPSHOT_CREATE_LIVE
*tools/virsh-domain-monitor.c:
- add string representation for VIR_DOMAIN_PAUSED_SNAPSHOT
*tools/virsh-snapshot.c:
- add support for VIR_DOMAIN_SNAPSHOT_CREATE_LIVE
*tools/virsh.pod:
- add docs for --live option added to use
VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag
The code that saves domain memory by migration to file can be reused
while doing external checkpoints of a machine. This patch extracts the
common code and places it in a separate function.
When pausing the guest while migration is running (to speed up
convergence) the virDomainSuspend API checks if the migration job is
active before entering the job. This could cause a possible race if the
virDomainSuspend is called while the job is active but ends before the
Suspend API enters the job (this would require that the migration is
aborted). This would cause a incorrect event to be emitted.
Both system checkpoint snapshots and disk snapshots were iterating
over all disks, doing a final sanity check before doing any work.
But since future patches will allow offline snapshots to be either
external or internal, it makes sense to share the pass over all
disks, and then relax restrictions in that pass as new modes are
implemented. Future patches can then handle external disks when
the domain is offline, then handle offline --disk-snapshot, and
finally, combine with migration to file to gain a complete external
system checkpoint snapshot of an active domain without using 'savevm'.
* src/qemu/qemu_driver.c (qemuDomainSnapshotDiskPrepare)
(qemuDomainSnapshotIsAllowed): Merge...
(qemuDomainSnapshotPrepare): ...into one function.
(qemuDomainSnapshotCreateXML): Update caller.
Now that the XML supports listing internal snapshots, it is worth
always populating the <memory> and <disks> element to match.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Always
parse disk info and set memory info.
There were not previous callers with require_match set to true.
I originally implemented this bool with the intent of supporting
ESX snapshot semantics, where the choice of internal vs. external
vs. non-checkpointable must be made at domain start, but as ESX
has not been wired up to use it yet, we might as well fix it to
work with our next qemu patch for now, and worry about any further
improvements (changing the bool to a flags argument) if the ESX
driver decides to use this function in the future.
* src/conf/snapshot_conf.c (virDomainSnapshotAlignDisks): Alter
logic when require_match is true to deal with new XML.
Each <domainsnapshot> can now contain an optional <memory>
element that describes how the VM state was handled, similar
to disk snapshots. The new element will always appear in
output; for back-compat, an input that lacks the element will
assume 'no' or 'internal' according to the domain state.
Along with this change, it is now possible to pass <disks> in
the XML for an offline snapshot; this also needs to be wired up
in a future patch, to make it possible to choose internal vs.
external on a per-disk basis for each disk in an offline domain.
At that point, using the --disk-only flag for an offline domain
will be able to work.
For some examples below, remember that qemu supports the
following snapshot actions:
qemu-img: offline external and internal disk
savevm: online internal VM and disk
migrate: online external VM
transaction: online external disk
=====
<domainsnapshot>
<memory snapshot='no'/>
...
</domainsnapshot>
implies that there is no VM state saved (mandatory for
offline and disk-only snapshots, not possible otherwise);
using qemu-img for offline domains and transaction for online.
=====
<domainsnapshot>
<memory snapshot='internal'/>
...
</domainsnapshot>
state is saved inside one of the disks (as in qemu's 'savevm'
system checkpoint implementation). If needed in the future,
we can also add an attribute pointing out _which_ disk saved
the internal state; maybe disk='vda'.
=====
<domainsnapshot>
<memory snapshot='external' file='/path/to/state'/>
...
</domainsnapshot>
This is not wired up yet, but future patches will allow this to
control a combination of 'virsh save /path/to/state' plus disk
snapshots from the same point in time.
=====
So for 1.0.1 (and later, as needed), I plan to implement this table
of combinations, with '*' designating new code and '+' designating
existing code reached through new combinations of xml and/or the
existing DISK_ONLY flag:
domain memory disk disk-only | result
-----------------------------------------
offline omit omit any | memory=no disk=int, via qemu-img
offline no omit any |+memory=no disk=int, via qemu-img
offline omit/no no any | invalid combination (nothing to snapshot)
offline omit/no int any |+memory=no disk=int, via qemu-img
offline omit/no ext any |*memory=no disk=ext, via qemu-img
offline int/ext any any | invalid combination (no memory to save)
online omit omit off | memory=int disk=int, via savevm
online omit omit on | memory=no disk=default, via transaction
online omit no/ext off | unsupported for now
online omit no on | invalid combination (nothing to snapshot)
online omit ext on | memory=no disk=ext, via transaction
online omit int off |+memory=int disk=int, via savevm
online omit int on | unsupported for now
online no omit any |+memory=no disk=default, via transaction
online no no any | invalid combination (nothing to snapshot)
online no int any | unsupported for now
online no ext any |+memory=no disk=ext, via transaction
online int/ext any on | invalid combination (disk-only vs. memory)
online int omit off |+memory=int disk=int, via savevm
online int no/ext off | unsupported for now
online int int off |+memory=int disk=int, via savevm
online ext omit off |*memory=ext disk=default, via migrate+trans
online ext no off |+memory=ext disk=no, via migrate
online ext int off | unsupported for now
online ext ext off |*memory=ext disk=ext, via migrate+transaction
* docs/schemas/domainsnapshot.rng (memory): New RNG element.
* docs/formatsnapshot.html.in: Document it.
* src/conf/snapshot_conf.h (virDomainSnapshotDef): New fields.
* src/conf/domain_conf.c (virDomainSnapshotDefFree)
(virDomainSnapshotDefParseString, virDomainSnapshotDefFormat):
Manage new fields.
* tests/domainsnapshotxml2xmltest.c: New test.
* tests/domainsnapshotxml2xmlin/*.xml: Update existing tests.
* tests/domainsnapshotxml2xmlout/*.xml: Likewise.
The libvirt coding standard is to use 'function(...args...)'
instead of 'function (...args...)'. A non-trivial number of
places did not follow this rule and are fixed in this patch.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When assigning the new persistent definition for a transient network
(thus making it persistent) the network needs to be marked persistent
before actually atempting to assign the definition.
Until now, the network undefine API was able to undefine only inactive
networks. The restriction doesn't make sense any more so this patch
implements changing networks to transient.
When a transient network was created some of the checks weren't run on
the definition allowing to start invalid networks.
This patch splits out code to the network validation function and
re-uses that code when creating transient networks.
The network driver didn't care about config files when a network was
destroyed, just when it was undefined leaving behind files for transient
networks.
This patch splits out the cleanup code to a helper function that handles
the cleanup if the inactive network object is being removed and re-uses
this code when getting rid of inactive networks.
The hosts file was created in the network definition function. This
patch moves the place the file is being created to the point where
dnsmasq is being started.
With our fix of mkostemp (pushed as 2b435c15) we define a macro
to compile with uclibc. However, this definition is conditional
and thus needs to be properly indented. Moreover, with this definition
sc_prohibit_mkstemp syntax-check rule keeps yelling:
src/util/logging.c:63:# define mkostemp(x,y) mkstemp(x)
maint.mk: use mkostemp with O_CLOEXEC instead of mkstemp
Therefore we should ignore this file for this rule.
BZ:https://bugzilla.redhat.com/show_bug.cgi?id=871273
when using virsh qemu-attach to attach an existing qemu process,
if it misses the -M option in qemu command line, libvirtd crashed
because the NULL value of def->os.machine in later use.
Example:
/usr/libexec/qemu-kvm -name foo \
-cdrom /var/lib/libvirt/images/boot.img \
-monitor unix:/tmp/demo,server,nowait \
error: End of file while reading data: Input/output error
error: Failed to reconnect to the hypervisor
This patch tries to set default machine type if the value of
def->os.machine is still NULL after qemu command line parsing.
* configure.ac docs/news.html.in libvirt.spec.in: update for the new release
* po/*.po*: update from transifex, a lot of added support e.g. Indian
languages, and regenerate
It turns out that calling virNodeGetCPUMap(conn, NULL, NULL, 0)
is both useful, and with Viktor's patches, common enough to
optimize. Since this interface hasn't been released yet, we
can change the RPC call.
A bit more background on the optimization - learning the cpu count
is a single file read (/sys/devices/system/cpu/possible), but
learning the number of online cpus can possibly trigger a file
read per cpu, depending on the age of the kernel, and all wasted
if the caller passed NULL for both arguments.
* src/nodeinfo.c (nodeGetCPUMap): Avoid bitmap when not needed.
* src/remote/remote_protocol.x (remote_node_get_cpu_map_args):
Supply two separate flags for needed arguments.
* src/remote/remote_driver.c (remoteNodeGetCPUMap): Update
caller.
* daemon/remote.c (remoteDispatchNodeGetCPUMap): Likewise.
* src/remote_protocol-structs: Regenerate.
Per the code comment in qemuCapsInitQMPBasic() and commit 43e23c7, we
should only use QMP for capabilities probing starting with 1.2 and
newer. The old code had dead logic that probed on 1.0 and newer.
Signed-off-by: Eric Blake <eblake@redhat.com>
This needs to be done before the container starts. Turning
off the mknod capability is noticed by systemd, which will
no longer attempt to create device nodes.
This eliminates SELinux AVC messages and ugly failure messages in the journal.
The string comparison logic was inverted and matched the first drive
that does *not* have the name we search for.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The QEMU -drive id= begins with libvirt's QEMU host drive prefix
("drive-"), which is stripped off in several places two convert between
host ("-drive") and guest ("-device") device names.
In the case of BlkIoTune it is unnecessary to strip the QEMU host drive
prefix because we operate on "info block"/"query-block" output that uses
host drive names.
Stripping the prefix incorrectly caused string comparisons to fail since
we were comparing the guest device name against the host device name.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Currently, when we are doing (managed) save, we insert the
iohelper between the qemu and OS. The pipe is created, the
writing end is passed to qemu and the reading end to the
iohelper. It reads data and write them into given file. However,
with write() being asynchronous data may still be in OS
caches and hence in some (corner) cases, all migration data
may have been read and written (not physically though). So
qemu will report success, as well as iohelper. However, with
some non local filesystems, where ENOSPACE is polled every X
time units, we may get into situation where all operations
succeeded but data hasn't reached the disk. And in fact will
never do. Therefore we ought sync caches to make sure data
has reached the block device on remote host.
QEMU uses 'i386' for its 32-bit x86 architecture, but libvirt
wants that to be 'i686', so we must fix it up
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
virPidFileReadPathIfAlive passed in an 'int *' where a 'pid_t *'
was expected, which breaks on Mingw64 targets. Also a few places
were using '%d' for formatting pid_t, change them to '%lld' and
force a cast to the longer type as done elsewhere in the same
file.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=871756
Commit cd1e8d1 assumed that systems new enough to have journald
also have mkostemp; but this is not true for uclibc.
For that matter, use of mkstemp[s] is unsafe in a multi-threaded
program. We should prefer mkostemp[s] in the first place.
* bootstrap.conf (gnulib_modules): Add mkostemp, mkostemps; drop
mkstemp and mkstemps.
* cfg.mk (sc_prohibit_mkstemp): New syntax check.
* tools/virsh.c (vshEditWriteToTempFile): Adjust caller.
* src/qemu/qemu_driver.c (qemuDomainScreenshot)
(qemudDomainMemoryPeek): Likewise.
* src/secret/secret_driver.c (replaceFile): Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainScreenshot): Likewise.
https://bugzilla.redhat.com/show_bug.cgi?id=871312
Recent fixes made almost all the right steps to make emulator pinned
to the cpuset of the whole domain in case <emulatorpin> isn't
specified, but qemudDomainGetEmulatorPinInfo still reports all the
CPUs even when cpuset is specified. This patch fixes that.
There are multiple reasons canonicalize_file_name() used in
absolutePathFromBaseFile helper can fail. This patch enhances error
reporting from that helper.
When there is no 'qemu-kvm' binary and the emulator used for a machine
is, for example, 'qemu-system-x86_64' that, by default, runs without
kvm enabled, libvirt still supplies '-no-kvm' option to this process,
even though it does not recognize such option (making the start of a
domain fail in that case).
This patch fixes building a command-line for QEMU machines without KVM
acceleration and is based on following assumptions:
- QEMU_CAPS_KVM flag means that QEMU is running KVM accelerated
machines by default (without explicitly requesting that using a
command-line option). It is the closest to the truth according to
the code with the only exception being the comment next to the
flag, so it's fixed in this patch as well.
- QEMU_CAPS_ENABLE_KVM flag means that QEMU is, by default, running
without KVM acceleration and in case we need KVM acceleration it
needs to be explicitly instructed to do so. This is partially
true for the past (this option essentially means that QEMU
recognizes the '-enable-kvm' option, even though it's almost the
same).
Three FORWARD chain rules are added and two INPUT chain rules
are added when a network is started but only the FORWARD chain
rules are removed when the network is destroyed.
I noticed this while answering a list question about Java bindings
of volume creation. All other functions that take xml logged xmlDesc.
* src/libvirt.c (virStorageVolCreateXML)
(virStorageVolCreateXMLFrom): Use consistent spelling of xmlDesc,
and log the argument.
This patch resolves: https://bugzilla.redhat.com/show_bug.cgi?id=871201
If libvirt is restarted after updating the dnsmasq or radvd packages,
a subsequent "virsh net-destroy" will fail to kill the dnsmasq/radvd
process.
The problem is that when libvirtd restarts, it re-reads the dnsmasq
and radvd pidfiles, then does a sanity check on each pid it finds,
including checking that the symbolic link in /proc/$pid/exe actually
points to the same file as the path used by libvirt to execute the
binary in the first place. If this fails, libvirt assumes that the
process is no longer alive.
But if the original binary has been replaced, the link in /proc is set
to "$binarypath (deleted)" (it literally has the string " (deleted)"
appended to the link text stored in the filesystem), so even if a new
binary exists in the same location, attempts to resolve the link will
fail.
In the end, not only is the old dnsmasq/radvd not terminated when the
network is stopped, but a new dnsmasq can't be started when the
network is later restarted (because the original process is still
listening on the ports that the new process wants).
The solution is, when the initial "use stat to check for identical
inodes" check for identity between /proc/$pid/exe and $binpath fails,
to check /proc/$pid/exe for a link ending with " (deleted)" and if so,
truncate that part of the link and compare what's left with the
original binarypath.
A twist to this problem is that on systems with "merged" /sbin and
/usr/sbin (i.e. /sbin is really just a symlink to /usr/sbin; Fedora
17+ is an example of this), libvirt may have started the process using
one path, but /proc/$pid/exe lists a different path (indeed, on F17
this is the case - libvirtd uses /sbin/dnsmasq, but /proc/$pid/exe
shows "/usr/sbin/dnsmasq"). The further bit of code to resolve this is
to call virFileResolveAllLinks() on both the original binarypath and
on the truncated link we read from /proc/$pid/exe, and compare the
results.
The resulting code still succeeds in all the same cases it did before,
but also succeeds if the binary was deleted or replaced after it was
started.
After separating 5.x and 5.1 versions of ESX, we forgot to add 5.1
into the list of allowed connections, so connections to 5.1 fail since
v1.0.0-rc1-5-g1e7cd39
Ever since commit eefb881, ATTRIBUTE_NONNULL has normally been a
no-op under gcc (since it tends to cause more bugs than it cures
given gcc's current lame implementation of the attribute). However,
the macro is still useful to Coverity and other static-analysis
tools, but only if we use it correctly. Coverity follows gcc's lead
in accepting function declarations with attributes at the end, but
function bodies must attach attributes to the return type. That is,
these are valid:
void foo(void *arg) ATTRIBUTE_NONNULL(1);
void ATTRIBUTE_NONNULL(1) foo(void *arg);
void ATTRIBUTE_NONNULL(1) foo(void *arg) {}
but this is not:
void foo(void *arg) ATTRIBUTE_NONNULL(1) {}
even though you don't get a compile failure until you do static
analysis. Bug introduced in commit 80533ca, with these symptoms:
nodeinfo.c:206: error: expected ',' or ';' before '{' token
cc1: warning: unrecognized command line option "-Wno-suggest-attribute=const"
cc1: warning: unrecognized command line option "-Wno-suggest-attribute=pure"
make[3]: *** [libvirt_driver_la-nodeinfo.lo] Error 1
* src/nodeinfo.c (virNodeParseNode): Fix syntax error when
non-null attribute is in use.
Commit 34e8f63a3 altered virfile.o to drag in additional symbols,
which in turn led to pulling in other .o files and eventually causing
a link failure when systemtap probes are enabled, such as:
./.libs/libvirt_util.a(libvirt_util_la-event_poll.o): In function `virEventPollRunOnce':
/home/dummy/libvirt/src/util/event_poll.c:614: undefined reference to `libvirt_event_poll_run_semaphore'
./.libs/libvirt_util.a(libvirt_util_la-event_poll.o):(.note.stapsdt+0x24): undefined reference to `libvirt_event_poll_add_handle_semaphore'
Even though libvirt_iohelper and libvirt_parthelper don't directly
use the portion of virfile.o that drags in probing, it was easier
to satisfy the linker and get the build back up, than to figure out
whether it is even possible or worth trying to disentangle the mess.
* src/Makefile.am (libvirt_iohelper_LDADD)
(libvirt_parthelper_LDADD): Use libvirt_probes.lo when needed.
Currently, we use iohelper when saving/restoring a domain.
However, if there's some kind of error (like I/O) it is not
propagated to libvirt. Since it is not qemu who is doing
the actual write() it will not get error. The iohelper does.
Therefore we should check for iohelper errors as it makes
libvirt more user friendly.
In the XML warning, we print a virsh command line that can be used to
edit that XML. This patch prints UUIDs if the entity name contains
special characters (like shell metacharacters, or "--" that would break
parsing of the XML comment). If the entity doesn't have a UUID, just
print the virsh command that can be used to edit it.
When using block copy to pivot over to a new chain, the backing files
for the new chain might still need labeling (particularly if the user
passes --reuse-ext with a relative backing file name). Relabeling a
file that is already labeled won't hurt, so this just labels the entire
chain at the point of the pivot. Doing the relabel of the chain uses
the fact that we already safely probed the file type of an external
file at the start of the block copy.
* src/qemu/qemu_driver.c (qemuDomainBlockPivot): Relabel chain before
asking qemu to pivot.
Use the recent addition of qemuDomainPrepareDiskChainElement to
obtain locking manager lease, permit a block device through cgroups,
and set the SELinux label; then audit the fact that we hand a new
file over to qemu. Alas, releasing the lease and label at the end
of the mirroring is a trickier prospect (we would have to trace the
backing chain of both source and destination, and be sure not to
revoke rights to any part of the chain that is shared), so for now,
virDomainBlockJobAbort still leaves things with additional access
granted (as block-pull and block-commit have the same problem of
not clamping access after completion, a future cleanup would cover
all three commands).
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): Set up labeling.
Support the REUSE_EXT flag, in part by copying sanity checks from
snapshot code. This code introduces a case of probing an external
file for its type; such an action would be a security risk if the
existing file is supposed to be raw but the contents resemble some
other format; however, since the virDomainBlockRebase API has a
flag to force treating the file as raw rather than probe, we can
assume that probing is safe in all other instances. Besides, if
we don't probe or force raw, then qemu will.
* src/qemu/qemu_driver.c (qemuDomainBlockRebase): Allow REUSE_EXT
flag.
(qemuDomainBlockCopy): Wire up flag, and add some sanity checks.
Minimal patch to wire up all the pieces in the previous patches
to actually enable a block copy job. By minimal, I mean that
qemu creates the file (that is, no REUSE_EXT flag support yet),
SELinux must be disabled, a lock manager is not informed, and the
audit logs aren't updated. But those will be added as
improvements in future patches.
This patch is designed so that if we ever add a future API
virDomainBlockCopy with more bells and whistles (such as letting
the user specify a destination image format different than the
source), where virDomainBlockRebase is a wrapper around the
simpler portions of the new functionality, then the new API can
just reuse the new qemuDomainBlockCopy function and already
support _SHALLOW and _REUSE_EXT flags. Also note that libvirt.c
already filtered the new flags if _COPY is not present, so that
we are not impacting the case of BlockRebase being a wrapper
around BlockPull.
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): New function.
(qemuDomainBlockRebase): Call it when appropriate.
Since libvirt drops locks between issuing a monitor command and
getting a response, it is possible for libvirtd to be restarted
before getting a response on a block-job-complete command; worse, it
is also possible for the guest to shut itself down during the window
while libvirtd is down, ending the qemu process. A management app
needs to know if the pivot happened (and the destination file
contains guest contents not in the source) or failed (and the source
file contains guest contents not in the destination), but since
the job is finished, 'query-block-jobs' no longer tracks the
status of the job, and if the qemu process itself has disappeared,
even 'query-block' cannot be checked to ask qemu its current state.
At the time of this patch, the design for persistent bitmap has not
been clarified, so a followup patch will be needed once qemu
actually figures out how to expose it, and we figure out how to use
it. In the meantime, we have a solution that avoids the worst of
the problem. [This problem was first analyzed with the RHEL 6.3
__com.redhat_drive-reopen command; which partly explains why
upstream qemu 1.3 ditched the drive-reopen idea and went with
block-job-complete plus persistent bitmap instead.]
If we surround 'drive-reopen' with a pause/resume pair, then we can
guarantee that the guest cannot modify either source or destination
files in the window of libvirtd uncertainty, and the management app
is guaranteed that either libvirt knows the outcome and reported it
correctly; or that on libvirtd restart, the guest will still be
paused and that the qemu process cannot have disappeared due to
guest shutdown; and use that as a clue that the management app must
implement recovery protocol, with both source and destination files
still being in sync and with 'query-block' still being an option as
part of that recovery. My testing shows that the pause window will
typically be only a fraction of a second.
* src/qemu/qemu_driver.c (qemuDomainBlockPivot): Pause around
drive-reopen.
(qemuDomainBlockJobImpl): Update caller.
This is the bare minimum to end a copy job (of course, until a
later patch adds the ability to start a copy job, this patch
doesn't do much in isolation; I've just split the patches to
ease the review).
This patch intentionally avoids SELinux, lock manager, and audit
actions. Also, if libvirtd restarts at the exact moment that a
'block-job-complete' is in flight, the proposed proper way to
detect the outcome of that would be with a persistent bitmap and
some additional query commands when libvirtd restarts. This
patch is enough to test the common case of success when used
correctly, while saving the subtleties of proper cleanup for
worst-case errors for later.
When a mirror job is started, cancelling the job safely reverts back
to the source disk, regardless of whether the destination is in
phase 1 (streaming, in which case the destination is worthless) or
phase 2 (mirroring, in which case the destination is synced up to
the source at the time of the cancel). Our existing code does just
fine in either phase, other than some bookkeeping cleanup; this
implements live block copy.
Ideas for future enhancements via new flags:
Depending on when persistent bitmap support is added, it may be
worth adding a VIR_DOMAIN_REBASE_COPY_ATOMIC flag that fails up
front if we detect an older qemu with risky pivot operation.
Interesting side note: while snapshot-create --disk-only creates a
copy of the disk at a point in time by moving the domain on to a
new file (the copy is the file now in the just-extended backing
chain), blockjob --abort of a copy job creates a copy of the disk
while keeping the domain on the original file. There may be
potential improvements to the snapshot code to exploit block copy
over multiple disks all at one point in time. And, if
'block-job-cancel' were made part of 'transaction', you could
copy multiple disks at the same point in time without pausing
the domain. This also implies we may want to add a --quiesce flag
to virDomainBlockJobAbort, so that when breaking a mirror (whether
by cancel or pivot), the side of the mirror that we are abandoning
is at least in a stable state with regards to guest I/O.
* src/qemu/qemu_driver.c (qemuDomainBlockJobAbort): Accept new flag.
(qemuDomainBlockPivot): New helper function.
(qemuDomainBlockJobImpl): Implement it.
Handle the new type of block copy event and info. Of course,
this patch does nothing until a later patch actually allows the
creation/abort of a block copy job.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_JOB_READY): New
block job status.
* src/libvirt.c (virDomainBlockRebase): Document the event.
* src/qemu/qemu_monitor_json.c (eventHandlers): New event.
(qemuMonitorJSONHandleBlockJobReady): New function.
(qemuMonitorJSONGetBlockJobInfoOne): Translate new job type.
(qemuMonitorJSONHandleBlockJobImpl): Handle new event and job type.
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Recognize
the event to minimize snooping.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Snoop a successful
info query to save effort on a pivot request.
For now, disk migration via block copy job is not implemented in
libvirt. But when we do implement it, we have to deal with the
fact that qemu does not yet provide an easy way to re-start a qemu
process with mirroring still intact. Paolo has proposed an idea
for a persistent dirty bitmap that might make this possible, but
until that design is complete, it's hard to say what changes
libvirt would need. Even something like 'virDomainSave' becomes
hairy, if you realize the implications that 'virDomainRestore'
would be stuck with recreating the same mirror layout.
But if we step back and look at the bigger picture, we realize that
the initial client of live storage migration via disk mirroring is
oVirt, which always uses transient domains, and that if a transient
domain is destroyed while a mirror exists, oVirt can easily restart
the storage migration by creating a new domain that visits just the
source storage, with no loss in data.
We can make life a lot easier by being cowards for now, forbidding
certain operations on a domain. This patch guarantees that we
never get in a state where we would have to restart a domain with
a mirroring block copy, by preventing saves, snapshots, migration,
hot unplug of a disk in use, and conversion to a persistent domain
(thankfully, it is still relatively easy to 'virsh undefine' a
running domain to temporarily make it transient, run tests on
'virsh blockcopy', then 'virsh define' to restore the persistence).
Later, if the qemu design is enhanced, we can relax our code.
The change to qemudDomainDefine looks a bit odd for undoing an
assignment, rather than probing up front to avoid the assignment,
but this is because of how virDomainAssignDef combines both a
lookup and assignment into a single function call.
* src/conf/domain_conf.h (virDomainHasDiskMirror): New prototype.
* src/conf/domain_conf.c (virDomainHasDiskMirror): New function.
* src/libvirt_private.syms (domain_conf.h): Export it.
* src/qemu/qemu_driver.c (qemuDomainSaveInternal)
(qemuDomainSnapshotCreateXML, qemuDomainRevertToSnapshot)
(qemuDomainBlockJobImpl, qemudDomainDefine): Prevent dangerous
actions while block copy is already in action.
* src/qemu/qemu_hotplug.c (qemuDomainDetachDiskDevice): Likewise.
* src/qemu/qemu_migration.c (qemuMigrationIsAllowed): Likewise.
Upstream qemu 1.3 is adding two new monitor commands, 'drive-mirror'
and 'block-job-complete'[1], which can drive live block copy and
storage migration. [Additionally, RHEL 6.3 had backported an earlier
version of most of the same functionality, but under the names
'__com.redhat_drive-mirror' and '__com.redhat_drive-reopen' and with
slightly different JSON arguments, and has been using patches similar
to these upstream patches for several months now.]
The libvirt API virDomainBlockRebase as already committed for 0.9.12
is flexible enough to expose the basics of block copy, but some
additional features in the 'drive-mirror' qemu command, such as
setting error policy, setting granularity, or using a persistent
bitmap, may later require a new libvirt API virDomainBlockCopy. I
will wait to add that API until we know more about what qemu 1.3
will finally provide.
This patch caters only to the upstream qemu 1.3 interface, although
I have proven that the changes for RHEL 6.3 can be isolated to
just qemu_monitor_json.c, and the rest of this series will
gracefully handle either interface once the JSON differences are
papered over in a downstream patch.
For consistency with other block job commands, libvirt must handle
the bandwidth argument as MiB/sec from the user, even though qemu
exposes the speed argument as bytes/sec; then again, qemu rounds
up to cluster size internally, so using MiB hides the worst effects
of that rounding if you pass small numbers.
[1]https://lists.gnu.org/archive/html/qemu-devel/2012-10/msg04123.html
* src/qemu/qemu_capabilities.h (QEMU_CAPS_DRIVE_MIRROR)
(QEMU_CAPS_DRIVE_REOPEN): New bits.
* src/qemu/qemu_capabilities.c (qemuCaps): Name them.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONCheckCommands): Set
them.
(qemuMonitorJSONDriveMirror, qemuMonitorDrivePivot): New functions.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDriveMirror)
(qemuMonitorDrivePivot): Declare them.
* src/qemu/qemu_monitor.c (qemuMonitorDriveMirror)
(qemuMonitorDrivePivot): New passthroughs.
* src/qemu/qemu_monitor.h (qemuMonitorDriveMirror)
(qemuMonitorDrivePivot): Declare them.
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=862515
which describes inconsistencies in dealing with duplicate mac
addresses on network devices in a domain.
(at any rate, it resolves *almost* everything, and prints out an
informative error message for the one problem that isn't solved, but
has a workaround.)
A synopsis of the problems:
1) you can't do a persistent attach-interface of a device with a mac
address that matches an existing device.
2) you *can* do a live attach-interface of such a device.
3) you *can* directly edit a domain and put in two devices with
matching mac addresses.
4) When running virsh detach-device (live or config), only MAC address
is checked when matching the device to remove, so the first device
with the desired mac address will be removed. This isn't always the
one that's wanted.
5) when running virsh detach-interface (live or config), the only two
items that can be specified to match against are mac address and model
type (virtio, etc) - if multiple netdevs match both of those
attributes, it again just finds the first one added and assumes that
is the only match.
Since it is completely valid to have multiple network devices with the
same MAC address (although it can cause problems in many cases, there
*are* valid use cases), what is needed is:
1) remove the restriction that prohibits doing a persistent add of a
netdev with a duplicate mac address.
2) enhance the backend of virDomainDetachDeviceFlags to check for
something that *is* guaranteed unique (but still work with just mac
address, as long as it yields only a single results.
This patch does three things:
1) removes the check for duplicate mac address during a persistent
netdev attach.
2) unifies the searching for both live and config detach of netdevices
in the subordinate functions of qemuDomainModifyDeviceFlags() to use the
new function virDomainNetFindIdx (which matches mac address and PCI
address if available, checking for duplicates if only mac address was
specified). This function returns -2 if multiple matches are found,
allowing the callers to print out an appropriate message.
Steps 1 & 2 are enough to fully fix the problem when using virsh
attach-device and detach-device (which require an XML description of
the device rather than a bunch of commandline args)
3) modifies the virsh detach-interface command to check for multiple
matches of mac address and show an error message suggesting use of the
detach-device command in cases where there are multiple matching mac
addresses.
Later we should decide how we want to input a PCI address on the virsh
commandline, and enhance detach-interface to take a --address option,
eliminating the need to use detach-device
* src/conf/domain_conf.c
* src/conf/domain_conf.h
* src/libvirt_private.syms
* added new virDomainNetFindIdx function
* removed now unused virDomainNetIndexByMac and
virDomainNetRemoveByMac
* src/qemu/qemu_driver.c
* remove check for duplicate max from qemuDomainAttachDeviceConfig
* use virDomainNetFindIdx/virDomainNetRemove instead
of virDomainNetRemoveByMac in qemuDomainDetachDeviceConfig
* use virDomainNetFindIdx instead of virDomainIndexByMac
in qemuDomainUpdateDeviceConfig
* src/qemu/qemu_hotplug.c
* use virDomainNetFindIdx instead of a homespun loop in
qemuDomainDetachNetDevice.
* tools/virsh-domain.c: modified detach-interface command as described
above
It turns out that the cpuacct results properly account for offline
cpus, and always returns results for every possible cpu, not just
the online ones. So there is no need to check the map of online
cpus in the first place, merely only a need to know the maximum
possible cpu. Meanwhile, virNodeGetCPUBitmap had a subtle change
from returning the maximum id to instead returning the width of
the bitmap (one larger than the maximum id) in commit 2f4c5338,
which made this code encounter some off-by-one logic leading to
bad error messages when a cpu was offline:
$ virsh cpu-stats dom
error: Failed to virDomainGetCPUStats()
error: An error occurred, but the cause is unknown
Cleaning this up unraveled a chain of other unused variables.
* src/qemu/qemu_driver.c (qemuDomainGetPercpuStats): Drop
pointless check for cpumap changes, and use correct number of
cpus. Simplify signature.
(qemuDomainGetCPUStats): Adjust caller.
* src/nodeinfo.h (nodeGetCPUCount): New prototype.
(nodeGetCPUBitmap): Drop unused parameter.
* src/nodeinfo.c (nodeGetCPUBitmap): Likewise.
(nodeGetCPUMap): Adjust caller.
(nodeGetCPUCount): New function.
* src/libvirt_private.syms (nodeinfo.h): Export it.
Commit 246143b fixed a warning on older gcc, but caused a warning
on newer gcc.
../../src/rpc/virnetserverservice.c: In function 'virNetServerServiceNewPostExecRestart':
../../src/rpc/virnetserverservice.c:277:41: error: pointer targets in passing argument 3 of 'virJSONValueObjectGetNumberUint' differ in signedness [-Werror=pointer-sign]
* src/rpc/virnetserverservice.c: Use correct types.
With older gcc and 64-bit size_t, the compiler issues a real warning:
rpc/virnetserverservice.c:277: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
Introduced in commit 0cc79255. Depending on machine endianness,
this warning represents a real bug that could mis-interpret the
value by a factor of 2^32. I don't know why I couldn't get newer
gcc to report the same warning message.
* src/rpc/virnetserverservice.c
(virNetServerServiceNewPostExecRestart): Use temporary instead.
Found this when building on RHEL5:
parallels/parallels_storage.c: In function 'parallelsStorageOpen':
parallels/parallels_storage.c:180: error: 'for' loop initial declaration used outside C99 mode
(and similar error in parallels_driver.c). This was in spite of
configuring with "-Wno-error".
This commit changes the behavior of LIBVIRT_DEBUG=1 libvirtd:
$ git show 7022b09111
commit 7022b09111
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Thu Sep 27 13:13:09 2012 +0100
Automatically enable systemd journal logging
Probe to see if the systemd journal is accessible, and if
so enable logging to the journal by default, rather than
stderr (current default under systemd).
Previously 'LIBVIRT_DEBUG=1 /usr/sbin/libvirtd' would show all debug
output to stderr, now it send debug output to the journal.
Only use the journal by default if running in daemon mode, or
if stdin is _not_ a tty. This should make libvirtd launched from
systemd use the journal, but preserve the old behavior in most
situations.
This was found during testing of the fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=868483
networkValidate was supposed to check for the existence of multiple
portgroups and report an error if this was encountered. It did, but
there were two problems:
1) even though it logged an error, it still returned success, allowing
the operation to continue.
2) It could exit the portgroup checking loop early (or possibly not
even do it once) if a vlan tag was supplied in the base network config
or one of the portgroups.
This patch fixes networkValidate to return failure in addition to
logging the error, and also changes it to not exit the portgroup
checking loop early. The logic was a bit off in the checking for vlan
anyway, and it's intertwined with fixing the early loop exit, so I
fixed that as well. Now it correctly checks for combinations where a
<virtualport> is specified in the base network def and <vlan> is given
in a portgroup, as well as the opposite (<vlan> in base network def
and <virtualport> in portgroup), and ignores the case of a disallowed
vlan when using *no* portgroup if there is a default portgroup (since
in that case there is no way to not use any portgroup).
Added an implemention of virNodeGetCPUMap to nodeinfo.c,
(nodeGetCPUMap) which can be used by all drivers for a Linux
hypervisor host.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Callers should not need to know what the name of the file to
be read in the Linux-specific version of nodeGetCPUmap;
furthermore, qemu cares about online cpus, not present cpus,
when determining which cpus to skip.
While at it, I fixed the fact that we were computing the maximum
online cpu id by doing a slow iteration, when what we really want
to know is the max available cpu.
* src/nodeinfo.h (nodeGetCPUmap): Rename...
(nodeGetCPUBitmap): ...and simplify signature.
* src/nodeinfo.c (linuxParseCPUmax): New function.
(linuxParseCPUmap): Simplify and alter signature.
(nodeGetCPUBitmap): Change implementation.
* src/libvirt_private.syms (nodeinfo.h): Reflect rename.
* src/qemu/qemu_driver.c (qemuDomainGetPercpuStats): Update
caller.
Sometimes it's handy to know how many bits are set.
* src/util/bitmap.h (virBitmapCountBits): New prototype.
(virBitmapNextSetBit): Use correct type.
* src/util/bitmap.c (virBitmapNextSetBit): Likewise.
(virBitmapSetAll): Maintain invariant of clear tail bits.
(virBitmapCountBits): New function.
* src/libvirt_private.syms (bitmap.h): Export it.
* tests/virbitmaptest.c (test2): Test it.
Also remove warnings for upcoming versions. There hadn't been any
compatibility problems with new ESX version over the whole lifetime
of the ESX driver, so I don't expect any in the future.
Update documentation to mention vSphere 5.x support.
Qemu has added some new feature flags. This patch adds them to libvirt.
The new features are for the cpuid function 0x7 that takes an argument
in the ecx register. Currently only 0x0 is used as the argument so I was
lazy and I just clear the registers to 0 before calling cpuid. In future
when there maybe will be some other possible arguments, we will need to
improve the cpu detection code to take this into account.
On one hand, numad probably will manage the affinity of domain process
dynamically in future. On the other hand, even numad won't manage it,
it still could confusion. Let's make things simpler enough to avoid
the lair for now.
When the cpu placement model is "auto", it sets the affinity for
domain process with the advisory nodeset from numad, however,
creating cgroup for the domain process (called emulator thread
in some contexts) later overrides that with pinning it to all
available pCPUs.
How to reproduce:
* Configure the domain with "auto" placement for <vcpu>, e.g.
<vcpu placement='auto'>4</vcpu>
* % virsh start dom
* % cat /proc/$dompid/status
Though the emulator cgroup cause conflicts, but we can't simply
prohibit creating it, as other tunables are still useful, such
as "emulator_period", which is used by API
virDomainSetSchedulerParameter. So this patch doesn't prohibit
creating the emulator cgroup, but inherit the nodeset from numad,
and reset the affinity for domain process.
* src/qemu/qemu_cgroup.h: Modify definition of qemuSetupCgroupForEmulator
to accept the passed nodenet
* src/qemu/qemu_cgroup.c: Set the affinity with the passed nodeset
Abstract the codes to prepare cpumap into a helper a function,
which can be used later.
* src/qemu/qemu_process.h: Declare qemuPrepareCpumap
* src/qemu/qemu_process.c: Implement qemuPrepareCpumap, and use it.
- Defined the wire protocol format for virNodeGetCPUMap and its
arguments
- Implemented remote method invocation (remoteNodeGetCPUMap)
- Implemented method dispatcher (remoteDispatchNodeGetCPUMap)
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Adding a new API to obtain information about the
host node's present, online and offline CPUs.
int virNodeGetCPUMap(virConnectPtr conn,
unsigned char **cpumap,
unsigned int *online,
unsigned int flags);
The function will return the number of CPUs present on the host
or -1 on failure;
If cpumap is non-NULL virNodeGetCPUMap will allocate an array
containing a bit map representation of the online CPUs. It's
the callers responsibility to deallocate cpumap using free().
If online is non-NULL, the variable pointed to will contain
the number of online host node CPUs.
The variable flags has been added to support future extensions
and must be set to 0.
Extend the driver structure by nodeGetCPUMap entry in support of the
new API virNodeGetCPUMap.
Added implementation of virNodeGetCPUMap to libvirt.c
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Transport Open vSwitch per-port data during live
migration by using the utility functions
virNetDevOpenvswitchGetMigrateData() and
virNetDevOpenvswitchSetMigrateData().
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
Add utility functions for Open vSwitch to both save
per-port data before a live migration, and restore the
per-port data after a live migration.
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
Add the ability for the Qemu V3 migration protocol to
include transporting network configuration. A generic
framework is proposed with this patch to allow for the
transfer of opaque data.
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Laine Stump <laine@laine.org>
In commit 371ddc98, I mistakenly added the check for sysctl
version 9 after setting the hypercall version to 1, which will
fail with
error : xenHypervisorDoV1Op:967 : Unable to issue hypervisor
ioctl 3166208: Function not implemented
This check should be included along with the others that use
hypercall version 2.
When restoring selinux labels after a VM is stopped, any non-standard
path that doesn't have a default selinux label causes the process
to stop and exit early. This isn't really an error condition IMO.
Of course the selinux API could be erroring for some other reason
but hopefully that's rare enough to not need explicit handling.
Common example here is storing disk images in a non-standard location
like under /mnt.
We put a comment containing "virsh edit <domain_name>" at the start of
the XML. W3C recommendation forbids the use of "--" in comments [1] and
libvirt can't parse it either. This patch omits the domain name if it
contains a double hyphen.
[1] http://www.w3.org/TR/REC-xml/#sec-comments
Rename the 'wait' parameter to 'loop'.
This silences the warning:
storage/storage_backend.c:1348:34: error: declaration of 'wait' shadows
a global declaration [-Werror=shadow]
and fixes the build with -Werror.
--
Note: loop is pool backwards.
The snapshot code when reusing an existing file had hard-to-read
logic, as well as a missing sanity check: REUSE_EXT should require
the destination to already be present.
* src/qemu/qemu_driver.c (qemuDomainSnapshotDiskPrepare): Require
destination on REUSE_EXT, rename variable for legibility.
Fixes a build failure on cygwin:
cc1: warnings being treated as errors
security/security_dac.c: In function 'virSecurityDACSetProcessLabel':
security/security_dac.c:862:5: error: format '%u' expects type 'unsigned int', but argument 7 has type 'uid_t' [-Wformat]
security/security_dac.c:862:5: error: format '%u' expects type 'unsigned int', but argument 8 has type 'gid_t' [-Wformat]
* src/security/security_dac.c (virSecurityDACSetProcessLabel)
(virSecurityDACGenLabel): Use proper casts.
virStorageVolLookupByPath is an API call that virt-manager uses
quite a bit when dealing with storage. This call use BackendStablePath
which has several usleep() heuristics that can be tripped up
and hang virt-manager for a while.
Current example: an empty mpath pool pointing to /dev/mapper makes
_any_ calls to virStorageVolLookupByPath take 5 seconds.
The sleep heuristics are actually only needed in certain cases
when we are waiting for new storage to appear, so let's skip the
timeout steps when calling from LookupByPath.
Currently it's assumed that qemu always supports VNC, however it is
definitely possible to compile qemu without VNC support so we should at
the very least check for it and handle that correctly.
Yet another instance of where using plain open() mishandles files
that live on root-squash NFS, and where improving the API can
improve the chance of a successful probe.
* src/util/storage_file.h (virStorageFileProbeFormat): Alter
signature.
* src/util/storage_file.c (virStorageFileProbeFormat): Use better
method for opening file.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Update caller.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Likewise.
In v2 migration protocol, XML is obtained by calling domainGetXMLDesc.
This includes the default USB controller in XML, which breaks migration
to older libvirt (before 0.9.2).
Commit 409b5f5495
qemu: Emit compatible XML when migrating a domain
only fixed this for v3 migration.
This patch uses the new VIR_DOMAIN_XML_MIGRATABLE flag (detected by
VIR_DRV_FEATURE_XML_MIGRATABLE) to obtain XML without the default controller,
enabling backward v2 migration.
As we switched to setting capabilities based on QMP communication,
qemu seamless-migration capability was not set. In the -help output
this knob is called seamless-migration=[on|off]. The equivalent in
QMP world is SPICE_MIGRATE_COMPLETED event (qemu upstream commit
2fdd16e2).
This resolves: https://bugzilla.redhat.com/show_bug.cgi?id=868483
virNetworkUpdate, virNetworkDefine, and virNetworkCreate all three
allow network definitions to contain multiple <portgroup> elements
with default='yes'. Only a single default portgroup should be allowed
for each network.
This patch updates networkValidate() (called by both
virNetworkCreate() and virNetworkDefine()) and
virNetworkDefUpdatePortGroup (called by virNetworkUpdate() to not
allow multiple default portgroups.
This fixes the problem reported in:
https://bugzilla.redhat.com/show_bug.cgi?id=868389
Previously, the dnsmasq hosts file (used for static dhcp entries, and
addnhosts file (used for additional dns host entries) were only
created/referenced on the dnsmasq commandline if there was something
to put in them at the time the network was started. Once we can update
a network definition while it's active (which is now possible with
virNetworkUpdate), this is no longer a valid strategy - if there were
0 dhcp static hosts (resulting in no reference to the hosts file on the
commandline), then one was later added, the commandline wouldn't have
linked dnsmasq up to the file, so even though we create it, dnsmasq
doesn't pay any attention.
The solution is to just always create these files and reference them
on the dnsmasq commandline (almost always, anyway). That way dnsmasq
can notice when a new entry is added at runtime (a SIGHUP is sent to
dnsmasq by virNetworkUdpate whenever a host entry is added or removed)
The exception to this is that the dhcp static hosts file isn't created
if there are no lease ranges *and* no static hosts. This is because in
this case dnsmasq won't be setup to listen for dhcp requests anyway -
in that case, if the count of dhcp hosts goes from 0 to 1, dnsmasq
will need to be restarted anyway (to get it listening on the dhcp
port). Likewise, if the dhcp hosts count goes from 1 to 0 (and there
are no dhcp ranges) we need to restart dnsmasq so that it will stop
listening on port 67. These special situations are handled in the
bridge driver's networkUpdate() by checking for ((bool)
nranges||nhosts) both before and after the update, and triggering a
dnsmasq restart if the before and after don't match.
https://bugzilla.redhat.com/show_bug.cgi?id=866364
pointed out a crash due to virNetworkObjAssignDef free'ing
network->newDef without NULLing it afterward. A fix for this is in
upstream commit b7e9202401. While the
NULLing of newDef was a legitimate fix, newDef should have already
been empty (NULL) anyway (as indicated in the comment that was deleted
by that commit).
The reason that newDef had a non-NULL value (i.e. the root cause) was
that networkStartNetwork() had failed after populating
network->newDef, but then neglected to free/NULL newDef in the
cleanup.
(A bit of background here: network->newDef should contain the
persistent config of a network when a network is active (and of course
only when it is persisten), and NULL at all other times. There is also
a network->def which should contain the persistent definition of the
network when it is inactive, and the current live state at all other
times. The idea is that you can make changes to network->newDef which
will take effect the next time the network is restarted, but won't
mess with the current state of the network (virDomainObj has a similar
pair of virDomainDefs that behave in the same fashion). Personally I
think there should be a network->live and network->config, and the
location of the persistent config should *always* be in
network->config, but that's for a later cleanup).
Since I love things to be symmetric, I created a new function called
virNetworkObjUnsetDefTransient(), which reverses the effects of
virNetworkObjSetDefTransient(). I don't really like the name of the
new function, but then I also didn't really like the name of the old
one either (it's just named that way to match a similar function in
the domain conf code).
Gcc with optimization warns:
../../src/qemu/qemu_driver.c: In function 'qemuDomainBlockCommit':
../../src/qemu/qemu_driver.c:12813:46: error: 'disk' may be used uninitialized in this function [-Werror=maybe-uninitialized]
../../src/qemu/qemu_driver.c:12698:25: note: 'disk' was declared here
cc1: all warnings being treated as errors
so obviously I had only been testing with optimization off.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Guard cleanup.
I finally have all the pieces in place to perform a block-commit with
SELinux enforcing. There's still missing cleanup work when the commit
completes, but doing that requires tracking both the backing chain and
the base and top files within that chain in domain XML across libvirtd
restarts. Furthermore, from a security standpoint, once you have
granted access, you must assume any damage that can be done will be
done; later revoking access is nice to minimize the window of damage,
but less important as it does not affect the fact that damage can be
done in the first place. Therefore, deferring the revoke efforts until
we have better XML tracking of what chain operations are in effect,
including across a libvirtd restart, is reasonable.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Label disks as
needed.
(qemuDomainPrepareDiskChainElement): Cast away const.
Previously, snapshot code did its own permission granting (lock
manager, cgroup device controller, and security manager labeling)
inline. But now that we are adding block-commit and block-copy
which also have to change permissions, it's better to reuse
common code for the task. While snapshot should fall back to
no access if read-write access failed, block-commit will want to
fall back to read-only access. The common code doesn't know
whether failure to grant read-write access should revert to no
access (snapshot, block-copy) or read-only access (block-commit).
This code can also be used to revoke access to unused files after
block-pull.
It might be nice to clean things up in a future patch by adding
new functions to the lock manager, cgroup manager, and security
manager that takes a single file name and applies context of a
disk to that file, rather than the current semantics of applying
context to the entire chain already associated to a disk. That
way, we could avoid the games this patch plays of temporarily
swapping out the disk->src and related fields of the disk. But
that would involve more code changes, so this patch really is
the smallest hack for doing the necessary work; besides, this
patch is more or less code motion (the hack was already employed
by the snapshot creation code, we are just making it reusable).
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateSingleDiskActive)
(qemuDomainSnapshotUndoSingleDiskActive): Refactor labeling hacks...
(qemuDomainPrepareDiskChainElement): ...into new function.
Now that we can crawl the chain of backing files, we can do
argument validation and implement the 'shallow' flag. In
testing this, I discovered that it can be handy to pass the
shallow flag and an explicit base, as a means of validating
that the base is indeed the file we expected.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit): Crawl through
chain to implement shallow flag.
* src/libvirt.c (virDomainBlockCommit): Relax API.
This is the bare minimum to kick off a block commit. In particular,
flags support is missing (shallow requires us to crawl the backing
chain to determine the file name to pass to the qemu monitor command;
delete requires us to track what needs to be deleted at the time
the completion event fires). Also, we are relying on qemu to do
error checking (such as validating 'top' and 'base' as being members
of the backing chain), including the fact that the current qemu code
does not support committing the active layer (although it is still
planned to add that before qemu 1.3). Since the active layer won't
change, we have it easy and do not have to alter the domain XML.
Additionally, this will fail if SELinux is enforcing, because we fail
to grant qemu proper read/write access to the files it will modify.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit): New function.
(qemuDriver): Register it.
qemu 1.3 will be adding a 'block-commit' monitor command, per
qemu.git commit ed61fc1. It matches nicely to the libvirt API
virDomainBlockCommit.
* src/qemu/qemu_capabilities.h (QEMU_CAPS_BLOCK_COMMIT): New bit.
* src/qemu/qemu_capabilities.c (qemuCapsProbeQMPCommands): Set it.
* src/qemu/qemu_monitor.h (qemuMonitorBlockCommit): New prototype.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockCommit):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorBlockCommit): Implement it.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockCommit):
Likewise.
(qemuMonitorJSONHandleBlockJobImpl)
(qemuMonitorJSONGetBlockJobInfoOne): Handle new event type.
We used to walk the backing file chain at least twice per disk,
once to set up cgroup device whitelisting, and once to set up
security labeling. Rather than walk the chain every iteration,
which possibly includes calls to fork() in order to open root-squashed
NFS files, we can exploit the cache of the previous patch.
* src/conf/domain_conf.h (virDomainDiskDefForeachPath): Alter
signature.
* src/conf/domain_conf.c (virDomainDiskDefForeachPath): Require caller
to supply backing chain via disk, if recursion is desired.
* src/security/security_dac.c
(virSecurityDACSetSecurityImageLabel): Adjust caller.
* src/security/security_selinux.c
(virSecuritySELinuxSetSecurityImageLabel): Likewise.
* src/security/virt-aa-helper.c (get_files): Likewise.
* src/qemu/qemu_cgroup.c (qemuSetupDiskCgroup)
(qemuTeardownDiskCgroup): Likewise.
(qemuSetupCgroup): Pre-populate chain.
Technically, we should not be re-probing any file that qemu might
be currently writing to. As such, we should cache the backing
file chain prior to starting qemu. This patch adds the cache,
but does not use it until the next patch.
Ultimately, we want to also store the chain in domain XML, so that
it is remembered across libvirtd restarts, and so that the only
kosher way to modify the backing chain of an offline domain will be
through libvirt API calls, but we aren't there yet. So for now, we
merely invalidate the cache any time we do a live operation that
alters the chain (block-pull, block-commit, external disk snapshot),
as well as tear down the cache when the domain is not running.
* src/conf/domain_conf.h (_virDomainDiskDef): New field.
* src/conf/domain_conf.c (virDomainDiskDefFree): Clean new field.
* src/qemu/qemu_domain.h (qemuDomainDetermineDiskChain): New
prototype.
* src/qemu/qemu_domain.c (qemuDomainDetermineDiskChain): New
function.
* src/qemu/qemu_driver.c (qemuDomainAttachDeviceDiskLive)
(qemuDomainChangeDiskMediaLive): Pre-populate chain.
(qemuDomainSnapshotCreateSingleDiskActive): Uncache chain before
snapshot.
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Update
chain after block pull.
In order to temporarily label files read/write during a commit
operation, we need to crawl the backing chain and find the absolute
file name that needs labeling in the first place, as well as the
name of the file that owns the backing file.
* src/util/storage_file.c (virStorageFileChainLookup): New
function.
* src/util/storage_file.h: Declare it.
* src/libvirt_private.syms (storage_file.h): Export it.
In order to search for a backing file name as literally present
in a chain, we need to remember if the chain had relative names.
Also, searching for absolute names is easier if we only have
to canonicalize once, rather than on every iteration.
* src/util/storage_file.h (_virStorageFileMetadata): Add field.
* src/util/storage_file.c (virStorageFileGetMetadataFromBuf):
(virStorageFileFreeMetadata): Manage it
(absolutePathFromBaseFile): Store absolute names in canonical form.
Requiring pre-allocation was an unusual idiom. It allowed iteration
over the backing chain to use fewer mallocs, but made one-shot
clients harder to read. Also, this makes it easier for a future
patch to move away from opening fds on every iteration over the chain.
* src/util/storage_file.h (virStorageFileGetMetadataFromFD): Alter
signature.
* src/util/storage_file.c (virStorageFileGetMetadataFromFD): Allocate
return value.
(virStorageFileGetMetadata): Update clients.
* src/conf/domain_conf.c (virDomainDiskDefForeachPath): Likewise.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Likewise.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Likewise.
Previously, no one was using virStorageFileGetMetadata, and for good
reason - it couldn't support root-squash NFS. Change the signature
and make it useful to future patches, including enhancing the metadata
to recursively track the entire chain.
* src/util/storage_file.h (_virStorageFileMetadata): Add field.
(virStorageFileGetMetadata): Alter signature.
* src/util/storage_file.c (virStorageFileGetMetadata): Rewrite.
(virStorageFileGetMetadataRecurse): New function.
(virStorageFileFreeMetadata): Handle recursion.
Backing chains can end on a network protocol, such as nbd:xxx; we
should not attempt to probe the file system in this case.
* src/storage/storage_backend_fs.c (virStorageBackendProbeTarget):
Only probe files.
This is the last use of raw strings for disk formats throughout
the src/conf directory.
* src/conf/snapshot_conf.h (_virDomainSnapshotDiskDef): Store enum
rather than string for disk type.
* src/conf/snapshot_conf.c (virDomainSnapshotDiskDefClear)
(virDomainSnapshotDiskDefParseXML, virDomainSnapshotDefFormat):
Adjust users.
* src/qemu/qemu_driver.c (qemuDomainSnapshotDiskPrepare)
(qemuDomainSnapshotCreateSingleDiskActive): Likewise.
Express the default disk type as an enum, for easier handling.
* src/conf/capabilities.h (_virCaps): Store enum rather than
string for disk type.
* src/conf/domain_conf.c (virDomainDiskDefParseXML): Adjust
clients.
* src/qemu/qemu_driver.c (qemuCreateCapabilities): Likewise.
We have historically allowed 'aio' as a synonym for 'raw' for
back-compat to xen, but since a future patch will move to using
an enum value, we have to pick one to be our preferred output
name. This is a slight change in the output XML, but the sexpr
and xm outputs should still be identical, and the input XML can
still use either form.
* src/conf/domain_conf.c (virDomainDiskDefForeachPath): Move aio
back-compat...
(virDomainDiskDefParseXML): ...to parse time.
* src/xenxs/xen_sxpr.c (xenParseSxprDisks, xenFormatSxprDisk): ...and
to output time.
* src/xenxs/xen_xm.c (xenParseXM, xenFormatXMDisk): Likewise.
* tests/sexpr2xmldata/sexpr2xml-*.xml: Update tests.
When an image has no backing file, using VIR_STORAGE_FILE_AUTO
for its type is a bit confusing. Additionally, a future patch
would like to reserve a default value for the case of no file
type specified in the XML, but different from the current use
of -1 to imply probing, since probing is not always safe.
Also, a couple of file types were missing compared to supported
code: libxl supports 'vhd', and qemu supports 'fat' for directories
passed through as a file system.
* src/util/storage_file.h (virStorageFileFormat): Add
VIR_STORAGE_FILE_NONE, VIR_STORAGE_FILE_FAT, VIR_STORAGE_FILE_VHD.
* src/util/storage_file.c (virStorageFileMatchesVersion): Match
documentation when version probing not supported.
(cowGetBackingStore, qcowXGetBackingStore, qcow1GetBackingStore)
(qcow2GetBackingStoreFormat, qedGetBackingStore)
(virStorageFileGetMetadataFromBuf)
(virStorageFileGetMetadataFromFD): Take NONE into account.
* src/conf/domain_conf.c (virDomainDiskDefForeachPath): Likewise.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Likewise.
* src/conf/storage_conf.c (virStorageVolumeFormatFromString): New
function.
(poolTypeInfo): Use it.
Relabeling tapfd right after the tap device is created.
qemuPhysIfaceConnect is common function called both for static
netdevs and for hotplug netdevs.
Having hostuuid in migration cookie is a nice bonus since it provides an
easy way of detecting migration to the same host. However, requiring it
breaks backward compatibility with older libvirt releases.
Recently, patches were added support for (managed)saving, restoring, and
migrating domains with host USB devices. However, qemu driver would
still forbid migration of such domains because qemuMigrationIsAllowed
was not updated.
If we can't probe the architecture from QMP we parse the architecture
from the qemu binaries name. This results in the architecture being i386
instead of i686 which then results in QEMU_CAPS_PCI_MULTIBUS being unset
which gives a broken qemu command line.
This probably didn't show up earlier since most of the time there's also
a /usr/bin/qemu around which results in i686 capabilities.
which frees all allocated memory but doesn't set the passed pointer to
NULL. Therefore, we must do it ourselves. This is causing actual
libvirtd crash: Basically, when doing 'virsh net-edit' the newDef should
be dropped. And the memory is freed, indeed. However, the pointer is
not set to NULL but kept instead. And the next duo of calls 'virsh
net-start' and 'virsh net-destroy' starts the disaster. The latter one
does the same as 'virsh destroy'; it sees that newDef is nonNULL so it
replaces def with newDef (which has been freed already as said a few
lines above). Therefore any subsequent call accessing def will hit the ground.
When libvirt cannot find a suitable CPU model for host CPU (easily
reproducible by running libvirt in a guest), it would not provide CPU
topology in capabilities XML either. Even though CPU topology is known
and can be queried by virNodeGetInfo. With this patch, CPU topology will
always be provided in capabilities XML regardless on the presence of CPU
model.
Hypervisors are starting to support HyperV Enlightenment features that
improve behavior of guests running Microsoft Windows operating systems.
This patch adds support for the "relaxed" feature that improves timer
behavior and also establishes a framework to add these features in
future.
Currently we query-spice after the main migration has completed
before moving to next state. Qemu reports this as boolean (not
enclosed within quotes). Therefore it is not correct to use
virJSONValueObjectGetString but virJSONValueObjectGetBoolean instead.
The machine in the last output line of <qemu-binary> -M ?
was always reported as default machine even if this wasn't the
actual default. Trivial fix.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
According to our recent changes (clarifications), we should be pinning
qemu's emulator processes using the <vcpu> 'cpuset' attribute in case
there is no <emulatorpin> specified. This however doesn't work
entirely as expected and this patch should resolve all the remaining
issues.
When p2p migration fails early because qemuMigrationIsAllowed or
qemuMigrationIsSafe say migration should be cancelled, we fail to clear
the migration-out async job. As a result of that, further APIs called
for the same domain may fail with Timed out during operation: cannot
acquire state change lock.
Reported by Guido Winkelmann.
Added support for retrieving the XML defining a specific interface via
the udev based backend to virInterface. Implement the following APIs
for the udev based backend:
* virInterfaceGetXMLDesc()
Note: Does not support bond devices.
There are some descriptions not right in PowerPC CPU model driver.
This patch is to fix them.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Acked-by: Michal Privoznik <mprivozn@redhat.com>
Currently, the CPU model driver is not implemented for PowerPC.
Host's CPU information is needed to exposed to guests' XML file some
time.
This patch is to implement the callback functions of CPU model driver.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Acked-by: Michal Privoznik <mprivozn@redhat.com>
CPU version can be got by PVR on PowerPC. So this PVR is defined in
the CPU data in cpuData structure.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Acked-by: Michal Privoznik <mprivozn@redhat.com>
It should relabel tapfd of virtual network of type VIR_DOMAIN_NET_TYPE_DIRECT
rather than VIR_DOMAIN_NET_TYPE_NETWORK and VIR_DOMAIN_NET_TYPE_BRIDGE
(commit ae368ebfcc introduced this bug)
Caution: The context of the two hunks is identical other than indentation.
Please be extremely cautious of where the patch gets applied.
On F17 at least, this command fails:
$ sudo /usr/sbin/lvcreate --name sparsetest -L 0K --virtualsize 16384K vgvirt
Unable to create new logical volume with no extents
Which is unfortunate since allocation=0 is what virt-manager tries to use
by default.
Rather than telling the user 'don't do that', let's just give them the
smallest allocation possible if alloc=0 is requested.
https://bugzilla.redhat.com/show_bug.cgi?id=866481
libvirt started using sanlock_killpath to implement on_lockfailure
action. Since sanlock_killpath was introduced in sanlock 2.4, libvirt
fails to build with older sanlock.
Currently there is a restriction that multi-threaded applications
must manually call virInitialize, before threads start using
libvirt, because it is not thread-safe. By switching it to use
a virOnceControl initializer we gain thread safety, and thus
applications no longer need to manually call it. They can rely
on virConnectOpen invoking it for them.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Win32 platforms don't have SIGKILL defined, but they do have
SIGABRT. Since our virProcess wrapper treats anything which
isn't SIGTERM/SIGINT as equivalent to SIGKILL, just use
SIGABRT on Win32.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add two new APIs virNetServerNewPostExecRestart and
virNetServerPreExecRestart which allow a virNetServerPtr
object to be created from a JSON object and saved to a
JSON object, for the purpose of re-exec'ing a process.
This includes serialization of all registered services
and clients
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add two new APIs virNetServerClientNewPostExecRestart and
virNetServerClientPreExecRestart which allow a virNetServerClientPtr
object to be created from a JSON object and saved to a
JSON object, for the purpose of re-exec'ing a process.
This includes serialization of the connected socket associated
with the client
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add two new APIs virNetServerServiceNewPostExecRestart and
virNetServerServicePreExecRestart which allow a virNetServerServicePtr
object to be created from a JSON object and saved to a
JSON object, for the purpose of re-exec'ing a process.
This includes serialization of the listening sockets associated
with the service
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add two new APIs virNetSocketNewPostExecRestart and
virNetSocketPreExecRestart which allow a virNetSocketPtr
object to be created from a JSON object and saved to a
JSON object, for the purpose of re-exec'ing a process.
As well as saving the state in JSON format, the second
method will disable the O_CLOEXEC flag so that the open
file descriptors are preserved across the process re-exec()
Since it is not possible to serialize SASL or TLS encryption
state, an error will be raised if attempting to perform
serialization on non-raw sockets
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add two new APIs virLockSpaceNewPostExecRestart and
virLockSpacePreExecRestart which allow a virLockSpacePtr
object to be created from a JSON object and saved to a
JSON object, for the purposes of re-exec'ing a process.
As well as saving the state in JSON format, the second
method will disable the O_CLOEXEC flag so that the open
file descriptors are preserved across the process re-exec()
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The previously introduced virFile{Lock,Unlock} APIs provide a
way to acquire/release fcntl() locks on individual files. For
unknown reason though, the POSIX spec says that fcntl() locks
are released when *any* file handle referring to the same path
is closed. In the following sequence
threadA: fd1 = open("foo")
threadB: fd2 = open("foo")
threadA: virFileLock(fd1)
threadB: virFileLock(fd2)
threadB: close(fd2)
you'd expect threadA to come out holding a lock on 'foo', and
indeed it does hold a lock for a very short time. Unfortunately
when threadB does close(fd2) this releases the lock associated
with fd1. For the current libvirt use case for virFileLock -
pidfiles - this doesn't matter since the lock is acquired
at startup while single threaded an never released until
exit.
To provide a more generally useful API though, it is necessary
to introduce a slightly higher level abstraction, which is to
be referred to as a "lockspace". This is to be provided by
a virLockSpacePtr object in src/util/virlockspace.{c,h}. The
core idea is that the lockspace keeps track of what files are
already open+locked. This means that when a 2nd thread comes
along and tries to acquire a lock, it doesn't end up opening
and closing a new FD. The lockspace just checks the current
list of held locks and immediately returns VIR_ERR_RESOURCE_BUSY.
NB, the API as it stands is designed on the basis that the
files being locked are not being otherwise opened and used
by the application code. One approach to using this API is to
acquire locks based on a hash of the filepath.
eg to lock /var/lib/libvirt/images/foo.img the application
might do
virLockSpacePtr lockspace = virLockSpaceNew("/var/lib/libvirt/imagelocks");
lockname = md5sum("/var/lib/libvirt/images/foo.img");
virLockSpaceAcquireLock(lockspace, lockname);
NB, in this example, the caller should ensure that the path
is canonicalized before calculating the checksum.
It is also possible to do locks directly on resources by
using a NULL lockspace directory and then using the file
path as the lock name eg
virLockSpacePtr lockspace = virLockSpaceNew(NULL);
virLockSpaceAcquireLock(lockspace, "/var/lib/libvirt/images/foo.img");
This is only safe to do though if no other part of the process
will be opening the files. This will be the case when this
code is used inside the soon-to-be-reposted virlockd daemon
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Given Daniel's announcement[1], code targetting the next release will
be in 1.0.0, not 0.10.3. Changed mechanically with:
for f in $(git grep -l '0\(.\)10\13\b') ; do
sed -i -e 's/0\(.\)10\13/1\10\10/g' $f
done
[1]https://www.redhat.com/archives/libvir-list/2012-October/msg00403.html
* docs/formatdomain.html.in: Use 1.0.0 for next release.
* src/interface/interface_backend_udev.c: Likewise.
There was a crash possible when both <boot dev... and <boot
order... were specified due to virDomainDefParseBootXML() erroring out
before setting *tmp (which was free'd in cleanup). As a fix, I
created this cleanup that uses one pointer for all the temporary
stored XPath strings and values, plus this pointer is correctly
initialized to NULL.
BZ:https://bugzilla.redhat.com/show_bug.cgi?id=851981
When using macvtap, a character device gets first created by
kernel with name /dev/tapN, its selinux context is:
system_u:object_r:device_t:s0
Shortly, when udev gets notification when new file is created
in /dev, it will then jump in and relabel this file back to the
expected default context:
system_u:object_r:tun_tap_device_t:s0
There is a time gap happened.
Sometimes, it will have migration failed, AVC error message:
type=AVC msg=audit(1349858424.233:42507): avc: denied { read write } for
pid=19926 comm="qemu-kvm" path="/dev/tap33" dev=devtmpfs ino=131524
scontext=unconfined_u:system_r:svirt_t:s0:c598,c908
tcontext=system_u:object_r:device_t:s0 tclass=chr_file
This patch will label the tapfd device before qemu process starts:
system_u:object_r:tun_tap_device_t:MCS(MCS from seclabel->label)
This patch adds support for SUSPEND_DISK event; both lifecycle and
separated. The support is added for QEMU, machines are changed to
PMSUSPENDED, but as QEMU sends SHUTDOWN afterwards, the state changes
to shut-off. This and much more needs to be done in order for libvirt
to work with transient devices, wake-ups etc. This patch is not
aiming for that functionality.
Commit e8fd8757c8 changed 'const char *'
category to virLogSource enum. This changes it in virLogEatParams as
well, thus fixing the build with --disable-debug.
--
Hopefully moving the enum declarations is less ugly than using int.
Upstream kernel introduced new sysfs knob "merge_across_nodes" to
specify if pages from different numa nodes can be merged. When set
to 0, only pages which physically reside in the memory area of
same NUMA node can be merged. When set to 1, pages from all nodes
can be merged.
This patch supports the tuning by adding new param field
"shm_merge_across_nodes".
This patch resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=805071
to the extent that it can be resolved with current qemu functionality.
It attempts to detect as many situations as possible when the simple
operation of disconnecting an existing tap device from one bridge and
attaching it to another will satisfy the change requested in
virDomainUpdateDeviceFlags() for a network device. Before this patch,
that situation could only be detected if the pre-change interface
*and* the post-change interface definition were both "type='bridge'".
After this patch, it can also be detected if the before or after
interfaces are any combination of type='bridge' and type='network'
(the networks can be <forward mode='nat|route|bridge'>, as long as
they use a Linux host bridge and not macvtap connections).
This extra effort is especially useful since the recent discovery that
a netdev_del+netdev_add combo (to reconnect the network device with
completely different hostside configuration) doesn't work properly
with current qemu (1.2) unless it is accompanied by the matching
device_del+device_add - see this mailing list message for details:
http://lists.nongnu.org/archive/html/qemu-devel/2012-10/msg02355.html
(A slight modification of the patch referenced there has been prepared
to apply on top of this patch, but won't be pushed until qemu can be
made to work with it.)
* qemuDomainChangeNet needs access to the virDomainDeviceDef that
holds the new netdef (so that it can clear out the virDomainDeviceDef
if it ends up using the NetDef to replace the original), so the
virDomainNetDefPtr arg is replaced with a virDomainDeviceDefPtr.
* qemuDomainChangeNet previously checked for *some* changes to the
interface config, but this check was by no means complete. It was also
a bit disorganized.
This refactoring of the code is (I believe) complete in its check of
all NetDef attributes that might be changed, and either returns a
failure (for changes that are simply impossible), or sets one of three
flags:
needLinkStateChange - if the device link state needs to go up/down
needBridgeChange - if everything else is the same, but it needs
to be connected to a difference linux host
bridge
needReconnect - if the entire host side of the device needs
to be torn down and reconstructed (currently
non-working, as mentioned above)
Note that this function will refuse to make any change that requires
the *guest* side of the device to be detached (e.g. changing the PCI
address or mac address). Those would be disruptive enough to the guest
that it's reasonable to require an explicit detach/attach sequence
from the management application.
* As mentioned above, qemuDomainChangeNet also does its best to
understand when a simple change in attached bridge for the existing
tap device will work vs. the need to completely tear down/reconstruct
the host side of the device (including tap device).
This patch *does not* implement the "reconnect" code anyway - there is
a placeholder that turns that into an error. Rather, the purpose of
this patch is to replicate existing behavior with code that is ready
to have that functionality plugged in in a later patch.
* The expanded uses for qemuDomainChangeNetBridge meant that it needed
to be enhanced as well - it no longer replaces the original brname
string in olddev with the new brname; instead, it relies on the
caller to replace the *entire* olddev with newdev (since we've gone
to great lengths to assure they are functionally identical other
than the name of the bridge, this is now not only safe, but more
correct). Additionally, qemuDomainNetChangeBridge can now set the
bridge for type='network' interfaces as well as plain type='bridge'
interfaces. (Note that I had to make this change simultaneous to the
reorganization of qemuDomainChangeNet because the two are too
closely intertwined to separate).
This function really should have been taking virDevicePCIAddress*
instead of the inefficient virDevicePCIAddress (results in copying two
entire structs onto the stack rather than just two pointers), and
returning a bool true/false (not matching is not necessarily a
"failure", as a -1 return would imply, and also using "if
(!virDevicePCIAddressEqual(x, y))" to mean "if x == y" is just a bit
counterintuitive).
When vcpu placement is "auto", the domain process will be pinned
to advisory nodeset from querying numad, While emulatorpin will
override the pinning. That means both of them are to set the
pinning policy for domain process, but conflicts with each other.
This patch ingore emulatorpin if vcpu placement is "auto", because
<vcpu> placement can't be simply ignored for <numatune> placement
could default to it.
The onlined vcpu pinning policy should inherit def->cpuset if
it's not specified explicitly, and the affinity should be set
in this case. Oppositely, the offlined vcpu pinning policy should
be free()'ed.
Various APIs use cgroup to either set or get the statistics of
host or guest. Hotplug or hot unplug new vcpus without creating
or removing the cgroup for the vcpus could cause problems for
those APIs. E.g.
% virsh vcpucount dom
maximum config 10
maximum live 10
current config 1
current live 1
% virsh setvcpu dom 2
% virsh schedinfo dom --set vcpu_quota=1000
Scheduler : posix
error: Unable to find vcpu cgroup for rhel6.2(vcpu: 1): No such file or
directory
This patch fixes the problem by creating cgroups for each of the
onlined vcpus, and destroying cgroups for each of the offlined
vcpus.
Document for <vcpu>'s "cpuset" says:
Since 0.4.4, this element can contain an optional cpuset attribute,
which is a comma-separated list of physical CPU numbers that virtual
CPUs can be pinned to.
However, it's not the truth, libvirt actually pins the domain
process to the specified pCPUs by "cpuset" of <vcpu>. And the
vcpu thread are pinned to all available pCPUs if no <vcpupin>
is specified for it.
This patch is to implement the codes to inherit <vcpu>'s "cpuset" for
vcpu that doesn't have <vcpupin> specified, and <vcpupin>
for these vcpu will be ignored when formating. Underlying
driver implementation will make sure the vcpu thread pinned
to correct pCPUs.
Setting pinning policy for vcpu which exceeds current vcpus number
just makes no sense, however, it could cause various problems, E.g.
<vcpu current='1'>4</vcpu>
<cputune>
<vcpupin vcpuid='3' cpuset='4'/>
</cputune>
% virsh start linux
error: Failed to start domain linux
error: cannot set CPU affinity on process 32534: No such process
We must have some odd codes underlying which produces the
"on process 32534", but the point is why we not to prevent
earlier when parsing? Note that this is only one of the
problem it could cause.
This patch is to ignore the <vcpupin> for not onlined vcpus.
We are currently able to work only with non-translated SELinux
contexts, but we are using functions that work with translated
contexts throughout the code. This patch swaps all SELinux context
translation relative calls with their raw sisters to avoid parsing
problems.
The problems can be experienced with mcstrans for example. The
difference is that if you have translations enabled (yum install
mcstrans; service mcstrans start), fgetfilecon_raw() will get you
something like 'system_u:object_r:virt_image_t:s0', whereas
fgetfilecon() will return 'system_u:object_r:virt_image_t:SystemLow'
that we cannot parse.
I was trying to confirm that the _raw variants were here since the dawn of
time, but the only thing I see now is that it was imported together in
the upstream repo [1] from svn, so before 2008.
Thanks Laurent Bigonville for finding this out.
[1] http://oss.tresys.com/git/selinux.git
When startupPolicy set for a USB devices allows such device to be
missing, there was no way this could be detected from domain XML. With
this patch, libvirt emits a new missing='yes' attribute for such devices
when active domain XML is generated.
The comment stated that you may call qemuDomainObjBeginJobWithDriver
without passing qemud_driver to signal it's not locked.
qemuDomainObjBeginJobWithDriver still accesses the qemud_driver
structure and the lock singaling is done through a separate parameter.
Save/restore with passed through USB devices currently only works if the
USB device can be found at the same USB address where it used to be
before saving a domain. This makes sense in case a user explicitly
configure the USB address in domain XML. However, if the device was
found automatically by vendor/product identification, we should try to
search for that device when restoring the domain and use any device we
find as long as there is only one available. In other words, the USB
device can now be removed and plugged again or the host can be rebooted
between saving and restoring the domain.
Using VIR_DOMAIN_XML_MIGRATABLE flag, one can request domain's XML
configuration that is suitable for migration or save/restore. Such XML
may contain extra run-time stuff internal to libvirt and some default
configuration may be removed for better compatibility of the XML with
older libvirt releases.
This flag may serve as an easy way to get the XML that can be passed
(after desired modifications) to APIs that accept custom XMLs, such as
virDomainMigrate{,ToURI}2 or virDomainSaveFlags.
All USB device lookup functions emit an error when they cannot find the
requested device. With this patch, their caller can choose if a missing
device is an error or normal condition.
The code which looks up a USB device specified by hostdev is duplicated
in two places. This patch creates a dedicated function that can be
called in both places.
USB devices can disappear without OS being mad about it, which makes
them ideal for startupPolicy. With this attribute, USB devices can be
configured to be mandatory (the default), requisite (will disappear
during migration if they cannot be found), or completely optional.
While the changes to sanlock driver should be stable, the actual
implementation of sanlock_helper is supposed to be replaced in the
future. However, before we can implement a better sanlock_helper, we
need an administrative interface to libvirtd so that the helper can just
pass a "leases lost" event to the particular libvirt driver and
everything else will be taken care of internally. This approach will
also allow libvirt to pass such event to applications and use
appropriate reasons when changing domain states.
The temporary implementation handles all actions directly by calling
appropriate libvirt APIs (which among other things means that it needs
to know the credentials required to connect to libvirtd).
While current on_{poweroff,reboot,crash} action configuration is about
configuring life cycle actions, they can all be considered events and
actions that need to be done on a particular event. Let's generalize the
code by renaming life cycle actions to event actions so that it can be
reused later for non-lifecycle events.
Done with:
sed -i -e "s/no pool with matching uuid/no storage pool with matching uuid/g" src/storage/storage_driver.c
sed -i -e 's/"%s", _("no storage pool with matching uuid")/_("no storage pool with matching uuid %s"), obj->uuid/g' src/storage/storage_driver.c
sed -i -e 's/"%s", _("storage pool is not active")/_("storage pool '%s' is not active"), pool->def->name/g' src/storage/storage_driver.c
And a couple fixups before, during, and after, and a manual inspection
pass to make sure nothing was wonky.
When adding variants of parameter setting APIs which accepted
flags, the existing APIs were all adapted internally to pass
VIR_DOMAIN_AFFECT_CURRENT to the new API. The QEMU impl
qemuSetSchedularParameters was an exception, which instead
used VIR_DOMAIN_AFFECT_LIVE. Change this to match other
compatibility scenarios, so that calling
virDomainSetSchedularParameters(dom, params, nparams);
Has the same semantics as
virDomainSetSchedularParametersFlags(dom, params, nparams, 0);
And
virDomainSetSchedularParametersFlags(dom, params, nparams, VIR_DOMAIN_AFFECT_CURRENT);
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently virNetSocketNew fails because virSetCloseExec fails as there
is no proper implementation for it on Windows at the moment. Workaround
this by pretending that setting close-on-exec on the fd works. This can
be done because libvirt currently lacks the ability to create child
processes on Windows anyway. So there is no point in failing to set a
flag that isn't useful at the moment anyway.
Traverse the whole inheritance hierarchy for dynamic dispatch as it is
already done for the dynamic cast.
Also make AnyType cast errors more verbose.
Reported by Ata Bohra.
Add support to check if a specific interface is active by supporting the
following API function in the udev based virInterface backend:
* virConnectInterfaceIsActive()
All other backends for virInterface or other HVs implementations of
virInterface list their own names for the name instead of the generic
'Interface' value. This does the same for the netcf based backend.
Also, report any errors during registration.
Add a read-only udev based backend for virInterface. Useful for distros
that do not have netcf support yet. Multiple libvirt based utilities use
a HAL based fallback when virInterface is not available which is less
than ideal. This implements:
* virConnectNumOfInterfaces()
* virConnectListInterfaces()
* virConnectNumOfDefinedInterfaces()
* virConnectListDefinedInterfaces()
* virConnectListAllInterfaces()
* virConnectInterfaceLookupByName()
* virConnectInterfaceLookupByMACString()
The code was reporting raw exit status without decoding it into
normal vs. signal exit. virCommandRun already does this, but
with a different error type, so all we have to do is recast
the error to the correct type.
Reported by li guang.
* src/util/hooks.c (virHookCall): Simplify.
As a side effect of changes in the functions virGetUserID and
virGetGroupID, the user and group configurations for DAC in qemu.conf
are now able to accept both names and IDs, supporting a leading plus
sign to ensure that a numeric value will not be interpreted as a name.
This patch updates the comments in qemu.conf, including a description of
this new behavior.
With the recent introduction of QMP capabilities probing, libvirt failed
to detect support for QXL graphics in QEMU 1.2 and newer. In addition to
fixing that, this patch also causes libvirt to detect QXL support for
qemu-kvm-0.13.0, which doesn't advertise it in -help output but mentions
it in device list. Since qemu-kvm-0.13.0 supported -spice, it looks like
not having qxl in -help was a bug.
The functions virGetUserID and virGetGroupID are now able to parse
user/group names and IDs in a similar way to coreutils' chown. So, user
and group parsing in security_dac can be simplified.
This patch updates virGetUserID and virGetGroupID to be able to parse a
user or group name in a similar way to coreutils' chown. This means that
a numeric value with a leading plus sign is always parsed as an ID,
otherwise the functions try to parse the input first as a user or group
name and if this fails they try to parse it as an ID.
This patch includes Peter Krempa's changes to correctly handle errors
returned by getpwnam_r and getgrnam_r.
curl_global_init is not thread-safe. curl_easy_init might call
curl_global_init when it was no called before. But curl_easy_init
can be called from different threads by the ESX driver. Therefore,
call curl_global_init from virInitialize to stop curl_easy_init from
calling it.
Reported by Benjamin Wang.
When both kvmclock and kvm_pv_eoi are configured (either disabled or
enabled) libvirt will generate invalid CPU specification due to the
fact that even though kvmclock causes the CPU to be specified, it
doesn't set have_cpu flag to true (and the new kvm_pv_eoi as well).
This patch fixes the issue and adds a test exactly for that to show
that it is fixed correctly (and also to keep it that way in the future
of course).
libcurl uses a SIGALRM in combination with sigsetjmp/siglongjmp to be
able to abort a DNS lookup when it takes too long. The problem with this
in a multi-threaded application is that the signal handler for SIGALRM
and the call to siglongjmp can be executed on a thread that is different
from the one that initially did the SIGALRM setup and the call to
sigsetjmp. In the reported case this triggered a segfault.
Disable libcurl's use of signals to avoid this situation. This has the
disadvantage of losing the ability to abort synchronous DNS lookups which
might result in libcurl getting stuck in a DNS lookup in the worst case.
When libcurl was build with an asynchronous DNS backend such as c-ares
then there is no problem because the timeout mechanism works without
signals here anyway.
Reported by Benjamin Wang.
The output buffer for virFileReadAll was too small for systems with
more than 30 CPUs which leads to a log entry and incorrect behavior.
The new size will be sufficient for the current
architectural limits.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
This reverts part of commit 5468594f465; the perl changes in that
patch were sufficient. Since libvirt.syms is already a generated
file created in part from libvirt_private.syms, we don't need a
second pass over libvirt_private.syms in isolation.
* src/Makefile.am: Undo addition of check-private-symfile.
Currently, we are checking if libvirt.so contains public symbols.
However, sometimes we rename an internal symbol and forget to
change libvirt_private.syms accordingly. Hence, it's safer to check
for internal symbols as well.
Correct the check for the return value of virStrcpyStatic()
when copying port-profile names. Fixes Open vSwitch ports
which utilize port-profiles from network definitions.
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
The DAC driver is missing parsing of group and user names for DAC labels
and currently just parses uid and gid. This patch extends it to support
names, so the following security label definition is now valid:
<seclabel type='static' model='dac' relabel='yes'>
<label>qemu:qemu</label>
<imagelabel>qemu:qemu</imagelabel>
</seclabel>
When it tries to parse an owner or a group, it first tries to resolve it as
a name, if it fails or it's an invalid user/group name then it tries to
parse it as an UID or GID. A leading '+' can also be used for both owner and
group to force it to be parsed as IDs, so the following example is also
valid:
<seclabel type='static' model='dac' relabel='yes'>
<label>+101:+101</label>
<imagelabel>+101:+101</imagelabel>
</seclabel>
This ensures that UID 101 and GUI 101 will be used instead of an user or
group named "101".
Introduced in commit 0caccb58.
CC libvirt_driver_qemu_impl_la-qemu_capabilities.lo
../../src/qemu/qemu_capabilities.c: In function 'qemuCapsInitQMP':
../../src/qemu/qemu_capabilities.c:2327:13: error: format '%d' expects argument of type 'int', but argument 8 has type 'const char *' [-Werror=format]
* src/qemu/qemu_capabilities.c (qemuCapsInitQMP): Use correct format.
Since libvirt switched to QMP capabilities probing recently, it starts
QEMU process used for this probing with -daemonize, which means
virCommandAbort can no longer reach these processes. As a result of
that, restarting libvirtd will leave several new QEMU processes behind.
Let's use QEMU's -pidfile and use it to kill the process when QMP caps
probing is done.
When doing snapshots, the filesystem freeze function used the agent
entering function that expects the qemud_driver unlocked. This might
cause a deadlock of the qemu driver if the agent does not respond.
The only call path of this function has the qemud_driver locked, so this
patch changes the entering functions to those expecting the driver
locked.
There was an inverted return value in lxcCgroupControllerActive().
The function assumes cgroups are active and do couple of checks
to prove that. If any of them fails, false is returned. Therefore,
at the end, after all checks are done we must return true, not false.
Commit f6430390 broke builds on RHEL 5, where glibc (2.5) is too
old to support mkostemp (2.7) or htole64 (2.9). While gnulib
has mkostemp, it still lacks htole64; and it's not worth dragging
in replacements on systems where journald is unlikely to exist
in the first place, so we just use an extra configure-time check
as our witness of whether to attempt compiling the code.
* src/util/logging.c (virLogParseOutputs): Don't attempt to
compile journald on older glibc.
* configure.ac (AC_CHECK_DECLS): Check for htole64.
Use MATCH for all flags checks.
hypervMsvmComputerSystemToDomain expects the domain pointer to the
initialized to NULL.
All items in doms up to the count-th one are valid, no need to double
check before freeing them.
Avoid requesting information such as identity or power state when it
is not necessary.
Lookup virtual machine list with the required fields (configStatus,
name, and config.uuid) to make esxVI_GetVirtualMachineIdentity work.
No need to call esxVI_GetNumberOfSnapshotTrees. rootSnapshotTreeList
can be tested for emptiness by checking it for NULL.
esxVI_LookupRootSnapshotTreeList already does the error reporting,
don't overwrite it.
Check if autostart is enabled at all before looking up the individual
autostart setting of a virtual machine.
Reorder VIR_EXPAND_N(doms, ndoms, 1) to avoid leaking the result of
the call to virGetDomain if VIR_EXPAND_N fails.
Replace VIR_EXPAND_N by VIR_RESIZE_N to avoid quadratic scaling, as in
the Hyper-V version of the function.
If virGetDomain fails it already reports an error, don't overwrite it
with an OOM error.
All items in doms up to the count-th one are valid, no need to double
check before freeing them.
Finally, don't leak autoStartDefaults and powerInfoList.
Start a QEMU process using
$QEMU -S -no-user-config -nodefaults \
-nographic -M none -qmp unix:/some/path,server,nowait
and talk QMP over stdio to discover what capabilities the
binary supports. This works for QEMU 1.2.0 or later and
for older QEMU automatically fallback to the old approach
of parsing -help and related command line args.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some architectures provide the query-cpu-definitions command,
but are set to always return a "GenericError" from it :-(
Catch this & treat it as if there was an empty list of CPUs
returned
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
After calling qemuMonitorClose(), it is still possible for
the QEMU monitor I/O event callback to get invoked. This
will trigger an error message because mon->fd has been set
to -1 at this point. Silently ignore the case where mon->fd
is -1, likewise for mon->watch being zero.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The qemuMonitorOpen method only needs a virDomainObjPtr in order
to access the QEMU pid. This is not critical when detecting the
QEMU capabilties, so can easily be skipped
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The help output for QEMU 1.2.0 changed 'pci-assign' to 'kvm-pci-assign'.
Since the new capabilities code does exact device name matching
instead of substring matching, this caused the capabilities to go
missing.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add support for logging to the systemd journal, using its
simple client library. The benefit over syslog is that it
accepts structured log data, so the journald can store
individual items like code file/line/func separately from
the string message. Tools which require structured log
data can then query the journal to extract exactly what
they desire without resorting to string parsing
While systemd provides a simple client library for logging,
it is more convenient for libvirt to directly write its
own client code. This lets us build up the iovec's on
the stack, avoiding the need to alloc memory when writing
log messages.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the qemuCapsParseDeviceStr method has a bunch of open
coded string searches/comparisons to detect devices and their
properties. Soon this data will be obtained from QMP queries
instead of -device help output. Maintaining the list of device
and properties in two places is undesirable. Thus the existing
qemuCapsParseDeviceStr() method needs to be refactored to
separate the device types and properties from the actual
search code.
Thus the -device help output is now parsed to construct a
list of device names, and device properties. These are then
checked against a set of datatables to set the capability
flags
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The 'const char *category' parameter only has a few possible
values now that the filename has been separated. Turn this
parameter into an enum instead.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the logging APIs have a 'const char *category' parameter
which indicates where the log message comes from. This is typically
a combination of the __FILE__ string and other prefix. Split the
__FILE__ off into a dedicated parameter so it can passed to the
log outputs
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
General whitespace cleanup in the logging files
- Move '{' to a new line after funtion declaration
- Put each parameter on a new line to avoid long lines
- Put return type on new line
- Leave 2 blank lines between functions
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The log destinations are an enum, but most of the code was
just using a plain 'int' for function params / variables.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The __LINE__ macro value is specified to fit in the size_t
type, so use that instead of 'long long' in the logging code
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The log priority levels are an enum, but most of the code was
just using a plain 'int' for function params / variables.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
I hit this problem recently when trying to create a bridge with an IPv6
address on a 3.2 kernel: dnsmasq (and, further, radvd) would not bind to
the given address, waiting 20s and then giving up with -EADDRNOTAVAIL
(resp. exiting immediately with "error parsing or activating the config
file", without libvirt noticing it, BTW). This can be reproduced with (I
think) any kernel >= 2.6.39 and the following XML (to be used with
"virsh net-create"):
<network>
<name>test-bridge</name>
<bridge name='testbr0' />
<ip family='ipv6' address='fd00::1' prefix='64'>
</ip>
</network>
(it happens even when you have an IPv4, too)
The problem is that since commit [1] (which, ironically, was made to
“help IPv6 autoconfiguration”) the linux bridge code makes bridges
behave like “real” devices regarding carrier detection. This makes the
bridges created by libvirt, which are started without any up devices,
stay with the NO-CARRIER flag set, and thus prevents DAD (Duplicate
address detection) from happening, thus letting the IPv6 address flagged
as “tentative”. Such addresses cannot be bound to (see RFC 2462), so
dnsmasq fails binding to it (for radvd, it detects that "interface XXX
is not RUNNING", thus that "interface XXX does not exist, ignoring the
interface" (sic)). It seems that this behavior was enhanced somehow with
commit [2] by avoiding setting NO-CARRIER on empty bridges, but I
couldn't reproduce this behavior on my kernel. Anyway, with the “dummy
tap to set MAC address” trick, this wouldn't work.
To fix this, the idea is to get the bridge's attached device to be up so
that DAD can happen (deactivating DAD altogether is not a good idea, I
think). Currently, libvirt creates a dummy TAP device to set the MAC
address of the bridge, keeping it down. But even if we set this device
up, it is not RUNNING as soon as the tap file descriptor attached to it
is closed, thus still preventing DAD. So, we must modify the API a bit,
so that we can get the fd, keep the tap device persistent, run the
daemons, and close it after DAD has taken place. After that, the bridge
will be flagged NO-CARRIER again, but the daemons will be running, even
if not happy about the device's state (but we don't really care about
the bridge's daemons doing anything when no up interface is connected to
it).
Other solutions that I envisioned were:
* Keeping the *-nic interface up: this would waste an fd for each
bridge during all its life. May be acceptable, I don't really
know.
* Stop using the dummy tap trick, and set the MAC address directly
on the bridge: it is possible since quite some time it seems,
even if then there is the problem of the bridge not being
RUNNING when empty, contrary to what [2] says, so this will need
fixing (and this fix only happened in 3.1, so it wouldn't work
for 2.6.39)
* Using the --interface option of dnsmasq, but I saw somewhere
that it's not used by libvirt for backward compatibility. I am
not sure this would solve this problem, though, as I don't know
how dnsmasq binds itself to it with this option.
This is why this patch does what's described earlier.
This patch also makes radvd start even if the interface is
“missing” (i.e. it is not RUNNING), as it daemonizes before binding to
it, and thus sometimes does it after the interface has been brought down
by us (by closing the tap fd), and then originally stops. This also
makes it stop yelling about it in the logs when the interface is down at
a later time.
[1]
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=1faa4356a3bd89ea11fb92752d897cff3a20ec0e
[2]
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=b64b73d7d0c480f75684519c6134e79d50c1b341
If QEMU reports CommandNotFound for the 'query-events' command,
we must treat that as success, returning a zero-length array
of events
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In Xen 4.2, xs.h is deprecated in favor of xenstore.h. xs.h now
contains
#warning xs.h is deprecated use xenstore.h instead
#include <xenstore.h>
which fails compilation when warnings are treated as errors.
Introduce a configure-time check for xenstore.h and if found,
use it instead of xs.h.
In addition to the preformatted text line, pass the raw message as well,
to allow the output functions to use a different output format.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Allow for the code converting from libvirt log levels to syslog
log levels to be reused.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The qemuMonitorSetCapabilities() API is used to initialize the QMP
protocol capabilities. It has since been abused to initialize some
libvirt internal capabilities based on command/event existance too.
Move the latter code out into qemuCapsProbeQMP() in the QEMU
capabilities source file instead
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The qemu monitor does not require qemu_conf.h, and the
qemu capabilities code actually wants bitmap.h
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetTargetArch() method to support invocation
of the 'query-target' JSON monitor command. No HMP equivalent
is required, since this will only be present for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetObjectProps() method to support invocation
of the 'device-list-properties' JSON monitor command. No HMP equivalent
is required, since this will only be present for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetObjectTypes() method to support invocation
of the 'qom-list-types' JSON monitor command. No HMP equivalent
is required, since this will only be present for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetEvents() method to support invocation
of the 'query-events' JSON monitor command. No HMP equivalent
is required, since this will only be used when JSON is available
The existing qemuMonitorJSONCheckEvents() method is refactored
to use this new method
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetCPUCommands() method to support invocation
of the 'query-commands' JSON monitor command. No HMP equivalent
is required, since this will only be used when JSON is available
The existing qemuMonitorJSONCheckCommands() method is refactored
to use this new method
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetCPUDefinitions() method to support invocation
of the 'query-cpu-definitions' JSON monitor command. No HMP equivalent
is required, since this will only be present for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetMachines() method to support invocation
of the 'query-machines' JSON monitor command. No HMP equivalent
is required, since this will only be present for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a new qemuMonitorGetVersion() method to support invocation
of the 'query-version' JSON monitor command. No HMP equivalent
is provided, since this will only be used for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The qemuCapsProbeMachineTypes & qemuCapsProbeCPUModels methods
do not need to be invoked directly anymore. Make them static
and refactor them to directly populate the qemuCapsPtr object
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When launching a QEMU guest the binary is probed to discover
the list of supported CPU names. Remove this probing with a
simple lookup of CPU models in the qemuCapsPtr object. This
avoids another invocation of the QEMU binary during the
startup path.
As a nice benefit we can now remove all the nasty hacks from
the test suite which were done to avoid having to exec QEMU
on the test system. The building of the -cpu command line
can just rely on data we pre-populate in qemuCapsPtr.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When XML for a new guest is received, the machine type is
immediately canonicalized into the version specific name.
This involves probing QEMU for supported machine types.
Replace this probing with a lookup of the machine types
in the (hopefully cached) qemuCapsPtr object
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove all use of the existing APIs for querying QEMU
capability flags. Instead obtain a qemuCapsPtr object
from the global cache. This avoids the execution of
'qemu -help' (and related commands) when launching new
guests.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When building up a virCapsPtr instance, the QEMU driver
was copying the list of machine types across from the
previous virCapsPtr instance, if the QEMU binary had not
changed. Replace this ad-hoc caching of data with use
of the new qemuCapsCache global cache.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Introduce a qemuCapsCachePtr object to provide a global cache
of capabilities for QEMU binaries. The cache auto-populates
on first request for capabilities about a binary, and will
auto-refresh if the binary has changed since a previous cache
was populated
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
For historical compat we use 'itanium' as the arch name, so
if the QEMU binary suffix is 'ia64' we need to translate it
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If the qemuAgentClose method is called from a place which holds
the domain lock, it is theoretically possible to get a deadlock
in the agent destroy callback. This has not been observed, but
the equivalent code in the QEMU monitor destroy callback has seen
a deadlock.
Remove the redundant locking while unrefing the object and the
bogus assignment
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Many parts of virDomainDefPtr were using 'int' variables as
array length counts. Replace all these with size_t and update
various format strings & API signatures to adapt
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some users report (very rarely) seeing a deadlock in the QEMU
monitor callbacks
Thread 10 (Thread 0x7fcd11e20700 (LWP 26753)):
#0 0x00000030d0e0de4d in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00000030d0e09ca6 in _L_lock_840 () from /lib64/libpthread.so.0
#2 0x00000030d0e09ba8 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00007fcd162f416d in virMutexLock (m=<optimized out>)
at util/threads-pthread.c:85
#4 0x00007fcd1632c651 in virDomainObjLock (obj=<optimized out>)
at conf/domain_conf.c:14256
#5 0x00007fcd0daf05cc in qemuProcessHandleMonitorDestroy (mon=0x7fcccc0029e0,
vm=0x7fcccc00a850) at qemu/qemu_process.c:1026
#6 0x00007fcd0db01710 in qemuMonitorDispose (obj=0x7fcccc0029e0)
at qemu/qemu_monitor.c:249
#7 0x00007fcd162fd4e3 in virObjectUnref (anyobj=<optimized out>)
at util/virobject.c:139
#8 0x00007fcd0db027a9 in qemuMonitorClose (mon=<optimized out>)
at qemu/qemu_monitor.c:860
#9 0x00007fcd0daf61ad in qemuProcessStop (driver=driver@entry=0x7fcd04079d50,
vm=vm@entry=0x7fcccc00a850,
reason=reason@entry=VIR_DOMAIN_SHUTOFF_DESTROYED, flags=flags@entry=0)
at qemu/qemu_process.c:4057
#10 0x00007fcd0db323cf in qemuDomainDestroyFlags (dom=<optimized out>,
flags=<optimized out>) at qemu/qemu_driver.c:1977
#11 0x00007fcd1637ff51 in virDomainDestroyFlags (
domain=domain@entry=0x7fccf00c1830, flags=1) at libvirt.c:2256
At frame #10 we are holding the domain lock, we call into
qemuProcessStop() to cleanup QEMU, which triggers the monitor
to close, which invokes qemuProcessHandleMonitorDestroy() which
tries to obtain the domain lock again. This is a non-recursive
lock, hence hang.
Since qemuMonitorPtr is a virObject, the unref call in
qemuProcessHandleMonitorDestroy no longer needs mutex
protection. The assignment of priv->mon = NULL, can be
instead done by the caller of qemuMonitorClose(), thus
removing all need for locking.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If QEMU quits immediately after we opened the monitor it was
possible we would skip the clearing of the SELinux process
socket context
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In the cgroups APIs we have a virCgroupKillPainfully function
which does the loop sending SIGTERM, then SIGKILL and waiting
for the process to exit. There is similar functionality for
simple processes in qemuProcessKill, but it is tangled with
the QEMU code. Untangle it to provide a virProcessKillPainfuly
function
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When calling qemuProcessKill from the virDomainDestroy impl
in QEMU, do not ignore the return value. This ensures that
if QEMU fails to respond to SIGKILL, the caller will know
about the failure.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Depending on the scenario in which LXC containers exit, it is
possible for the EOF callback of the LXC monitor to deadlock
the driver.
#0 0x00000038a0a0de4d in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00000038a0a09ca6 in _L_lock_840 () from /lib64/libpthread.so.0
#2 0x00000038a0a09ba8 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00007f4bd9579d55 in virMutexLock (m=<optimized out>) at util/threads-pthread.c:85
#4 0x00007f4bcacc7597 in lxcDriverLock (driver=0x7f4bc40c8290) at lxc/lxc_conf.h:81
#5 virLXCProcessMonitorEOFNotify (mon=<optimized out>, vm=0x7f4bb4000b00) at lxc/lxc_process.c:581
#6 0x00007f4bd9645c91 in virNetClientCloseLocked (client=client@entry=0x7f4bb4009e60)
at rpc/virnetclient.c:554
#7 0x00007f4bd96460f8 in virNetClientIOEventLoopPassTheBuck (thiscall=0x0, client=0x7f4bb4009e60)
at rpc/virnetclient.c:1306
#8 virNetClientIOEventLoopPassTheBuck (client=0x7f4bb4009e60, thiscall=0x0)
at rpc/virnetclient.c:1287
#9 0x00007f4bd96467a2 in virNetClientCloseInternal (reason=3, client=0x7f4bb4009e60)
at rpc/virnetclient.c:589
#10 virNetClientCloseInternal (client=0x7f4bb4009e60, reason=3) at rpc/virnetclient.c:561
#11 0x00007f4bcacc4a82 in virLXCMonitorClose (mon=0x7f4bb4000a00) at lxc/lxc_monitor.c:201
#12 0x00007f4bcacc55ac in virLXCProcessCleanup (reason=<optimized out>, vm=0x7f4bb4000b00,
driver=0x7f4bc40c8290) at lxc/lxc_process.c:240
#13 virLXCProcessStop (driver=0x7f4bc40c8290, vm=vm@entry=0x7f4bb4000b00,
reason=reason@entry=VIR_DOMAIN_SHUTOFF_DESTROYED) at lxc/lxc_process.c:735
#14 0x00007f4bcacc5bd2 in virLXCProcessAutoDestroyDom (payload=<optimized out>,
name=0x7f4bb4003c80, opaque=0x7fff41af2df0) at lxc/lxc_process.c:94
#15 0x00007f4bd9586649 in virHashForEach (table=0x7f4bc409b270,
iter=iter@entry=0x7f4bcacc5ab0 <virLXCProcessAutoDestroyDom>, data=data@entry=0x7fff41af2df0)
at util/virhash.c:514
#16 0x00007f4bcacc52d7 in virLXCProcessAutoDestroyRun (driver=driver@entry=0x7f4bc40c8290,
conn=conn@entry=0x7f4bb8000ab0) at lxc/lxc_process.c:120
#17 0x00007f4bcacca628 in lxcClose (conn=0x7f4bb8000ab0) at lxc/lxc_driver.c:128
#18 0x00007f4bd95e67ab in virReleaseConnect (conn=conn@entry=0x7f4bb8000ab0) at datatypes.c:114
When the driver calls virLXCMonitorClose, there is really no
need for the EOF callback to be invoked in this case, since
the caller can easily handle events itself. In changing this,
the monitor needs to take a deep copy of the callback list,
not merely a reference.
Also adds debug statements in various places to aid
troubleshooting
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
<interface> elements are location inside the <forward> element of a
network. There is only one <forward> element in any network, but it
might have many <interface> elements. This element only contains a
single attribute, "dev", which is the name of a network device
(e.g. "eth0").
Since there is only a single attribute, the modify operation isn't
supported for this "section", only add-first, add-last, and
delete. Also, note that it's not permitted to delete an interface from
the list while any guest is using it. We may later decide this is safe
(because removing it from the list really only excludes it from
consideration in future guest allocations of interfaces, but doesn't
affect any guests currently connected), but for now this limitation
seems prudent (of course when changing the persistent config, this
limitation doesn't apply, because the persistent config doesn't
support the concept of "in used").
Another limitation - it is also possible for the interfraces in this
list to be described by PCI address rather than netdev name. However,
I noticed while writing this function that we currently don't support
defining interfaces that way in config - the only method of getting
interfaces specified as <adress type='pci' ..../> instead of
<interface dev='xx'/> is to provide a <pf dev='yy'/> element under
forward, and let the entries in the interface list be automatically
populated with the virtual functions (VF) of the physical function
device given in <pg>.
As with the other virNetworkUpdate section backends, support for this
section is completely contained within a single static function, no
other changes were required, and only functions already called from
elsewhere within the same file are used in the new content for this
existing function (i.e., adding this code should not cause a new build
problem on any platform).
Jim Fehlig reported a compilation error with older gcc 4.3.4:
libvirt.c: In function 'virDomainGetEmulatorPinInfo':
libvirt.c:9111: error: logical '&&' with non-zero constant will always evaluate as true [-Wlogical-op]
It looks like someone programmed via too much copy-and-paste.
* src/libvirt.c (virDomainGetEmulatorPinInfo): Multiplying by 1 is
a no-op, and thus will never overflow.
Recently, there have been some improvements made to qemu so it
supports seamless migration or something very close to it.
However, it requires libvirt interaction. Once qemu is migrated,
the SPICE server needs to send its internal state to the destination.
Once it's done, it fires SPICE_MIGRATE_COMPLETED event and this
fact is advertised in 'query-spice' output as well.
We must not kill qemu until SPICE server finishes the transfer.
SELinux wants all log files opened with O_APPEND. When
running non-root though, libvirtd likes to use O_TRUNC
to avoid log files growing in size indefinitely. Instead
of using O_TRUNC though, we can use O_APPEND and then
call ftruncate() which keeps SELinux happier.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Asynchronously setting priv->mon to NULL was pointless,
just remove the destroy callback entirely.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Continue consolidation of process functions by moving some
helpers out of command.{c,h} into virprocess.{c,h}
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There are a number of process related functions spread
across multiple files. Start to consolidate them by
creating a virprocess.{c,h} file
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virCommand prefix was inappropriate because the API
does not use any virCommandPtr object instance. This
API closely related to waitpid/exit, so use virProcess
as the prefix
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In most of the snapshot API's there's no need to hold the driver lock
the whole time.
This patch adds helper functions that get the domain object in functions
that don't require the driver lock and simplifies call paths from
snapshot-related API's.
It might need some time till the LUN's stable path shows up on
initiator host, and although the time window is not foreseeable,
as a better than nothing fix, this patch adds timeout for the
stable path discovery process.
Every level of the code for virNetworkUpdate was assuming that some
other level was checking for validity of the "command" arg, but none
actually were. The result was that an invalid command code would do
nothing, but also report success.
Since the command code isn't used until the very lowest level backend
functions, that's where I put the check. I made a separate one-line
function to log the error. The compiler would have combined the
identical strings used by multiple calls if I'd just called
virReportError directly in each location, but sending them all to the
same string in the source guards against inadvertant divergence (which
would lead to extra work for translators.)
1) virNetworkObjUpdate should be an all or none operation, but in the
case that we want to update both the live state and persistent config
versions of the network, it was committing the update to the live
state before starting to update the persistent config. If update of
the persistent config failed, we would leave with things in an
inconsistent state - the live state would be updated (even though an
error was returned), but persistent config unchanged.
This patch changed virNetworkObjUpdate to use a separate pointer for
each copy of the virNetworkDef, and not commit either of them in the
virNetworkObj until both live and config parts of the update have
successfully completed.
2) The parsers for various pieces of the virNetworkDef have all sorts
of subtle limitations on them that may not be known by the
Update[section] function, making it possible for one of these
functions to make a modification directly to the object that may not
pass the scrutiny of a subsequent parse. But normally another parse
wouldn't be done on the data until the *next* time the object was
updated (which could leave the network definition in an unusable
state).
Rather than fighting the losing battle of trying to duplicate all the
checks from the parsers into the update functions as well, the more
foolproof solution to this is to simply do an extra
virNetworkDefCopy() operation on the updated networkdef -
virNetworkDefCopy() does a virNetworkFormat() followed by a
virNetworkParseString(), so it will do all the checks we need. If this
fails, then we don't commit the changed def.
The bridge driver implementation of virNetworkUpdate() removes and
re-adds iptables rules any time a network has an <ip>, <forward>, or
<forward>/<interface> element updated. There are some types of
networks that have those elements and yet have no iptables rules
associated with them, and unfortunately the functions that remove/add
iptables rules don't check the type of network before attempting to
remove/add the rules, sometimes leading to an erroneous failure of the
entire update operation.
Under normal circumstances I would refactor the lower level functions
to be more robust, but to avoid code churn as much as possible, I've
just added extra checks directly to networkUpdate().
Nothing uses the return value, and creating it requries otherwise
unnecessary strlen () calls.
This cleanup is conceptually independent from the rest of the series
(although the later patches won't apply without it). This just seems
a good opportunity to clean this up, instead of entrenching the unnecessary
return value in the virLogOutputFunc instance that will be added in this
series.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
maxcpu and hostcpus are defined and calculated in qemudDomainPinVcpuFlags()
and qemudDomainPinEmulator(), but never used. So remove them including nodeinfo.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
This allows the user to control labelling of each character device
separately (the default is to inherit from the VM).
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
This is just code motion, allowing us to reuse the same function to
parse the <seclabel> from character devices too.
However it also fixes a possible segfault in the original code if
VIR_ALLOC_N returns an error and the cleanup code (at the error:
label) tries to iterate over the unallocated array (thanks Michal
Privoznik for spotting this).
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Disk hotplug is a two phase action: qemuMonitorAddDrive followed by
qemuMonitorAddDevice. When the first part succeeds but the second one
fails, we need to rollback the drive addition.
The README file seems to be a leftover from some previous version of
locking driver. It is not consistent with what the code does nor is it
consistent with existing documentation in internals/locking.html.
Some kernel versions (at least RHEL-6 2.6.32) do not let you over-mount
an existing selinuxfs instance with a new one. Thus we must unmount the
existing instance inside our namespace.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When auto-probing hypervisor drivers, the conn->uri field will
initially be NULL. Care must be taken not to access members
when doing auth lookups in the config file
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
portgroup elements are located in the toplevel of <network>
objects. There can be multiple <portgroup> elements, and they each
have a unique name attribute.
Add, delete, and modify are all supported for portgroup. When deleting
a portgroup, only the name must be specified in the provided xml - all
other attributes and subelements are ignored for the purposes of
matching and existing portgroup.
The bridge driver and virsh already know about the portgroup element,
so providing this backend should cause the entire stack to work. Note
that in the case of portgroup, there is no external daemon based on
the portgroup config, so nothing must be restarted.
It is important to note that guests make a copy of the appropriate
network's portgroup data when they are started, so although an updated
portgroup's configuration will have an affect on new guests started
after the cahange, existing guests won't magically have their
bandwidth changed, for example. If something like that is desired, it
will take a lot of redesign work in the way network devices are setup
(there is currently no link from the network back to the individual
interfaces using it, much less from a portgroup within a network back
to the individual interfaces).
The dhcp range element is contained in the <dhcp> element of one of a
network's <ip> elements. There can be multiple <range>
elements. Because there are only two attributes (start and end), and
those are exactly what you would use to identify a particular range,
it doesn't really make sense to modify an existing element, so
VIR_NETWORK_UPDATE_COMMAND_MODIFY isn't supported for this section,
only ADD_FIRST, ADD_LAST, and DELETE.
Since virsh already has support for understanding all the defined
sections, this new backend is automatically supported by virsh. You
would use it like this:
virsh net-update mynet add ip-dhcp-range \
"<range start='1.2.3.4' end='1.2.3.20'/>" --live --config
The bridge driver also already supports all sections, so it's doing
the correct thing in this case as well - since the dhcp range is
placed on the dnsmasq commandline, the bridge driver recreates the
dnsmasq commandline, and re-runs dnsmasq whenever a range is
added/deleted (and AFFECT_LIVE is specified in the flags).
https://www.gnu.org/licenses/gpl-howto.html recommends that
the 'If not, see <url>.' phrase be a separate sentence.
* tests/securityselinuxhelper.c: Remove doubled line.
* tests/securityselinuxtest.c: Likewise.
* globally: s/; If/. If/
The "dump-guest-core' option is new option for the machine type
(-machine pc,dump-guest-core) that controls whether the guest memory
will be marked as dumpable.
While testing this, I've found out that the value for the '-M' options
is not parsed correctly when additional parameters are used. However,
when '-machine' is used for the same options, it gets parsed as
expected. That's why this patch also modifies the parsing and creating
of the command line, so both '-M' and '-machine' are recognized. In
QEMU's help there is only mention of the 'machine parameter now with
no sign of the older '-M'.
Sometimes when guest machine crashes, coredump can get huge due to the
guest memory. This can be limited using madvise(2) system call and is
being used in QEMU hypervisor. This patch adds an option for configuring
that in the domain XML and related documentation.
Whenever the guest machine fails to boot, new parameter (reboot-timeout)
controls whether it should reboot and after how many ms it should do so.
Docs included.
The DAC security driver silently ignored errors when parsing the DAC
label and used default values instead.
With a domain containing the following label definition:
<seclabel type='static' model='dac' relabel='yes'>
<label>sdfklsdjlfjklsdjkl</label>
</seclabel>
the domain would start normaly but the disk images would be still owned
by root and no error was displayed.
This patch changes the behavior if the parsing of the label fails (note
that a not present label is not a failure and in this case the default
label should be used) the error isn't masked but is raised that causes
the domain start to fail with a descriptive error message:
virsh # start tr
error: Failed to start domain tr
error: internal error invalid argument: failed to parse DAC seclabel
'sdfklsdjlfjklsdjkl' for domain 'tr'
I also changed the error code to "invalid argument" from "internal
error" and tweaked the various error messages to contain correct and
useful information.
This patch cleans up building the "-boot" parameter and while on that
fixes one inconsistency by modifying these things:
- I completed the unfinished virDomainBootMenu enum by specifying
LAST, declaring it and also declaring the TypeFromString and
TypeToString parameters.
- Previously mentioned TypeFromString and TypeToString are used when
parsing the XML.
- Last, but not least, visible change is that the "-boot" parameter
is built and parsed properly:
- The "order=" prefix is used only when additional parameters are
used (menu, etc.).
- It's rewritten in a way that other parameters can be added
easily in the future (used in following patch).
- The "order=" parameter is properly parsed regardless to where it
is placed in the string (e.g. "menu=on,order=nc").
- The "menu=" parameter (and others in the future) are created
when they should be (i.e. even when bootindex is supported and
used, but not when bootloader is selected).
Currently, we mark domain PAUSED (but not emit an event)
just before we issue 'stop' on monitor; This command can
take ages to finish, esp. when domain's doing a lot of
IO - users can enforce qemu to open files with O_DIRECT
which doesn't return from write() until data reaches the
block device. Having said that, we report PAUSED even if
domain is not paused yet.
The memmove to move elements in the dhcp hosts array when inserting
and deleting items was mistakenly basing the length of the copy on the
size of a virNetworkDHCPHostDefPtr rather than virNetworkDHCPHostDef,
with the expected disastrous results.
The memmove to delete an entry commits two errors - along with the
size of each element being wrong, it also omits some required
parentheses.
Based exclusively on work by Eric Blake in a patch posted with the same
subject. However some modifications related to comments and my plans to
add another backend.
Added WITH_INTERFACE as the only automake variable deciding whether to
build the driver and using WITH_NETCF to identify that we're wanting to
use the netcf library as the backend.
* configure.ac: Added with_interface
* src/interface/netcf_driver.c: Renamed..
* src/interface/interface_backend_netcf.c: ..to this to match storage.
* src/interface/netcf_driver.h: Renamed..
* src/interface/interface_driver.h: ..to this.
* daemon/Makefile.am: Respect WITH_INTERFACE and WITH_NETCF.
* libvirt.spec.in: Add RPM support for --with-interface
Commit f36309d added an export with no matching implementation;
probably a misspelling of an earlier version of the final addition
of virNetworkObjSetDefTransient.
* src/libvirt_private.syms (network_conf.h): Drop bogus
virNetworkSetDefTransient.
Commit aaa8ab3 added new static functions that are only used on Linux;
but commit 22acfdc didn't go far enough to fix compiler issues.
* src/nodeinfo.c (nodeSetMemoryParameterValue)
(nodeGetMemoryParameterValue): Conditionally compile based on use.
Commit ee3d3893 missed the fact that (unsigned char)<<(int)
is truncated to int, and therefore failed for any bitmap data
longer than four bytes.
Also, I failed to run 'make syntax-check' on my commit 4bba6579;
for whatever odd reason, ffs lives in a different header than ffsl.
* src/util/bitmap.c (virBitmapNewData): Use correct shift type.
(includes): Glibc (and therefore gnulib) decided ffs is in
<strings.h>, but ffsl is in <string.h>.
* tests/virbitmaptest.c (test5): Test it.
Commit 0fc89098 used functions only available on glibc, completely
botched 32-bit environments, and risked SIGBUS due to unaligned
memory access on platforms that aren't as forgiving as x86_64.
* bootstrap.conf (gnulib_modules): Import ffsl.
* src/util/bitmap.c (includes): Use <strings.h> for ffsl.
(virBitmapNewData, virBitmapToData): Avoid 64-bit assumptions and
non-portable functions.
The introduction of APIC EOI patches had a few little details that
could look better, so this patch fixes that and one more place in the
file as well (same problem).
When trying to get the value of a private secret, the code used
'operation denied' error. That error is specified as a error for
read-only connections trying to perform denied operation. The
following error seems more accurate.
To compare the difference:
- BEFORE
error: operation secret is private forbidden for read only access
- AFTER
error: Invalid secret: secret is private
Two changes are introduced in this patch:
- The first change removes ATTRIBUTE_RETURN_CHECK from
virNetDevBandwidthClear, because it was called with ignore_value
always, anyway. The function is used even when it's not necessary
to call it, just for cleanup purposes.
- The second change is added ignoring of the command's exit status,
since it may report an error even when run just as "to be sure we
clean up" function. No libvirt errors are suppresed by this.
Commit 7a99b0abaf adds a new RPC struct
but one of the members has different names in remote_protocol.x and
remote_protocol-struct breaking make check.
Currently we search along the hard-coded names:
SBINDIR "/libvirtd"
SBINDIR "/libvirtd_dbg"
but if the environment variable $LIBVIRTD_PATH is set to the
name of the libvirtd binary, that is used instead. Fix the
error message so it accurately reflects current behaviour
($PATH is NOT searched).
This patch fills in the first implementation for one of the
virNetworkUpdate sections. With this code, you can now add/delete/edit
<host> entries in a network's <ip> address <dhcp> element (by
specifying a section of VIR_NETWORK_SECTION_IP_DHCP_HOST).
If you pass in a parentIndex of -1, the code will automatically find
the one ip element that has a <dhcp> section and make the updates
there. Otherwise, you can specify an index >= 0, and libvirt will look
for that particular instance of <ip> in the network, and modify its
<dhcp> element. (This currently isn't very useful, because libvirt
only supports having dhcp information on a single IP address, but that
could change in the future).
When adding a new host entry
(VIR_NETWORK_UPDATE_COMMAND_ADD_(FIRST|LAST)), the existing entries
will be compared to the new entry, and if any non-empty attribute
matches, the add will fail. When updating an existing entry
(VIR_NETWORK_UPDATE_COMMAND_MODIFY), the mac address or name will be
used to find the existing entry, and other fields will only be updated
(note there is some potential for ambiguity here if you specify the
mac address from one entry and the name from another). When deleting
an existing entry (VIR_NETWORK_UPDATE_COMMAND_DELETE), all non-empty
attributes in the supplied xml arg will be compared - all of them must
match before libvirt will delete the host.
The xml should be a fully formed <host> element as it would appear in
a network definition, e.g. "<host mac=00:11:22:33:44:55 ip=10.1.23.22
name='testbox'/>" (when adding/updating, ip and one of mac|name is
required; when deleting, you can specify any one, two, or all
attributes, but they all must match the target element).
As with the update of any other section, you can choose to affect the
live config (with flag VIR_NETWORK_UPDATE_AFFECT_LIVE), the persistent
config (VIR_NETWORK_UPDATE_AFFECT_CONFIG), or both. If you've chosen
to affect the live config, those changes will take effect immediately,
with no need to destroy/restart the network.
An example of adding a host entry:
virNetworkUpdate(net, VIR_NETWORK_UPDATE_COMMAND_ADD_LAST,
VIR_NETWORK_SECTION_IP_DHCP_HOST, -1,
"<host mac='00:11:22:33:44:55' ip='192.168.122.5'/>",
VIR_NETWORK_UPDATE_AFFECT_LIVE
| VIR_NETWORK_UPDATE_AFFECT_CONFIG);
To delete that same entry:
virNetworkUpdate(net, VIR_NETWORK_UPDATE_COMMAND_DELETE,
VIR_NETWORK_SECTION_IP_DHCP_HOST, -1,
"<host mac='00:11:22:33:44:55'/>",
VIR_NETWORK_UPDATE_AFFECT_LIVE
| VIR_NETWORK_UPDATE_AFFECT_CONFIG);
(you could also delete it by replacing "mac='00:11:22:33:44:55'" with
"ip='192.168.122.5'".)
A user on IRC had accidentally killed all of his libvirt-started
dnsmasq instances (due to a buggy dnsmasq service script in Fedora
16), and had hoped that libvirtd would notice this on restart and
reload all the dnsmasq daemons (as it does with iptables
rules). Unfortunately this was not the case - as long as the network
object had a pid registered for dnsmasq and/or radvd, it assumed that
the processes were running.
This patch takes advantage of the new utility functions in
bridge_driver.c to do a "refresh" of all radvd and dnsmasq processes
started by libvirt each time libvirtd is restarted - this function
attempts to do a SIGHUP of each existing process, and if that fails,
it restarts the process, rebuilding all the associated config files
and commandline parameters in the process. This normally has no
effect, but will be useful in solving the occasional "odd situation"
without needing to take the drastic step of destroying/re-starting the
network.
The test driver does nothing outside of keeping track of each
network's config/state in the in-memory database maintained by
network_conf functions, so all we have to do is call the function that
updates the network's entry in the in-memory database.
Call the network_conf function that modifies the live/persistent/both
config, then refresh/restart dnsmasq/radvd if necessary, and finally
save the config in the proper place(s).
This patch also needed to uncomment a few utility functions that were
added inside #if 0 in the previous commit (to avoid compiler errors
due to unreferenced static functions).
This patch splits the starting of dnsmasq and radvd into multiple
files, and adds new networkRefreshXX() and networkRestartXX()
functions for each. These new functions are currently commented out
because they won't be used until the next commit, and the compile options
require all static functions to be used.
networkRefreshXX() - rewrites any file-based config for dnsmasq/radvd,
and sends SIGHUP to the process to make it reread its config. If the
program isn't already running, it's just started.
networkRestartXX() - kills the given program, waits for it to exit
(see the comments in the function networkKillDaemon()), then calls
networkStartXX().
This commit is here mostly as a checkpoint to verify no change in
functional behavior after refactoring networkStartXX() functions to
fit in with these new functions.
virNetworkObjUpdate takes care of all virNetworkUpdate-related changes
to the data stored in the in-memory virNetworkObj list. It should be
called by network drivers that use this in-memory list.
virNetworkObjUpdate *does not* take care of updating any disk-based
copies of the config, nor does it perform any other operations
necessary to have the new config data take effect (e.g. it won't
re-write dnsmasq host files, nor will it send a SIGHUP to dnsmasq) -
those things should all be taken care of in the network driver
function that calls virNetworkObjUpdate (assuming that it returns
success).
These new functions are highly inspired by those in domain_conf.c (but
not identical), and are intended to make it simpler to update the
various combinations of live/persistent network configs.
The network driver wasn't previously as careful about the separation
between the live "status" in network->def and the persistent "config"
in network->newDef (or sometimes in network->def). This series
attempts to remedy some of that, but probably doesn't go all the way
(enough to get these functions working and enable continued work on
virNetworkUpdate though).
bridge_driver.c and test_driver.c were updated in a few places to take
advantage of the new functions and/or account for changes in argument
lists.
This is very short, because almost everything is autogenerated. All
that's needed are:
* src/remote/remote_driver.c: add pointer to autogenerated
remoteNetworkUpdate to the function table for the remote
network driver.
* src/remote/remote_protocol.x: add the "args" struct and add one more
item to the remote_procedure enum for this function.
* src/remote_protocol-struct: update to match remote_protocol.x
This patch adds a new public API virNetworkUpdate that will permit
updating an existing network configuration without requiring that the
network be destroyed/restarted for the changes to take effect.
This series adds support to run QEMU with seccomp sandbox enabled. It can be
configured in qemu.conf to on, off, or the QEMU default, which is off in 1.2.
Default value is the QEMU default.
On agent EOF the qemuProcessHandleAgentEOF() callback is called
which locks virDomainObjPtr. Then qemuAgentClose() is called
(with domain object locked) which eventually calls qemuAgentDispose()
and qemuProcessHandleAgentDestroy(). This tries to lock the
domain object again. Hence the deadlock.
All of ide-drive, ide-hd, ide-cd, scsi-disk, scsi-hd, and scsi-cd
supports wwn property. (NB, scsi-block doesn't support to set wwn).
* src/qemu/qemu_command.c: Error out if underlying QEMU doesn't
support wwn property for the device; Set wwn for the device otherwise.
* tests/qemuxml2argvdata/qemuxml2argv-disk-ide-wwn.args: New test
* tests/qemuxml2argvdata/qemuxml2argv-disk-ide-wwn.xml: Likewise
* tests/qemuxml2argvdata/qemuxml2argv-disk-scsi-disk-wwn.args: Likewise
* tests/qemuxml2argvdata/qemuxml2argv-disk-scsi-disk-wwn.xml: Likewise
* tests/qemuxml2argvtest.c: Add the new tests.
This assumes ide-drive.wwn, ide-hd.wwn, ide-cd.wwn were supported
at the same time, similar for scsi-disk.wwn, scsi-hd.wwn, and
scsi-cd.wwn. So only two new caps (QEMU_CAPS_IDE_DRIVE_WWN,
and QEMU_CAPS_SCSI_DISK_WWN) are introduced.
Validates the wwn while parsing, error out if it's malformed.
* src/util/util.h: Declare virValidateWWN
* src/util/util.c: Implement virValidateWWN
* src/libvirt_private.syms: Export virValidateWWN.
* src/conf/domain_conf.h: New member 'wwn' for disk def.
* src/conf/domain_conf.c: Parse and format disk <wwn>
Relatively straightforward. Our decision to make block job
speed a long keeps haunting us on new API.
* src/remote/remote_protocol.x (remote_domain_block_commit_args):
New struct.
* src/remote/remote_driver.c (remote_driver): Enable it.
* src/remote_protocol-structs: Regenerate.
* src/rpc/gendispatch.pl (long_legacy): Exempt another bandwidth.
A block commit moves data in the opposite direction of block pull.
Block pull reduces the chain length by dropping backing files after
data has been pulled into the top overlay, and is always safe; block
commit reduces the chain length by dropping overlays after data has
been committed into the backing file, and any files that depended
on base but not on top are invalidated at any point where they have
unallocated data that is now pointing to changed contents in base.
Both directions are useful, however: a qcow2 layer that is more than
50% allocated will typically be faster with a pull operation, while
a qcow2 layer with less than 50% allocation will be faster as a
commit operation. Committing across multiple layers can be more
efficient than repeatedly committing one layer at a time, but
requires extra support from the hypervisor.
This API matches Jeff Cody's proposed qemu command 'block-commit':
https://lists.gnu.org/archive/html/qemu-devel/2012-09/msg02226.html
Jeff's command is still in the works for qemu 1.3, and may gain
further enhancements, such as the ability to control on-error
handling (it will be comparable to the error handling Paolo is
adding to 'drive-mirror', so a similar solution will be needed
when I finally propose virDomainBlockCopy with more functionality
than the basics supported by virDomainBlockRebase). However, even
without qemu support, this API will be useful for _offline_ block
commits, by wrapping qemu-img calls and turning them into a block
job, so this API is worth committing now.
For some examples of how this will be implemented, all starting
with the chain: base <- snap1 <- snap2 <- active
+ These are equivalent:
virDomainBlockCommit(dom, disk, NULL, NULL, 0, 0)
virDomainBlockCommit(dom, disk, NULL, "active", 0, 0)
virDomainBlockCommit(dom, disk, "base", NULL, 0, 0)
virDomainBlockCommit(dom, disk, "base", "active", 0, 0)
but cannot be implemented for online qemu with round 1 of
Jeff's patches; and for offline images, it would require
three back-to-back qemu-img invocations unless qemu-img
is patched to allow more efficient multi-layer commits;
the end result would be 'base' as the active disk with
contents from all three other files, where 'snap1' and
'snap2' are invalid right away, and 'active' is invalid
once any further changes to 'base' are made.
+ These are equivalent:
virDomainBlockCommit(dom, disk, "snap2", NULL, 0, 0)
virDomainBlockCommit(dom, disk, NULL, NULL, 0, _SHALLOW)
they cannot be implemented for online qemu, but for offline,
it is a matter of 'qemu-img commit active', so that 'snap2'
is now the active disk with contents formerly in 'active'.
+ Similarly:
virDomainBlockCommit(dom, disk, "snap2", NULL, 0, _DELETE)
for an offline domain will merge 'active' into 'snap2', then
delete 'active' to avoid leaving a potentially invalid file
around.
+ This version:
virDomainBlockCommit(dom, disk, NULL, "snap2", 0, _SHALLOW)
can be implemented online with 'block-commit' passing a base of
snap1 and a top of snap2; and can be implemented offline by
'qemu-img commit snap2' followed by 'qemu-img rebase -u
-b snap1 active'
* include/libvirt/libvirt.h.in (virDomainBlockCommit): New API.
* src/libvirt.c (virDomainBlockCommit): Implement it.
* src/libvirt_public.syms (LIBVIRT_0.10.2): Export it.
* src/driver.h (virDrvDomainBlockCommit): New driver callback.
* docs/apibuild.py (CParser.parseSignature): Add exception.
Upstream qemu has raised a concern about whether dumping guest
memory by reading guest paging tables is a security hole:
https://lists.gnu.org/archive/html/qemu-devel/2012-09/msg02607.html
While auditing libvirt to see if we would be impacted, I noticed
that we had some dead code. It is simpler to nuke the dead code
and limit our monitor code to just the subset we make use of.
* src/qemu/qemu_monitor.h (QEMU_MONITOR_DUMP): Drop poorly named
and mostly-unused enum.
* src/qemu/qemu_monitor.c (qemuMonitorDumpToFd): Drop arguments.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDump): Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDump): Likewise.
* src/qemu/qemu_driver.c (qemuDumpToFd): Update caller.
virNetworkAssignDef was allocating a new network object, initing and
grabbing its lock, then potentially freeing it without unlocking or
destroying the lock. In practice 1) this will probably never happen,
and 2) even if it did, the lock implementation used on most (all?)
platforms doesn't actually hold any resources for an initialized or
held lock, but it still bothered me, so I moved the realloc that could
lead to this bad situation earlier in the function, and now the mutex
isn't inited or locked until we are assured of complete success.
These two objects were previously always parsed as a part of an IpDef,
but we will now need to be able to parse them on their own for
virNetworkUpdate(). Split the parsing functions out, with no
functional changes.
The final patch in Hu Tao's series to enhance virBitmap actually
removes virDomainCpuSetParse and virDomainCpuSetFormat as "no longer
used", and the rest of the series hadn't taken care of two uses of
virDomainCpuSetParse in the xen code.
This patch replaces those with appropriate virBitmap functions. It
should be pushed prior to the patch removing virDomainCpuSetParse.
In many places we store bitmap info in a chunk of data
(pointed to by a char *), and have redundant codes to
set/unset bits. This patch extends virBitmap, and convert
those codes to use virBitmap in subsequent patches.
Only implemented for linux platform.
* src/nodeinfo.h: (Declare node{Get,Set}MemoryParameters)
* src/nodeinfo.c: (Implement node{Get,Set}MemoryParameters)
* src/libvirt_private.syms: (Export those two new internal APIs to
private symbols)
* src/rpc/gendispatch.pl: (virNodeSetMemoryParameters is the
the special one which needs a connection object as the first
argument, improve the generator to support it).
* daemon/remote.c: (Implement the server side handler for
virDomainGetMemoryParameters)
* src/remote/remote_driver.c: (Implement the client side handler
for virDomainGetMemoryParameters)
* src/remote/remote_protocol.x: (New RPC procedures for the two
new APIs and structs to represent the args and ret for it)
* src/remote_protocol-structs: Likewise
* include/libvirt/libvirt.h.in: (Add macros for the param fields,
declare the APIs).
* src/driver.h: (New methods for the driver struct)
* src/libvirt.c: (Implement the public APIs)
* src/libvirt_public.syms: (Export the public symbols)
Simply returns the object list. Supports to filter the secrets
by its storage location, and whether it's private or not.
src/secret/secret_driver.c: Implement listAllSecrets
The RPC generator doesn't support returning list of object yet, this patch
does the work manually.
* daemon/remote.c:
Implement the server side handler remoteDispatchConnectListAllSecrets.
* src/remote/remote_driver.c:
Add remote driver handler remoteConnectListAllSecrets.
* src/remote/remote_protocol.x:
New RPC procedure REMOTE_PROC_CONNECT_LIST_ALL_SECRETS and
structs to represent the args and ret for it.
* src/remote_protocol-structs: Likewise.
This is to list the secret objects. Supports to filter the secrets
by its storage location, and whether it's private or not.
include/libvirt/libvirt.h.in: Declare enum virConnectListAllSecretFlags
and virConnectListAllSecrets.
python/generator.py: Skip auto-generating
src/driver.h: (virDrvConnectListAllSecrets)
src/libvirt.c: Implement the public API
src/libvirt_public.syms: Export the symbol to public
The RPC generator doesn't support returning list of object yet, this patch
do the work manually.
* daemon/remote.c:
Implemente the server side handler remoteDispatchConnectListAllNWFilters.
* src/remote/remote_driver.c:
Add remote driver handler remoteConnectListAllNWFilters.
* src/remote/remote_protocol.x:
New RPC procedure REMOTE_PROC_CONNECT_LIST_ALL_NWFILTERS and
structs to represent the args and ret for it.
* src/remote_protocol-structs: Likewise.
This is to list the network filter objects. No flags are supported
include/libvirt/libvirt.h.in: Declare enum virConnectListAllNWFilterFlags
and virConnectListAllNWFilters.
python/generator.py: Skip auto-generating
src/driver.h: (virDrvConnectListAllNWFilters)
src/libvirt.c: Implement the public API
src/libvirt_public.syms: Export the symbol to public
tools/virsh-nodedev.c:
* vshNodeDeviceSorter to sort node devices by name
* vshNodeDeviceListFree to free the node device objects list.
* vshNodeDeviceListCollect to collect the node device objects, trying
to use new API first, fall back to older APIs if it's not supported.
* Change option --cap to accept multiple capability types.
tools/virsh.pod
* Update document for --cap
src/conf/node_device_conf.h:
* New macro VIR_CONNECT_LIST_NODE_DEVICES_FILTERS_CAP
* Declare virNodeDeviceList
src/conf/node_device_conf.c:
* New helpers virNodeDeviceCapMatch, virNodeDeviceMatch.
virNodeDeviceCapMatch looks up the list of all the caps the device
support, to see if the device support the cap type.
* Implement virNodeDeviceList
src/libvirt_private.syms:
* Export virNodeDeviceList
* Export virNodeDevCapTypeFromString
The RPC generator doesn't support returning list of object yet, this patch
does the work manually.
* daemon/remote.c:
Implemente the server side handler remoteDispatchConnectListAllNodeDevices.
* src/remote/remote_driver.c:
Add remote driver handler remoteConnectListAllNodeDevices.
* src/remote/remote_protocol.x:
New RPC procedure REMOTE_PROC_CONNECT_LIST_ALL_INTERFACES and
This is to list the node device objects, supports to filter the results
by capability types.
include/libvirt/libvirt.h.in: Declare enum virConnectListAllNodeDeviceFlags
and virConnectListAllNodeDevices.
python/generator.py: Skip auto-generating
src/driver.h: (virDrvConnectListAllNodeDevices)
src/libvirt.c: Implement the public API
src/libvirt_public.syms: Export the symbol to public
virNWFilterSnoopAdjustPoll() uses a struct pollfd but poll.h is never included
nwfilter/nwfilter_dhcpsnoop.c:1297: error: 'struct pollfd' declared inside parameter list
If the qemuBuildCommandLine method raised an error before the
virCommandPtr instance was created, the local var would not
be initialized, resulting in a possible SEGV in the error
cleanup branch. Also add some debugging of the method params
Introduce a qemuCapsNewForBinary() API which creates a new
QEMU capabilities object, populated with data relating to
a specific QEMU binary. The qemuCaps object is also given
a timestamp, which makes it possible to detect when the
cached capabilities for a binary are out of date
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Don't bother checking for the existance of the HMP passthrough
command. Just try to execute it, and propagate the failure.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The qemuMonitorHMPCommand() API and things it calls will report
a wide variety of errors. The QEMU text monitor should not be
overwriting these errors
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch adds full support for EOI setting for domains. Because this
is CPU feature (flag), the model needs to be added even when it's not
specified. Fortunately this problem was already solved with kvmclock,
so this patch simply abuses that.
And due to the size of the patch (17 lines) I dared to include the tests.
New options is added to support EOI (End of Interrupt) exposure for
guests. As it makes sense only when APIC is enabled, I added this into
the <apic> element in <features> because this should be tri-state
option (cannot be handled as standalone feature).
Fix for CVE-2012-4423.
When generating RPC protocol messages, it's strictly needed to have a
continuous line of numbers or RPC messages. However in case anyone
tries backporting some functionality and will skip a number, there is
a possibility to make the daemon segfault with newer virsh (version of
the library, rpc call, etc.) even unintentionally.
The problem is that the skipped numbers will get func filled with
NULLs, but there is no check whether these are set before the daemon
tries to run them. This patch very simply enhances one check and fixes
that.
BZ:https://bugzilla.redhat.com/show_bug.cgi?id=843372
when qemu supports the 'transaction' monitor command,
and libvirt's --reuse-ext flag was not specified, libvirt created
a stub file with zero size in first place. After the failure of
QEMU transaction command performing qcow2 snapshots on more than
one drives, the stub file is left behind with non-empty
by the QEMU transaction command.
In order to unlink the file, the patch removes the file size checking.
Steps to reproduce the issue:
Steps:
1, Create a qemu instance with two drive images of qcow2 type (root user)
/usr/libexec/qemu-kvm -m 1024 -smp 1 -name "rhel6u1" \
-drive file=/var/lib/libvirt/images/firstqcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-drive file=/var/lib/libvirt/images/secondqcow2,if=none,id=drive-virtio-disk1,format=qcow2,cache=none \
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk1,id=virtio-disk1 -qmp stdio
2, Initialize qemu qmp
{"execute":"qmp_capabilities"}
3, Remove the second drive image file
rm -f /var/lib/libvirt/images/secondqcow2
4, Run 'transaction' command with snapshot qemu commands in.
{"execute":"transaction","arguments":
{"actions":
[{"type":"blockdev-snapshot-sync","data":
{"device":"drive-virtio-disk0","snapshot-file":"/var/lib/libvirt/images/firstqcow2-snapshot.img","format":"qcow2"}
},
{"type":"blockdev-snapshot-sync","data":
{"device":"drive-virtio-disk1","snapshot-file":"/var/lib/libvirt/images/secondqcow2-snapshot.img","format":"qcow2"}
}]
},
"id":"libvirt-6"}
5, Got the error as follows:
{"id": "libvirt-6",
"error": {"class": "OpenFileFailed", "desc": "Could not open '/var/lib/libvirt/images/secondqcow2-snapshot.img'",
"data": {"filename": "/var/lib/libvirt/images/secondqcow2-snapshot.img"}
}
}
6, List first newly-created snapshot file:
-rw-r--r--. 1 root root 262144 Sep 13 11:43 firstqcow2-snapshot.img
The 'def->target.addr' hasn't been initialized in virDomainChrDefNew() and
its value is always '0xffffffff', in addition, the following test scenario
hasn't also include 'address' element in channel XML block, so the branch
'if (addrStr == NULL)' is hit in virDomainChrDefParseTargetXML(), the
programming jumps to 'error' label to release relevant resources, and the
statement 'if (VIR_ALLOC(def->target.addr) < 0)' hasn't been executed then
the virDomainChrDefFree() will free 'def->target.addr'(0xffffffff) via
VIR_FREE(), which results in libvirt crash, to use valgrind can also
find a 'Invalid free() / delete / delete[]' error. This patch just adjusts
codes order to initialize 'def->target.addr' firstly.
With this patch, libvirt hasn't crash and can get a expected error message "
XML error: guestfwd channel does not define a target address".
How to reproduce?
1. define a guest with the following channel XML configuration
$ cat foo.xml
<snip>
<channel type='pty'>
<target type='guestfwd'/>
</channel>
</snip>
$ virsh define foo.xml
2. actual result
error: Failed to define domain from /tmp/foo.xml
error: End of file while reading data: Input/output error
error: Failed to reconnect to the hypervisor
GDB debugger information:
<snip>
Breakpoint 1, virDomainChrDefFree (def=0x7f8ab000ec70) at conf/domain_conf.c:1264
...ignore
1264 {
(gdb) p def->target
$2 = {port = -1, addr = 0xffffffff, name = 0xffffffff <Address 0xffffffff out of bounds>}
</snip>
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=856489
Signed-off-by: Alex Jia <ajia@redhat.com>
Add separate function parallelsCreateCt, which creates container.
Also add example xml configuration domain-parallels-ct-simple.xml.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Fix code, which checks what is changed in virDomainDef structure.
It looks slightly different for containers and VMs: containers haven't
boot devices, but have init path
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
User may set "unlimited" cpus for containers, which means to
take all available cpus on the node.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
This patch makes parallelsLoadDomains to be able to load information
about containers. So functions, which return different information
and change state will work.
parallelsDomainDefineXML will be fixed in separate patch.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
The QEMU capabilities APIs used a misc of 'int' and
'unsigned int' for variables relating to array sizes.
Change all these to use 'size_t'
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To allow each VM instance to record additional capabilities
without affecting other VMs, there needs to be a way to do
a deep copy of the qemuCapsPtr object
Add struct fields and APIs to allow the qemu capabilities object
to store version, arch, machines & cpu names, etc
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The current qemu capabilities are stored in a virBitmapPtr
object, whose type is exposed to callers. We want to store
more data besides just the flags, so we need to move to a
struct type. This object will also need to be reference
counted, since we'll be maintaining a cache of data per
binary. This change introduces a 'qemuCapsPtr' virObject
class. Most of the change is just renaming types and
variables in all the callers
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If no private data needs to be maintained, it can be useful
to create virDomainObjPtr instances without having a virCapsPtr
instance around. Adapt the virDomainObjNew() function to allow
for a NULL caps
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Technically speaking we should wait until we receive the QMP
greeting message before attempting to send any QMP monitor
commands. Mostly we've got away with this, but there is a race
in some QEMU which cause it to SEGV if you sent it data too
soon after startup. Waiting for the QMP greeting avoids the
race
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=795929http://git.qemu.org/?p=qemu.git;a=commitdiff;h=6af165892cf900291046f1d25f95416f379504c2
This patch define and parse the input XML of USB redirection filter.
<devices>
...
<redirdev bus='usb' type='spicevmc'>
<address type='usb' bus='0' port='4'/>
</redirdev>
<redirfilter>
<usbdev class='0x08' vendor='0x1234' product='0xbeef' \
version='2.00' allow='yes'/>
<usbdev allow='no'/>
</redirfilter>
...
</devices>
There is no 1:1 mapping between ports and redirected devices and
qemu and spicy client couldn't decide into which usbredir ports
the client can 'plug' redirected devices. So it make sense to apply
all of filter rules global to all existing usb redirection devices.
class attribute is USB Class codes. version is bcdDevice value
of USB device. vendor and product is USB vendorId and productId.
-1 can be used to allow any value for a field. Except allow attribute
the other four are optional, default value is -1.
Add a qemu flag for USB redirection filter support.
The output:
usb-redir.chardev=chr
usb-redir.debug=uint8
usb-redir.filter=string
usb-redir.port=string
I got an off-list report about a bad diagnostic:
Target network card mac 52:54:00:49:07:ccdoes not match source 52:54:00:49:07:b8
True to form, I've added a syntax check rule to prevent it
from recurring, and found several other offenders.
* cfg.mk (sc_require_whitespace_in_translation): New rule.
* src/conf/domain_conf.c (virDomainNetDefCheckABIStability): Add
space.
* src/esx/esx_util.c (esxUtil_ParseUri): Likewise.
* src/qemu/qemu_command.c (qemuCollectPCIAddress): Likewise.
* src/qemu/qemu_driver.c (qemuDomainSetMetadata)
(qemuDomainGetMetadata): Likewise.
* src/qemu/qemu_hotplug.c (qemuDomainChangeNetBridge): Likewise.
* src/rpc/virnettlscontext.c
(virNetTLSContextCheckCertDNWhitelist): Likewise.
* src/vmware/vmware_driver.c (vmwareDomainResume): Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainGetXMLDesc, vboxAttachDrives):
Avoid false negatives.
* tools/virsh-domain.c (info_save_image_dumpxml): Reword.
Based on a report by Luwen Su.
Currently qemuMonitorOpen() requires an address of the QEMU
monitor. When doing QMP based capabilities detection it is
easier if a pre-opened FD can be provided, since then the
monitor can be run on the STDIO console. Add a new API
qemuMonitorOpenFD() for such usage
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Historically, the first <console> element is treated as the
alias of a <serial> device. In the virDomainDeviceInfoIterate,
This situation is not considered. It still handles the first <console>
element as another devices, which means that for console[0] with
serial targetType, it calls callback function another time.
It will cause the problem of address conflicts when assigning
spapr-vio address for serial device on pSeries guest.
For pSeries guest, the serial configuration in the xml file
is as the following:
<serial type='pty'>
<target port='0'/>
<address type='spapr-vio'/>
</serial>
Console configuration is default, the dumped xml file is as the following:
<serial type='pty'>
<source path='/dev/pts/5'/>
<target port='0'/>
<alias name='serial0'/>
<address type='spapr-vio' reg='0x30000000'/>
</serial>
<console type='pty' tty='/dev/pts/5'>
<source path='/dev/pts/5'/>
<target type='serial' port='0'/>
<alias name='serial0'/>
<address type='spapr-vio' reg='0x30000000'/>
</console>
It shows that the <console> device is the alias of serial device.
So its address is the same as the serial device. When detecting
the conflicts in the qemuAssignSpaprVIOAddress the first console
and the serial device conflicts because virDomainDeviceInfoIterate()
still handle these as two different devices, and in the qemuAssignSpaprVIOAddress(),
it will compare these two devices' addressed. If they have same address,
it will report address conflict error.
So this patch is to handle the first console which targetType is serial
as the alias of serial device to avoid address conflicts error reported.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
This is not that ideal as API for other objects, as it's still
O(n). Because interface driver uses netcf APIs to manage the
stuffs, instead of by itself. And netcf APIs don't return a object.
It provides APIs like old libvirt APIs:
ncf_number_of_interfaces
ncf_list_interfaces
ncf_lookup_by_name
......
Perhaps we should further improve netcf to let it provide an API
to return the object, but it could be a later patch. And anyway,
we will still benefit from the new API for the simplification,
and no race like the old APIs.
src/interface/netcf_driver.c: Implement listAllInterfaces
The RPC generator doesn't support returning list of object yet, this patch
do the work manually.
* daemon/remote.c:
Implemente the server side handler remoteDispatchConnectListAllInterfaces.
* src/remote/remote_driver.c:
Add remote driver handler remoteConnectListAllInterfaces.
* src/remote/remote_protocol.x:
New RPC procedure REMOTE_PROC_CONNECT_LIST_ALL_INTERFACES and
structs to represent the args and ret for it.
* src/remote_protocol-structs: Likewise.
This is to list the interface objects, supported filtering flags
are: active|inactive.
include/libvirt/libvirt.h.in: Declare enum virConnectListAllInterfaceFlags
and virConnectListAllInterfaces.
python/generator.py: Skip auto-generating
src/driver.h: (virDrvConnectListAllInterfaces)
src/libvirt.c: Implement the public API
src/libvirt_public.syms: Export the symbol to public
The remote driver first looks at the libvirt auth config file to
fill in any credentials. It then invokes the auth callback for
any remaining credentials. It was accidentally invoking the
auth callback even if there were not any more credentials
required.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
All public API functions must call virResetLastError to clear
out any previous error. The virConnectOpen* functions forgot
to do this.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
src/conf/network_conf.c: Add virNetworkMatch to filter the networks;
and virNetworkList to iterate over all the networks with the filter.
src/conf/network_conf.h: Declare virNetworkList and define the macros
for filters.
src/libvirt_private.syms: Export virNetworkList.
The RPC generator doesn't support returning list of object, this patch
do the work manually.
* daemon/remote.c:
Implemente the server side handler remoteDispatchConnectListAllNetworks.
* src/remote/remote_driver.c:
Add remote driver handler remoteConnectListAllNetworks.
* src/remote/remote_protocol.x:
New RPC procedure REMOTE_PROC_CONNECT_LIST_ALL_NETWORKS and
structs to represent the args and ret for it.
* src/remote_protocol-structs: Likewise.
This is to list the network objects, supported filtering flags
are: active|inactive, persistent|transient, autostart|no-autostart.
include/libvirt/libvirt.h.in: Declare enum virConnectListAllNetworkFlags
and virConnectListAllNetworks.
python/generator.py: Skip auto-generating
src/driver.h: (virDrvConnectListAllNetworks)
src/libvirt.c: Implement the public API
src/libvirt_public.syms: Export the symbol to public
e5a1bee07 introduced a regression in Boxes: when Boxes is left idle
(it's still doing some libvirt calls in the background), the
libvirt connection gets closed after a few minutes. What happens is
that this code in virNetClientIOHandleOutput gets triggered:
if (!thecall)
return -1; /* Shouldn't happen, but you never know... */
and after the changes in e5a1bee07, this causes the libvirt connection
to be closed.
Upon further investigation, what happens is that
virNetClientIOHandleOutput is called from gvir_event_handle_dispatch
in libvirt-glib, which is triggered because the client fd became
writable. However, between the times gvir_event_handle_dispatch
is called, and the time the client lock is grabbed and
virNetClientIOHandleOutput is called, another thread runs and
completes the current call. 'thecall' is then NULL when the first
thread gets to run virNetClientIOHandleOutput.
After describing this situation on IRC, danpb suggested this:
11:37 < danpb> In that case I think the correct thing would be to change
'return -1' above to 'return 0' since that's not actually an
error - its a rare, but expected event
which is what this patch is doing. I've tested it against master
libvirt, and I didn't get disconnected in ~10 minutes while this
happens in less than 5 minutes without this patch.
The RPC generator doesn't returning support list of object, this
patch do the work manually.
* daemon/remote.c:
Implemente the server side handler remoteDispatchStoragePoolListAllVolumes
* src/remote/remote_driver.c:
Add remote driver handler remoteStoragePoolListAllVolumes
* src/remote/remote_protocol.x:
New RPC procedure REMOTE_PROC_STORAGE_POOL_LIST_ALL_VOLUMES and
structs to represent the args and ret for it.
* src/remote_protocol-structs: Likewise.
Simply returns the storage volume objects. No supported filter
flags.
include/libvirt/libvirt.h.in: Declare the API
python/generator.py: Skip the function for generating. virStoragePool.py
will be added in later patch.
src/driver.h: virDrvStoragePoolListVolumesFlags
src/libvirt.c: Implementation for the API.
src/libvirt_public.syms: Export the symbol to public
GNOME Boxes sometimes stops getting domain events from libvirtd, even
after restarting it. Further investigation in libvirtd shows that
events are properly queued with virDomainEventStateQueue, but the
timer virDomainEventTimer which flushes the events and sends them to
the clients never gets called. Looking at the event queue in gdb
shows that it's non-empty and that its size increases with each new
events.
virDomainEventTimer is set up in virDomainEventStateRegister[ID]
when going from 0 client connecte to 1 client connected, but is
initially disabled. The timer is removed in
virDomainEventStateRegister[ID] when the last client is disconnected
(going from 1 client connected to 0).
This timer (which handles sending the events to the clients) is
enabled in virDomainEventStateQueue when queueing an event on an
empty queue (queue containing 0 events). It's disabled in
virDomainEventStateFlush after flushing the queue (ie removing all
the elements from it). This way, no extra work is done when the queue
is empty, and when the next event comes up, the timer will get
reenabled because the queue will go from 0 event to 1 event, which
triggers enabling the timer.
However, with this Boxes bug, we have a client connected (Boxes), a
non-empty queue (there are events waiting to be sent), but a disabled
timer, so something went wrong.
When Boxes connects (it's the only client connecting to the libvirtd
instance I used for debugging), the event timer is not set as expected
(state->timer == -1 when virDomainEventStateRegisterID is called),
but at the same time the event queue is not empty. In other words,
we had no clients connected, but pending events. This also explains
why the timer never gets enabled as this is only done when an event
is queued on an empty queue.
I think this can happen if an event gets queued using
virDomainEventStateQueue and the client disconnection happens before
the event timer virDomainEventTimer gets a chance to run and flush
the event. In this situation, virDomainEventStateDeregister[ID] will
get called with a non-empty event queue, the timer will be destroyed
if this was the only client connected. Then, when other clients connect
at a later time, they will never get notified about domain events as
the event timer will never get enabled because the timer is only
enabled if the event queue is empty when virDomainEventStateRegister[ID]
gets called, which will is no longer the case.
To avoid this issue, this commit makes sure to remove all events from
the event queue when the last client in unregistered. As there is
no longer anyone interested in receiving these events, these events
are stale so there is no need to keep them around. A client connecting
later will have no interest in getting events that happened before it
got connected.
The introduction of /sys/fs/cgroup came in fairly recent kernels.
Prior to that time distros would pick a custom directory like
/cgroup or /dev/cgroup. We need to auto-detect where this is,
rather than hardcoding it
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add some non-null annotations to qemuMonitorOpen and also
check that the error callback is set, since it is mandatory
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch adds a helper to deal with assigning values to
virTypedParameter structures from strings. The helper parses the value
from the string and assigns it to the corresponding union value.
The quota and period tunables for cpu scheduler accept only a certain
range of values. When changing the live configuration invalid values get
rejected. This check is not performed when changing persistent config.
This patch adds a separate range check, that improves error messages
when changing live config and adds the check for persistent config.
This check is done only when using the API. It is still possible to
specify invalid values in the XML.
This patch tries to clean the code up a little bit and shorten very long
lines.
The apparent semantic change from moving the condition before calling
the setter function is a non-issue here as the setter function is a
no-op when called with both arguments zero.
Commit 2a41bc9 dropped a dependency on gawk, but we can go one step
further and avoid awk altogether.
* src/nwfilter/nwfilter_ebiptables_driver.c
(iptablesLinkIPTablesBaseChain): Simplify command.
(ebiptablesDriverInit, ebiptablesDriverShutdown): Drop awk probe.
This patch removed the "--filterwin2k" dnsmasq command line
parameter which was unnecessary for domain specification,
possibly blocked some usage, and was command line clutter.
Gene Czarcinski <gene@czarc.net>
FreeBSD and OpenBSD have a <net/if.h> that is not self-contained;
and mingw lacks the header altogether. But gnulib has just taken
care of that for us, so we might as well simplify our code. In
the process, I got a syntax-check failure if we don't also take
the gnulib execinfo module.
* .gnulib: Update to latest, for execinfo and net_if.
* bootstrap.conf (gnulib_modules): Add execinfo and net_if modules.
* configure.ac: Let gnulib check for headers. Simplify check for
'struct ifreq', while also including enough prereq headers.
* src/internal.h (IF_NAMESIZE): Drop, now that gnulib guarantees it.
* src/nwfilter/nwfilter_learnipaddr.h: Use correct header for
IF_NAMESIZE.
* src/util/virnetdev.c (includes): Assume <net/if.h> exists.
* src/util/virnetdevbridge.c (includes): Likewise.
* src/util/virnetdevtap.c (includes): Likewise.
* src/util/logging.c (includes): Assume <execinfo.h> exists.
(virLogStackTraceToFd): Handle gnulib's fallback implementation.
When the event symbols were added to the public API, not all
of them were removed from the private exports list. Solaris
gets unhappy when there are duplicated symbols. Extend the
symfile check to test for this scenario
The RPC generator doesn't support returning list of object, this patch does
the work manually.
* daemon/remote.c:
Implement the server side handler remoteDispatchConnectListAllStoragePools
* src/remote/remote_driver.c:
Add remote driver handler remoteConnectListAllStoragePools.
* src/remote/remote_protocol.x:
New RPC procedure REMOTE_PROC_CONNECT_LIST_ALL_STORAGE_POOLS and
structs to represent the args and ret for it.
* src/remote_protocol-structs: Likewise.
src/conf/storage_conf.c: Add virStoragePoolMatch to filter the
pools; Add virStoragePoolList to iterate over the pool objects
with filter.
src/conf/storage_conf.h: Declare virStoragePoolMatch,
virStoragePoolList, and the macros for filters.
src/libvirt_private.syms: Export helper virStoragePoolList.
This introduces a new API to list the storage pool objects,
4 groups of flags are provided to filter the returned pools:
* Active or not
* Autostarting or not
* Persistent or not
* And the pool type.
include/libvirt/libvirt.h.in: New enum virConnectListAllStoragePoolFlags;
Declare the API.
python/generator.py: Skip the generating
src/driver.h: (virDrvConnectListAllStoragePools)
src/libvirt.c: Implementation for the API.
src/libvirt_public.syms: Export the symbol.
ESX doesn't use the common virDomainObj implementation so this patch
adds a separate implementation.
This driver supports all currently defined filtering flags, but as with
other drivers some combinations yield a empty result list.
Hyperv doesn't use the common virDomainObj implementation so this patch
adds a separate implementation.
This driver supports all currently added flags for filtering although
some of those don't make sense with this driver (no support yet) and
thus produce no output when used.
I tested both OpenBSD and cygwin; both failed 'make check' with:
GEN check-symfile
Can't return outside a subroutine at ./check-symfile.pl line 13.
Perl requires 'exit 77' instead of 'return 77' in that context,
but even with that tweak, the build still fails, since the exit
code of 77 is only special to explicit TESTS=foo listings, and
not to make-only dependency rules where we are not going through
automake's test framework.
* src/check-symfile.pl: Kill bogus platform check...
* src/Makefile.am (check-symfile): ...and replace with an automake
conditional.
This fixes https://bugzilla.redhat.com/show_bug.cgi?id=852984
If a network or interface is configured to use Open vSwitch, but
ovs-vswitchd (the Open vSwitch database service) isn't running, the
ovs-vsctl add-port/del-port commands will hang indefinitely rather
than returning an error. There is a --nowait option, but that appears
to have no effect on add-port and del-port commands, so instead we add
a --timeout=5 to the commands - they will retry for up to 5 seconds,
then fail if there is no response.
OpenBSD ships with gcc 4.2.1, which annoyingly treats all format
strings as though they were also attribute((nonnull)). The two
concepts are orthogonal, though, as evidenced by the number of
spurious warnings it generates on uses where we know that
virReportError specifically handles NULL instead of a format
string; worse, since we now force -Werror on git builds, it
prevents development builds on OpenBSD.
I hate to do this, as it disables ALL format checking on older
gcc, and therefore misses out on some useful checks (code that
happened to compile on Linux may still have type mismatches
when compiled on other platforms, as evidenced by the number
of times I have fixed formatting mismatches for uid_t as found
by warnings on Cygwin), but I don't see any other way to keep
-Werror alive and still compile on OpenBSD.
A more invasive change would be to make virReportError() mark
its format attribute as nonnull, and fix (a lot of) fallout;
we may end up doing that anyways as part of danpb's error
refactoring improvements, but not today.
* src/internal.h (ATTRIBUTE_FMT_PRINTF): Use preferred spellings.
* m4/virt-compile-warnings.m4 (-Wformat): Disable on older gcc.
This is another fix for the emulator-pin series. When going through
the cputune pinning settings, the current code is trying to pin all
the CPUs, even when not all of them are specified. This causes error
in the subsequent function which, of course, cannot find the cpu to
pin. Since it's enough to pass the correct VCPU ID to the function,
the fix is trivial.
Only VNC_{{DIS,}CONNECTED,INITIALIZED} and SPICE_INITIALIZED events are
documented to support server/auth field and even there it is marked as
optional. Emit "" auth scheme in case QEMU didn't send it.
On OpenBSD, clock_gettime() exists in libc rather than librt, and
blindly linking with -lrt made the build fail. Gnulib already
did the work for determining which libraries to use, so we should
reuse that work rather than doing it ourselves.
* bootstrap.conf (gnulib_modules): Pull in clock-time.
* configure.ac (RT_LIBS): Drop.
* src/Makefile.am (libvirt_util_la_LIBADD): Use gnulib variable
instead.
* src/util/virtime.c (includes): Simplify.
After discussion with DB we decided to rename the new iolimit
element as it creates the impression it would be there to
limit (i.e. throttle) I/O instead of specifying immutable
characteristics of a block device.
This is also backed by the fact that the term I/O Limits has
vanished from newer storage admin documentation.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
When reboot using qemu guest agent was requested, qemu driver kept
waiting for SHUTDOWN event from qemu. However, such event is never
emitted during guest reboot and qemu driver would keep waiting forever.
This patch adds support for running qemu guests with the required
parameters to forcefully enable or disable BIOS advertising of S3 and
S4 states. The support for this is added to capabilities and there is
also a qemu command parameter parsing implemented.
There is a new <pm/> element implemented that can control what ACPI
sleeping states will be advertised by BIOS and allowed to be switched
to by libvirt. The default keeps defaults on hypervisor, otherwise
forces chosen setting.
The documentation of the pm element is added as well.
Implementation of iolimits for the qemu driver with
capability probing for block size attribute and
command line generation for block sizes.
Including testcase for qemuxml2argvtest.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Introducing a new iolimits element allowing to override certain
properties of a guest block device like the physical and logical
block size.
This can be useful for platforms with 'non-standard' disk formats
like S390 DASD with its 4K block size.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Without this patch, logged command executions can be ambiguous if
the command contained any shell metacharacters. This has caused
more than one person to attempt to patch clients to add unnecessary
quoting, without realizing that the command itself was run with
correct args, and only the logged output was ambiguous.
* src/util/command.c (virCommandToString): Add shell escapes.
* tests/commandtest.c (test16): Test new behavior.
* tests/commanddata/test16.log: Update expected output.
* tests/qemuxml2argvdata/qemuxml2argv-*.args: Likewise.
* tests/networkxml2argvdata/*.argv: Likewise.
After fixing the last review comments on remote port searching (commit
a14b4aea51), the commit right after that
wasn't modified accordingly, therefore two values weren't changed as
they should and the configurable ports don't work as expected.
This simple commit changes last two values missed and fixes the issue.
The codes were updated to allow to reset the device as long as
there is no devices/functions behind the same bus. However, the
comments were kept without touched.
To avoid backward compatibility issues, this patch suppresses
auto-generated DAC labels from XML. This change affects commands such as
dumpxml and save.
Signed-off-by: Marcelo Cerri <mhcerri@linux.vnet.ibm.com>
With this patch libvirt tries to assign a model to a single seclabel
when model is missing. Libvirt will look up at host's capabilities and
assign the first model to seclabel.
This patch fixes:
1. The problem with existing guests that have a seclabel defined in its XML.
2. A XML parse error when a guest is restored.
Signed-off-by: Marcelo Cerri <mhcerri@linux.vnet.ibm.com>
When domain XML contains any of the elements for setting up CPU
scheduling parameters (period, quota, emulator_period, or
emulator_quota) we need cpu cgroup to enforce the configuration.
However, the existing code would just ignore silently such settings if
either cgroups were not available at all cpu cgroup was not available.
Moreover, APIs for manipulating CPU scheduler parameters were already
failing if cpu cgroup was not available. This patch makes cpu cgroup
mandatory for all domains that use CPU scheduling elements in their XML.
The variable max_id is initialized again in the step of
getting cpu mapping variable map2. But in the next for loop
we still expect original value of max_id, the bug will
crash libvirtd when using on NUMA machine with big number
of cpus.
On NUMA machine, the length of string got from file
cpuacct.usage_percpu is quite large, so expand the
limit of 1024 bytes.
errors like:
Failed to read file \
'/cgroup/cpuacct/libvirt/qemu/rhel6q/cpuacct.usage_percpu': \
Value too large for defined data type
Some DHCP servers send their DHCP replies to the broadcast MAC address
rather than to the MAC address of the VM. The existing DHCP snooping
code assumes that the reply always goes to the MAC address of the VM
thus filtering the traffic of some DHCP servers' replies.
The below patch adapts the code to
1) filter DHCP replies by comparing the MAC address in the reply against
the MAC address of the VM (held in the snoop request)
2) adapts the pcap filter for traffic towards the VM to accept DHCP replies
sent to any MAC address; for further filtering we rely on 1)
3) creates initial rules that are active while waiting for DHCP replies;
these rules now accept DHCP replies to the VM's MAC address or to the
MAC broadcast address
The introduction of the new VLAN code, along with the fix
from 5e465df6be, caused the
addition of OVS ports to fail with the following message:
ovs-vsctl: 00002|vsctl|ERR|: missing column name
This fix takes into account the VLAN arguments are optional,
and correctly sets up the command line to run the "ovs-vsctl"
command to add ports to the OVS bridge.
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
CC: Eric Blake <eblake@redhat.com>
If a 8021.Qbh network device supports SRIOV and its VF is being used
in pci passthrough mode, when the guest is shutdown or destroyed, the
PF inteface is also brought down. qemuDomainHostdevNetConfigRestore()
finds out the PF for provided hostdev (which is VF) and passes it to
virNetDevPortProfileDisassociate() as linkdev. Later, linkdev gets passed
to virNetDevSetOnline() where the interface is brought down by clearing
IFF_UP flag.
Bringing down a PF, when only VF is being brought down is not expected
behavior. This patch adds a check so that virNetDevSetOnline() is called
only for PF and not if device is a VF.
Signed-off-by: Nishank Trivedi <nistrive@cisco.com>
The loop processing the trusted DHCP server generated one too
many rules and added one final rules that accepted responses
from all DHCP servers. Below patch fixes this.
virDomainVcpuPinAdd does a realloc on vcpupin_list if the new vcpu pin
definition doesn't fit into the array. The list is an array of pointers
but the function definition didn't support returning the changed pointer
to the caller if it was realloced. This caused segfaults if realloc
would change the base pointer.
virDomainVcpuPinDefCopy when the control flow reaches out of memory
cleanup code, the flow would end in a infinite loop as the loop variable
wasn't decremented.
Also a dereference of NULL pointers was possible if allocation of the
Vcpu pinning definiton structure failed.
Commit d0c0e79ac6 left behind some dead
code (hasDAC can't be efectively set to true, because
virSecurityManagerNew fails to load the "dac" driver).
This patch also enhances the condition for adding the default
auto-detected security manager if the manager array is allocated but
empty.
Also the configuration file for qemu driver still contains reference to
the DAC driver that can't be enabled manualy.
Before commit 05447e3af4, qemuAgentCommand
blocked until it got a reply or appropriate event. When new parameter
was added to qemuAgentCommand in the above commit, all existing callers
of it were updated in a wrong way changing them from blocking to
5-seconds timeout.
This bug was revealed by the crash described in
https://bugzilla.redhat.com/show_bug.cgi?id=852383
The vlan info pointer sent to virNetDevOpenvswitchAddPort should never
be non-NULL unless there is at least one tag. The factthat such a vlan
info pointer was receveid pointed out that a caller was passing the
wrong pointer. Instead of sending &net->vlan, the result of
virDomainNetGetActualVlan(net) should be sent - that function will
look for vlan info in net->data.network.actual->vlan, and in cany case
return NULL instead of a pointer if the vlan info it finds has no
tags.
Aside from causing the crash, sending a hardcoded &net->vlan has the
effect of ignoring vlan info from a <network> or <portgroup> config.
Fixup buffer usage when handling VLANs. Also fix the logic
used to determine if the virNetDevVlanPtr is valid or not.
Fixes crashes in the latest code when using Open vSwitch
virtualports.
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
If no 'security_driver' config option was set, then the code
just loaded the 'dac' security driver. This is a regression
on previous behaviour, where we would probe for a possible
security driver. ie default to SELinux if available.
This changes things so that it 'security_driver' is not set,
we once again do probing. For simplicity we also always
create the stack driver, even if there is only one driver
active.
The desired semantics are:
- security_driver not set
-> probe for selinux/apparmour/nop
-> auto-add DAC driver
- security_driver set to a string
-> add that one driver
-> auto-add DAC driver
- security_driver set to a list
-> add all drivers in list
-> auto-add DAC driver
It is not allowed, or possible to specify 'dac' in the
security_driver config param, since that is always
enabled.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The security driver loading code in qemu has a flaw that causes it to
register the DAC security driver twice. This causes problems (machines
unable to start) as the two DAC drivers clash together.
This patch refactors the code to allow loading the DAC driver even if
its specified in configuration (it can't be registered as a common
security driver), and does not add the driver twice.
This reverts commit 9f9b7b85c9.
The DAC security driver needs special handling and extra parameters and
can't just be added to regular security drivers.
If cgroups are enabled in general but cpu cgroup is disabled in
qemu.conf or not mounted at all, libvirt would refuse to start any
domain even though scheduler parameters are not set in domain XML.
This patch makes cpu cgroup mandatory only for domains that actually
want to use it.
* src/util/virnetdevopenvswitch.c (virNetDevOpenvswitchAddPort): avoid libvirtd
crash due to derefing a NULL virtVlan->tag.
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=852383
Signed-off-by: Alex Jia <ajia@redhat.com>
When starting a machine the DAC security driver tries to set the UID and
GID of the newly spawned process. This worked as desired if the desired
label was set. When the label was missing a logical bug in
virSecurityDACGenLabel() caused that uninitialised values were used as
uid and gid for the new process.
With this patch, default values (from qemu driver configuration)
are used if the label is not found.
getpwuid_r returns success but sets the return structure to NULL when it
fails to deliver data about the requested uid. In our helper code this
created following strange error messages:
" ... cannot getpwuid_r(1234): Success"
This patch creates a more helpful message:
" ... getpwuid_r failed to retrieve data for uid '1234'"
Previous commit 0b4b53bb80 defined 'inline' to prevent broken build on
systems with libnl1 headers. However, it broke build on systems with
libnl3 headers. Therefore we must make that fix conditional.
Ubuntu 10.04 shipped with out-of-the-box libnl1 headers, which
assumed the old gcc semantics of 'extern inline' as a C89 extension:
the function will _always_ be inline if it is used, and that
it may be declared extern inline in headers without a definition,
as long as the definition occurs before any use. But when C99
added 'extern inline' as a mandatory feature of the language, with
slightly different semantics than gcc (the function MUST have
external linkage, and the inline definition MUST be present
alongside any declaration, where the compiler can then choose
which of the two versions to use), this rendered the use of
'inline' in libnl's header obsolete. Most distros already solved
this by removing 'inline' (the resulting 'extern' is correct,
regardless of gcc semantics), and libnl-3 does not have the
problem (where it has switched to 'static inline' instead, again
with the definition present, and again, our hack will result in
plain 'static' with no ill effects). But for the case of building
out of the box, we hack around the broken Ubuntu header.
* src/util/virnetlink.h: Work around libnl issue.
With current flow in qemudDomainDefine we might lose data
when updating an existing domain. We parse given XML and
overwrite the configuration. Then we try to save the new
config. However, this step may fail and we don't perform any
roll back. In fact, we remove the domain from the list of
domains held up by qemu driver. This is okay as long as the
domain was brand new one.
Currently, when guest agent is configured but not responsive
(e.g. due to appropriate service not running in the guest)
we return VIR_ERR_INTERNAL_ERROR. Both are wrong. Therefore
we need to introduce new error code to reflect this case.
When checking for seclabels without security models, def->nseclabels is
already set to n. In the case of an error def->seclabels is freed but
nseclabels is left untouched. This leads to a segmentation fault when
def is freed in virDomainDefParseXML.
With the latest patches libvirt supports qemu agent monitor
passthrough. However, function in qemu driver is called
qemuDrvDomainAgentCommand. s/Drv// as used in all other names.
In my quest for reusing variables I failed to edit one variable when
fixing details between two patch versions. That results in a failure
to start qemu with autoport and spice tls, because qemu is trying to
bind two sockets to the same port.
Commit 4b03d59167 changed the pinning
behavior in a way that makes some machines non-startable.
The comment mentioning that we cannot control each vcpu when there is
not VCPU<-> PID mapping available is true, however, this isn't
necessarily an error, because this can be caused by old QEMU without
support for "query-cpus" command as well as a software emulated
machines that don't create more than one process.
When libvirt_lxc is built, it uses the utility library and #includes
virnetdev.h, which #includes virnetlink.h, which includes
<netlink/msg.h>.
Normally, the netlink include directory would be just off
/usr/include, so that wouldn't create a problem, but on Fedora and
RHEL systems using libnl3, the libnl includes have been moved into
/usr/include/libnl3 (to allow concurrent installation of libnl-1.1).
All other binaries that need it have added $(LIBNL_CFLAGS) to their
CFLAGS, but not libvirt_lxc, so it fails to build on Fedora and RHEL
that have only libnl3-devel installed. This was previously unnoticed
because everyone was building with libnl headers in
/usr/include/netlink (even on systems with the headers in
/usr/include/libnl3/netlink, many people (like me) usually also have
the libnl1.1 headers in /usr/include/netlink).
This patch adds the necessary CFLAGS for libvirt_lxc.
Note that we don't need to add $(LIBNL_LIBS) to the LDADD for this
binary, because it never directly calls libnl functions, but only
calls them indirectly through the util library, which it's already
linking against.
The name 'virDomainDiskSnapshot' didn't fit in with our normal
conventions of using a prefix hinting that it is related to a
virDomainSnapshotPtr. Also, a future patch will reuse the
enum for declaring where the VM memory is stored.
* src/conf/snapshot_conf.h (virDomainDiskSnapshot): Rename...
(virDomainSnapshotLocation): ...to this.
(_virDomainSnapshotDiskDef): Update clients.
* src/conf/domain_conf.h (_virDomainDiskDef): Likewise.
* src/libvirt_private.syms (domain_conf.h): Likewise.
* src/conf/domain_conf.c (virDomainDiskDefParseXML)
(virDomainDiskDefFormat): Likewise.
* src/conf/snapshot_conf.c: (virDomainSnapshotDiskDefParseXML)
(virDomainSnapshotAlignDisks, virDomainSnapshotDefFormat):
Likewise.
* src/qemu/qemu_driver.c (qemuDomainSnapshotDiskPrepare)
(qemuDomainSnapshotCreateSingleDiskActive)
(qemuDomainSnapshotCreateDiskActive, qemuDomainSnapshotCreateXML):
Likewise.
This has several benefits:
1. Future snapshot-related code has a definite place to go (and I
_will_ be adding some)
2. Snapshot errors now use the VIR_FROM_DOMAIN_SNAPSHOT error
classification, which has been underutilized (previously only in
libvirt.c)
* src/conf/domain_conf.h, domain_conf.c: Split...
* src/conf/snapshot_conf.h, snapshot_conf.c: ...into new files.
* src/Makefile.am (DOMAIN_CONF_SOURCES): Build new files.
* po/POTFILES.in: Mark new file for translation.
* src/vbox/vbox_tmpl.c: Update caller.
* src/esx/esx_driver.c: Likewise.
* src/qemu/qemu_command.c: Likewise.
* src/qemu/qemu_domain.h: Likewise.
We were failing to react to allocation failure when initializing
a snapshot object list. Changing things to store a pointer
instead of a complete object adds one more possible point of
allocation failure, but at the same time, will make it easier to
react to failure now, as well as making it easier for a future
patch to split all virDomainSnapshotPtr handling into a separate
file, as I continue to add even more snapshot code.
Luckily, there was only one client outside of domain_conf.c that
was actually peeking inside the object, and a new wrapper function
was easy.
* src/conf/domain_conf.h (_virDomainObj): Use a pointer.
(virDomainSnapshotObjListInit): Rename.
(virDomainSnapshotObjListFree, virDomainSnapshotForEach): New
declarations.
(_virDomainSnapshotObjList): Move definitions...
* src/conf/domain_conf.c: ...here.
(virDomainSnapshotObjListInit, virDomainSnapshotObjListDeinit):
Rename...
(virDomainSnapshotObjListNew, virDomainSnapshotObjListFree): ...to
these.
(virDomainSnapshotForEach): New function.
(virDomainObjDispose, virDomainListPopulate): Adjust callers.
* src/qemu/qemu_domain.c (qemuDomainSnapshotDiscard)
(qemuDomainSnapshotDiscardAllMetadata): Likewise.
* src/qemu/qemu_migration.c (qemuMigrationIsAllowed): Likewise.
* src/qemu/qemu_driver.c (qemuDomainSnapshotLoad)
(qemuDomainUndefineFlags, qemuDomainSnapshotCreateXML)
(qemuDomainSnapshotListNames, qemuDomainSnapshotNum)
(qemuDomainListAllSnapshots)
(qemuDomainSnapshotListChildrenNames)
(qemuDomainSnapshotNumChildren)
(qemuDomainSnapshotListAllChildren)
(qemuDomainSnapshotLookupByName, qemuDomainSnapshotGetParent)
(qemuDomainSnapshotGetXMLDesc, qemuDomainSnapshotIsCurrent)
(qemuDomainSnapshotHasMetadata, qemuDomainRevertToSnapshot)
(qemuDomainSnapshotDelete): Likewise.
* src/libvirt_private.syms (domain_conf.h): Export new function.
When the XenStore tdb lives persistently and is not cleared between host
reboots, Xend (version 3.4 and 4.1) re-creates the domain information
located in XenStore below /vm/$UUID. (According to the xen-3.2-commit
hg265950e3df69 to fix a problem when locally migrating a domain to the
host itself.)
When doing so a version number is added to the UUID separated by one
dash, which confuses xenStoreDomainIntroduced(): It iterates over all
domains and tries to lookup all inactive domains using
xenStoreDomainGetUUID(), which fails if the running domain is renamed:
virUUIDParse() fails to parse the versioned UUID and the domain is
flagged as missing. When this happens the function delays .2s and
re-tries 20 times again, multiplied by the number of renamed VMs.
14:48:38.878: 4285: debug : xenStoreDomainIntroduced:1354 : Some domains were missing, trying again
This adds a significant delay:
# time virsh list >/dev/null
real 0m6.529s
# xenstore-list /vm
00000000-0000-0000-0000-000000000000
00000000-0000-0000-0000-000000000000-1
00000000-0000-0000-0000-000000000000-2
00000000-0000-0000-0000-000000000000-3
00000000-0000-0000-0000-000000000000-4
00000000-0000-0000-0000-000000000000-5
7c06121e-90c3-93d4-0126-50481d485cca
00000000-0000-0000-0000-000000000000-6
00000000-0000-0000-0000-000000000000-7
144ad19d-dfb4-2f80-8045-09196bb8784f
00000000-0000-0000-0000-000000000000-8
144ad19d-dfb4-2f80-8045-09196bb8784f-1
00000000-0000-0000-0000-000000000000-9
00000000-0000-0000-0000-000000000000-10
00000000-0000-0000-0000-000000000000-11
00000000-0000-0000-0000-000000000000-12
00000000-0000-0000-0000-000000000000-13
00000000-0000-0000-0000-000000000000-14
144ad19d-dfb4-2f80-8045-09196bb8784f-2
00000000-0000-0000-0000-000000000000-15
144ad19d-dfb4-2f80-8045-09196bb8784f-3
00000000-0000-0000-0000-000000000000-16
The patch adds truncation of the UUID as read from the XenStore path
before passing it to virUUIDParse().
The same issue is reported at
<http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=666135>
Signed-off-by: Philipp Hahn <hahn@univention.de>
Currently, if users set 'security_driver="dac"' in qemu.conf libvirtd
fails to initialize as DAC driver is not found because it is missing
in our security drivers array.
The original patch to support firewalld in nwfilter wasn't personally
checking the exit status of firewall-cmd, but was instead sending NULL
in the *exitstatus arg, which meant that virCommandWait would log an
error just for the exit status being non-0 (and a "more scary than
useful" error at that).
We don't want to treat this as an error, though, just as a reason to
use standard (ip|eb)tables commands instead of firewall-cmd.
This patch modifies the virCommandRun in the nwfilter code to request
status back from the caller. This avoids virCommandWait logging an
error message, and allows the caller to do as it likes after examining
the status.
The VIR_DEBUG() logged when firewalld is enabled has also been
reworded and changed to a VIR_INFO, and a similar VIR_INFO has been
added in the case that firewalld is *not* found+enabled.
I noticed this while auditing all calls to virCommandRun that request
an exit status from virCommandRun. Two functions in the openvz driver
openvzDomainGetBarrierLimit
openvzDomainSetBarrierLimit
request an exit status from virCommandRun (thus assuring that
virCommandRun won't log any errors just due to a non-0 exit status),
but then fail to examine that exit status. This could result in the
functions believing that the call to "vzlist" was successful, even
though it may have encountered an error.
The recent virDomainQemuAgentCommand addition is part of 0.10.0;
also, grouping all libvirt-qemu.so callbacks together makes them
easier to identify.
* src/libvirt_qemu.syms: Fix release symbol.
* src/qemu/qemu_driver.c (qemuDriver): Likewise.
* src/remote/remote_driver.c (remote_driver): Likewise.
* src/driver.h (_virDriver): Group qemu-specific callbacks.
libvirt's network config documents that a bridge's STP "forward delay"
(called "delay" in the XML) should be specified in seconds, but
virNetDevBridgeSetSTPDelay() assumes that it is given a delay in
milliseconds (although the comment at the top of the function
incorrectly says "seconds".
This fixes the comment, and converts the delay to milliseconds before
calling virNetDevBridgeSetSTPDelay().
If a domain is pmsuspended then virsh suspend will succeed. Beside
obvious flaw, virsh resume will report success and change domain
state to running which is another mistake. Therefore we must forbid
any attempts for suspend and resume when pmsuspended.
Add qemuDomainAgentCommand() which is generated automatically,
for .qemuDomainArbitraryAgentCommand to remote driver.
Signed-off-by: MATSUDA Daiki <matsudadik@intellilink.co.jp>
Add @seconds variable to qemuAgentSend().
When @timemout is true, @seconds controls how long to wait for a
response (if @seconds is VIR_DOMAIN_QEMU_AGENT_COMMAND_DEFAULT,
default to QEMU_AGENT_WAIT_TIME).
In addition, @seconds must be >= 0 or VIR_DOMAIN_QEMU_AGENT_COMMAND_DEFAULT.
If @timeout is false, @seconds is ignored.
Signed-off-by: MATSUDA Daiki <matsudadik@intellilink.co.jp>
Several VIR_DEBUG()'s were changed to VIR_WARN() while I was testing
the firewalld support patch, and I neglected to change them back
before I pushed.
In the meantime I've decided that it would be useful to have them be
VIR_INFO(), just so there will be logged evidence of which method is
being used (firewall-cmd vs. (eb|ip)tables) without needing to crank
logging to 11. (at most this adds 2 lines to libvirtd's logs per
libvirtd start).
dnsmasq is forwarding a number of queries upstream that should not
be done. There still remains an MX query for a plain name with no
domain specified that will be forwarded is dnsmasq has --domain=xxx
--local=/xxx/ specified. This does not happen with no domain name
and --local=// ... not a libvirt problem.
BTW, thanks again to Claudio Bley!
The bandwidth units for blockpull and blockcopy are in Megabytes per
Second, not Megabits per Second.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
The virNetlinkEventAddClient / virNetlinkEventRemoveClient stub
impls had syntax errors in their parameter lists, using a ')'
after the second-to-last parameter instead of a ','
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch introduce virNetlinkEventServiceStopAll() to stop
all the monitors to receive netlink messages for libvirtd.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
This patch improve all the API in virnetlink.c to support
all kinds of netlink protocols, and make all netlink sockets
be able to join in groups.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Unfortunately libssh2 doesn't support all types of host keys that can be
saved in the known_hosts file. Also it does not report that parsing of
the file failed. This results into truncated known_hosts files where the
standard client stores keys also in other formats (eg.
ecdsa-sha2-nistp256).
This patch changes the default location of the known_hosts file into the
libvirt private configuration directory, where it will be only written
by the libssh2 layer itself. This prevents trashing user's known_host
file.
The libssh2 code wasn't supposed to create the known_hosts file, but
recent findings show, that we can't use the default created by OpenSSH
as libssh2 might damage it. We need to create a private known_hosts file
in the config path.
This patch adds support for skipping error if the known_hosts file is
not present and let libssh2 create a new one.
This patch introduces support of setting emulator's period and
quota to limit cpu bandwidth when the vm starts. Also updates
XML Schema for new entries and docs.
This patch changes the behaviour of xml element cputune.period
and cputune.quota to limit cpu bandwidth only for vcpus, and no
longer limit cpu bandwidth for the whole guest.
The reasons to do this are:
- This matches docs of cputune.period and cputune.quota.
- The other parts excepting vcpus are treated as "emulator",
and there are separate period/quota settings for emulator
in the subsequent patches
Introduce 2 APIs to support emulator threads in remote driver.
1) remoteDomainPinEmulator: call driver api, such as qemudDomainPinEmulator.
2) remoteDomainGetEmulatorPinInfo: call driver api, such as qemudDomainGetEmulatorPinInfo.
They are similar to remoteDomainPinVcpuFlags and remoteDomainGetVcpuPinInfo.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Introduce 2 APIs to support emulator threads pin in qemu driver.
1) qemudDomainPinEmulator: setup emulator threads pin info.
2) qemudDomainGetEmulatorPinInfo: get all emulator threads pin info.
They are similar to qemudDomainPinVcpuFlags and qemudDomainGetVcpuPinInfo.
And also, remoteDispatchDomainPinEmulatorFlags and remoteDispatchDomainGetEmulatorPinInfo
functions are introduced.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Introduce 2 APIs to support emulator threads pin.
1) virDomainEmulatorPinAdd: setup emulator threads pin with a given cpumap string.
2) virDomainEmulatorPinDel: remove all emulator threads pin.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Introduce 2 APIs to set/get physical cpu pinning info of emulator threads.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Emulator threads should also be pinned by sched_setaffinity(), just
the same as vcpu threads.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Introduce qemuSetupCgroupEmulatorPin() function to add emulator
threads pin info to cpuset cgroup, the same as vcpupin.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
This patch adds a new xml element <emulatorpin>, which is a sibling
to the existing <vcpupin> element under the <cputune>, to pin emulator
threads to specified physical CPUs.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
vcpu threads pin are implemented using sched_setaffinity(), but
not controlled by cgroup. This patch does the following things:
1) enable cpuset cgroup
2) reflect all the vcpu threads pin info to cgroup
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Create a new cgroup and move all emulator threads to the new cgroup.
And then we can do the other things:
1. limit only vcpu usage rather than the whole qemu
2. limit for emulator threads(include vhost-net threads)
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Introduce a new API to move tasks of one controller from a cgroup to another cgroup
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Introduce the function virCgroupForEmulator() to create sub directory
for simulator thread(include I/O thread, vhost-net thread)
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Qemu command line generation for geometry override and testcases.
Signed-off-by: J.B. Joret <jb@linux.vnet.ibm.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
A hypervisor may allow to override the disk geometry of drives.
Qemu, as an example with cyls=,heads=,secs=[,trans=].
This patch extends the domain config to allow the specification of
disk geometry with libvirt.
Signed-off-by: J.B. Joret <jb@linux.vnet.ibm.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Older automake 1.9.6 (hello there, RHEL 5) did not populate
$(builddir), which meant 'make check' failed with:
make[3]: *** No rule to make target `/.libs/libvirt.la', needed by `check-symfile'. Stop.
For that matter, even newer automake doesn't directly emit rules
to build .libs/libvirt.la; we are better off basing our rules
on the public ./libvirt.la.
* src/Makefile.am (check-symfile): Delete useless variable.
Without this patch, RHEL 5 fails to compile, since the dbus
files lives under /usr/include/dbus-1.0/dbus/dbus.h, and
DBUS_CFLAGS contains -I/usr/include/dbus-1.0.
In file included from network/bridge_driver.c:67:
../src/util/virdbus.h:26:25: error: dbus/dbus.h: No such file or directory
* src/Makefile.am (libvirt_driver_network_impl_la_CFLAGS): Add
DBUS_CFLAGS.
When gcc atomic intrinsics are not available (such as on RHEL 5
with gcc 4.1.2), we were getting link errors due to multiple
definitions:
./.libs/libvirt_util.a(libvirt_util_la-virobject.o): In function `virAtomicIntXor':
/home/dummy/l,ibvirt/src/util/viratomoic.h:404: multiple definition of `virAtomicIntXor'
./.libs/libvirt_util.a(libvirt_util_la-viratomic.o):/home/dummy/libvirt/src/util/viratomic.h:404: first defined here
Solve this by conditionally marking the functions static (the
condition avoids falling foul of gcc warnings about unused
static function declarations).
* src/util/viratomic.h: When not using gcc intrinsics, use static
functions to avoid linker errors on duplicate functions.
Building on RHEL 5 warned:
nodeinfo.c: 305: warning: implicit declaration of function 'CPU_COUNT'
This extension macro in <sched.h> was not added until later glibc.
* src/nodeinfo.c (CPU_COUNT): Add fallback implementation.
We already skip out on building the LXC under RHEL 5, because the
kernel is too old (commits 4c18acf, 2dee896); but commit 9612e4b
moved some LXC-only code into common files, resulting in this
build failure:
util/virfile.c: In function 'virFileLoopDeviceAssociate':
util/virfile.c:580: error: 'LO_FLAGS_AUTOCLEAR' undeclared (first use in this function)
Unfortunately, the kernel folks only made it an enum, rather than
also a #define, so we have to modify configure.ac to record when
it is usable.
* configure.ac (with_lxc): Mark when LO_FLAGS_AUTOCLEAR was found.
* src/util/virfile.c (virFileLoopDeviceAssociate): Avoid
compilation when kernel is too old.
Fix possible double close in the child process after the fork in case
infd and outfd are equal, just like they are after being called from
virNetSocketNewConnectCommand.
* configure.ac, spec file: firewalld defaults to enabled if dbus is
available, otherwise is disabled. If --with_firewalld is explicitly
requested and dbus is not available, configure will fail.
* bridge_driver: add dbus filters to get the FirewallD1.Reloaded
signal and DBus.NameOwnerChanged on org.fedoraproject.FirewallD1.
When these are encountered, reload all the iptables reuls of all
libvirt's virtual networks (similar to what happens when libvirtd is
restarted).
* iptables, ebtables: use firewall-cmd's direct passthrough interface
when available, otherwise use iptables and ebtables commands. This
decision is made once the first time libvirt calls
iptables/ebtables, and that decision is maintained for the life of
libvirtd.
* Note that the nwfilter part of this patch was separated out into
another patch by Stefan in V2, so that needs to be revised and
re-reviewed as well.
================
All the configure.ac and specfile changes are unchanged from Thomas'
V3.
V3 re-ran "firewall-cmd --state" every time a new rule was added,
which was extremely inefficient. V4 uses VIR_ONCE_GLOBAL_INIT to set
up a one-time initialization function.
The VIR_ONCE_GLOBAL_INIT(x) macro references a static function called
vir(Ip|Eb)OnceInit(), which will then be called the first time that
the static function vir(Ip|Eb)TablesInitialize() is called (that
function is defined for you by the macro). This is
thread-safe, so there is no chance of any race.
IMPORTANT NOTE: I've left the VIR_DEBUG messages in these two init
functions (one for iptables, on for ebtables) as VIR_WARN so that I
don't have to turn on all the other debug message just to see
these. Even if this patch doesn't need any other modification, those
messages need to be changed to VIR_DEBUG before pushing.
This one-time initialization works well. However, I've encountered
problems with testing:
1) Whenever I have enabled the firewalld service, *all* attempts to
call firewall-cmd from within libvirtd end with firewall-cmd hanging
internally somewhere. This is *not* the case if firewall-cmd returns
non-0 in response to "firewall-cmd --state" (i.e. *that* command runs
and returns to libvirt successfully.)
2) If I start libvirtd while firewalld is stopped, then start
firewalld later, this triggers libvirtd to reload its iptables rules,
however it also spits out a *ton* of complaints about deletion failing
(I suppose because firewalld has nuked all of libvirt's rules). I
guess we need to suppress those messages (which is a more annoying
problem to fix than you might think, but that's another story).
3) I noticed a few times during this long line of errors that
firewalld made a complaint about "Resource Temporarily
unavailable. Having libvirtd access iptables commands directly at the
same time as firewalld is doing so is apparently problematic.
4) In general, I'm concerned about the "set it once and never change
it" method - if firewalld is disabled at libvirtd startup, causing
libvirtd to always use iptables/ebtables directly, this won't cause
*terrible* problems, but if libvirtd decides to use firewall-cmd and
firewalld is later disabled, libvirtd will not be able to recover.
Generating "Unable to add lockspace /lock/space/dir/__LIBVIRT__DISKS__:
No such file or directory" is correct but not exactly clear. This patch
changes the error message to "Unable to create lockspace
/lock/space/dir/__LIBVIRT__DISKS__: parent directory does not exist or
is not a directory".
When running libvirtd from a build directory, libvirtd would load lock
drivers from system directory unless explicitly overridden by setting
LIBVIRT_LOCK_MANAGER_PLUGIN_DIR environment variable. Since we already
autodetect driver directory if libvirt is build with driver modules, we
can use the same trick to automagically set lock driver directory.
Commit 1d22ba95 was complete at the time, but we have since
reintroduced a warning that is fixed in the same manner:
CCLD storagebackendsheepdogtest
*** Warning: Linking the executable storagebackendsheepdogtest against the loadable module
*** libvirt_driver_storage.so is not portable!
* src/Makefile.am (libvirt_driver_storage.la): Factor into new
convenience library libvirt_driver_storage_impl.la.
* tests/Makefile.am (storagebackendsheepdogtest_LDADD): Link to
convenience library, not shared library.
Our existing STRNEQ_NULLABLE() triggered a warning in gcc 4.7 when
used with a literal NULL argument:
qemumonitorjsontest.c: In function 'testQemuMonitorJSONGetMachines':
qemumonitorjsontest.c:289:5: error: null argument where non-null required (argument 1) [-Werror=nonnull]
even though the strcmp is provably dead when a null argument is
present. Squelch the warning by refactoring things so that gcc
never sees strcmp() called with NULL arguments (we still compare
NULL as not equal to "", this rewrite merely aids gcc).
Next, gcc has a valid warning about a literal NULLSTR(NULL):
qemumonitorjsontest.c:289:5: error: invalid application of 'sizeof' to a void type [-Werror=pointer-arith]
Of course, you'd never write NULLSTR(NULL) directly, but it is
handy to use through macros. But the entire part about verify_true()
is unnecessary - gcc already warns about type mismatch with ?:,
without needing to make it more complex.
* src/internal.h (STREQ_NULLABLE, STRNEQ_NULLABLE): Avoid gcc 4.7
stupidity.
(NULLSTR): Simplify, to allow passing compile-time constants.
The DAC security driver uses the virStrToLong_ui function to
parse the uid/gid out of the seclabel string. This works on
Linux where 'uid_t' is an unsigned int, but on Mingw32 it is
just an 'int'. This causes compiler warnings about signed/
unsigned int pointer mis-match.
To avoid this, use explicit 'unsigned int ouruid' local
vars to pass into virStrToLong_ui, and then simply assign
to the 'uid_t' type after parsing
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch adds URI options to support libssh2 transport in the remote
driver.
A new transport sceme is introduced eg. "qemu+libssh2://..." that
utilizes the libssh2 code added in previous patches.
The libssh2 code requires the authentication callback to be able to
perform keyboard-interactive authentication or to ask t passprhases or
add host keys to known hosts database.
Added URI components:
- known_hosts - path to a knownHosts file in OpenSSH format to check
for known ssh host keys
- known_hosts_verify - how to deal with server key verification:
* "normal" (default) - ask to add new keys
* "auto" - automaticaly add new keys
* "ignore" - don't validate host keys
- sshauth - authentication methods to use. Default is
"agent,privkey,keyboard-interactive". It's a comma separated
string of methods to try while authenticating. The order is
preserved. Some of the methods may require additional
parameters.
Locations of the known_hosts file and private keys are set to default
values if they're present. (~/.ssh/known_hosts, ~/.ssh/id_rsa,
~/.ssh/id_dsa)
This patch adds a glue layer to enable using libssh2 code with the
network client code.
As in the original client implementation, shell code is sent to the
server to detect correct options for netcat and connect to libvirt's
unix socket.
This patch enables virNetSocket to be used as an ssh client when
properly configured.
This patch adds function virNetSocketNewConnectLibSSH2() that takes all
needed parameters and creates a libssh2 session and performs steps
needed to open the connection and then create a virNetSocket that
seamlesly encapsulates the communication.
This patch adds helper functions that enable us to use libssh2 in
conjunction with libvirt's virNetSockets for ssh transport instead of
spawning "ssh" client process.
This implemetation supports tunneled plaintext, keyboard-interactive,
private key, ssh agent based and null authentication. Libvirt's Auth
callback is used for interaction with the user. (Keyboard interactive
authentication, adding of host keys, private key passphrases). This
enables seamless integration into the application using libvirt. No
helpers as "ssh-askpass" are needed.
Reading and writing of OpenSSH style "known_hosts" files is supported.
Communication is done using SSH exec channel, where the user may specify
arbitrary command to be executed on the remote side and reads and writes
to/from stdin/out are sent through the ssh channel. Usage of stderr is
not (yet) supported.
Currently the dynamic label generation code will create labels
with a sensitivity of s0, and a category pair in the range
0-1023. This is fine when running a standard MCS policy because
libvirtd will run with a label
system_u:system_r:virtd_t:s0-s0:c0.c1023
With custom policies though, it is possible for libvirtd to have
a different sensitivity, or category range. For example
system_u:system_r:virtd_t:s2-s3:c512.c1023
In this case we must assign the VM a sensitivity matching the
current lower sensitivity value, and categories in the range
512-1023
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The code to refactor sec label handling accidentally changed the
SELinux driver to use the 'domain_context' when generating the
image label instead of the 'file_context'
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
After the cleanup of remote display port allocation, I noticed some
messages that didn't make a lot of sense the way they were written. So
I rephrased them.
The defines QEMU_REMOTE_PORT_MIN and QEMU_REMOTE_PORT_MAX were used to
find free port when starting domains. As this was hard-coded to the
same ports as default VNC servers, there were races with these other
programs. This patch includes the possibility to change the default
starting port as well as the maximum port (mostly for completeness) in
qemu config file.
Support for two new config options in qemu.conf is added:
- remote_port_min (defaults to QEMU_REMOTE_PORT_MIN and
must be >= than this value)
- remote_port_max (defaults to QEMU_REMOTE_PORT_MAX and
must be <= than this value)
Port allocations for SPICE and VNC behave almost the same (with
default ports), but there is some mess in the code. This patch clears
these inconsistencies and makes sure the same behavior will be used
when ports for remote displays are changed.
Changes:
- hard-coded number 5900 removed (handled elsewhere like with VNC)
- reservedVNCPorts renamed to reservedRemotePorts (it's not just for
VNC anymore)
- QEMU_VNC_PORT_{MIN,MAX} renamed to QEMU_REMOTE_PORT_{MIN,MAX}
- port allocation unified for VNC and SPICE
This patch updates libvirt's API to allow applications to inspect the
full list of security labels of a domain.
Signed-off-by: Marcelo Cerri <mhcerri@linux.vnet.ibm.com>
This patch updates the key "security_driver" in QEMU config to suport
both a sigle default driver or a list of default drivers. This ensures
that it will remain compatible with older versions of the config file.
Signed-off-by: Marcelo Cerri <mhcerri@linux.vnet.ibm.com>
These changes make the security drivers able to find and handle the
correct security label information when more than one label is
available. They also update the DAC driver to be used as an usual
security driver.
Signed-off-by: Marcelo Cerri <mhcerri@linux.vnet.ibm.com>
This patch updates the domain and capability XML parser and formatter to
support more than one "seclabel" element for each domain and device. The
RNG schema and the tests related to this are also updated by this patch.
Signed-off-by: Marcelo Cerri <mhcerri@linux.vnet.ibm.com>
This patch updates the structures that store information about each
domain and each hypervisor to support multiple security labels and
drivers. It also updates all the remaining code to use the new fields.
Signed-off-by: Marcelo Cerri <mhcerri@linux.vnet.ibm.com>
This is a fix for the object label generation. It uses a new flag for
virSecuritySELinuxGenNewContext that specifies whether the context is
for an object. If so the context role remains unchanged.
Without this fix it is not possible to start domains with image file or
block device backed storage when selinux is enabled.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
In order to support systemd socket based activation, it needs to
be possible to create virNetSocketPtr and virNetServerServicePtr
instance from a pre-opened file descriptor
In preparation for adding further constructors, refactor
the virNetServerClientNew method to move most of the code
into a common virNetServerClientNewInternal helper API.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the virNetServerDispatchNewClient both creates the
virNetServerClientPtr instance and registers it with the
virNetServerPtr internal state. Split the client registration
code out into a separate virNetServerAddClient method to
allow future reuse from other contexts
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
For network devices allocated from a network with <forward
mode='hostdev'>, there is a need to add the newly minted hostdev to
the hostdevs array.
In this case we also need to call qemuPrepareHostDevices just for this
one device, as the standard call to initialize all the hostdevs that
were defined directly in the domain's configuration has already been
made by the time we allocate a device from a libvirt network, and thus
have something that needs initializing.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
This patch updates the network driver to properly utilize the new
attributes/elements that are now in virNetworkDef
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: Laine Stump <laine@laine.org>
This function is needed by the network driver in a later commit.
It is useful in functions like networkNotifyActualDevice and
networkReleaseActualDevice
The network pool should be able to keep track of both network device
names and PCI addresses, and return the appropriate one in the
actualDevice when networkAllocateActualDevice is called.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
This patch introduces the new forward mode='hostdev' along with
attribute managed. Includes updates to the network RNG and new xml
parser/formatter code.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Existing code that creates a list of forwardIfs from a single PF
was moved to the new utility function networkCreateInterfacePool.
No functional change.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Move the functions the parse/format, and validate PCI addresses to
their own file so they can be conveniently used in other places
besides device_conf.c
Refactoring existing code without causing any functional changes to
prepare for new code.
This patch makes the code reusable.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Change device type of a virtio channel from/to spicevmc is not a user
visible change. However, spicevmc channels use different default target
name than other virtio channels. To maintain ABI stability during this
change target name must be explicitly specified (and equal) in both
configurations.
Add the ability to support VLAN tags for Open vSwitch virtual port
types. To accomplish this, modify virNetDevOpenvswitchAddPort and
virNetDevTapCreateInBridgePort to take a virNetDevVlanPtr
argument. When adding the port to the OVS bridge, setup either a
single VLAN or a trunk port based on the configuration from the
virNetDevVlanPtr.
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
Setting hard_limit larger than previous swap_hard_limit must fail,
it's not that good if one wants to change the swap_hard_limit
and hard_limit together. E.g.
% virsh memtune rhel6
hard_limit : 1000000
soft_limit : 1000000
swap_hard_limit: 1000000
% virsh memtune rhel6 --hard-limit 1000020 --soft-limit 1000020 \
--swap-hard-limit 1000020 --live
This patch reorder the limits setting to set the swap_hard_limit
first, hard_limit then, and soft_limit last if it's greater than
current swap_hard_limit. And soft_limit first, hard_limit then,
swap_hard_limit last, if not.
'make distcheck' fails because the generated ESX and HyperV files
are (intentionally) marked read-only, but since the stamp file was
missing, make assumes they need to be rebuilt. Shipping the stamp
file solves the problem.
* src/Makefile.am (EXTRA_DIST): Ship stamp files.
The underlying function to set the vlan tag of an SR-IOV network
device was already in place (although an extra patch to save/restore
the original vlan tag was needed), and recent patches added the
ability to configure a vlan tag. This patch just ties those two
together.
An SR-IOV device doesn't support vlan trunking, so if anyone tries to
configure more than a single tag, or set the trunk flag, and error is
logged.
When a network device that is a VF of an SR-IOV card was assigned to a
guest using <interface type='hostdev'>, only the MAC address was being
saved/restored, but the VLAN tag was left untouched. Up to now we
haven't actually used vlan tags on SR-IOV devices, so the guest would
have used whatever was set, and left it the same at the end.
The patch following this one will hook up the <vlan> element from the
interface config, so save/restore of the device state needs to also
include the vlan tag.
MAC address is being saved as a simple ASCII string in a file named
for the device under /var/run. The VLAN tag is now just added at the
end of that file, after a newline. It might be nicer if the file was
XML (in case it ever gets more complicated) but at the moment there's
nothing else on the horizon, and this makes backward compatibility
easier.
The parameter value for cpuset could be in special format like
"0-10,^7", which is not recognized by cgroup. This patch is to
ensure the cpuset is formatted as expected before passing it to
cgroup. As a side effect, after the patch, it parses the cpuset
early before cgroup setting, to avoid the rollback if cpuset
parsing fails afterwards.
Previous commit:
commit 9093ab7734
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Wed Jul 18 17:03:17 2012 +0100
Add lots of internal symbols to libvirt_private.syms
mistakenly put some conditional SASL symbols in libvirt_private.syms
instead of libvirt_sasl.syms
The network driver now looks for the vlan element in network and
portgroup objects, and logs an error at network define time if a vlan
is requested for a network type that doesn't support it. (Currently
vlan configuration is only supported for openvswitch networks, and
networks used to do hostdev assignment of SR-IOV VFs.)
At runtime, the three potential sources of vlan information are
examined in this order: interface, chosen portgroup, network, and the
first that is non-empty is used. Another check for valid network type
is made at this time, since the interface may have requested a vlan (a
legal thing to have in the interface config, since it's not known
until runtime if the chosen network will actually support it).
Since we must also check for domains requesting vlans for unsupported
connection types even if they are type='network', and since
networkAllocateActualDevice() is being called in exactly the correct
places, and has all of the necessary information to check, I slightly
modified the logic of that function so that interfaces that aren't
type='network' don't just return immediately. Instead, they also
perform all the same validation for supported features. Because of
this, it's not necessary to make this identical check in the other
three places that would normally require it: 1) qemu domain startup,
2) qemu device hotplug, 3) lxc domain startup.
This can be seen as a first step in consolidating network-related
functionality into the network driver, rather than having copies of
the same code spread around in multiple places; this will make it
easier to split the network parts off into a separate daemon, as we've
discussed recently.
The following config elements now support a <vlan> subelements:
within a domain: <interface>, and the <actual> subelement of <interface>
within a network: the toplevel, as well as any <portgroup>
Each vlan element must have one or more <tag id='n'/> subelements. If
there is more than one tag, it is assumed that vlan trunking is being
requested. If trunking is required with only a single tag, the
attribute "trunk='yes'" should be added to the toplevel <vlan>
element.
Some examples:
<interface type='hostdev'/>
<vlan>
<tag id='42'/>
</vlan>
<mac address='52:54:00:12:34:56'/>
...
</interface>
<network>
<name>vlan-net</name>
<vlan trunk='yes'>
<tag id='30'/>
</vlan>
<virtualport type='openvswitch'/>
</network>
<interface type='network'/>
<source network='vlan-net'/>
...
</interface>
<network>
<name>trunk-vlan</name>
<vlan>
<tag id='42'/>
<tag id='43'/>
</vlan>
...
</network>
<network>
<name>multi</name>
...
<portgroup name='production'/>
<vlan>
<tag id='42'/>
</vlan>
</portgroup>
<portgroup name='test'/>
<vlan>
<tag id='666'/>
</vlan>
</portgroup>
</network>
<interface type='network'/>
<source network='multi' portgroup='test'/>
...
</interface>
IMPORTANT NOTE: As of this patch there is no backend support for the
vlan element for *any* network device type. When support is added in
later patches, it will only be for those select network types that
support setting up a vlan on the host side, without the guest's
involvement. (For example, it will be possible to configure a vlan for
a guest connected to an openvswitch bridge, but it won't be possible
to do that for one that is connected to a standard Linux host bridge.)
To allow for the possibility of vlan "trunks", which have more than
one vlan tag associated with them, we need a vlan struct. Since it
will be used by multiple files in src/util, src/conf, src/network, and
src/qemu, it must be defined in src/util. Unfortunately there isn't
currently a common file for simple netdev data definitions, so I
created a new file.
This caused compilation of virnetdevvportprofile.c to fail on systems
without IFLA support in netlink (these are netlink commands used to
configure the VF's of SR-IOV network devices).
Currently there is a hook function that is invoked when a
new client connection comes in, which allows an app to
setup private data. This setup will make it difficult to
serialize client state during process re-exec(). Change to
a model where the app registers a callback when creating
the virNetServerPtr instance, which is used to allocate
the client private data immediately during virNetClientPtr
construction.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the virNetClientPtr constructor will always register
the async IO event handler and the keepalive objects. In the
case of the lock manager, there will be no event loop available
nor keepalive support required. Split this setup out of the
constructor and into separate methods.
The remote driver will enable async IO and keepalives, while
the LXC driver will only enable async IO
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the virNetServerServicePtr is responsible for
creating the virNetServerClientPtr instance when accepting
a new connection. Change this so that the virNetServerServicePtr
merely gives virNetServerPtr a virNetSocketPtr instance. The
virNetServerPtr can then create the virNetServerClientPtr
as it desires
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
It is desirable to be able to query the config params of
the thread pool, in order to save the server state. Add
virThreadPoolGetMinWorkers, virThreadPoolGetMaxWorkers
and virThreadPoolGetPriorityWorkers APIs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
While the QEMU monitor/agent do not want JSON strings pretty
printed, other parts of libvirt might. Instead of hardcoding
QEMU's desired behaviour in virJSONValueToString(), add a
boolean flag to control pretty printing
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To allow a virLockManagerPtr to be created directly from a
driver table struct, replace the virLockManagerPluginPtr parameter
with a virLockDriverPtr parameter.
* src/locking/domain_lock.c, src/locking/lock_manager.c,
src/locking/lock_manager.h: Replace plugin param with
a driver in virLockManagerNew
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Do some cleanup of parallelsOpen, STREQ_NULLABLE can replace
a lot of checks.
Also fix error message to be VIR_ERR_INTERNAL_ERROR, the same
as in other drivers.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Let's change URI to parallels:///system. Parallels Server supports
creating VMs from non-privileged accounts, but it's not main usage
scenario and it may be forbidden in the future.
Also containers, which will be supported by the driver, can be managed
only by root, so /system path is more suitable for this driver.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Each interface has a single pointer to a filterref object. That
filterref can itself point to multiple other filterrefs, but at the
toplevel there is only one.
The parser had previously just silently overwritten earlier filterrefs
when a new one was encountered, so the interface was left with
whichever was the last filterref in the xml, ignoring all the
others. This patch logs an error when it sees more than one filterref.
Just as each physical device used by a network has a connections
counter, now each network has a connections counter which is
incremented once for each guest interface that connects using this
network.
The count is output in the live network XML, like this:
<network connections='20'>
...
</network>
It is read-only, and for informational purposes only - it isn't used
internally anywhere by libvirt.
A later patch will be adding a counter that will be
incremented/decremented each time an guest interface starts/stops
using a particular network. For this to work, all types of networks
need to go through a common return sequence rather than returning
early. To setup for this, a new success: label is added (when
necessary), a new error: label is added which does any cleanup
necessary only for error returns and then does goto cleanup, and early
returns are changed to goto error if it's a failure, or goto success
if it's successful. This way the intent of all the gotos is
unambiguous, and a successful return path never encounters the
"error:" label.
It may be useful for management applications to know which physical
network devices are in use by guests. This information is already
available in the network objects, but wasn't output in the XML. This
patch outputs it when the INACTIVE flag isn't set (and if it's non-0).
I want to include this count in the xml output of networks, but
calling it "connections" in the XML sounds better than "usageCount", and it
would be better if the name in the XML matched the variable name.
In a few places, usageCount was being initialized to 0, but this is
unnecessary, because VIR_ALLOC_N zero-fills everything anyway.
This array was originally defined using the existing
virNetworkForwardIfDef, but that struct has a UsageCount field that
isn't used in the case of PFs. This patch just copies that struct and
removes UsageCount. It ends up being a struct with a single field, but
I left it as a struct in case we need to add other fields to it in the
future.
Use of ldexp() requires -lm on some platforms; use gnulib to determine
this for our makefile. Also, optimize virRandomInt() for the case
of a power-of-two limit (actually rather common, given that Daniel
has a pending patch to replace virRandomBits(10) with code that will
default to virRandomInt(1024) on default SELinux settings).
* .gnulib: Update to latest, for ldexp.
* bootstrap.conf (gnulib_modules): Import ldexp.
* src/Makefile.am (libvirt_util_la_CFLAGS): Link with -lm when
needed.
* src/util/virrandom.c (virRandomInt): Optimize powers of 2.
One of the original ideas behind allowing a <virtualport> in an
interface definition as well as in the <network> definition *and*one
or more <portgroup>s within the network, was that guest-specific
parameteres (like instanceid and interfaceid) could be given in the
interface's virtualport, and more general things (portid, managerid,
etc) could be given in the network and/or portgroup, with all the bits
brought together at guest startup time and combined into a single
virtualport to be used by the guest. This was somehow overlooked in
the implementation, though - it simply picks the "most specific"
virtualport, and uses the entire thing, with no attempt to merge in
details from the others.
This patch uses virNetDevVPortProfileMerge3() to combine the three
possible virtualports into one, then uses
virNetDevVPortProfileCheck*() to verify that the resulting virtualport
type is appropriate for the type of network, and that all the required
attributes for that type are present.
An example of usage is this: assuming a <network> definitions on host
ABC of:
<network>
<name>testA</name>
...
<virtualport type='openvswitch'/>
...
<portgroup name='engineering'>
<virtualport>
<parameters profileid='eng'/>
</virtualport>
</portgroup>
<portgroup name='sales'>
<virtualport>
<parameters profileid='sales'/>
</virtualport>
</portgroup>
</network>
and the same <network> on host DEF of:
<network>
<name>testA</name>
...
<virtualport type='802.1Qbg'>
<parameters typeid="1193047" typeidversion="2"/>
</virtualport>
...
<portgroup name='engineering'>
<virtualport>
<parameters managerid="11"/>
</virtualport>
</portgroup>
<portgroup name='sales'>
<virtualport>
<parameters managerid="55"/>
</virtualport>
</portgroup>
</network>
and a guest <interface> definition of:
<interface type='network'>
<source network='testA' portgroup='sales'/>
<virtualport>
<parameters instanceid="09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f"
interfaceid="09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f"\>
</virtualport>
...
</interface>
If the guest was started on host ABC, the <virtualport> used would be:
<virtualport type='openvswitch'>
<parameters interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'
profileid='sales'/>
</virtualport>
but if that guest was started on host DEF, the <virtualport> would be:
<virtualport type='802.1Qbg'>
<parameters instanceid="09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f"
typeid="1193047" typeidversion="2"
managerid="55"/>
</virtualport>
Additionally, if none of the involved <virtualport>s had a specified type
(this includes cases where no virtualport is given at all),
Until now, all attributes in a <virtualport> parameter list that were
acceptable for a particular type, were also required. There were no
optional attributes.
One of the aims of supporting <virtualport> in libvirt's virtual
networks and portgroups is to allow specifying the group-wide
parameters in the network's virtualport, and merge that with the
interface's virtualport, which will have the instance-specific info
(i.e. the interfaceid or instanceid).
Additionally, the guest's interface XML shouldn't need to know what
type of network connection will be used prior to runtime - it could be
openvswitch, 802.1Qbh, 802.1Qbg, or none of the above - but should
still be able to specify instance-specific info just in case it turns
out to be applicable.
Finally, up to now, the parser for virtualport has always generated a
random instanceid/interfaceid when appropriate, making it impossible
to leave it blank (which is what's required for virtualports within a
network/portprofile definition).
This patch modifies the parser and formatter of the <virtualport>
element in the following ways:
* because most of the attributes in a virNetDevVPortProfile are fixed
size binary data with no reserved values, there is no way to embed a
"this value wasn't specified" sentinel into the existing data. To
solve this problem, the new *_specified fields in the
virNetDevVPortProfile object that were added in a previous patch of
this series are now set when the corresponding attribute is present
during the parse.
* allow parsing/formatting a <virtualport> that has no type set. In
this case, all fields are settable, but all are also optional.
* add a GENERATE_MISSING_DEFAULTS flag to the parser - if this flag is
set and an instanceid/interfaceid is expected but not provided, a
random one will be generated. This was previously the default
behavior, but is now done only for virtualports inside an
<interface> definition, not for those in <network> or <portgroup>.
* add a REQUIRE_ALL_ATTRIBUTES flag to the parser - if this flag is
set the parser will call the new
virNetDevVPortProfileCheckComplete() functions at the end of the
parser to check for any missing attributes (based on type), and
return failure if anything is missing. This used to be default
behavior. Now it is only used for the virtualport defined inside an
interface's <actual> element (by the time you've figured out the
contents of <actual>, you should have all the necessary data to fill
in the entire virtualport)
* add a REQUIRE_TYPE flag to the parser - if this flag is set, the
parser will return an error if the virtualport has no type
attribute. This also was previously the default behavior, but isn't
needed in the case of the virtualport for a type='network' interface
(i.e. the exact type isn't yet known), or the virtualport of a
portgroup (i.e. the portgroup just has modifiers for the network's
virtualport, which *does* require a type) - in those cases, the
check will be done at domain startup, once the final virtualport is
assembled (this is handled in the next patch).
This function has several calls to increase the buffer indent by 6,
then decrease it again, then increase, then decrease. Additionally,
there were several printfs that had 6 spaces at the beginning of the
line.
virDomainActualNetDefFormat, which is called by virDomainNetDefFormat,
had similar ugliness.
This patch changes both functions to just increase the indent at the
beginning, decrease it at (well, just before*) the end, and remove all
of the occurences of 6/8 spaces at the beginning of lines.
*The indent had to be reset before the end of the function because
virDomainDeviceInfoFormat assumes a 0 indent and is called from many
other places, and I didn't want to do an overhaul of every caller of
that function. A separate patch to switch all of domain_conf.c would
be a useful exercise, but my current goal is unrelated to that, so
I'll leave it for another day.
There was an error: label that simply did "return ret", but ret was
defaulted to -1, and was never used other than setting it manually to
0 just before a non-error return. Aside from this, some of the error
return paths used "goto error" and others used "return ret".
This patch removes ret and the error: label, and makes all error
returns just consistently do "return -1".
virtPortProfile is now used by 4 different types of network devices
(NETWORK, BRIDGE, DIRECT, and HOSTDEV), and it's getting cumbersome to
replicate so much code in 4 different places just because each type
has the virtPortProfile in a slightly different place. This patch puts
a single virtPortProfile in a common place (outside the type-specific
union) in both virDomainNetDef and virDomainActualNetDef, and adjusts
the parse and format code (and the few other places where it is used)
accordingly.
Note that when a <virtualport> element is found, the parse functions
verify that the interface is of a type that supports one, otherwise an
error is generated (CONFIG_UNSUPPORTED in the case of <interface>, and
INTERNAL in the case of <actual>, since the contents of <actual> are
always generated by libvirt itself).
This patch adds three utility functions that operate on
virNetDevVPortProfile objects.
* virNetDevVPortProfileCheckComplete() - verifies that all attributes
required for the type of the given virtport are specified.
* virNetDevVPortProfileCheckNoExtras() - verifies that there are no
attributes specified which are inappropriate for the type of the
given virtport.
* virNetDevVPortProfileMerge3() - merges 3 virtports into a single,
newly allocated virtport. If any attributes are specified in
more than one of the three sources, and do not exactly match,
an error is logged and the function fails.
These new functions depend on new fields in the virNetDevVPortProfile
object that keep track of whether or not each attribute was
specified. Since the higher level parse function doesn't yet set those
fields, these functions are not actually usable yet (but that's okay,
because they also aren't yet used - all of that functionality comes in
a later patch.)
Note that these three functions return 0 on success and -1 on
failure. This may seem odd for the first two Check functions, since
they could also easily return true/false, but since they actually log
an error when the requested condition isn't met (and should result in
a failure of the calling function), I thought 0/-1 was more
appropriate.
virNetDevVPortProfile has (had) a type field that can be set to one of
several values, and a union of several structs, one for each
type. When a domain's interface object is of type "network", the
domain config may not know beforehand which type of virtualport is
going to be provided in the actual device handed down from the network
driver at runtime, but may want to set some values in the virtualport
that may or may not be used, depending on the type. To support this
usage, this patch replaces the union of structs with toplevel fields
in the struct, making it possible for all of the fields to be set at
the same time.
Both of these functions returned void, but it's convenient for them to
return a const char* of the char* that is passed in. This was you can
call the function and use the result in the same expression/arg.
Commit bb705e25 missed that the appArmor helper file also needs to
resolve the new symbols dragged in by domain_conf.c.
* src/Makefile.am (SECURITY_DRIVER_APPARMOR_HELPER_SOURCES): Pull
in datatypes.c.
openvzOpen fucntion must leave unlocked virDomainObj objects in
driver->domains.
Now even simple commands like list or domain lookup hang,
for example virsh -c openvz:///system list --all.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
The code for picking a MCS label is about to get significantly
more complicated, so it deserves to be in a standlone method,
instead of a switch/case body.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When generating an SELinux context for a VM from the template
"system_u:system_r:svirt_t:s0", copy the role + user from the
current process instead of the template context. So if the
current process is
unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
then the VM context ends up as
unconfined_u:unconfined_r:svirt_t:s0:c386,c703
instead of
system_u:system_r:svirt_t:s0:c177,c424
Ideally the /etc/selinux/targeted/contexts/virtual_domain_context
file would have just shown the 'svirt_t' type, and not the full
context, but that can't be changed now for compatibility reasons.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virSecuritySELinuxGenNewContext method was not reporting any
errors, leaving it up to the caller to report a generic error.
In addition it could potentially trigger a strdup(NULL) in an
OOM scenario. Move all error reporting into the
virSecuritySELinuxGenNewContext method where accurate info
can be provided
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There is currently no way to distinguish the case that a requested
security driver was disabled, from the case where no security driver
was available. Use VIR_ERR_CONFIG_UNSUPPORTED as the error when an
explicitly requested security driver was disabled
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The security_manager.h header is not self-contained because it
uses the virDomainDefPtr without first including domain_conf.h
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The current virRandomBits() API is only usable if the caller wants
a random number in the range [0, n-1) where n is a power of two.
This adds a virRandom() API which generates a double in the
range [0.0,1.0) with 48 bits of entropy. It then also adds a
virRandomInt(uint32_t max) API which generates an unsigned
in the range [0,@max)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
As the consensus in:
https://www.redhat.com/archives/libvir-list/2012-July/msg01692.html,
this patch is to destroy conf/virdomainlist.[ch], folding the
helpers into conf/domain_conf.[ch].
* src/Makefile.am:
- Various indention fixes incidentally
- Add macro DATATYPES_SOURCES (datatypes.[ch])
- Link datatypes.[ch] for libvirt_lxc
* src/conf/domain_conf.c:
- Move all the stuffs from virdomainlist.c into it
- Use virUnrefDomain and virUnrefDomainSnapshot instead of
virDomainFree and virDomainSnapshotFree, which are defined
in libvirt.c, and we don't want to link to it.
- Remove "if" before "free" the object, as virObjectUnref
is in the list "useless_free_options".
* src/conf/domain_conf.h:
- Move all the stuffs from virdomainlist.h into it
- s/LIST_FILTER/LIST_DOMAINS_FILTER/
* src/libxl/libxl_driver.c:
- s/LIST_FILTER/LIST_DOMAINS_FILTER/
- no (include "virdomainlist.h")
* src/libxl/libxl_driver.c: Likewise
* src/lxc/lxc_driver.c: Likewise
* src/openvz/openvz_driver.c: Likewise
* src/parallels/parallels_driver.c: Likewise
* src/qemu/qemu_driver.c: Likewise
* src/test/test_driver.c: Likewise
* src/uml/uml_driver.c: Likewise
* src/vbox/vbox_tmpl.c: Likewise
* src/vmware/vmware_driver.c: Likewise
* tools/virsh-domain-monitor.c: Likewise
* tools/virsh.c: Likewise
libvirt creates invalid commands if wrong locale is selected. For
example with locale that uses comma as a decimal point, JSON commands
created with decimal numbers are invalid because comma separates the
entries in JSON. Fortunately even when decimal point is affected,
thousands grouping is not, because for grouping to be enabled with
*printf, there has to be an apostrophe flag specified (and supported).
This patch adds specific internal function for converting doubles to
strings with C locale.
This is a patch for bug 847848
If registering an existing lockspace with the sanlock daemon
returns an error, libvirt should not proceed to unlink the lockspace.
Signed-off-by: Asad Saeed <asad.saeed@acidseed.com>
Otherwise distcheck can fail with:
GEN check-symfile
Can't open perl script "../../src/check-symfile.pl": No such file or directory
make[4]: *** [check-symfile] Error 2
This is a patch for bug 826704
All sanlock resources get released when hot-dettaching a disk from the domain
because virLockManagerSanlockRelease uses the wrong function parameters/flags.
With the patch only the resources that should be released are cleaned up.
Signed-off-by: Frido Roose <frido.roose@gmail.com>
This patch introduces a new error code VIR_ERR_OPERATION_UNSUPPORTED to
mark error messages regarding operations that failed due to lack of
support on the hypervisor or other than libvirt issues.
The code is first used in reporting error if qemu does not support block
IO tuning variables yielding error message:
error: Unable to get block I/O throttle parameters
error: Operation not supported: block_io_throttle field
'total_bytes_sec' missing in qemu's output
instead of:
error: Unable to get block I/O throttle parameters
error: internal error cannot read total_bytes_sec
libvirt_qemu_probes.stp stopped working after switching to a build
that used --with-driver-modules. This was because the symbols listed
int libvirt_qemu_probes.stp are no longer in $(bindir)/libvirtd, but
are now in $(libdir)/connection-driver/libvirt_driver_qemu.so.
This patch enhances dtrace2systemtap.pl (which generates the .stp
files from .d files) to look for a new "module" setting in the
comments of the .d file (similar to the existing "binary" setting),
and to look for a --with-modules option. If the --with-modules option
is set *and* a "module" setting is present in the .d file, the process
name for the stap line is set to
$libdir/$module
If either of these isn't true, it reverts to the old behavior.
src/Makefile.am was also modified to add the --with-modules option
when the build calls for it, and src/libvirt_qemu_probes.d has added a
"module" line pointing to the correct .so file for the qemu driver.
The meat of this patch is just moving the calls to
virNWFilterRegisterCallbackDriver from each hypervisor's "register"
function into its "initialize" function. The rest is just code
movement to allow that, and a new virNWFilterUnRegisterCallbackDriver
function to undo what the register function does.
The long explanation:
There is an array in nwfilter called callbackDrvArray that has
pointers to a table of functions for each hypervisor driver that are
called by nwfilter. One of those function pointers is to a function
that will lock the hypervisor driver. Entries are added to the table
by calling each driver's "register" function, which happens quite
early in libvirtd's startup.
Sometime later, each driver's "initialize" function is called. This
function allocates a driver object and stores a pointer to it in a
static variable that was previously initialized to NULL. (and here's
the important part...) If the "initialize" function fails, the driver
object is freed, and that pointer set back to NULL (but the entry in
nwfilter's callbackDrvArray is still there).
When the "lock the driver" function mentioned above is called, it
assumes that the driver was successfully loaded, so it blindly tries
to call virMutexLock on "driver->lock".
BUT, if the initialize never happened, or if it failed, "driver" is
NULL. And it just happens that "lock" is always the first field in
driver so it is also NULL.
Boom.
To fix this, the call to virNWFilterRegisterCallbackDriver for each
driver shouldn't be called until the end of its (*already guaranteed
successful*) "initialize" function, not during its "register" function
(which is currently the case). This implies that there should also be
a virNWFilterUnregisterCallbackDriver() function that is called in a
driver's "shutdown" function (although in practice, that function is
currently never called).
Otherwise, in locations like virobject.c where PROBE is used,
for certain configure options, the compiler warns:
util/virobject.c:110:1: error: 'intptr_t' undeclared (first use in this function)
As long as we are making this header always available, we can
clean up several other files.
* src/internal.h (includes): Pull in <stdint.h>.
* src/conf/nwfilter_conf.h: Rely on internal.h.
* src/storage/storage_backend.c: Likewise.
* src/storage/storage_backend.h: Likewise.
* src/util/cgroup.c: Likewise.
* src/util/sexpr.h: Likewise.
* src/util/virhashcode.h: Likewise.
* src/util/virnetdevvportprofile.h: Likewise.
* src/util/virnetlink.h: Likewise.
* src/util/virrandom.h: Likewise.
* src/vbox/vbox_driver.c: Likewise.
* src/xenapi/xenapi_driver.c: Likewise.
* src/xenapi/xenapi_utils.c: Likewise.
* src/xenapi/xenapi_utils.h: Likewise.
* src/xenxs/xenxs_private.h: Likewise.
* tests/storagebackendsheepdogtest.c: Likewise.
An ESX server has one or more PhysicalNics that represent the actual
hardware NICs. Those can be listed via the interface driver.
A libvirt virtual network is mapped to a HostVirtualSwitch. On the
physical side a HostVirtualSwitch can be connected to PhysicalNics.
On the virtual side a HostVirtualSwitch has HostPortGroups that are
mapped to libvirt virtual network's portgroups. Typically there is
HostPortGroups named 'VM Network' that is used to connect virtual
machines to a HostVirtualSwitch. A second HostPortGroup typically
named 'Management Network' is used to connect the hypervisor itself
to the HostVirtualSwitch. This one is not mapped to a libvirt virtual
network's portgroup. There can be more HostPortGroups than those
typical two on a HostVirtualSwitch.
+---------------+-------------------+
...---| | | +-------------+
| HostPortGroup | |---| PhysicalNic |
| VM Network | | | vmnic0 |
...---| | | +-------------+
+---------------+ HostVirtualSwitch |
| vSwitch0 |
+---------------+ |
| HostPortGroup | |
...---| Management | |
| Network | |
+---------------+-------------------+
The virtual counterparts of the PhysicalNic is the HostVirtualNic for
the hypervisor and the VirtualEthernetCard for the virtual machines
that are grouped into HostPortGroups.
+---------------------+ +---------------+---...
| VirtualEthernetCard |---| |
+---------------------+ | HostPortGroup |
+---------------------+ | VM Network |
| VirtualEthernetCard |---| |
+---------------------+ +---------------+
|
+---------------+
+---------------------+ | HostPortGroup |
| HostVirtualNic |---| Management |
+---------------------+ | Network |
+---------------+---...
The currently implemented network driver can list, define and undefine
HostVirtualSwitches including HostPortGroups for virtual machines.
Existing HostVirtualSwitches cannot be edited yet. This will be added
in a followup patch.
esxVI_LookupHostSystemProperties guarantees that hostSystem is non-NULL.
Remove redundant NULL checks from callers.
Also prefer esxVI_GetStringValue over open-coding the logic.
The static deep copy allocates storage for the copy. The dynamic
version injected the dynamic dispatch after the allocation. This
triggered the invalid argument check in the dynamically dispatched
deep copy call. The deep copy function expects its dest parameter
to be a pointer to a NULL-pointer. This expectation wasn't met due
to the dispatching deep copy doing the allocation before the call.
Fix this by dynamically dispatching to the correct type before the
allocation.
Lists available PhysicalNic devices. A PhysicalNic is always active
and can neither be defined nor undefined.
A PhysicalNic is used to bridge a HostVirtualSwitch to the physical
network.
Remove the target table before renaming a table to it, i.e.,
remove table B before renaming A to B. This makes the
renaming more robust against unconnected left-over tables.
Both LVM volumes and SCSI LUNs have a globally unique
identifier associated with them. It is useful to be able
to query this identifier to then perform disk locking,
rather than try to figure out a stable pathname.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When entering "confirm" phase, we are interested in the value of
cancelled rather then ret variable which was interesting before "finish"
phase and didn't change since then.
Previously, qemu did not respond to monitor commands during migration if
the limit was too high. This prevented us from raising the limit
earlier. The qemu issue seems to be fixed (according to my testing) and
we may remove the 32Mb/s limit.
This patch refactors the JSON parsing function that extracts the block
IO tuning parameters from qemu's output. The most impacting change
concerns the error message that is returned if the reply from qemu does
not contain the needed data. The data for IO parameter tuning were added
in qemu 1.1 and the previous error message was confusing.
This patch also breaks long lines and extracts a multiple time used code
pattern to a macro.
Remove spaces before function calls and some other coding nits in some
parts of the remote driver and refactor getting of URI argument
components into variables used by libvirt later on.
Prevents libvirt from treating RBD backing stores as files. Without this
patch, creating a domain with a qcow2 overlay on an RBD would fail.
This patch essentially extends 9c7c4a4fc5,
which allows nbd backing stores, to allow rbd backing stores.
From man poll(2), poll does not set errno=EAGAIN on interrupt, however
it does set errno=EINTR. Have libvirt retry on the appropriate errno.
Under heavy load, a program of mine kept getting libvirt errors 'poll on
socket failed: Interrupted system call'. The signals were SIGCHLD from
processes forked by threads unrelated to those using libvirt.
Rename qemuDefaultScsiControllerModel to qemuCheckScsiControllerModel.
When scsi model is given explicitly in XML(model > 0) checking if the
underlying QEMU supports it or not first, raise an error on checking
failure.
When the model is not given(mode <= 0), return LSI by default, if
the QEMU doesn't support it, raise an error.
QEMU_CAPS_SCSI_LSI
set the flag when "lsi53c895a", bus PCI, alias "lsi" in
the output of "qemu -device ?"
-device lsi in qemu command line
QEMU_CAPS_VIRTIO_SCSI_PCI
set the flag when "name "virtio-scsi-pci", bus PCI" in
the output of qemu devices query.
-device virtio-scsi-pci in qemu command line
This patch is in response to:
https://bugzilla.redhat.com/show_bug.cgi?id=818467
If a caller to virCommandRun doesn't ask for the exitstatus of the
program it's running, the virCommand functions assume that they should
log an error message and return failure if the exit code isn't
0. However, only the commandline and exit status are logged, while
potentially useful information sent by the program to stderr is
discarded.
Fortunately, virCommandRun is already checking if the caller had asked
for stderr to be saved and, if not, sets things up to save it in
*cmd->errbuf. This makes it fairly simple for virCommandWait to
include *cmd->errbuf in the error log (there are still other callers
that don't setup errbuf, and even virCommandRun won't set it up if the
command is being daemonized, so we have to check that it's non-zero).
Switch virDomainObjPtr to use the virObject APIs for reference
counting. The main change is that virObjectUnref does not return
the reference count, merely a bool indicating whether the object
still has any refs left. Checking the return value is also not
mandatory.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This converts the following public API datatypes to use the
virObject infrastructure:
virConnectPtr
virDomainPtr
virDomainSnapshotPtr
virInterfacePtr
virNetworkPtr
virNodeDevicePtr
virNWFilterPtr
virSecretPtr
virStreamPtr
virStorageVolPtr
virStoragePoolPtr
The code is significantly simplified, since the mutex in the
virConnectPtr object now only needs to be held when accessing
the per-connection virError object instance. All other operations
are completely lock free.
* src/datatypes.c, src/datatypes.h, src/libvirt.c: Convert
public datatypes to use virObject
* src/conf/domain_event.c, src/phyp/phyp_driver.c,
src/qemu/qemu_command.c, src/qemu/qemu_migration.c,
src/qemu/qemu_process.c, src/storage/storage_driver.c,
src/vbox/vbox_tmpl.c, src/xen/xend_internal.c,
tests/qemuxml2argvtest.c, tests/qemuxmlnstest.c,
tests/sexpr2xmltest.c, tests/xmconfigtest.c: Convert
to use virObjectUnref/virObjectRef
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This introduces a fairly basic reference counted virObject type
and an associated virClass type, that use atomic operations for
ref counting.
In a global initializer (recommended to be invoked using the
virOnceInit API), a virClass type must be allocated for each
object type. This requires a class name, a "dispose" callback
which will be invoked to free memory associated with the object's
fields, and the size in bytes of the object struct.
eg,
virClassPtr connclass = virClassNew("virConnect",
sizeof(virConnect),
virConnectDispose);
The struct for the object, must include 'virObject' as its
first member
eg
struct _virConnect {
virObject object;
virURIPtr uri;
};
The 'dispose' callback is only responsible for freeing
fields in the object, not the object itself. eg a suitable
impl for the above struct would be
void virConnectDispose(void *obj) {
virConnectPtr conn = obj;
virURIFree(conn->uri);
}
There is no need to reset fields to 'NULL' or '0' in the
dispose callback, since the entire object will be memset
to 0, and the klass pointer & magic integer fields will
be poisoned with 0xDEADBEEF before being free()d
When creating an instance of an object, one needs simply
pass the virClassPtr eg
virConnectPtr conn = virObjectNew(connclass);
if (!conn)
return NULL;
conn->uri = virURIParse("foo:///bar")
Object references can be manipulated with
virObjectRef(conn)
virObjectUnref(conn)
The latter returns a true value, if the object has been
freed (ie its ref count hit zero)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch adds the support to run the QEMU network helper
under unprivileged user. It also adds the support for
attach-interface option in virsh to run under unprivileged
user.
Signed-off-by: Richa Marwaha <rmarwah@linux.vnet.ibm.com>
Signed-off-by: Corey Bryant<coreyb@linux.vnet.ibm.com>
This patch adds the capability in libvirt to check if
-netdev bridge option is supported or not.
Signed-off-by: Richa Marwaha <rmarwah@linux.vnet.ibm.com>
Signed-off-by: Corey Bryant<coreyb@linux.vnet.ibm.com>
All callers used the same initialization seed (well, the new
viratomictest forgot to look at getpid()); so we might as well
make this value automatic. And while it may feel like we are
giving up functionality, I documented how to get it back in the
unlikely case that you actually need to debug with a fixed
pseudo-random sequence. I left that crippled by default, so
that a stray environment variable doesn't cause a lack of
randomness to become a security issue.
* src/util/virrandom.c (virRandomInitialize): Rename...
(virRandomOnceInit): ...and make static, with one-shot call.
Document how to do fixed-seed debugging.
* src/util/virrandom.h (virRandomInitialize): Drop prototype.
* src/libvirt_private.syms (virrandom.h): Don't export it.
* src/libvirt.c (virInitialize): Adjust caller.
* src/lxc/lxc_controller.c (main): Likewise.
* src/security/virt-aa-helper.c (main): Likewise.
* src/util/iohelper.c (main): Likewise.
* tests/seclabeltest.c (main): Likewise.
* tests/testutils.c (virtTestMain): Likewise.
* tests/viratomictest.c (mymain): Likewise.
Commit 1f6f723 missed a step. At first I was worried that scrubbing
the conditionals would lead to a runtime failure when compiled without
avahi, but my testing makes it appear that the runtime error will only
occur if the .conf files in /etc request mdns advertisement; and the
old behavior was to silently ignore the request, so this is actually
a better behavior of only failing when the config requests the
impossible.
* src/rpc/virnetserver.c: Drop HAVE_AVAHI conditionals; all
callers already passed NULL if mdns_adv was not configured.
If there's a memory leak in qemu or qemu is exploited the host's
system will sooner or later start trashing instead of killing
the bad process. This however has impact on performance and other
guests as well. Therefore we should set a reasonable RSS limit
even when user hasn't set any. It's better to be secure by default.
Parsing xen-xm format configuration will fail if UUID is not
specified, e.g.
virsh domxml-from-native xen-xm some-config-without-uuid
error: internal error parsing xm config failed
Initially I thought to skip parsing the UUID in xenParseXM() when
not present in the configuration, but this results in a UUID of
all zeros since it is never set
virsh domxml-from-native xen-xm /tmp/jim/bug-773621_pierre-test
<domain type='xen'>
<name>test</name>
<uuid>00000000-0000-0000-0000-000000000000</uuid>
...
which certainly can't be correct since this is the UUID the xen
tools use for dom0.
This patch takes the approach of generating a UUID when it is not
specified in the configuration.
Commit ba226d334a tried to fix crash of
the daemon when a domain with an open console was destroyed. The fix was
wrong as it tried to remove the callback also when the stream was
aborted, where at that point the fd stream driver was already freed and
removed.
This patch clears the callbacks with a helper right before the hash is
freed, so that it doesn't interfere with other codepaths where the
stream object is freed.
Without this patch, the English phrase 'no name' would appear
literally within the remaining translated message.
* src/parallels/parallels_driver.c (parallelsCreateVm)
(parallelsDomainDefineXML): Tweak error message.
The remote driver did not fill the required snapshot parent argument in
the RPC call structure that caused a client crash when trying to use
this new API.
* src/conf/domain_conf.c:
- Add virDomainControllerFind to find controller device by type
and index.
- Add virDomainControllerRemove to remove the controller device
from maintained controler list.
* src/conf/domain_conf.h:
- Declare the two new helpers.
* src/libvirt_private.syms:
- Expose private symbols for the two new helpers.
* src/qemu/qemu_driver.c:
- Support attach/detach controller device persistently
* src/qemu/qemu_hotplug.c:
- Use the two helpers to simplify the codes.
The access, birth, modification and change times are added to
storage volumes and corresponding xml representations. This
shows up in the XML in this format:
<timestamps>
<atime>1341933637.027319099</atime>
<mtime>1341933637.027319099</mtime>
</timestamps>
Signed-off-by: Eric Blake <eblake@redhat.com>
capability.rng: Guest features can be in any order.
nodedev.rng: Added <driver> element, <capability> phys_function and
virt_functions for PCI devices.
storagepool.rng: Owner or group ID can be -1.
schema tests: New capabilities and nodedev files; changed owner and
group to -1 in pool-dir.xml.
storage_conf: Print uid_t and gid_t as signed to storage pool XML.
The recent changes to the testsuite to validate exported symbols
flushed out a case of unconditionally exporting symbols that
were only conditionally compiled under HAVE_AVAHI.
* src/Makefile.am (libvirt_net_rpc_server_la_SOURCES): Compile
virnetservermdns unconditionally.
* configure.ac (HAVE_AVAHI): Drop unused automake conditional.
* src/rpc/virnetservermdns.c: Add fallbacks when Avahi is not
present.
One of our latest commits fbe87126 introduced this nasty typo:
func(vmdef, ...); where func() dereference vmdef->ncontrollers,
and vmdef was initialized to NULL. This leaves us with unconditional
immediate segfault. It should be vm->def instead.
Security manager is not a dynamically loadable driver. Let's avoid the
confusion by renaming libvirt_driver_security library as
libvirt_security_manager.
Security manager is not a dynamically loadable driver, it's a common
infrastructure similar to util, conf, cpu, etc. used by individual
drivers. Such code is allowed to be linked into libvirt.so.
This reverts commit ec5b7bd2ec and most of
aae5cfb699.
This patch is supposed to fix virdrivermoduletest failures for qemu and
lxc drivers as well as libvirtd's ability to load qemu and lxc drivers.
With 0.10.0-rc0 out the door, we are committed to the next version
number.
* src/libvirt_public.syms (LIBVIRT_0.9.14): Rename...
(LIBVIRT_0.10.0): ...to this.
* docs/formatdomain.html.in: Fix fallout.
* src/openvz/openvz_driver.c (openvzDriver): Likewise.
* src/remote/remote_driver.c (remote_driver): Likewise.
There are a few issues with the current virAtomic APIs
- They require use of a virAtomicInt struct instead of a plain
int type
- Several of the methods do not implement memory barriers
- The methods do not implement compiler re-ordering barriers
- There is no Win32 native impl
The GLib library has a nice LGPLv2+ licensed impl of atomic
ops that works with GCC, Win32, or pthreads.h that addresses
all these problems. The main downside to their code is that
the pthreads impl uses a single global mutex, instead of
a per-variable mutex. Given that it does have a Win32 impl
though, we don't expect anyone to seriously use the pthread.h
impl, so this downside is not significant.
* .gitignore: Ignore test case
* configure.ac: Check for which atomic ops impl to use
* src/Makefile.am: Add viratomic.c
* src/nwfilter/nwfilter_dhcpsnoop.c: Switch to new atomic
ops APIs and plain int datatype
* src/util/viratomic.h: inline impls of all atomic ops
for GCC, Win32 and pthreads
* src/util/viratomic.c: Global pthreads mutex for atomic
ops
* tests/viratomictest.c: Test validate to validate safety
of atomic ops.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove the use of a manually run virLogStartup and
virNodeSuspendInitialize methods. Instead make sure they
are automatically run using VIR_ONCE_GLOBAL_INIT
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch enables the "none" USB controller for qemu guests and adds
valdiation on hot-plugged devices if the guest has USB disabled.
This patch also adds a set of tests to check parsing of domain XMLs that
use the "none" controller and some forbidden situations concerning it.
This patch adds helpers that validate domain's device configuration.
This will be needed later on to verify devices being hot-plugged to
guests. If the guest has no USB bus, then it's not valid to plug a USB
device to that guest.
Libvirt adds a USB controller to the guest even if the user does not
specify any in the XML. This is due to back-compat reasons.
To allow disabling USB for a guest this patch adds a new USB controller
type "none" that disables USB support for the guest.
The option 'srcSpec' to virsh command find-storage-pool-sources
is optional for logical type of storage pool, but mandatory for
netfs and iscsi type.
When missing the option for netfs and iscsi, libvirt reports XML
parsing error due to null string option srcSpec.
before
error: Failed to find any netfs pool sources
error: (storage_source_specification):1: Document is empty
(null)
after:
error: pool type 'iscsi' requires option --srcSpec for source discovery
To create a new VM in Parallels Clud Server we should issue
"prlctl create" command, and give path to the directory,
where VM should be created. VM's storage will be in that
directory later. So in this first version find out location
of first VM's hard disk and create VM there.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Parallels Cloud Server has one serious discrepancy with libvirt:
libvirt stores domain configuration files in one place, and storage
files in other places (with the API of storage pools and storage volumes).
Parallels Cloud Server stores all domain data in a single directory,
for example, you may have domain with name fedora-15, which will be
located in '/var/parallels/fedora-15.pvm', and it's hard disk image will be
in '/var/parallels/fedora-15.pvm/harddisk1.hdd'.
I've decided to create storage driver, which produces pseudo-volumes
(xml files with volume description), and they will be 'converted' to
real disk images after attaching to a VM.
So if someone creates VM with one hard disk using virt-manager,
at first virt-manager creates a new volume, and then defines a
domain. We can lookup a volume by path in XML domain definition
and find out location of new domain and size of its hard disk.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Add parallelsDomainDefineXML function, it works only for existing
domains for the present.
It's too hard to convert libvirt's XML domain configuration into
Parallel's one, so I've decided to compare virDomainDef structures:
current domain definition and the one created from XML, given to
the function. And change only different parameters.
Currently only name, description, number of cpus, memory amount
and video memory can be changed.
Video device and console added, because libvirt supposes that
VM must always have one video device, if there are some
graphics and one console.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Add support of collecting information about serial
ports. This change is needed mostly as an example,
support of other devices will be added later.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Parallels driver is 'stateless', like vmware or openvz drivers.
It collects information about domains during startup using
command-line utility prlctl. VMs in Parallels are identified by UUIDs
or unique names, which can be used as respective fields in
virDomainDef structure. Currently only basic info, like
description, virtual cpus number and memory amount, is implemented.
Querying devices information will be added in the next patches.
Parallels doesn't support non-persistent domains - you can't run
a domain having only disk image, it must always be registered
in system.
Functions for querying domain info have been just copied from
test driver with some changes - they extract needed data from
previously created list of virDomainObj objects.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Add function virCommandNewVAList which is equivalent to the
virCommandNewArgList but with va_list instead of a variable number
of arguments.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Parallels Cloud Server is a cloud-ready virtualization
solution that allows users to simultaneously run multiple virtual
machines and containers on the same physical server.
More information can be found here: http://www.parallels.com/products/pcs/
Also beta version of Parallels Cloud Server can be downloaded there.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
The 'check-symfile' test case was checking the contents of
libvirt.syms against libvirt.so + all of libvirt_driver_XXX.so
This was in fact bogus - libvirt.syms should only refer to
stuff in libvirt.so, but it had some symbols from the various
driver modules in it too. Now that libvirt.syms has been
fixed, the check-symfile test can be simplified to only
consider libvirt.so
The nwfilter and secrets drivers are both stateful and are already
linked directly to libvirtd. Linking them to libvirt.so is thus
wrong, likewise exporting their symbols in libvirt.so is wrong
The network driver is stateful, so it is linked directly to libvirtd,
rather than libvirt.so. Thus there are no network symbols to be exported
in libvirt.so, and libvirt_network.syms can be deleted
Otherwise, a build may fail with:
lxc/lxc_conatiner.c: In function 'lxcContainerDropCapabilities':
lxc/lxc_container.c:1662:46: error: unused parameter 'keepReboot' [-Werror=unused-parameter]
* src/lxc/lxc_container.c (lxcContainerDropCapabilities): Mark
parameter unused.
If an LXC container is using a virtual network and that network
is not active, currently the user gets a rather unhelpful
error message about tap device setup failure. Add an explicit
check for whether the network is active, in exactly the same
way as the QEMU driver
The cfg.mk file rule to check for tab characters was not
applied to perl files. Much of our Perl code is full of
tabs as a result. Kill them, kill them all !
The reboot() syscall is allowed by new kernels for LXC containers.
The LXC controller can detect whether a reboot was requested
(instead of a normal shutdown) by looking at the "init" process
exit status. If a reboot was triggered, the exit status will
record SIGHUP as the kill reason.
The LXC controller has cleared all its capabilities, and the
veth network devices will no longer exist at this time. Thus
it cannot restart the container init process itself. Instead
it emits an event which is picked up by the LXC driver in
libvirtd. This will then re-create the container, using the
same configuration as it was previously running with (ie it
will not activate 'newDef').
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Check whether the reboot() system call is virtualized, and if
it is, then allow the container to keep CAP_SYS_REBOOT.
Based on an original patch by Serge Hallyn
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This defines a new RPC protocol to be used between the LXC
controller and the libvirtd LXC driver. There is only a
single RPC message defined thus far, an asynchronous "EXIT"
event that is emitted just before the LXC controller process
exits. This provides the LXC driver with details about how
the container shutdown - normally, or abnormally (crashed),
thus allowing the driver to emit better libvirt events.
Emitting the event in the LXC controller requires a few
little tricks with the RPC service. Simply calling the
virNetServiceClientSendMessage does not work, since this
merely queues the message for asynchronous processing.
In addition the main event loop is no longer running at
the point the event is emitted, so no I/O is processed.
Thus after invoking virNetServiceClientSendMessage it is
necessary to mark the client as being in "delayed close"
mode. Then the event loop is run again, until the client
completes its close - this happens only after the queued
message has been fully transmitted. The final complexity
is that it is not safe to run virNetServerQuit() from the
client close callback, since that is invoked from a
context where the server is locked. Thus a zero-second
timer is used to trigger shutdown of the event loop,
causing the controller to finally exit.
* src/Makefile.am: Add rules for generating RPC protocol
files and dispatch methods
* src/lxc/lxc_controller.c: Emit an RPC event immediately
before exiting
* src/lxc/lxc_domain.h: Record the shutdown reason
given by the controller
* src/lxc/lxc_monitor.c, src/lxc/lxc_monitor.h: Register
RPC program and event handler. Add callback to let
driver receive EXIT event.
* src/lxc/lxc_process.c: Use monitor exit event to decide
what kind of domain event to emit
* src/lxc/lxc_protocol.x: Define wire protocol for LXC
controller monitor.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Update the gendispatch.pl script to get a little closer to
being able to generate code for the LXC monitor, by passing
in the struct prefix separately from the procedure prefix.
Also allow method names using virCapitalLetters instead
of vir_underscore_separator
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move the code that handles the LXC monitor out of the
lxc_process.c file and into lxc_monitor.{c,h}
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Update the LXC driver to use the virNetClient APIs for
connecting to the libvirt_lxc monitor, instead of the
low-level socket APIs. This is a step towards running
a full RPC protocol with libvirt_lxc
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Rename the lxc_driver_t struct typedef to virLXCDriver to more
closely follow normal libvirt naming conventions
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
For consistency all the APIs in the lxc_domain.c file should
have a virLXCDomain prefix in their name
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
For consistency all the APIs in the lxc_process.c file should
have a virLXCProcess prefix in their name
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In the socket event handler for the RPC client we must deal
with read/write events, before checking for EOF, otherwise
we might close the socket before we've read & acted upon the
last RPC messages
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Update the remote driver to use the virNetClient close callback
to trigger the virConnectPtr close callbacks
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Allow detection of socket close in virNetClient via a callback
function, triggered on any condition that causes the socket to
be closed.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently if the keepalive timer triggers, the 'markClose'
flag is set on the virNetClient. A controlled shutdown will
then be performed. If an I/O error occurs during read or
write of the connection an error is raised back to the
caller, but the connection isn't marked for close. This
patch ensures that all I/O error scenarios always result
in the connection being marked for close.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Define new virConnect{Register,Unregister}CloseCallback() public APIs
which allows registering/unregistering a callback to be invoked when
the connection to a hypervisor is closed. The callback is provided
with the reason for the close, which may be 'error', 'eof', 'client'
or 'keepalive'.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If a domain is explicitly configured with <seclabel type="none"/> we
correctly ensure that no labeling will be done by setting
norelabel=true. However, if no seclabel element is present in domain XML
and hypervisor is configured not to confine domains by default, we only
set type to "none" without turning off relabeling. Thus if such a domain
is being started, security driver wants to relabel resources with
default label, which doesn't make any sense.
Moreover, with SELinux security driver, the generated image label lacks
"s0" sensitivity, which causes setfilecon() fail with EINVAL in
enforcing mode.
This is a follow up patch of commit f9ce7dad6, it modifies all
the files which declare the copyright like "See COPYING.LIB for
the License of this software" to use the detailed/consistent one.
And deserts the outdated comments like:
* libvirt-qemu.h:
* Summary: qemu specific interfaces
* Description: Provides the interfaces of the libvirt library to handle
* qemu specific methods
*
* Copy: Copyright (C) 2010, 2012 Red Hat, Inc.
Uses the more compact style like:
* libvirt-qemu.h: Interfaces specific for QEMU/KVM driver
*
* Copyright (C) 2010, 2012 Red Hat, Inc.
During refactoring of code, it has proved common to forget to
remove old symbols from the .syms file. While the Win32 linker
will complain about this, the Linux ELF linker does not. The
new test case validates that every symbol listed in the .syms
file actually exists in the built ELF libraries.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
virNWFilterGetIpAddrForIfname and virNWFilterDelIpAddrForIfname
do not exist, so remove them from libvirt_nwfilter.syms
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Any time we have a string with no % passed through gettext, a
translator can inject a % to cause a stack overread. When there
is nothing to format, it's easier to ask for a string that cannot
be used as a formatter, by using a trivial "%s" format instead.
In the past, we have used --disable-nls to catch some of the
offenders, but that doesn't get run very often, and many more
uses have crept in. Syntax check to the rescue!
The syntax check can catch uses such as
virReportError(code,
_("split "
"string"));
by using a sed script to fold context lines into one pattern
space before checking for a string without %.
This patch is just mechanical insertion of %s; there are probably
several messages touched by this patch where we would be better
off giving the user more information than a fixed string.
* cfg.mk (sc_prohibit_diagnostic_without_format): New rule.
* src/datatypes.c (virUnrefConnect, virGetDomain)
(virUnrefDomain, virGetNetwork, virUnrefNetwork, virGetInterface)
(virUnrefInterface, virGetStoragePool, virUnrefStoragePool)
(virGetStorageVol, virUnrefStorageVol, virGetNodeDevice)
(virGetSecret, virUnrefSecret, virGetNWFilter, virUnrefNWFilter)
(virGetDomainSnapshot, virUnrefDomainSnapshot): Add %s wrapper.
* src/lxc/lxc_driver.c (lxcDomainSetBlkioParameters)
(lxcDomainGetBlkioParameters): Likewise.
* src/conf/domain_conf.c (virSecurityDeviceLabelDefParseXML)
(virDomainDiskDefParseXML, virDomainGraphicsDefParseXML):
Likewise.
* src/conf/network_conf.c (virNetworkDNSHostsDefParseXML)
(virNetworkDefParseXML): Likewise.
* src/conf/nwfilter_conf.c (virNWFilterIsValidChainName):
Likewise.
* src/conf/nwfilter_params.c (virNWFilterVarValueCreateSimple)
(virNWFilterVarAccessParse): Likewise.
* src/libvirt.c (virDomainSave, virDomainSaveFlags)
(virDomainRestore, virDomainRestoreFlags)
(virDomainSaveImageGetXMLDesc, virDomainSaveImageDefineXML)
(virDomainCoreDump, virDomainGetXMLDesc)
(virDomainMigrateVersion1, virDomainMigrateVersion2)
(virDomainMigrateVersion3, virDomainMigrate, virDomainMigrate2)
(virStreamSendAll, virStreamRecvAll)
(virDomainSnapshotGetXMLDesc): Likewise.
* src/nwfilter/nwfilter_dhcpsnoop.c (virNWFilterSnoopReqLeaseDel)
(virNWFilterDHCPSnoopReq): Likewise.
* src/openvz/openvz_driver.c (openvzUpdateDevice): Likewise.
* src/openvz/openvz_util.c (openvzKBPerPages): Likewise.
* src/qemu/qemu_cgroup.c (qemuSetupCgroup): Likewise.
* src/qemu/qemu_command.c (qemuBuildHubDevStr, qemuBuildChrChardevStr)
(qemuBuildCommandLine): Likewise.
* src/qemu/qemu_driver.c (qemuDomainGetPercpuStats): Likewise.
* src/qemu/qemu_hotplug.c (qemuDomainAttachNetDevice): Likewise.
* src/rpc/virnetsaslcontext.c (virNetSASLSessionGetIdentity):
Likewise.
* src/rpc/virnetsocket.c (virNetSocketNewConnectUNIX)
(virNetSocketSendFD, virNetSocketRecvFD): Likewise.
* src/storage/storage_backend_disk.c
(virStorageBackendDiskBuildPool): Likewise.
* src/storage/storage_backend_fs.c
(virStorageBackendFileSystemProbe)
(virStorageBackendFileSystemBuild): Likewise.
* src/storage/storage_backend_rbd.c
(virStorageBackendRBDOpenRADOSConn): Likewise.
* src/storage/storage_driver.c (storageVolumeResize): Likewise.
* src/test/test_driver.c (testInterfaceChangeBegin)
(testInterfaceChangeCommit, testInterfaceChangeRollback):
Likewise.
* src/vbox/vbox_tmpl.c (vboxListAllDomains): Likewise.
* src/xenxs/xen_sxpr.c (xenFormatSxprDisk, xenFormatSxpr):
Likewise.
* src/xenxs/xen_xm.c (xenXMConfigGetUUID, xenFormatXMDisk)
(xenFormatXM): Likewise.
Change the permissible minimum value of nodesuspend duration time
to 60 seconds. If option is less than the value, reports error.
Update virsh help and manpage the infomation.
No check for conn->uri being NULL in virAuthGetConfigFilePath (valid
state) made the client segfault. This happens for example with these
settings:
- no virtualbox driver installed (modifies conn->uri)
- no default URI set (VIRSH_DEFAULT_CONNECT_URI="",
LIBVIRT_DEFAULT_URI="", uri_default="")
- auth_sock_rw="sasl"
- virsh run as root
That are unfortunately the settings with fresh Fedora 17 installation
with VDSM.
The check ought to be enough as conn->uri being NULL is valid in later
code and is handled properly.
If from a clean GIT checkout 'make -j 8' is run, the ESX
and Hyper-V code will be generated multiple times over.
This is because there are multiple files being generated
from one invocation of the generator script. make does not
realize this and so invokes the generator once per file.
This doesn't matter with serialized builds, but with
parallel builds multiple instances of the generator get
run at once.
make[2]: Entering directory `/home/berrange/src/virt/libvirt/src'
GEN util/virkeymaps.h
GEN remote/remote_protocol.h
GEN remote/remote_client_bodies.h
GEN remote/qemu_protocol.h
GEN remote/qemu_client_bodies.h
GEN esx/esx_vi_methods.generated.c
GEN esx/esx_vi_methods.generated.h
GEN esx/esx_vi_methods.generated.macro
GEN esx/esx_vi_types.generated.c
GEN esx/esx_vi_types.generated.h
GEN esx/esx_vi_types.generated.typedef
GEN esx/esx_vi_types.generated.typedef
GEN esx/esx_vi_types.generated.typeenum
GEN esx/esx_vi_types.generated.typetostring
GEN esx/esx_vi_types.generated.typefromstring
GEN esx/esx_vi_types.generated.h
GEN esx/esx_vi_types.generated.c
GEN esx/esx_vi_methods.generated.h
GEN esx/esx_vi_methods.generated.c
GEN esx/esx_vi_methods.generated.macro
GEN esx/esx_vi.generated.h
GEN esx/esx_vi.generated.c
GEN esx/esx_vi_types.generated.typeenum
GEN esx/esx_vi_types.generated.typedef
GEN esx/esx_vi_types.generated.typeenum
GEN esx/esx_vi_types.generated.typetostring
GEN esx/esx_vi_types.generated.typefromstring
GEN esx/esx_vi_types.generated.h
GEN esx/esx_vi_types.generated.c
GEN esx/esx_vi_methods.generated.h
...snip...
GEN hyperv/hyperv_wmi.generated.h
GEN libvirt_qemu_probes.h
GEN locking/qemu-sanlock.conf
GEN hyperv/hyperv_wmi.generated.c
GEN rpc/virnetprotocol.h
GEN hyperv/hyperv_wmi_classes.generated.typedef
GEN hyperv/hyperv_wmi_classes.generated.h
GEN hyperv/hyperv_wmi_classes.generated.c
GEN rpc/virkeepaliveprotocol.h
GEN remote/remote_protocol.c
GEN remote/qemu_protocol.c
GEN rpc/virkeepaliveprotocol.c
GEN rpc/virnetprotocol.c
GEN libvirt.def
Prevent this using a timestamp file to control generation,
as was previously done for the python bindings in commit
a7868e0131
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Per the FSF address could be changed from time to time, and GNU
recommends the following now: (http://www.gnu.org/licenses/gpl-howto.html)
You should have received a copy of the GNU General Public License
along with Foobar. If not, see <http://www.gnu.org/licenses/>.
This patch removes the explicit FSF address, and uses above instead
(of course, with inserting 'Lesser' before 'General').
Except a bunch of files for security driver, all others are changed
automatically, the copyright for securify files are not complete,
that's why to do it manually:
src/security/security_selinux.h
src/security/security_driver.h
src/security/security_selinux.c
src/security/security_apparmor.h
src/security/security_apparmor.c
src/security/security_driver.c
Fix addresses two issues:
1. Fix generator code to allow deep copy operation for objects with
Dynamic_Cast capabilities.
2. Add missing deep copy routine to Long datatype.
Signed-off-by: Ata E Husain Bohra <ata.husain@hotmail.com>
to query a guests's hostname. Containers like LXC and OpenVZ allow to
set a hostname different from the hosts name and QEMU's guest agent
could provide similar functionality.
When reporting a system error (VIR_ERR_SYSTEM_ERROR) via
virReportSystemError, we should copy the errno value into
the 'int1' field of the virErrorPtr struct. This allows
callers to detect certain errno conditions & discard the
error
* src/util/virterror.c: Place errno value in int1 field
The previous check for YAJL would have many undesirable
consequences, the most important being that it caused the
capabilities XML to lose all <guest> elements. There is
no user visible feedback as to what is wrong in this respect,
merely a syslog message. The empty capabilities causes
libvirtd to then throw away all guest XML configs that are
stored.
This changes the code so that the check for YAJL is only
performed at the time we attempt to spawn a QEMU process
error: Failed to start domain vm-vnc
error: unsupported configuration: this qemu binary requires libvirt to be compiled with yajl
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of using an O(n) efficiency linked list for storing
MCS labels, use a hash table. Instead of having the list
be global, put it in the SELinux driver private data struct
to ensure uniqueness across different instances of the driver.
This also ensures thread safety when multiple hypervisor
drivers are used in the same libvirtd process
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When adding MCS labels, OOM was not being handled correctly.
In addition when reserving an existing label, no check was
made to see if it was already reserved
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The function names in the SELinux driver all start with
SELinux or 'mcs' as a prefix. Sanitize this so that they
all use 'virSecuritySELinux' as the prefix
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Running libvirtd unprivileged results in a warning message from
the NWFilter driver
virNWFilterSnoopLeaseFileRefresh:1882 : open("/var/run/libvirt/network/nwfilter.ltmp"): No such file or directory
Since it requires privileged network access, this driver should
not even run when unprivileged.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Update the legacy Xen drivers to use virReportError instead of
the statsError, virXenInotifyError, virXenStoreError,
virXendError, xenUnifiedError, xenXMError custom macros
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The xenHypervisorInit method was called from two different
locations, during initial driver registration and also while
opening a Xen connection. The former can't report any useful
errors to the end user/app, so remove it. To ensure thread
safety use a VIR_ONCE_GLOBAL_INIT call to invoke
xenHypervisorInit from the xenHypervisorOpen method.
As per the comment, the Xen hypervisor driver is considered to
be mandatory when running privileged. When it fails to open,
we should thus return an error, not ignore it.
ensures that initialization will always take place when it is
needed, and guarantees it only occurs once. The problem is that
the code to setup a global initializer with proper error
propagation is tedious. This introduces VIR_ONCE_GLOBAL_INIT
macro to simplify this.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Update the libvirtd dispatch code to use virReportError
instead of the virNetError custom macro
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Update the nodeinfo helper code to use virReportError instead
of the nodeReportError custom macro
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Update the security drivers to use virReportError instead of
the virSecurityReportError custom macro
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Update the Power-Hypervisor driver to use virReportError
instead of the PHYP_ERROR custom macro
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Update the ESX driver to use virReportError instead of
the ESX_ERROR & ESX_VI_ERROR custom macros
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The 'lxc_driver' global variable is now used from several of
the LXC sources files. Thus it needs to be non-static to
avoid runtime linkage errors
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
The Xen driver had a number of error reports which passed a
constant string without format specifiers and was missing
"%s". Furthermore the errors were related to failing system
calls, but virReportSystemError was not used. So the only
useful piece of info (the errno) was being discarded
Move all the code that manages stop/start of LXC processes
into separate lxc_process.{c,h} file to make the lxc_driver.c
file smaller
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move the cgroup setup code out of the lxc_controller.c file
and into lxc_cgroup.{c,h}. This reduces the size of the
lxc_controller.c file and paves the way to invoke cgroup
setup from lxc_driver.c instead of lxc_controller.c in the
future
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move the LXC driver code related to the virDomainObjPtr
private data into separate lxc_domain.{c,h} files
to reduce the size of lxc_driver.c
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Make sure that libvirt_private.syms has all the internal symbols
from APIs in src/rpc/*.h and src/util/cgroup.h, since the LXC
controller/driver will shortly need them
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To allow virNetServerRun/virNetServerQuit to be invoked multiple
times, we must reset the 'quit' flag in virNetServerRun
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In the delayed close mode, we're just waiting for final data to
be written back to the client. While waiting, we should not
bother to read more data from the client.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When sending SIGHUP to libvirtd, it will trigger the virStateDriver
reload operation. This is intended to reload the configuration files
for guests. For unknown historical reasons this is also triggering
autostart of all guests. Autostart is generally expected to be
something that happens on OS startup. Starting VMs on SIGHUP will
violate that expectation and potentially cause dangerous scenarios
if the admin has explicitly shutdown a misbehaving VM that has
been marked as autostart
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Update the linux bridge driver to use virReportError instead
of the networkReportError custom macro
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of only removing the ending newline character, it is
better to remove all of standard whitespace character for the
sake of log format.
One example that we have to do this is:
After three times incorrect password input, virsh command
virsh -c qemu://remoteserver/system will report error like:
: Connection reset by peerey,gssapi-keyex,gssapi-with-mic,password).
But it should be:
Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
: Connection reset by peer
The reason is that we dropped the newline, but have a '\r' left.
The terminal interprets it as "move the cursor back to the start
of the current line", so the error string is messed up.
It was broken since forever as it expected a libxml2
XML_ELEMENT_NODE containing a XML_TEXT_NODE instead of
just a XML_TEXT_NODE.
This problem was not discovered for so long because
esxVI_String_Deserialize was not used until now.
Reported by Ata Bohra
Commit 80533ca forgot to think about offline cpus. When a node
cpu is offline, then its topology/ subdirectory is not present,
leading to spurious error messages leaked to the user such as:
libvir: error : cannot open /home/dummy/libvirt/tests/nodeinfodata/linux-nodeinfo-sysfs-test-6/node/node0/cpu7/topology/physical_package_id: No such file or directory
Fix that, as well as test it; the test data is gathered from a
machine with one NUMA node, hyperthreading, and with 2 of the
8 cpus offline.
* src/nodeinfo.c (virNodeParseNode): Don't parse topology of
offline cpus.
* tests/nodeinfotest.c (mymain): Run new test.
* tests/nodeinfodata/linux-nodeinfo-sysfs-test-6*: New data.
Update the network filter driver to use virReportError instead
of the virNWFilterReportError custom macro
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch brings support to manage sheepdog pools and volumes to libvirt.
It uses the "collie" command-line utility that comes with sheepdog for that.
A sheepdog pool in libvirt maps to a sheepdog cluster.
It needs a host and port to connect to, which in most cases
is just going to be the default of localhost on port 7000.
A sheepdog volume in libvirt maps to a sheepdog vdi.
To create one specify the pool, a name and the capacity.
Volumes can also be resized later.
In the volume XML the vdi name has to be put into the <target><path>.
To use the volume as a disk source for virtual machines specify
the vdi name as "name" attribute of the <source>.
The host and port information from the pool are specified inside the host tag.
<disk type='network'>
...
<source protocol="sheepdog" name="vdi_name">
<host name="localhost" port="7000"/>
</source>
</disk>
To work right this patch parses the output of collie,
so it relies on the raw output option. There recently was a bug which caused
size information to be reported wrong. This is fixed upstream already and
will be in the next release.
Signed-off-by: Sebastian Wiedenroth <wiedi@frubar.net>
Basically within a Secure Linux Container (virt-sandbox) we want all content
that the process within the container can write to be labeled the same. We
are labeling the physical disk correctly but when we create "RAM" based file
systems
libvirt is not labeling them, and they are defaulting to tmpfs_t, which will
will not allow the processes to write. This patch labels the RAM based file
systems correctly.
Update the node device driver to use virReportError instead of
the virNodeDeviceReportError custom macro
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Update the secret driver to use virReportError instead of the
virSecretReportError custom macro
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Update the storage driver to use virReportError instead of
the virStorageReportError custom macro
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When passing a const message string to the error reporting APIs
RBD forgot to use "%s" to avoid GCC format string warnings
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This removes nearly all the per-file error reporting macros
from the code in src/util/. A few custom macros remain for the
case, where the file needs to report errors with a variety of
different codes or parameters
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virNetServerMDNSTimeoutNew method was casting a long long
to an int when reporting errors. This should just be using
%lld instead of %d, avoiding the need to cast
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virnetdevtap.c and viruri.c files had two error report
messages which were not annotated with _(...)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Nearly every source file does something like
#define VIR_FROM_THIS VIR_FROM_FOO
#define virFooReportErorr(code, ...) \
virReportErrorHelper(VIR_FROM_THIS, code, __FILE__, \
__FUNCTION__, __LINE__, \
__VA_ARGS__)
This creates needless duplication and inconsistent error
reporting function names in each file. It is trivial to
just have virterror_internal.h provide a virReportError
macro that is equivalent
* src/util/virterror_internal.h: Define virReportError(code, ...)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remote driver needs to make sure the driver lock is released before
entering client IO loop as that may block indefinitely in poll(). As a
direct consequence of not following this in stream APIs, tunneled
migration to a destination host which becomes non-responding may block
qemu driver. Luckily, if keepalive is turned for p2p migrations, both
remote and qemu drivers will get automagically unblocked after keepalive
timeout.
The previous commit (387117ad92) was incomplete leaving those
who does not use libpcap with uncompilable sources beacuse
of incomplete conversion of virNWFilterDHCPSnoopReq function.
Introduce new members in the virMacAddr 'class'
- virMacAddrSet: set virMacAddr from a virMacAddr
- virMacAddrSetRaw: setting virMacAddr from raw 6 byte MAC address buffer
- virMacAddrGetRaw: writing virMacAddr into raw 6 byte MAC address buffer
- virMacAddrCmp: comparing two virMacAddr
- virMacAddrCmpRaw: comparing a virMacAddr with a raw 6 byte MAC address buffer
then replace raw MAC addresses by replacing
- 'unsigned char *' with virMacAddrPtr
- 'unsigned char ... [VIR_MAC_BUFLEN]' with virMacAddr
and introduce usage of above functions where necessary.
Even though qemu-kvm binaries can be used in TCG mode, libvirt would
only detect them if /dev/kvm was available. Thus, one would need to make
a /usr/bin/qemu symlink to be able to use TCG mode with qemu-kvm in an
environment without KVM support.
And even though QEMU is able to make use of KVM, libvirt would not
advertise KVM support unless there was a qemu-kvm symlink available.
This patch fixes both issues.
If QEMU supports the BALLOON_EVENT QMP event, then we can
avoid invoking 'query-balloon' when returning XML or the
domain info.
* src/qemu/qemu_capabilities.c, src/qemu/qemu_capabilities.h:
Add QEMU_CAPS_BALLOON_EVENT
* src/qemu/qemu_driver.c: Skip query-balloon in
qemudDomainGetInfo and qemuDomainGetXMLDesc if we have
QEMU_CAPS_BALLOON_EVENT set
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Check
for BALLOON_EVENT at connect to monitor. Add callback
for balloon change notifications
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h:
Add handling of BALLOON_EVENT and impl 'query-events'
check
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When the guest changes its memory balloon applications may want
to know what the new value is, without having to periodically
poll on XML / domain info. Introduce a "balloon change" event
to let apps see this
* include/libvirt/libvirt.h.in: Define the
virConnectDomainEventBalloonChangeCallback callback
and VIR_DOMAIN_EVENT_ID_BALLOON_CHANGE constant
* python/libvirt-override-virConnect.py,
python/libvirt-override.c: Wire up helpers for new event
* daemon/remote.c: Helper for serializing balloon event
* examples/domain-events/events-c/event-test.c,
examples/domain-events/events-python/event-test.py: Add
example of balloon event usage
* src/conf/domain_event.c, src/conf/domain_event.h: Handling
of balloon events
* src/remote/remote_driver.c: Add handler of balloon events
* src/remote/remote_protocol.x: Define wire protocol for
balloon events
* src/remote_protocol-structs: Likewise.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When building with --disable-debug, VIR_DEBUG expands to a nop.
But parameters to VIR_DEBUG can be variables that are passed only
to VIR_DEBUG. In the case the building system complains about unused
variables.
When --direct is used when migrating a domain running on a hypervisor
that does not support direct migration (such as QEMU), the caller would
get the following error message:
this function is not supported by the connection driver:
virDomainMigrateToURI2
which is a complete nonsense since qemu driver implements
virDomainMigrateToURI2. This patch would emit a more sensible error in
this case:
Requested operation is not valid: direct migration is not supported
by the connection driver
Commit 32a9aac switched libvirt to use the XDG base directories
to locate most of its data/config. In particular, the per-user socket
for qemu:///session is now stored in the XDG runtime directory.
This directory is located by looking at the XDG_RUNTIME_DIR environment
variable, with a fallback to ~/.cache/libvirt if this variable is not
set.
When the daemon is autospawned because a client application wants
to use qemu:///session, the daemon is ran in a clean environment
which does not contain XDG_RUNTIME_DIR. It will create its socket
in ~/.cache/libvirt. If the client application has XDG_RUNTIME_DIR
set, it will not look for the socket in the fallback place, and will
fail to connect to the autospawned daemon.
This patch adds XDG_RUNTIME_DIR to the daemon environment before
auto-starting it. I've done this in virNetSocketForkDaemon rather
than in virCommandAddEnvPassCommon as I wasn't sure we want to pass
these variables to other commands libvirt spawns. XDG_CACHE_HOME
and XDG_CONFIG_HOME are also added to the daemon env as it makes use
of those as well.
When calling 'lvcreate' if specifying both the '-L' and
'--virtualsize' options, the latter will be treated as
the capacity and the former as the allocation. This can
be used to support sparse volume creation. In addition,
when listing volumes it is necessary to include the 'size'
field in lvs output, so that we can detect sparse volume
allocation correctly.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To make it easier to dynamically change the command line ARGV,
switch all storage code over to use virCommandPtr APIs for
running programs
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Fix the virStorageBackendFileSystemVolDelete method to not use
unlink() unconditionally. It must use rmdir() for volumes which
are directories. It should also raise an error if given a volume
which has the network/block type.
Per the typical use of libvirt is to fork the qemu process with
qemu:qemu. Setting the pool permission mode as 0700 by default
will prevent the guest start with permission reason.
Define macro for the default pool and vol permission modes
incidentally.
Since we are not yet using the virNetServerPtr object for running
the event loop, we can't use virNetServerQuit(). Instead set the
global 'quit' flag in libvirt_lxc
This patch changes the way data to fill the nodeinfo structure are
gathered. We've gathere the test data by iterating processors an sockets
separately from nodes. The reported data was based solely on information
about core id. Problems arise when eg cores in mulit-processor machines
don't have same id's on both processors or maybe one physical processor
contains more NUMA nodes.
This patch changes the approach how we detect processors and nodes. Now
we start at enumerating nodes and for each node processors, sockets and
threads are enumerated separately. This approach provides acurate data
that comply to docs about the nodeinfo structure. This also enables to
get rid of hacks: see commits 10d9038b74,
ac9dd4a676. (Those changes in nodeinfo.c
are efectively reverted by this patch).
This patch also changes output of one of the tests, as the processor
topology is now acquired more precisely.
The s390(x) architecture doesn't feature a PCI bus. For the purpose of
supporting virtio devices a virtual bus called virtio-s390 is used.
A new address type VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_S390 is used to
distinguish the virtio devices on s390 from PCI-based virtio devices.
V3 Change: updated QEMU_CAPS_VIRTIO_S390 to fit upstream.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
This is in preparation of the enablement of s390 guests with virtio devices.
The assignment of device addresses happens in different places, i.e. the
qemu driver and process modules as well as in the unit tests in slightly
different flavors. Currently, these are PPC spapr-vio and PCI
devices, virtio-s390 (not PCI based) will follow.
By optionally passing to qemuDomainAssignAddresses the domain
object and the capabilities it is now possible to call the function
from most of the places (except for hotplug) where address assignment
is done.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
This makes the driver fail with a clear error message in case of UUID
collisions (for example if somebody copied a container configuration
without updating the UUID) and also raises an error on other hash map
failures.
OpenVZ itself doesn't complain about duplicate UUIDs since this
parameter is only used by libvirt.
Commit 5e6ce1 moved down detection of the ACPI feature in
qemuParseCommandLine. However, when ACPI is detected, it clears
all feature flags in def->features to only set ACPI. This used to
be fine because this was the first place were def->features was set,
but after the move this is no longer necessarily true because this
block comes before the ACPI check:
if (strstr(def->emulator, "kvm")) {
def->virtType = VIR_DOMAIN_VIRT_KVM;
def->features |= (1 << VIR_DOMAIN_FEATURE_PAE);
}
Since def is allocated in qemuParseCommandLine using VIR_ALLOC, we
can always use |= when modifying def->features
The only useful translation of "%s" as a format string is "%s" (I
suppose you could claim "%1$s" is also valid, but why bother). So
it is not worth translating; fixing this exposes some instances
where we were failing to translate real error messages. This makes
the fix of commit 097da1ab more generic, as well as ensuring no
future regressions.
* cfg.mk (sc_prohibit_useless_translation): New rule.
* src/lxc/lxc_driver.c (lxcSetVcpuBWLive): Fix offender.
* src/openvz/openvz_conf.c (openvzReadFSConf): Likewise.
* src/qemu/qemu_cgroup.c (qemuSetupCgroupForVcpu): Likewise.
* src/qemu/qemu_driver.c (qemuSetVcpusBWLive): Likewise.
* src/xenapi/xenapi_utils.c (xenapiSessionErrorHandle): Likewise.
Instead of changing the existed virFileMakePath to accept mode
argument and modifying a pile of its uses, this patch introduces
virFileMakePathWithMode, and use it instead of mkdir() to create
the readline history dir.
Commit 122fa379de introduces option to
store more than one host entry in a storage pool source definition. That
commit causes a regression, where a check is added that only one host
entry should be present (that actualy is not present as the source
structure was just allocated and zeroed) instead of allocating memory
for the host entry.
As the storage pool sources are stored in a list of structs, the pointer
returned by virStoragePoolSourceListNewSource() shouldn't be freed as it
points in the middle of a memory block. This combined with a regression
that takes the error path every time on caused a double-free abort on
the src struct in question.
Previous commits added code to unmount the existing /proc,
/sys and /dev hierarchies on the root filesystem of the
container. This should only have been done if the container's
root filesystem was the same as the host's root. ie if
the root source is '/'. As it is, this causes LXC containersr
to fail to start if their root source is not '/'
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In preparation for introducing a full RPC protocol for
libvirt_lxc, switch over to using the virNetServer APIs
for the monitor connection
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
While it is not currently used elsewhere in libvirt, the code
for finding a free loop device & associating a file with it
is not LXC specific. Move it into the viffile.{c,h} file where
potentially shared code is more commonly kept.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move the cgroup object into virLXCControllerPtr and rename
all the setup methods to include 'Cgroup' in their name
if appropriate
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move the monitor FDs into the virLXCControllerPtr object
removing the need for the 'struct lxcMonitor' object
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virLXCControllerRun method is getting a little too large,
and about 50% of its code is related to setting up a /dev/pts
mount. Move the latter out into a dedicated method
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move the security manager object into the virLXCControllerPtr
object. Also simplify the code creating it in the first place
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move the list of loop device FDs into the virLXCControllerPtr
object and make sure that virLXCControllerStopInit will
close them all
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Turn 'struct lxc_console' into virLXCControllerConsolePtr and make it
a part of virLXCControllerPtr
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Keep a record of the init PID in the virLXCController object
and create a virLXCControllerStopInit method for killing this
process
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move the veth device name state into the virLXCControllerPtr
object and stop passing it around. Also use size_t instead
of unsigned int for the array length parameters.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The LXC controller code is having to pass around an ever increasing
number of parameters between methods. To make the code more managable
introduce a virLXCControllerPtr to hold all this state, starting with
the container name and virDomainDefPtr object
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the build of libvirt_lxc will cause recompilation
of all sources under src/util, src/conf, src/security and
more. Switch the libvirt_lxc process to link against the
libtool convenience libraries that are already built as
part of the main libvirt.os & libvirtd build process
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Refactor the RPC server dispatcher code so that if 'max_workers==0'
the entire server will run single threaded. This is useful for
use cases where there will only ever be 1 client connected
which serializes its requests
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The callback that is invoked when a new RPC client is
initialized does not have any opaque parameter. Add
one so that custom data can be passed into the callback
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Recently the Ceph project defaulted auth_supported from 'none' to 'cephx'.
When no auth information was set for Ceph disks this would lead to librados defaulting to
'cephx', but there would be no additional authorization information.
We now explicitly set auth_supported to none when passing down arguments to Qemu.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
This patch adds an internal function vmwareUpdateVMStatus to
update the real state of the domain. This function is used in
various places in the driver, in particular to detect when
the domain has been shut down by the user with the "halt"
command.
QEMU domains were marked as having managed save image even if they were
saved using the regular save. With this patch, domains are marked so
only when using managed save API.
QEMU (and librbd) flush the cache on the source before the
destination starts, and the destination does not read any
changeable data before that, so live migration with rbd caching
is safe.
This makes 'virsh migrate' work with rbd and caching without the
--unsafe flag.
Reported-by: Vladimir Bashkirtsev <vladimir@bashkirtsev.com>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
virDomainBlockStatsFlags can't collect total_time_ns for read/write/flush
because of key typo when retriveing from qemu cmd result
Signed-off-by: lvroyce <lvroyce@linux.vnet.ibm.com>
Reported by Jason Helfman as a build-breaker on FreeBSD.
* src/conf/domain_conf.c (virDomainFSDefParseXML): Use POSIX
spelling.
* src/openvz/openvz_conf.c (openvzReadFSConf): Likewise.
Below patch fixes this coverity report:
/libvirt/src/conf/nwfilter_conf.c:382:
leaked_storage: Variable "varAccess" going out of scope leaks the storage it points to.
Since we are mounting a new /dev in the container, we must
remove any sub-mounts like /dev/shm, /dev/mqueue, etc,
otherwise they'll be recorded in /proc/mounts, but not be
accessible to applications.
Hello,
This is a patch to fix vm's outbound traffic control problem.
Currently, vm's outbound traffic control by libvirt doesn't go well.
This problem was previously discussed at libvir-list ML, however
it seems that there isn't still any answer to the problem.
http://www.redhat.com/archives/libvir-list/2011-August/msg00333.html
I measured Guest(with virtio-net) to Host TCP throughput with the
command "netperf -H".
Here are the outbound QoS parameters and the results.
outbound average rate[kilobytes/s] : Guest to Host throughput[Mbit/s]
======================================================================
1024 (8Mbit/s) : 4.56
2048 (16Mbit/s) : 3.29
4096 (32Mbit/s) : 3.35
8192 (64Mbit/s) : 3.95
16384 (128Mbit/s) : 4.08
32768 (256Mbit/s) : 3.94
65536 (512Mbit/s) : 3.23
The outbound traffic goes down unreasonably and is even not controled.
The cause of this problem is too large mtu value in "tc filter" command run by
libvirt. The command uses burst value to set mtu and the burst is equal to
average rate value if it's not set. This value is too large. For example
if the average rate is set to 1024 kilobytes/s, the mtu value is set to 1024
kilobytes. That's too large compared to the size of network packets.
Here libvirt applies tc ingress filter to Host's vnet(tun) device.
Tc ingress filter is implemented with TBF(Token Buckets Filter) algorithm. TBF
uses mtu value to calculate the amount of token consumed by each packet. With too
large mtu value, the token consumption rate is set too large. This leads to
token starvation and deterioration of TCP throughput.
Then, should we use the default mtu value 2 kilobytes?
The anser is No, because Guest with virtio-net device uses 65536 bytes
as mtu to transmit packets to Host, and the tc filter with the default mtu
value 2k drops packets whose size is larger than 2k. So, the most packets
is droped and again leads to deterioration of TCP throughput.
The appropriate mtu value is 65536 bytes which is equal to the maximum value
of network interface device defined in <linux/netdevice.h>. The value is
not so large that it causes token starvation and not so small that it
drops most packets.
Therefore this patch set the mtu value to 64kb(== 65535 bytes).
Again, here are the outbound QoS parameters and the TCP throughput with
the libvirt patched.
outbound average rate[kilobytes/s] : Guest to Host throughput[Mbit/s]
======================================================================
1024 (8Mbit/s) : 8.22
2048 (16Mbit/s) : 16.42
4096 (32Mbit/s) : 32.93
8192 (64Mbit/s) : 66.85
16384 (128Mbit/s) : 133.88
32768 (256Mbit/s) : 271.01
65536 (512Mbit/s) : 547.32
The outbound traffic conforms to the given limit.
Thank you,
Signed-off-by: Eiichi Tsukata <eiichi.tsukata.xh@hitachi.com>
If the user specified invalid protocol type in a network's SRV record
the error path ended up in freeing uninitialized pointers causing a
daemon crash.
*network_conf.c: virNetworkDNSSrvDefParseXML(): initialize local
variables
VirtualBox doesn't use the common virDomainObj implementation so this
patch adds a separate implementation using the VirtualBox API.
This driver implementation supports all currently defined flags. As
VirtualBox does not support transient guests, managed save images and
autostarting we assume all guests are persistent, don't have a managed
save image and are not autostarted. Filtering for existence of those
properities results in empty list.
mnt_fsname can not be the same, as we check the duplicate pool
sources earlier before, means it can't be the same pool, moreover,
a pool can't be started if it's already active anyway. So no reason
to act as success.
virConnectDomainEventRegisterAny() takes a domain as an argument.
So it should be possible to register the same event (be it
VIR_DOMAIN_EVENT_ID_LIFECYCLE for example) for two different domains.
That is, we need to take domain into account when searching for
duplicate event being already registered.
In order to retrieve some sysinfo data we need to parse /proc/sysinfo and
/proc/cpuinfo.
Signed-off-by: Thang Pham <thang.pham@us.ibm.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
For the s390x architecture the sysfs core_id alone is not unique. As a
result it can happen that libvirt thinks there are less host CPUs available
than really present.
Currently, a logical CPU is equivalent to a core for s390x. We therefore
produce a fake core id from the CPU number.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Minimal CPU "parser" for s390 to avoid compile time warning.
Signed-off-by: Thang Pham <thang.pham@us.ibm.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Adding CPU encoder/decoder for s390 to avoid runtime error messages.
Signed-off-by: Thang Pham <thang.pham@us.ibm.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Starting a KVM guest on s390 fails immediately. This is because
"qemu --help" reports -no-acpi even for the s390(x) architecture but
-no-acpi isn't supported there.
Workaround is to remove QEMU_CAPS_NO_ACPI from the capability set
after the version/capability extraction.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
We used to prefix 'rbd:' to volume names, this is not necessary.
Qemu takes RBD devices in this way, like: qemu -drive rbd:pool/image
When attaching a network disk like RBD to a guest we however do not use this prefix.
Currently you can't map a RBD volume name directly to a domain without removing the prefix.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
If no 'listen' attribute or <listen> element is set in the
guest XML, the default driver configured listen address is
used. There is no way to client applications to determine
what this address is though. When starting the guest, we
should update the live XML to include this default listen
address
Storage is one of the last domains in libvirt where we don't fully
utilize inactive and live XML. Okay, it might be because we don't
have support for that. So implement such support. However, we need
to fallback when talking to old daemon which doesn't support this
new flag called VIR_STORAGE_XML_INACTIVE.
Currently, we share the idea of old & new def with domains. Users can
*-edit an object (domain, pool) which spawns a new internal
representation for them. This is referenced via
{domainObj,poolObj}->newDef [compared to ->def]. However, for pool we
were never overwriting def with newDef. This must be done on
pool-destroy (like we do analogically in domain detroy).
Currently libvirt-lxc checks to see if the destination exists and is a
directory. If it is not a directory then the mount fails. Since
libvirt-lxc can bind mount files on an inode, this patch is needed to
allow us to bind mount files on files. Currently we want to bind mount
on top of /etc/machine-id, and /etc/adjtime
If the destination of the mount point does not exists, it checks if the
src is a directory and then attempts to create a directory, otherwise it
creates an empty file for the destination. The code will then bind mount
over the destination.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some GNULIB headers (eg unistd.h) will often need to include
winsock2.h for various symbols. There is a rule that winsock2.h
must be included before windows.h. This means that any file
which does
#ifdef WIN32
#include <windows.h>
#endif
#include <unistd.h>
is potentially broken. A simple rule is that /all/ includes of
windows.h must be matched with a preceding include of winsock2.h
regardless of whether unistd.h is used currently
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A sanlock lease can be marked as shared (rather
than exclusive) using SANLK_RES_SHARED flag. This
adds support for that flag and ensures that in auto
disk mode, any shared disks use shared leases. This
also makes any read-only disks be completely
ignored.
These changes remove the need for the option
ignore_readonly_and_shared_disks
so that is removed
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently you can configure LXC to bind a host directory to
a guest directory, but not to bind a guest directory to a
guest directory. While the guest container init could do
this itself, allowing it in the libvirt XML means a stricter
SELinux policy can be written
Introduce a new syntax for filesystems to allow use of a RAM
filesystem
<filesystem type='ram'>
<source usage='10' units='MiB'/>
<target dir='/mnt'/>
</filesystem>
The usage units default to KiB to limit consumption of host memory.
* docs/formatdomain.html.in: Document new syntax
* docs/schemas/domaincommon.rng: Add new attributes
* src/conf/domain_conf.c: Parsing/formatting of RAM filesystems
* src/lxc/lxc_container.c: Mounting of RAM filesystems
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The test of ref count is not protected by lock, which is unsafe because
the ref count may have been changed by other threads during the test.
This patch fixes this.
When shutting down libvirtd, the virNetServer shutdown can deadlock
if there are in-flight jobs being handled by virNetServerHandleJob().
virNetServerFree() will acquire the virNetServer lock and call
virThreadPoolFree() to terminate the workers, waiting for the workers
to finish. But in-flight workers will attempt to acquire the
virNetServer lock, resulting in deadlock.
Fix the deadlock by unlocking the virNetServer lock before calling
virThreadPoolFree(). This is safe since the virNetServerPtr object
is ref-counted and only decrementing the ref count needs to be
protected. Additionally, there is no need to re-acquire the lock
after virThreadPoolFree() completes as all the workers have
terminated.
The lxc contoller eventually makes use of virRandomBits(), which was
segfaulting since virRandomInitialize() is never invoked.
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff554d560 in random_r () from /lib64/libc.so.6
(gdb) bt
0 0x00007ffff554d560 in random_r () from /lib64/libc.so.6
1 0x0000000000469eaa in virRandomBits (nbits=32) at util/virrandom.c:80
2 0x000000000045bf69 in virHashCreateFull (size=256,
dataFree=0x4aa2a2 <hashDataFree>, keyCode=0x45bd40 <virHashStrCode>,
keyEqual=0x45bdad <virHashStrEqual>, keyCopy=0x45bdfa <virHashStrCopy>,
keyFree=0x45be37 <virHashStrFree>) at util/virhash.c:134
3 0x000000000045c069 in virHashCreate (size=0, dataFree=0x4aa2a2 <hashDataFree>)
at util/virhash.c:164
4 0x00000000004aa562 in virNWFilterHashTableCreate (n=0)
at conf/nwfilter_params.c:686
5 0x00000000004aa95b in virNWFilterParseParamAttributes (cur=0x711d30)
at conf/nwfilter_params.c:793
6 0x0000000000481a7f in virDomainNetDefParseXML (caps=0x702c90, node=0x7116b0,
ctxt=0x7101b0, bootMap=0x0, flags=0) at conf/domain_conf.c:4589
7 0x000000000048cc36 in virDomainDefParseXML (caps=0x702c90, xml=0x710040,
root=0x7103b0, ctxt=0x7101b0, expectedVirtTypes=16, flags=0)
at conf/domain_conf.c:8658
8 0x000000000048f011 in virDomainDefParseNode (caps=0x702c90, xml=0x710040,
root=0x7103b0, expectedVirtTypes=16, flags=0) at conf/domain_conf.c:9360
9 0x000000000048ee30 in virDomainDefParse (xmlStr=0x0,
filename=0x702ae0 "/var/run/libvirt/lxc/x.xml", caps=0x702c90,
expectedVirtTypes=16, flags=0) at conf/domain_conf.c:9310
10 0x000000000048ef00 in virDomainDefParseFile (caps=0x702c90,
filename=0x702ae0 "/var/run/libvirt/lxc/x.xml", expectedVirtTypes=16, flags=0)
at conf/domain_conf.c:9332
11 0x0000000000425053 in main (argc=5, argv=0x7fffffffe2b8)
at lxc/lxc_controller.c:1773
The comment says:
/* Now create the final dir in the path with the uid/gid/mode
* requested in the config. If the dir already exists, just set
* the perms.
*/
However, virDirCreate is only invoked if the target path doesn't
exist yet (which is opposite with the comment), or the uid from
the config is not -1 (I don't understand why, think it's just
another mistake). And the result is the perms of the pool won't
be changed if one tries to build the pool with different perms
again.
Besides these logic error fix, if no uid and gid are specified in
the config, the practical used uid, gid are reflected.
The two new APIs are rather trivial; based on bits and pieces of
other existing APIs. But rather than blindly return 0 or 1 for
HasMetadata, I chose to first validate that the snapshot in
question in fact exists.
* src/esx/esx_driver.c (esxDomainSnapshotIsCurrent)
(esxDomainSnapshotHasMetadata): New functions.
* src/vbox/vbox_tmpl.c (vboxDomainSnapshotIsCurrent)
(vboxDomainSnapshotHasMetadata): Likewise.
Blindly returning success is misleading if the object no longer
exists; it is a bit better to check for existence up front before
returning information about that object. This pattern matches the
fact that most of our other APIs check for existence as a side
effect prior to getting at the real piece of information being
queried.
* src/esx/esx_driver.c (esxDomainIsUpdated, esxDomainIsPersistent):
Add existence checks.
* src/vbox/vbox_tmpl.c (vboxDomainIsPersistent)
(vboxDomainIsUpdated): Likewise.
This patch adds support for listing all domains into drivers that use
the common virDomainObj implementation: libxl, lxc, openvz, qemu, test,
uml, vmware.
For drivers that don't support managed save images the guests are
treated as if they had none, so filtering guests that do have such an
image on this driver succeeds and produces 0 results.
The two new functions are very similar to the existing functions;
just a matter of different arguments and a call to a different
helper function.
* src/qemu/qemu_driver.c (qemuDomainSnapshotListNames)
(qemuDomainSnapshotNum, qemuDomainSnapshotListChildrenNames)
(qemuDomainSnapshotNumChildren): Support new flags.
(qemuDomainListAllSnapshots): New functions.
Wraps the conversion from 'char *name' to virDomainSnapshotPtr in
a reusable manner.
* src/conf/virdomainlist.h (virDomainListSnapshots): New declaration.
* src/conf/virdomainlist.c (virDomainListSnapshots): Implement it.
* src/libvirt_private.syms (virdomainlist.h): Export it.
The generator doesn't handle lists of virDomainSnapshotPtr, so
this commit requires a bit more work than some RPC additions.
* src/remote/remote_protocol.x
(REMOTE_PROC_DOMAIN_LIST_ALL_SNAPSHOTS)
(REMOTE_PROC_DOMAIN_SNAPSHOT_LIST_ALL_CHILDREN): New RPC calls,
with corresponding structs.
* daemon/remote.c (remoteDispatchDomainListAllSnapshots)
(remoteDispatchDomainSnapshotListAllChildren): New functions.
* src/remote/remote_driver.c (remoteDomainListAllSnapshots)
(remoteDomainSnapshotListAllChildren): Likewise.
* src/remote_protocol-structs: Regenerate.
There was an inherent race between virDomainSnapshotNum() and
virDomainSnapshotListNames(), where an additional snapshot could
be created in the meantime, or where a snapshot could be deleted
before converting the name back to a virDomainSnapshotPtr. It
was also an awkward name: the function operates on domains, not
domain snapshots. virDomainSnapshotListChildrenNames() suffered
from the same inherent race, although its naming was nicer.
This patch makes things nicer by grabbing a snapshot list
atomically, in the format most useful to the user.
* include/libvirt/libvirt.h.in (virDomainListAllSnapshots)
(virDomainSnapshotListAllChildren): New declarations.
* src/libvirt.c (virDomainSnapshotListNames)
(virDomainSnapshotListChildrenNames): Add cross-references.
(virDomainListAllSnapshots, virDomainSnapshotListAllChildren):
New functions.
* src/libvirt_public.syms (LIBVIRT_0.9.13): Export them.
* src/driver.h (virDrvDomainListAllSnapshots)
(virDrvDomainSnapshotListAllChildren): New callbacks.
* python/generator.py (skip_function): Prepare for later
hand-written versions.
It turns out that one-bit filtering makes it hard to select the inverse
set, so it is easier to provide filtering groups. For back-compat,
omitting all bits within a group means the group is not used for
filtering, and by definition of a group (each snapshot matches exactly
one bit within the group, and the set of bits in the group covers all
snapshots), selecting all bits also makes the group useless.
Unfortunately, virDomainSnapshotListChildren defined the bit
VIR_DOMAIN_SNAPSHOT_LIST_DESCENDANTS as an expansion rather than a
filter, so we cannot make it part of a filter group, so that bit
(and its counterpart VIR_DOMAIN_SNAPSHOT_LIST_ROOTS for
virDomainSnapshotList) remains a single control bit.
* include/libvirt/libvirt.h.in (virDomainSnapshotListFlags): Add a
couple more flags.
* src/libvirt.c (virDomainSnapshotNum)
(virDomainSnapshotNumChildren): Document them.
(virDomainSnapshotListNames, virDomainSnapshotListChildrenNames):
Likewise, and add thread-safety caveats.
* src/conf/virdomainlist.h (VIR_DOMAIN_SNAPSHOT_FILTERS_*): New
convenience macros.
* src/conf/domain_conf.c (virDomainSnapshotObjListCopyNames)
(virDomainSnapshotObjListCount): Support the new flags.
Until now, it was possible to crash libvirtd when defining domain with
channel device with missing source element.
When creating new virDomainChrDef, target.port is set to -1, but
unfortunately it is an union with addresses that virDomainChrDefFree
tries to free in case the deviceType is channel. Having the port set
to -1 is intended, however the cleanest way to get around the problems
with the crash seems to be renumbering the VIR_DOMAIN_CHR_CHANNEL_
target types to cover new NONE type (with value 0) being the default
(no target type yet).
Macro virCheckNullArgGoto is supposed to check for NULL argument but
checks non-NULL instead.
Macro virCheckNonNullArgReturn reports error as if the argument should
be NULL when it shouldn't.
when lxcContainerIdentifyCGroups failed, the memory it allocated
has been freed, so we should not free this memory again in
lxcContainerSetupPivortRoot and lxcContainerSetupExtraMounts.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Another case where we can do the same amount of work with fewer
lines of redundant code, which will make adding new filters easier.
* src/conf/domain_conf.c (virDomainSnapshotNameData): Adjust
struct.
(virDomainSnapshotObjListCount): Delete, now taken care of...
(virDomainSnapshotObjListCopyNames): ...here.
(virDomainSnapshotObjListGetNames): Adjust caller to handle
counting.
(virDomainSnapshotObjListNum): Simplify.
Now that domain listing is a thin wrapper around child listing,
it's easier to have a common entry point. This restores the
hashForEach optimization lost in the previous patch when there
are no snapshots being filtered out of the entire list.
* src/conf/domain_conf.h (virDomainSnapshotObjListGetNames)
(virDomainSnapshotObjListNum): Add parameter.
(virDomainSnapshotObjListGetNamesFrom)
(virDomainSnapshotObjListNumFrom): Delete.
* src/libvirt_private.syms (domain_conf.h): Drop deleted functions.
* src/conf/domain_conf.c (virDomainSnapshotObjListGetNames):
Merge, and (re)add an optimization.
* src/qemu/qemu_driver.c (qemuDomainUndefineFlags)
(qemuDomainSnapshotListNames, qemuDomainSnapshotNum)
(qemuDomainSnapshotListChildrenNames)
(qemuDomainSnapshotNumChildren): Update callers.
* src/qemu/qemu_migration.c (qemuMigrationIsAllowed): Likewise.
* src/conf/virdomainlist.c (virDomainListPopulate): Likewise.
This idea was first suggested by Daniel Veillard here:
https://www.redhat.com/archives/libvir-list/2011-October/msg00353.html
Now that I am about to add more complexity to snapshot listing, it
makes sense to avoid code duplication and special casing for domain
listing (all snapshots) vs. snapshot listing (descendants); adding
a metaroot reduces the number of code lines by having the domain
listing turn into a descendant listing of the metaroot.
Note that this has one minor pessimization - if we are going to list
ALL snapshots without filtering, then virHashForeach is more efficient
than recursing through the child relationships; restoring that minor
optimization will occur in the next patch.
* src/conf/domain_conf.h (_virDomainSnapshotObj)
(_virDomainSnapshotObjList): Repurpose some fields.
(virDomainSnapshotDropParent): Drop unused parameter.
* src/conf/domain_conf.c (virDomainSnapshotObjListGetNames)
(virDomainSnapshotObjListCount): Simplify.
(virDomainSnapshotFindByName, virDomainSnapshotSetRelations)
(virDomainSnapshotDropParent): Match new field semantics.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML)
(qemuDomainSnapshotReparentChildren, qemuDomainSnapshotDelete):
Adjust clients.
This patch adds common code to list domains in fashion used by
virListAllDomains with all currently supported flags. The header file
also contains macros that group filters together that are used to
shorten filter conditions.
This patch stores existence of the image in the object. At start of the
daemon the state is checked and then updated in key moments in domain
lifecycle.
This patch wires up the RPC protocol handlers for
virConnectListAllDomains(). The RPC generator has no support for the way
how virConnectListAllDomains() returns the results so the handler code
had to be done manually.
The new api is handled by REMOTE_PROC_CONNECT_LIST_ALL_DOMAINS, with
number 273 and marked with high priority.
This patch adds a new public api that lists domains. The new approach is
different from those used before. There are key points to this:
1) The list is acquired atomically and contains both active and inactive
domains (guests). This eliminates the need to call two different list
APIs, where the state might change in between the calls.
2) The returned list consists of virDomainPtrs instead of names or ID's
that have to be converted to virDomainPtrs anyways using separate calls
for each one of them. This is more convenient and saves hypervisor calls.
3) The returned list is auto-allocated. This saves a lot of hassle for
the users.
4) Built in support for filtering. The API call supports various
filtering flags that modify the output list according to user needs.
Available filter groups:
Domain status:
VIR_CONNECT_LIST_DOMAINS_ACTIVE, VIR_CONNECT_LIST_DOMAINS_INACTIVE
Domain persistence:
VIR_CONNECT_LIST_DOMAINS_PERSISTENT,
VIR_CONNECT_LIST_DOMAINS_TRANSIENT
Domain state:
VIR_CONNECT_LIST_DOMAINS_RUNNING, VIR_CONNECT_LIST_DOMAINS_PAUSED,
VIR_CONNECT_LIST_DOMAINS_SHUTOFF, VIR_CONNECT_LIST_DOMAINS_OTHER
Existence of managed save image:
VIR_CONNECT_LIST_DOMAINS_MANAGEDSAVE,
VIR_CONNECT_LIST_DOMAINS_NO_MANAGEDSAVE
Auto-start option:
VIR_CONNECT_LIST_DOMAINS_AUTOSTART,
VIR_CONNECT_LIST_DOMAINS_NO_AUTOSTART
Existence of snapshot:
VIR_CONNECT_LIST_DOMAINS_HAS_SNAPSHOT,
VIR_CONNECT_LIST_DOMAINS_NO_SNAPSHOT
5) The python binding returns a list of domain objects that is very neat
to work with.
The only problem with this approach is no support from code generators
so both RPC code and python bindings had to be written manually.
*include/libvirt/libvirt.h.in: - add API prototype
- clean up whitespace mistakes nearby
*python/generator.py: - inhibit generation of the bindings for the new
api
*src/driver.h: - add driver prototype
- clean up some whitespace mistakes nearby
*src/libvirt.c: - add public implementation
*src/libvirt_public.syms: - export the new symbol
when libvirt_lxc trigger oom error in lxcContainerGetSubtree
we should free the alloced memory for mounts.
so when lxcContainerGetSubtree failed,we should do some
memory cleanup in lxcContainerUnmountSubtree.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
we alloc the memory for format in lxcContainerMountDetectFilesystem
but without free it in lxcContainerMountFSBlockHelper.
this patch just call VIR_FREE to free it.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
With latest changes to qemu-ga success on some commands is not reported
anymore, e.g. guest-shutdown or guest-suspend-*. However, errors are
still being reported. Therefore, we need to find different source of
indication if operation was successful. Events.
Commit 6e769eba made it a runtime error if libvirt was compiled
without yajl support but targets a new enough qemu. But enough
users are hitting this on self-compiled libvirt that it is worth
erroring out at compilation time, rather than an obscure failure
when trying to use the built executable.
* configure.ac: If qemu is requested and -version works, require
yajl when qemu version is new enough.
* src/qemu/qemu_capabilities.c (qemuCapsComputeCmdFlags): Add
comment.
When libpcap is not available, the NWFilter driver provides a
no-op stub for the DHCP snooping initialization. This was
mistakenly returning '-1' instead of '0', so the entire driver
initialization failed
dump-guest-memory is a new dump mechanism, and it can work when the
guest uses host devices. This patch adds a API to use this new
monitor command.
We will always use json mode if qemu's version is >= 0.15, so I
don't implement the API for text mode.
This reverts
commit c16b4c43fc
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Fri May 11 15:09:27 2012 +0100
Avoid LXC pivot root in the root source is still /
This commit broke setup of /dev, because the code which
deals with setting up a private /dev and /dev/pts only
works if you do a pivotroot.
The original intent of avoiding the pivot root was to
try and ensure the new root has a minimumal mount
tree. The better way todo this is to just unmount the
bits we don't want (ie old /proc & /sys subtrees.
So apply the logic from
commit c529b47a75
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Fri May 11 11:35:28 2012 +0100
Trim /proc & /sys subtrees before mounting new instances
to the pivot_root codepath as well
Libvirt updates the configuration of SPICE server only when something
changes. This is unfortunate when the user wants to disconnect a
existing spice session when the connected attribute is already
"disconnect".
This patch modifies the conditions for calling the password updater to
be called when nothing changes, but the connected attribute is already
"disconnect".
While unescaping the commands the commands passed through to the monitor
function qemuMonitorUnescapeArg() initialized lenght of the input string
to strlen()+1 which is fine for alloc but not for iteration of the
string.
This patch fixes the off-by-one error and drops the pointless check for
a single trailing slash that is automaticaly handled by the default
branch of switch.
commit 52d064f42d added
VIR_NETWORK_XML_INACTIVE in order to allow suppressing the
auto-generated list of VFs in network definitions, and a --inactive
flag to virsh net-dumpxml to take advantage of the flag. However, it
missed out on two opportunities:
1) Use INACTIVE to get the current config of the network as it
exists on disk, rather than the currently active config.
2) Add INACTIVE to the flags used for the virsh net-edit command, so
that it won't include the forward-pool interfaces that were
autogenerated, and so that a re-edit of the network prior to
restarting it will show any other edits made since the last restart
of the network. (prior to this patch, if you edited a network a 2nd
time without restarting, all of the previous edits would magically
disappear).
In order to fit with the new #define-based generic edit function in
virsh.c, a new function vshNetworkGetXMLDesc() was added. This
function first tries to call virNetworkGetXMLDesc with the INACTIVE
flag added, then retries without if the first attempt fails (in the
manner expected when the server doesn't support it).
A core use case of the hook scripts is to be able to do things
to a guest's network configuration. It is possible to hook into
the 'start' operation for a QEMU guest which runs just before
the guest is started. The TAP devices will exist at this point,
but the QEMU process will not. It can be desirable to have a
'started' hook too, which runs once QEMU has started.
If libvirtd is restarted it will re-populate firewall rules,
but there is no QEMU hook to trigger for existing domains.
This is solved with a 'reconnect' hook.
Finally, if attaching to an external QEMU process there needs
to be an 'attach' hook script.
This all also applies to the LXC driver
* docs/hooks.html.in: Document new operations
* src/util/hooks.c, src/util/hooks.c: Add 'started', 'reconnect'
and 'attach' operations for QEMU. Add 'prepare', 'started',
'release' and 'reconnect' operations for LXC
* src/lxc/lxc_driver.c: Add hooks for 'prepare', 'started',
'release' and 'reconnect' operations
* src/qemu/qemu_process.c: Add hooks for 'started', 'reconnect'
and 'reconnect' operations
First 'poll' can't return EWOULDBLOCK, and second, we're checking errno
so far away from the poll() call that we've probably already trashed the
original errno value.
In addition to keepalive responses, we also need to send keepalive
requests from client IO loop to properly detect dead connection in case
a libvirt API is called from the main loop, which prevents any timers to
be called.
The previous commit removed the only usage of ``all'' parameter in
virKeepAliveStopInternal, which was actually the only reason for having
virKeepAliveStopInternal. This effectively reverts most of commit
6446a9e20c.
When a libvirt API is called from the main event loop (which seems to be
common in event-based glib apps), the client IO loop would properly
handle keepalive requests sent by a server but will not actually send
them because the main event loop is blocked with the API. This patch
gets rid of response timer and the thread which is processing keepalive
requests is also responsible for queueing responses for delivery.
Add virKeepAliveTimeout and virKeepAliveTrigger APIs that can be used to
set poll timeouts and trigger keepalive timer. virKeepAliveTrigger
checks if it is called to early and does nothing in that case.
The code that needs to be run every keepalive interval of inactivity was
only called from a timer and thus from the main event loop. We will need
to call the code directly from another place.
As non-blocking calls are no longer dropped, we don't really need to
care that much about their fate and wait for the thread with the buck
to process them. If another thread has the buck, we can just push a
non-blocking call to the queue and be done with it.
So far, we were dropping non-blocking calls whenever sending them would
block. In case a client is sending lots of stream calls (which are not
supposed to generate any reply), the assumption that having other calls
in a queue is sufficient to get a reply from the server doesn't work. I
tried to fix this in b1e374a7ac but
failed and reverted that commit.
With this patch, non-blocking calls are never dropped (unless the
connection is being closed) and will always be sent.
Normally, when every call has a thread associated with it, the thread
may get the buck and be in charge of sending all calls until its own
call is done. When we introduced non-blocking calls, we had to add
special handling of new non-blocking calls. This patch uses event loop
to send data if there is no thread to get the buck so that any
non-blocking calls left in the queue are properly sent without having to
handle them specially. It also avoids adding even more cruft to client
IO loop in the following patches.
With this change in, non-blocking calls may see unpredictable delays in
delivery when the client has no event loop registered. However, the only
non-blocking calls we have are keepalives and we already require event
loop for them, which makes this a non-issue until someone introduces new
non-blocking calls.
'make dist' was depending on *protocol-structs files, which are
stored in git but in turn depended on generated files. We still
want to ship the protocol-structs files, but by renaming the
tests to something not matching a file name, we separate 'make
check' (which depends on the generated file) from 'make dist'
(which only depends on the git files). After all, the tarball
should never depend on a generated file not stored in git.
I found one more case of a git file depending on a generated
file, in a bogus virkeycode.c listing; but at least this one
had no associated rules so it never broke 'make dist'.
Reported by Wen Congyang. Latent bug has been present since
commit 62dee6f, but only recently exposed by commit 7bff56a.
* src/Makefile.am ($(srcdir)/util/virkeycode.c): Drop useless
dependency.
(BUILT_SOURCES): ...and build virkeymaps.h sooner.
(PROTOCOL_STRUCTS): Rather than depend on the struct file...
(check-local): ...convert things into a phony target of...
(check-protocol): ...a new check.
($(srcdir)/remote_protocol-struct): Rename to isolate the distributed
file from the conditional test.
(PDWTAGS): Deal with rename. Swap to compare 'expected actual'.
Currently, if qemuProcessStart fail at some point, e.g. because
domain being started wants a PCI/USB device already assigned to
a different domain, we jump to cleanup label where qemuProcessStop
is performed. This unconditionally calls virSecurityManagerRestoreAllLabel
which is wrong because the other domain is still using those devices.
However, once we successfully label all devices/paths in
qemuProcessStart() from that point on, we have to perform a rollback
on failure - that is - we have to virSecurityManagerRestoreAllLabel.
The two APIs are rather trivial; based on bits and pieces of other
existing APIs. It leaves the door open for future extension to
qemu to report snapshots without metadata based on reading qcow2
internal snapshot names.
* src/qemu/qemu_driver.c (qemuDomainSnapshotIsCurrent)
(qemuDomainSnapshotHasMetadata): New functions.
Right now, starting from just a virDomainSnapshotPtr, and wanting to
know if it is the current snapshot for its respective domain, you have
to use virDomainSnapshotGetDomain(), then virDomainSnapshotCurrent(),
then compare the two names returned by virDomainSnapshotGetName().
It is a bit easier if we can directly query this information from the
snapshot itself.
Right now, it is possible to filter a snapshot listing based on
whether snapshots have metadata that would prevent domain deletion,
but the only way to learn if an individual snapshot has metadata is
to see if that snapshot appears in the list returned by a listing.
Additionally, I hope to expand the qemu driver in a future patch to
use qemu-img to reconstruct snapshot XML corresponding to internal
qcow2 snapshot names not otherwise tracked by libvirt (in part, so
that libvirt can guarantee that new snapshots are not created with
a name that would silently corrupt the existing portion of the qcow2
file); if I ever get that in, then it would no longer be an all-or-none
decision on whether snapshots have metadata, and becomes all the more
important to be able to directly determine that information from a
particular snapshot.
Other query functions (such as virDomainIsActive) do not have a flags
argument, but since virDomainHasCurrentSnapshot takes a flags argument,
I figured it was safer to provide a flags argument here as well.
* include/libvirt/libvirt.h.in (virDomainSnapshotIsCurrent)
(virDomainSnapshotHasMetadata): New declarations.
* src/libvirt.c (virDomainSnapshotIsCurrent)
(virDomainSnapshotHasMetadata): New functions.
* src/libvirt_public.syms (LIBVIRT_0.9.13): Export them.
* src/driver.h (virDrvDomainSnapshotIsCurrent)
(virDrvDomainSnapshotHasMetadata): New driver callbacks.
Right now, the only way to get at the contents of a virBuffer is
to destroy it. But there are cases in my upcoming patches where
peeking at the contents makes life easier. I suppose this does
open up the potential for bad code to dereference a stale pointer,
by disregarding the docs that the return value is invalid on the
next virBuf operation, but such is life.
* src/util/buf.h (virBufferCurrentContent): New declaration.
* src/util/buf.c (virBufferCurrentContent): Implement it.
* src/libvirt_private.syms (buf.h): Export it.
* tests/virbuftest.c (testBufAutoIndent): Test it.
My latest patch for RPC rework (a2c304f687) introduced a memory leak.
virNetMessageEncodeHeader() is calling VIR_ALLOC_N(msg->buffer, ...)
despite fact, that msg->buffer isn't VIR_FREE()'d on all paths calling
the function. Therefore, rather than injecting free statement switch to
VIR_REALLOC_N().
when do remount,the source and target should be the same
values specified in the initial mount() call.
So change fs->dst to src.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
virDomainSnapshotPtr has a refcount member, but no one was able
to use it. Furthermore, all of our other vir*Ptr objects have
a *Ref method to match their *Free method. Thankfully, this is
client-side only, so we can use this new function regardless of
how old the server side is! (I have future patches to virsh
that want to use it.)
* include/libvirt/libvirt.h.in (virDomainSnapshotRef): Declare.
* src/libvirt.c (virDomainSnapshotRef): Implement it.
* src/libvirt_public.syms (LIBVIRT_0.9.13): Export it.
When libvirtd forks off a new child, the child then calls virLogReset(),
which ends up closing file descriptors used as log outputs. However, we
recently started logging closed file descriptors, which means we need to
lock logging mutex which was already locked by virLogReset(). We don't
really want to log anything when we are in the process of closing log
outputs.
For pseries guest, spapr-vlan and spapr-vty is based
on spapr-vio address. According to model of network
device, the address type should be assigned automatically.
For serial device, serial pty device is recognized as
spapr-vty device, which is also on spapr-vio.
So this patch is to correct the address type of
spapr-vlan and spapr-vty, and build correct
command line of spapr-vty.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Reviewed-by: Michael Ellerman<michaele@au1.ibm.com>
There is a theoretical problem of an extreme bug where we can get
into deadlock due to command handshaking. Thanks to a pair of pipes,
we have a situation where the parent thinks the child reported an
error and is waiting for a message from the child to explain the
error; but at the same time the child thinks it reported success
and is waiting for the parent to acknowledge the success; so both
processes are now blocked.
Thankfully, I don't think this deadlock is possible without at
least one other bug in the code, but I did see exactly that sort
of situation prior to commit da831af - I saw a backtrace where a
double close bug in the parent caused the parent to read from the
wrong fd and assume the child failed, even though the child really
sent success.
This potential deadlock is not quite like commit 858c247 (a deadlock
due to multiple readers on one pipe preventing a write from completing),
although the solution is similar - always close unused pipe fds before
blocking, rather than after.
* src/util/command.c (virCommandHandshakeWait): Close unused fds
sooner.
When libvirtd is started and there is an unusable/not-connectable
leftover from earlier started machine, it's more reasonable to say
that the machine "crashed" if we know it was started with
"-no-shutdown".
This patch fixes that and also changes the other result (when machine
was started without "-no-shutdown") to "unknown", because the previous
"failed" reason means (according to include/libvirt/libvirt.h.in:174),
that the machine failed to start.
Commit 7bff56a worked in an incremental build, but fails for a
fresh clone; apparently, if make sees both an actual file
spelling and an inference rule, only the exact spelling is used.
CCLD libvirt_driver_test.la
CC libvirt_driver_remote_la-remote_driver.lo
remote/remote_driver.c:4707:34: fatal error: remote_client_bodies.h: No such file or directory
compilation terminated.
BUILT_SOURCES to the rescue, instead of trying to mess with .lo
dependencies directly.
* src/Makefile.am (REMOTE_DRIVER_PREREQS, %remote_driver.lo): Drop...
(BUILT_SOURCES): ...and add here instead.
Commit 1c275e9a accidentally dropped the storage driver from
libvirtd, because it depended on a C preprocessor macro that
was not defined. Furthermore, if you do './configure
--without-storage-dir --with-storage-disk' or any other combination
where you explicitly build a subset of storage backends excluding
the dir backend, then the build is broken.
Based on analysis by Osier Yang.
* configure.ac (WITH_STORAGE): Define top-level conditional.
* src/Makefile.am (mod_LTLIBRARIES): Build driver even when
storage_dir is disabled.
* daemon/libvirtd.c: Pick up storage driver for any backend, not
just dir.
* daemon/Makefile.am (libvirtd_LDADD): Likewise.
Since we are allocating RPC buffer dynamically, we can increase limits
for max. size of RPC message and RPC string. This is needed to cover
some corner cases where libvirt is run on such huge machines that their
capabilities XML is 4 times bigger than our current limit. This leaves
users with inability to even connect.
Currently, we are allocating buffer for RPC messages statically.
This is not such pain when RPC limits are small. However, if we want
ever to increase those limits, we need to allocate buffer dynamically,
based on RPC message len (= the first 4 bytes). Therefore we will
decrease our mem usage in most cases and still be flexible enough in
corner cases.
We had a distributed file (remote_protocol.h, which in turn was
a prereq to remote_driver.c) depending on a generated file
(libvirt_probes.h), which is a no-no for a VPATH build from a
read-only source tree (no wonder 'make distcheck' tests precisely
that situation):
File `libvirt_driver_remote.la' does not exist.
File `libvirt_driver_remote_la-remote_driver.lo' does not exist.
Prerequisite `libvirt_probes.h' is newer than target `../../src/remote/remote_protocol.h'.
Must remake target `../../src/remote/remote_protocol.h'.
Invoking recipe from Makefile:7464 to update target `../../src/remote/remote_protocol.h'.
make[3]: Entering directory `/home/remote/eblake/libvirt-tmp2/build/libvirt-0.9.12/_build/src'
GEN ../../src/remote/remote_protocol.h
cannot create ../../src/remote/remote_protocol.h: Permission denied at ../../src/rpc/genprotocol.pl line 31.
make[3]: *** [../../src/remote/remote_protocol.h] Error 13
Rather than making distributed .c files depend on generated files, we
really want to ensure that compilation into .lo files is not attempted
until the generated files are present, done by this patch. Since there
were two different sets of conditionally generated files that both
feed the .lo file, I had to introduce a new variable REMOTE_DRIVER_PREREQS
to keep automake happy.
After that fix, the next issue was that make treats './foo' and 'foo'
differently in determining whether an implicit %foo rule is applicable,
with the result that locking/qemu-sanlock.conf wasn't properly being
built at the right times. Also, the output for using the .aug test
files was a bit verbose.
After fixing the src directory, the next error is related to the docs
directory, where the tarball is missing a stamp file and thus tries to
regenerate files that are already present:
GEN ../../docs/apibuild.py.stamp
Traceback (most recent call last):
File "../../docs/apibuild.py", line 2511, in <module>
rebuild("libvirt")
File "../../docs/apibuild.py", line 2495, in rebuild
builder.serialize()
File "../../docs/apibuild.py", line 2424, in serialize
output = open(filename, "w")
IOError: [Errno 13] Permission denied: '../../docs/libvirt-api.xml'
make[5]: *** [../../docs/apibuild.py.stamp] Error 1
and fixing that exposed another case of a distributed file (generated
html) depending on a built file (libvirt.h), but only when doing an
in-tree build, because of a file glob.
* src/Makefile.am ($(srcdir)/remote/remote_driver.c): Change...
(libvirt_driver_remote_la-remote_driver.lo): ...to the real
dependency.
($(builddir)/locking/%-sanlock.conf): Drop $(builddir), so that
rule gets run in time for test_libvirt_sanlock.aug.
(test_libvir*.aug): Cater to silent build.
(conf_DATA): Don't ship qemu-sanlock.conf in the tarball, since it
is trivial to regenerate.
* docs/Makefile.am (EXTRA_DIST): Ship our stamp file.
($(APIBUILD_STAMP)): Don't depend on generated file.
I came across a bug that the command line generated for passthrough
of the host parallel port /dev/parport0 by libvirt for QEMU is incorrect.
It currently produces:
-chardev tty,id=charparallel0,path=/dev/parport0
-device isa-parallel,chardev=charparallel0,id=parallel0
The first parameter is "tty". It sould be "parport".
If I launch qemu with -chardev parport,... it works as expected.
I have already filled a bug report (
https://bugzilla.redhat.com/show_bug.cgi?id=823879 ), the topic was
already on the list some months ago:
https://www.redhat.com/archives/libvirt-users/2011-September/msg00095.html
Signed-off-by: Eric Blake <eblake@redhat.com>
It is possible to deadlock libvirt by having a domain with XML
longer than PIPE_BUF, and by writing a hook script that closes
stdin early. This is because libvirt was keeping a copy of the
child's stdin read fd open, which means the write fd in the
parent will never see EPIPE (remember, libvirt should always be
run with SIGPIPE ignored, so we should never get a SIGPIPE signal).
Since there is no error, libvirt blocks waiting for a write to
complete, even though the only reader is also libvirt. The
solution is to ensure that only the child can act as a reader
before the parent does any writes; and then dealing with the
fallout of dealing with EPIPE.
Thankfully, this is not a security hole - since the only way to
trigger the deadlock is to install a custom hook script, anyone
that already has privileges to install a hook script already has
privileges to do any number of other equally disruptive things
to libvirt; it would only be a security hole if an unprivileged
user could install a hook script to DoS a privileged user.
* src/util/command.c (virCommandRun): Close parent's copy of child
read fd earlier.
(virCommandProcessIO): Don't let EPIPE be fatal; the child may
be done parsing input.
* tests/commandhelper.c (main): Set up a SIGPIPE situation.
* tests/commandtest.c (test20): Trigger it.
* tests/commanddata/test20.log: New file.
Although src/util/viratomic.h has been added to the repo, up until now
it hasn't been used. Stefan Berger is using it in his proposed dhcp
snooping patches, and an rpm build with those patches failed due to
viratomic.h not being packed up with the rest of the sources.
EBADF errors are logged as warnings as they normally indicate a double
close bug. This patch also provides VIR_MASS_CLOSE helper to be user in
the only case of mass close after fork when EBADF should rather be
ignored.
With support for multiple IP addresses per interface in place, this patch
now adds support for multiple IP addresses per interface for the DHCP
snooping code.
Testing:
Since the infrastructure I tested this with does not provide multiple IP
addresses per MAC address (anymore), I either had to plug the VM's interface
from the virtual bride connected directly to the infrastructure to virbr0
to get a 2nd IP address from dnsmasq (kill and run dhclient inside the VM)
or changed the lease file (/var/run/libvirt/network/nwfilter.leases) and
restart libvirtd to have a 2nd IP address on an existing interface.
Note that dnsmasq can take a lease timeout parameter as part of the --dhcp-range
command line parameter, so that timeouts can be tested that way
(--dhcp-range 192.168.122.2,192.168.122.254,120). So, terminating and restarting
dnsmasq with that parameter is another choice to watch an IP address disappear
after 120 seconds.
Regards,
Stefan
The goal of this patch is to prepare for support for multiple IP
addresses per interface in the DHCP snooping code.
Move the code for the IP address map that maps interface names to
IP addresses into their own file. Rename the functions on the way
but otherwise leave the code as-is. Initialize this new layer
separately before dependent layers (iplearning, dhcpsnooping)
and shut it down after them.
This patch adds DHCP snooping support to libvirt. The learning method for
IP addresses is specified by setting the "CTRL_IP_LEARNING" variable to one of
"any" [default] (existing IP learning code), "none" (static only addresses)
or "dhcp" (DHCP snooping).
Active leases are saved in a lease file and reloaded on restart or HUP.
The following interface XML activates and uses the DHCP snooping:
<interface type='bridge'>
<source bridge='virbr0'/>
<filterref filter='clean-traffic'>
<parameter name='CTRL_IP_LEARNING' value='dhcp'/>
</filterref>
</interface>
All filters containing the variable 'IP' are automatically adjusted when
the VM receives an IP address via DHCP. However, multiple IP addresses per
interface are silently ignored in this patch, thus only supporting one IP
address per interface. Multiple IP address support is added in a later
patch in this series.
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Currently, monitoring QEMU virtual machines with standard Unix
sysadmin tools is harder than it has to be. The QEMU command line is
often miles long and mostly redundant, it's hard to tell which process
is which.
This patch reorders the QEMU -name argument to be the first, so it's
immediately visible in "ps x", htop and "atop -c" output.
This patch resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=827519
The problem is that an interface with type='hostdev' will have an
alias of the form "hostdev%d", while the function that looks through
existing netdevs to determine the name to use for a new addition will
fail if there's an existing entry that does not match the form
"net%d".
This is another of the handful of places that need an exception due to
the hybrid nature of <interface type='hostdev'> (which is not exactly
an <interface> or a <hostdev>, but is both at the same time).
If we migrate to fd, spec->fwdType is not MIGRATION_FWD_DIRECT,
we will close spec->dest.fd.local in qemuMigrationRun(). So we
should set spec->dest.fd.local to -1 in qemuMigrationRun().
Bug present since 0.9.5 (commit 326176179).
Wen Congyang reported that we have a double-close bug if we fail
virFDStreamOpenInternal, since childfd duplicated one of the fds[]
array contents. In truth, since we always transfer both members
of fds to other variables, we should close the fds through those
other names, and just use fds[] for pipe().
Bug present since 0.9.0 (commit e886237a).
* src/fdstream.c (virFDStreamOpenFileInternal): Swap scope of
childfd and fds[], to avoid a double close.
KAMEZAWA Hiroyuki reported a nasty double-free bug when virCommand
is used to convert a string into input to a child command. The
problem is that the poll() loop of virCommandProcessIO would close()
the write end of the pipe in order to let the child see EOF, then
the caller virCommandRun() would also close the same fd number, with
the second close possibly nuking an fd opened by some other thread
in the meantime. This in turn can have all sorts of bad effects.
The bug has been present since the introduction of virCommand in
commit f16ad06f.
This is based on his first attempt at a patch, at
https://bugzilla.redhat.com/show_bug.cgi?id=823716
* src/util/command.c (_virCommand): Drop inpipe member.
(virCommandProcessIO): Add argument, to avoid closing caller's fd
without informing caller.
(virCommandRun, virCommandNewArgs): Adjust clients.
Apart from the non-sanlock check build, there is also a little fix for
qemu (EXTRA_DIST had qemu.conf and others inside even if the build was
supposed to be without qemu).
Some of our rules used $(PERL), while others used 'perl'. Always
using the variable allows a developer to point to a different (often
better) perl than the default one found on $PATH.
* daemon/Makefile.am ($(srcdir)/remote_dispatch.h): s/perl/$(PERL).
* src/Makefile.am ($(srcdir)/remote/remote_client_bodies.h)
(PDWTAGS, %protocol.c, %_probes.stp): Likewise.
Without this fix, a VPATH build (such as used by ./autobuild.sh)
fails with messages like:
make[3]: Entering directory `/home/remote/eblake/libvirt-tmp2/build/daemon'
../../build-aux/augeas-gentest.pl libvirtd.conf ../../daemon/test_libvirtd.aug.in test_libvirtd.aug
cannot read libvirtd.conf: No such file or directory at ../../build-aux/augeas-gentest.pl line 38.
Since the test files are not part of the tarball, we can generate
them into the build dir, but rather than create a subdirectory
just for the test file, it is easier to test them directly in
libvirt.git/src.
* daemon/Makefile.am (AUG_GENTEST): Factor out definition.
(test_libvirtd.aug): Look for correct file.
* src/Makefile.am (AUG_GENTEST): Use $(PERL).
(qemu/test_libvirtd_qemu.aug, lxc/test_libvirtd_lxc.aug)
(locking/test_libvirt_sanlock.aug): Rename to avoid subdirectories.
(check-augeas-qemu, check-augeas-lxc, check-augeas-sanlock): Reflect
location of built tests.
* configure.ac (PERL): Substitute perl.
Currently, we are logging only one side of pipes we
create in virCommandRequireHandshake(); This is enough
in cases where pipe2() returns two consecutive FDs. However,
it is not guaranteed and it may return any FDs.
Therefore, it's wise to log the other ends as well.
When getting number of CPUs the host has assigned, there was always
number "1" returned. Even though all lxc domains with no pinning
launched by libvirt run on all pCPUs (by default, no matter what's the
number), we should at least return the same number as the user
specified when creating the domain.
I added libvirt_qemu_probes.h into BUILT_SOURCES. That makes it
generated, but most probably it is not the clearest way how to do
that, but it fixes the build.
The previous patch fixed an incremental build, but missed that on
a fresh checkout, we now have nothing left that stops make from
nuking libvirt_qemu_probes.o.
* src/Makefile.am ($(libvirt_driver_qemu_la_SOURCES)): Delete,
since this variable is empty.
(.PRECIOUS): Add %_probes.o, so they don't get nuked as an
intermediate by-product after creating %_probes.lo.
The moment you specify a _DEPENDENCIES, older automake (stupidly)
assumes that you will specify _all_ dependencies for that target.
This stupidity has been fixed in automake 1.12, but we cannot rely on
newer automake everywhere. For libvirt_la_DEPENDENCIES, we took
care of providing the full list, but for libvirt_qemu_la_DEPENDENCIES,
we were missing the dependency on libvirt_qemu_impl.la, which resulted
in a failed build:
make[3]: Entering directory `/home/ajia/Workspace/libvirt/src'
CCLD libvirt_driver_qemu.la
libtool: link: `libvirt_qemu_probes.lo' is not a valid libtool object
* src/Makefile.am (libvirt_driver_qemu_la_DEPENDENCIES): Delete;
automake does a better job if it does the entire job.
==3240== 23 bytes in 1 blocks are definitely lost in loss record 242 of 744
==3240== at 0x4C2A4CD: malloc (vg_replace_malloc.c:236)
==3240== by 0x8077537: __vasprintf_chk (vasprintf_chk.c:82)
==3240== by 0x509C677: virVasprintf (stdio2.h:199)
==3240== by 0x509C733: virAsprintf (util.c:1912)
==3240== by 0x1906583A: qemudStartup (qemu_driver.c:679)
==3240== by 0x511991D: virStateInitialize (libvirt.c:809)
==3240== by 0x40CD84: daemonRunStateInit (libvirtd.c:751)
==3240== by 0x5098745: virThreadHelper (threads-pthread.c:161)
==3240== by 0x7953D8F: start_thread (pthread_create.c:309)
==3240== by 0x805FF5C: clone (clone.S:115)
To ensure consistent error reporting of invalid arguments,
provide a number of predefined helper methods & macros.
- An arg which must not be NULL:
virCheckNonNullArgReturn(argname, retvalue)
virCheckNonNullArgGoto(argname, label)
- An arg which must be NULL
virCheckNullArgGoto(argname, label)
- An arg which must be positive (ie 1 or greater)
virCheckPositiveArgGoto(argname, label)
- An arg which must not be 0
virCheckNonZeroArgGoto(argname, label)
- An arg which must be zero
virCheckZeroArgGoto(argname, label)
- An arg which must not be negative (ie 0 or greater)
virCheckNonNegativeArgGoto(argname, label)
* src/libvirt.c, src/libvirt-qemu.c,
src/nodeinfo.c, src/datatypes.c: Update to use
virCheckXXXX macros
* po/POTFILES.in: Add libvirt-qemu.c and virterror_internal.h
* src/internal.h: Define macros for checking invalid args
* src/util/virterror_internal.h: Define macros for reporting
invalid args
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Libtool is picky about linking against a module library (aka a .so);
giving lots of warnings like this in the tests directory:
CCLD networkxml2argvtest
*** Warning: Linking the executable networkxml2argvtest against the loadable module
*** libvirt_driver_network.so is not portable!
Fix that by splitting things into a convenience library which can
be used directly by the tests, and making the real .so just wrap
the convenience library.
Based on a suggestion by Daniel P. Berrange.
* configure.ac (--with-driver-modules): Fix help test.
* src/Makefile.am (libvirt_driver_xen.la, libvirt_driver_libxl.la)
(libvirt_driver_qemu.la, libvirt_driver_lxc.la)
(libvirt_driver_uml.la): Factor into new convenience libraries.
* tests/Makefile.am (xen_LDADDS, qemu_LDADDS, lxc_LDADDS)
(networkxml2argvtest_LDADD): Link to convenience libraries, not
shared libraries.
There was no rule forcing libvirt_qemu_probes.o to be built
before libvirt_qemu_probes.lo was used. Also libvirtd was
still referencing the .o file, rather than the .lo file.
Both the .lo and .o file must be listed as DEPENDENCIES,
otherwise libtool will unhelpfully delete the .o file
once the .lo file is created.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The CoTaskMemFree function requires the ole32 DLL to be
linked against. Currently this is only done for the
VirtualBox driver. Also add it to libvirt_util.la
* configure.ac: Unconditionally add ole32 DLL to Win32
* src/Makefile.am: Link old32 to libvirt_util.la
When adding new config file parameters, the corresponding
additions to the augeas lens' are constantly forgotten.
Also there are augeas test cases, these don't catch the
error, since they too are never updated.
To address this, the augeas test cases need to be auto-generated
from the example config files.
* build-aux/augeas-gentest.pl: Helper to generate an
augeas test file, substituting in elements from the
example config files
* src/Makefile.am, daemon/Makefile.am: Switch to
auto-generated augeas test cases
* daemon/test_libvirtd.aug, daemon/test_libvirtd.aug.in,
src/locking/test_libvirt_sanlock.aug,
src/locking/test_libvirt_sanlock.aug.in,
src/lxc/test_libvirtd_lxc.aug,
src/lxc/test_libvirtd_lxc.aug.in,
src/qemu/test_libvirtd_qemu.aug,
src/qemu/test_libvirtd_qemu.aug.in: Remove example
config file data, replacing with a ::CONFIG:: placeholder
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently all the config options are listed under a 'vnc_entry'
group. Create a bunch of new groups & move options to the
right place
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add nmissing 'host_uuid' entry to libvirtd.conf lens and
rename spice_passwd to spice_password in qemu.conf lens
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of doing
# example_config
use
#example_config
so it is possible to programatically uncomment example config
options, as distinct from their comment/descriptions
Also delete rogue trailing comma not allowed by lens
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add an impl of +virGetUserRuntimeDirectory, virGetUserCacheDirectory
virGetUserConfigDirectory and virGetUserDirectory for Win32 platform.
Also create stubs for non-Win32 platforms which lack getpwuid_r()
In adding these two helpers were added virFileIsAbsPath and
virFileSkipRoot, along with some macros VIR_FILE_DIR_SEPARATOR,
VIR_FILE_DIR_SEPARATOR_S, VIR_FILE_IS_DIR_SEPARATOR,
VIR_FILE_PATH_SEPARATOR, VIR_FILE_PATH_SEPARATOR_S
All this code was adapted from GLib2 under terms of LGPLv2+ license.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove the uid param from virGetUserConfigDirectory,
virGetUserCacheDirectory, virGetUserRuntimeDirectory,
and virGetUserDirectory
These functions were universally called with the
results of getuid() or geteuid(). To make it practical
to port to Win32, remove the uid parameter and hardcode
geteuid()
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When you try to connect to a socket in the abstract namespace,
the error will be ECONNREFUSED for a non-listening daemon. With
the non-abstract namespace though, you instead get ENOENT. Add
a check for this extra errno when auto-spawning the daemon
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove a number of pointless checks against PATH_MAX and
add a syntax-check rule to prevent its use in future
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Libtool supports linking directly against .o files on some platforms
(such as Linux), which happens to be the only place where we are
actually doing that (for the dtrace-generated probes.o files). However,
it raises a big stink about the non-portability, even though we don't
attempt it on platforms where it would actually fail:
CCLD libvirt_driver_qemu.la
*** Warning: Linking the shared library libvirt_driver_qemu.la against
the non-libtool
*** objects libvirt_qemu_probes.o is not portable!
This shuts libtool up by creating a proper .lo file that matches
what libtool normally expects.
* src/Makefile.am (%_probes.lo): New rule.
(libvirt_probes.stp, libvirt_qemu_probes.stp): Simplify into...
(%_probes.stp): ...shorter rule.
(CLEANFILES): Clean new .lo files.
(libvirt_la_BUILT_LIBADD, libvirt_driver_qemu_la_LIBADD)
(libvirt_lxc_LDADD, virt_aa_helper_LDADD): Link against .lo file.
* tests/Makefile.am (PROBES_O, qemu_LDADDS): Likewise.
If vdsm is installed and configured in Fedora 17, we add the following
items into qemu.conf:
spice_tls=1
spice_tls_x509_cert_dir="/etc/pki/vdsm/libvirt-spice"
However, after this changes, augtool cannot identify qemu.conf anymore.
Add a VIR_ERR_DOMAIN_LAST sentinel for virErrorDomain and
replace the virErrorDomainName function by a VIR_ENUM_IMPL
In the process the naming of error domains is sanitized
* src/util/virterror.c: Use VIR_ENUM_IMPL for converting
error domains to strings
* include/libvirt/virterror.h: Add VIR_ERR_DOMAIN_LAST
The libvirt_private.syms file exports virNetlinkEventServiceLocalPid
so there needs to be a no-op stub for Win32 to avoid linker errors
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
* daemon/libvirtd.c: Set custom driver module dir if the current
binary name is 'lt-libvirtd' (indicating execution directly
from GIT checkout)
* src/driver.c, src/driver.h, src/libvirt_driver_modules.syms: Add
virDriverModuleInitialize to allow driver module location to
be changed
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When building as driver modules, it is not possible for the QEMU
driver module to reference the DTrace/SystemTAP probes linked into
the main libvirt.so. Thus we need to move the QEMU probes into a
separate file 'libvirt_qemu_probes.d'. Also rename the existing
file from 'probes.d' to 'libvirt_probes.d' while we're at it
* daemon/Makefile.am, src/internal.h: Include libvirt_probes.h
instead of probes.h
* src/Makefile.am: Add rules for libvirt_qemu_probes.d
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor_json.c,
src/qemu/qemu_monitor_text.c: Include libvirt_qemu_probes.h
* src/libvirt_probes.d: Rename from probes.d
* src/libvirt_qemu_probes.d: QEMU specific probes formerly
in probes.d
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Since we have drivers which depend on each other (ie QEMU/LXC
depend on the network driver APIs), we need to use RTLD_GLOBAL
instead of RTLD_LOCAL. While this pollutes the calling binary
with many more symbols, this is no worse than if we directly
link to the drivers, and this only applies to libvirtd
* src/driver.c: s/RTLD_LOCAL/RTLD_GLOBAL/
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Only libvirt_driver_storage.la links to libblkid currently. If
we are running in a scenario with driver modules, LXC must
directly link to it, since it can't assume the storage driver
is present
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The libvirt_test.la library was introduced to allow test suites
to reference internal-only symbols. These days, nearly every
symbol we care about is in src/libvirt_private.syms, so there
is no need for libvirt_test.la to continue to exist
* src/Makefile.am: Delete libvirt_test.la & add new .syms files
* src/libvirt_private.syms: Export symbols needed by test suite
* tests/Makefile.am: Link to libvirt_test.la. Ensure LXC tests link
to network_driver.la
* src/libvirt_esx.syms, src/libvirt_openvz.syms: Add exports needed
by test suite
libvirt_driver_nodedev.la should not link against either
libvirt_util.la or gnulib.la, since libvirt.so brings
in those deps.
* src/Makefile.am: Fix broken linkage of libvirt_driver_nodedev.la
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The driver modules all use symbols which are defined in libvirt.so.
Thus for loading of modules to work, the binary that libvirt.so
is linked to must export its symbols back to modules. If the
libvirt.so itself is dlopen()d then the RTLD_GLOBAL flag must
be set. Unfortunately few, if any, programming languages use
the RTLD_GLOBAL flag when loading modules :-( This means is it
not practical to use driver modules for any libvirt client side
drivers (OpenVZ, VMWare, Hyper-V, Remote client, test).
This patch changes the build process so only server side drivers
are built as modules (Xen, QEMU, LXC, UML)
* daemon/libvirtd.c: Add missing load of 'interface' driver
* src/Makefile.am: Only build server side drivers as modules
* src/libvirt.c: Don't load any driver modules
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
and use it for virDomainParseMemory. This allows to parse arbitrary
scaled value, not only memory related values as needed for the
filesystem limits code following later in this series.
This reverts commit b1e374a7ac, which was
rather bad since I failed to consider all sides of the issue. The main
things I didn't consider properly are:
- a thread which sends a non-blocking call waits for the thread with
the buck to process the call
- the code doesn't expect non-blocking calls to remain in the queue
unless they were already partially sent
Thus, the reverted patch actually breaks more than what it fixes and
clients (which may even be libvirtd during p2p migrations) will likely
end up in a deadlock.
The pciDevice structure corresponding to the device being hot-unplugged
was freed after it was "stolen" from activeList. The pointer was still
used for eg-inactive list. This patch removes the free of the structure
and frees it only if reset fails on the device.
I'm tired of writing:
bool sep = false;
while (...) {
if (sep)
virBufferAddChar(buf, ',');
sep = true;
virBufferAdd(buf, str);
}
This makes it easier, allowing one to write:
while (...)
virBufferAsprintf(buf, "%s,", str);
virBufferTrim(buf, ",", -1);
to trim any remaining comma.
* src/util/buf.h (virBufferTrim): Declare.
* src/util/buf.c (virBufferTrim): New function.
* tests/virbuftest.c (testBufTrim): Test it.
This patch adds support for a new storage backend with RBD support.
RBD is the RADOS Block Device and is part of the Ceph distributed storage
system.
It comes in two flavours: Qemu-RBD and Kernel RBD, this storage backend only
supports Qemu-RBD, thus limiting the use of this storage driver to Qemu only.
To function this backend relies on librbd and librados being present on the
local system.
The backend also supports Cephx authentication for safe authentication with
the Ceph cluster.
For storing credentials it uses the built-in secret mechanism of libvirt.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
When the last reference to a virConnectPtr is released by
libvirtd, it was possible for a deadlock to occur in the
virDomainEventState functions. The virDomainEventStatePtr
holds a reference on virConnectPtr for each registered
callback. When removing a callback, the virUnrefConnect
function is run. If this causes the last reference on the
virConnectPtr to be released, then virReleaseConnect can
be run, which in turns calls qemudClose. This function has
a call to virDomainEventStateDeregisterConn which is intended
to remove all callbacks associated with the virConnectPtr
instance. This will try to grab a lock on virDomainEventState
but this lock is already held. Deadlock ensues
Thread 1 (Thread 0x7fcbb526a840 (LWP 23185)):
Since each callback associated with a virConnectPtr holds a
reference on virConnectPtr, it is impossible for the qemudClose
method to be invoked while any callbacks are still registered.
Thus the call to virDomainEventStateDeregisterConn must in fact
be a no-op. Thus it is possible to just remove all trace of
virDomainEventStateDeregisterConn and avoid the deadlock.
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Delete virDomainEventStateDeregisterConn
* src/libxl/libxl_driver.c, src/lxc/lxc_driver.c,
src/qemu/qemu_driver.c, src/uml/uml_driver.c: Remove
calls to virDomainEventStateDeregisterConn
This patch adds support for the recent ipset iptables extension
to libvirt's nwfilter subsystem. Ipset allows to maintain 'sets'
of IP addresses, ports and other packet parameters and allows for
faster lookup (in the order of O(1) vs. O(n)) and rule evaluation
to achieve higher throughput than what can be achieved with
individual iptables rules.
On the command line iptables supports ipset using
iptables ... -m set --match-set <ipset name> <flags> -j ...
where 'ipset name' is the name of a previously created ipset and
flags is a comma-separated list of up to 6 flags. Flags use 'src' and 'dst'
for selecting IP addresses, ports etc. from the source or
destination part of a packet. So a concrete example may look like this:
iptables -A INPUT -m set --match-set test src,src -j ACCEPT
Since ipset management is quite complex, the idea was to leave ipset
management outside of libvirt but still allow users to reference an ipset.
The user would have to make sure the ipset is available once the VM is
started so that the iptables rule(s) referencing the ipset can be created.
Using XML to describe an ipset in an nwfilter rule would then look as
follows:
<rule action='accept' direction='in'>
<all ipset='test' ipsetflags='src,src'/>
</rule>
The two parameters on the command line are also the two distinct XML attributes
'ipset' and 'ipsetflags'.
FYI: Here is the man page for ipset:
https://ipset.netfilter.org/ipset.man.html
Regards,
Stefan
We were being lazy - virnetlink.c was getting uint32_t as a
side-effect from glibc 2.14's <unistd.h>, but older glibc 2.11
does not provide uint32_t from <unistd.h>. In fact, POSIX states
that <unistd.h> need only provide intptr_t, not all of <stdint.h>,
so the bug really is ours. Reported by Jonathan Alescio.
* src/util/virnetlink.h: Include <stdint.h>.
This involves setting the cpuacct cgroup to a per-vcpu granularity,
as well as summing the each vcpu accounting into a common array.
Now that we are reading more than one cgroup file, we double-check
that cpus weren't hot-plugged between reads to invalidate our
summing.
Signed-off-by: Eric Blake <eblake@redhat.com>
If qemuPrepareHostdevUSBDevices fail it will roll back devices added
to the driver list of used devices. However, if it may fail because
the device is being used already. But then again - with roll back.
Therefore don't try to remove a usb device manually if the function
fail. Although, we want to remove the device if any operation
performed afterwards fail.
Allow the logging APIs to be called with a va_list for format
args, instead of requiring var-args usage.
* src/util/logging.h, src/util/logging.c: Add virLogVMessage
The use of readlink() in lxc_container.c is intentional; we don't
want an absolute pathname there.
* src/util/cgroup.h (VIR_CGROUP_SYSFS_MOUNT): Indent properly.
* cfg.mk (exclude_file_name_regexp--sc_prohibit_readlink): Add
exemption.
One of our latest USB device handling patches
05abd1507d introduced a regression.
That is, we first create a temporary list of all USB devices that
are to be used by domain just starting up. Then we iterate over and
check if a device from the list is in the global list of currently
assigned devices (activeUsbHostdevs). If not, we add it there and
continue with next iteration then. But if a device from temporary
list is either taken already or adding to the activeUsbHostdevs fails,
we remove all devices in temp list from the activeUsbHostdevs list.
Therefore, if a device is already taken we remove it from
activeUsbHostdevs even if we should not. Thus, next time we allow
the device to be assigned to another domain.
Most versions of libselinux do not contain the function
selinux_lxc_contexts_path() that the security driver
recently started using for LXC. We must add a conditional
check for it in configure and then disable the LXC security
driver for builds where libselinux lacks this function.
* configure.ac: Check for selinux_lxc_contexts_path
* src/security/security_selinux.c: Disable LXC security
if selinux_lxc_contexts_path() is missing
Normal practice is for cgroups controllers to be mounted at
/sys/fs/cgroup. When setting up a container, /sys is mounted
with a new sysfs instance, thus we must re-mount all the
cgroups controllers. The complexity is that we must mount
them in the same layout as the host OS. ie if 'cpu' and 'cpuacct'
were mounted at the same location in the host we must preserve
this in the container. Also if any controllers are co-located
we must setup symlinks from the individual controller name to
the co-located mount-point
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Both /proc and /sys may have sub-mounts in them from the host
OS. We must explicitly unmount them all before mounting the
new instance over that location. If we don't then /proc/mounts
will show the sub-mounts as existing, even though nothing will
be able to access them, due to the over-mount.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If the LXC config has a filesystem
<filesystem>
<source dir='/'/>
<target dir='/'/>
</filesystem>
then there is no need to go down the pivot root codepath.
We can simply use the existing root as needed.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently to make sysfs readonly, we remount the existing
instance and then bind it readonly. Unfortunately this means
sysfs is still showing device objects wrt the host OS namespace.
We need it to reflect the container namespace, so we must mount
a completely new instance of it. Do the same for selinuxfs since
there is no benefit to bind mounting & this lets us simplify
the code.
* src/lxc/lxc_container.c: Mount fresh sysfs instance
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of hardcoding use of SELinux contexts in the LXC driver,
switch over to using the official security driver API.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some security drivers require special options to be passed to
the mount system call. Add a security driver API for handling
this data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The SELinux policy for LXC uses a different configuration file
than the traditional svirt one. Thus we need to load
/etc/selinux/targeted/contexts/lxc_contexts which contains
something like this:
process = "system_u:system_r:svirt_lxc_net_t:s0"
file = "system_u:object_r:svirt_lxc_file_t:s0"
content = "system_u:object_r:virt_var_lib_t:s0"
cleverly designed to be parsable by virConfPtr
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the SELinux driver stores its state in a set of global
variables. This switches it to use a private data struct instead.
This will enable different instances to have their own data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The AppArmour driver does not currently have support for LXC
so ensure that when probing, it claims to be disabled
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To allow the security drivers to apply different configuration
information per hypervisor, pass the virtualization driver name
into the security manager constructor.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Thanks to this new option we are now able to use modern CPU models (such
as Westmere) defined in external configuration file.
The qemu-1.1{,-device} data files for qemuhelptest are filled in with
qemu-1.1-rc2 output for now. I will update those files with real
qemu-1.1 output once it is released.
The uhci1, uhci2, uhci3 companion controllers for ehci1 must
have a master start port set. Since this value is predictable
we should set it automatically if the app does not supply it
Currently each USB2 companion controller gets put on a separate
PCI slot. Not only is this wasteful of PCI slots, but it is not
in compliance with the spec for USB2 controllers. The master
echi1 and all companion controllers should be in the same slot,
with echi1 in function 7, and uhci1-3 in functions 0-2 respectively.
* src/qemu/qemu_command.c: Special case handling of USB2 controllers
to apply correct pci slot assignment
* tests/qemuxml2argvdata/qemuxml2argv-usb-ich9-ehci-addr.args,
tests/qemuxml2argvdata/qemuxml2argv-usb-ich9-ehci-addr.xml: Expand
test to cover automatic slot assignment
The virDomainDeviceInfoIsSet API was only checking if an
address or alias was set in the struct. Thus if only a
rom bar setting / filename, boot index, or USB master
value was set, they could be accidentally dropped when
formatting XML
Callers of virGetUser{Config,Runtime,Cache}Directory all
append further path component. We should not be
adding a trailing slash in the return path otherwise we
get paths containing '//'
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Sometimes it is useful to see the callpath for log messages.
This change enhances the log filter syntax so that stack traces
can be show by setting '1:+NAME' instead of '1:NAME'.
This results in output like:
2012-05-09 14:18:45.136+0000: 13314: debug : virInitialize:414 : register drivers
/home/berrange/src/virt/libvirt/src/.libs/libvirt.so.0(virInitialize+0xd6)[0x7f89188ebe86]
/home/berrange/src/virt/libvirt/tools/.libs/lt-virsh[0x431921]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x3a21e21735]
/home/berrange/src/virt/libvirt/tools/.libs/lt-virsh[0x40a279]
2012-05-09 14:18:45.136+0000: 13314: debug : virRegisterDriver:775 : driver=0x7f8918d02760 name=Test
/home/berrange/src/virt/libvirt/src/.libs/libvirt.so.0(virRegisterDriver+0x6b)[0x7f89188ec717]
/home/berrange/src/virt/libvirt/src/.libs/libvirt.so.0(+0x11b3ad)[0x7f891891e3ad]
/home/berrange/src/virt/libvirt/src/.libs/libvirt.so.0(virInitialize+0xf3)[0x7f89188ebea3]
/home/berrange/src/virt/libvirt/tools/.libs/lt-virsh[0x431921]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x3a21e21735]
/home/berrange/src/virt/libvirt/tools/.libs/lt-virsh[0x40a279]
* docs/logging.html.in: Document new syntax
* configure.ac: Check for execinfo.h
* src/util/logging.c, src/util/logging.h: Add support for
stack traces
* tests/testutils.c: Adapt to API change
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The current unprivileged user libvirtd sockets are in the abstract
namespace. This has a number of problems
- You can't connect to them remotely using the nc/ssh tunnel
- This is not portable for OS-X, BSD & probably others
- Parent directory permissions don't apply
"Instead of developing one CPU with 12 cores, the Magny Cours is
actually two 6 core “Bulldozer” CPUs combined in to one package"
I.e, each package has two NUMA nodes, and the two numa nodes share
the same core ID set (0-6), which means parsing the cores number
from sysfs doesn't work in this case.
And the wrong CPU number could cause three problems for libvirt:
1) performance lost
A domain without "cpuset" or "placement='auto'" (to drive numad)
specified will be only pinned to part of the CPUs.
2) domain can be started
If a domain uses numad, and the advisory nodeset returned from
numad contains node which exceeds the range of wrong total CPU
number. The domain will fail to start, as the bitmask passed to
sched_setaffinity could be fully filled with zero.
3) wrong CPU number affects lots of stuffs.
E.g. for command "virsh vcpuinfo", "virsh vcpupin", it will always
output with the truncated CPU list.
For more details:
https://www.redhat.com/archives/libvir-list/2012-May/msg00607.html
This patch is to fix the problem by parsing /proc/cpuinfo to get
the value of field "cpu cores", and use it as nodeinfo->cores if
it's greater than the cores number from sysfs.
Like for 'static' placement, when the memory policy mode is
'strict', set the memory policy by writing the advisory nodeset
returned from numad to cgroup file cpuset.mems,
On some of the NUMA platforms, the CPU index in each NUMA node
grows non-consecutive. While on other platforms, it can be inconsecutive,
E.g.
% numactl --hardware
available: 4 nodes (0-3)
node 0 cpus: 0 4 8 12 16 20 24 28
node 0 size: 131058 MB
node 0 free: 86531 MB
node 1 cpus: 1 5 9 13 17 21 25 29
node 1 size: 131072 MB
node 1 free: 127070 MB
node 2 cpus: 2 6 10 14 18 22 26 30
node 2 size: 131072 MB
node 2 free: 127758 MB
node 3 cpus: 3 7 11 15 19 23 27 31
node 3 size: 131072 MB
node 3 free: 127226 MB
node distances:
node 0 1 2 3
0: 10 20 20 20
1: 20 10 20 20
2: 20 20 10 20
3: 20 20 20 10
This patch is to fix the problem by using the CPU index in
caps->host.numaCell[i]->cpus[i] to set the bitmask instead of
assuming the CPU index of the NUMA nodes are always sequential.
For pseries guest, the default controller model is
ibmvscsi controller, this controller only can work
on spapr-vio address.
This patch is to assign spapr-vio address type to
ibmvscsi controller and correct vscsi test case.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
We had previously weakened our nodeinfotest in order to ignore parsed
node values, because the parse function was mistakenly relying on
host files. A better fix is to avoid using the numactl library, but
to instead parse the same files that numactl would read, all while
allowing the files to be relative to our choice of directory.
* src/nodeinfo.c (CPU_SYS_PATH, NODE_SYS_PATH): Replace with...
(SYSFS_SYSTEM_PATH): ...parent directory.
(linuxNodeInfoCPUPopulate): Check NUMA nodes from requested
directory (by inlining numactl code).
(nodeGetCPUmap, nodeGetMemoryStats): Adjust macro use.
* tests/nodeinfotest.c (linuxTestCompareFiles, linuxTestNodeInfo):
Update test to match.
We were wasting time to malloc a copy of a constant string, then
copy it into static storage, for every call to nodeGetInfo. At
least we were lucky that it was a constant source, and thus not
subject to even worse issues with one thread clobbering the static
storage while another was using it. This gets rid of the waste,
by passing the string through the stack instead, as well as renaming
internal functions to better match our conventions.
* src/nodeinfo.c (sysfs_path): Delete.
(get_cpu_value, count_thread_siblings, parse_socket): Add
parameter, and rename...
(virNodeGetCpuValue, virNodeCountThreadSiblings)
(virNodeParseSocket): ... into a common namespace.
(cpu_online, parse_core): Inline into callers.
(linuxNodeInfoCPUPopulate): Update caller.
(nodeGetInfo): Drop a useless malloc.
Commit cdce2f42d tried to silence a compiler warning on 32-bit builds,
but the gcc shipped with RHEL 5 is old enough that the type conversion
via multiplication by 1 was insufficient for the task.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Previous attempt
didn't get past all gcc versions.
As defined in:
http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html
This offers a number of advantages:
* Allows sharing a home directory between different machines, or
sessions (eg. using NFS)
* Cleanly separates cache, runtime (eg. sockets), or app data from
user settings
* Supports performing smart or selective migration of settings
between different OS versions
* Supports reseting settings without breaking things
* Makes it possible to clear cache data to make room when the disk
is filling up
* Allows us to write a robust and efficient backup solution
* Allows an admin flexibility to change where data and settings are stored
* Dramatically reduces the complexity and incoherence of the
system for administrators
Appending an item to a list transfers ownership of that item to the
list owner. But an error can occur in between item allocation and
appending it to the list. In this case the item has to be freed
explicitly. This was not done in some special cases resulting in
possible memory leaks.
Reported by Coverity.
This patch lifts the limit of calling thread detection code only on KVM
guests. With upstream qemu the thread mappings are reported also on
non-KVM machines.
QEMU adopted the thread_id information from the kvm branch.
To remain compatible with older upstream versions of qemu the check is
attempted but the failure to detect threads (or even run the monitor
command - on older versions without SMP support) is treated non-fatal
and the code reports one vCPU with pid of the hypervisor (in same
fashion this was done on non-KVM guests).
After a cpu hotplug the qemu driver did not refresh information about
virtual processors used by qemu and their corresponding threads. This
patch forces a re-detection as is done on start of QEMU.
This ensures that correct information is reported by the
virDomainGetVcpus API and "virsh vcpuinfo".
A failure to obtain the thread<->vcpu mapping is treated non-fatal and
the mapping is not updated in a case of failure as not all versions of
QEMU report this in the info cpus command.
This patch changes a switch statement into ifs when handling live vs.
configuration modifications getting rid of redundant code in case when
both live and persistent configuration gets changed.
when failing to attach another usb device to a domain for some reason
which has one use device attached before, the libvirtd crashed.
The crash is caused by null-pointer dereference error in invoking
usbDeviceListSteal passed in NULL value usb variable.
commit 05abd1507d introduces the bug.
Detected by valgrind. Leaks are introduced in commit 122fa379.
src/conf/storage_conf.c: fix memory leaks.
How to reproduce?
$ make && make -C tests check TESTS=storagepoolxml2xmltest
$ cd tests && valgrind -v --leak-check=full ./storagepoolxml2xmltest
actual result:
==28571== LEAK SUMMARY:
==28571== definitely lost: 40 bytes in 5 blocks
==28571== indirectly lost: 0 bytes in 0 blocks
==28571== possibly lost: 0 bytes in 0 blocks
==28571== still reachable: 1,054 bytes in 21 blocks
==28571== suppressed: 0 bytes in 0 blocks
Signed-off-by: Alex Jia <ajia@redhat.com>
No useful error was being reported when an invalid character device
target type is specified in the domainXML. E.g.
...
<console type="pty">
<source path="/dev/pts/2"/>
<target type="kvm" port="0"/>
</console>
...
resulted in
error: Failed to define domain from x.xml
error: An error occurred, but the cause is unknown
With this small patch, the error is more helpful
error: Failed to define domain from x.xml
error: XML error: unknown target type 'kvm' specified for character device
<vcpu> is not an optional node. The value for its 'placement'
actually always defaults to 'static' in the underlying codes.
(Even no 'cpuset' and 'placement' is specified, the domain
process will be pinned to all the available pCPUs).
Though numad will manage the memory allocation of task dynamically,
it wants management application (libvirt) to pre-set the memory
policy according to the advisory nodeset returned from querying numad,
(just like pre-bind CPU nodeset for domain process), and thus the
performance could benefit much more from it.
This patch introduces new XML tag 'placement', value 'auto' indicates
whether to set the memory policy with the advisory nodeset from numad,
and its value defaults to the value of <vcpu> placement, or 'static'
if 'nodeset' is specified. Example of the new XML tag's usage:
<numatune>
<memory placement='auto' mode='interleave'/>
</numatune>
Just like what current "numatune" does, the 'auto' numa memory policy
setting uses libnuma's API too.
If <vcpu> "placement" is "auto", and <numatune> is not specified
explicitly, a default <numatume> will be added with "placement"
set as "auto", and "mode" set as "strict".
The following XML can now fully drive numad:
1) <vcpu> placement is 'auto', no <numatune> is specified.
<vcpu placement='auto'>10</vcpu>
2) <vcpu> placement is 'auto', no 'placement' is specified for
<numatune>.
<vcpu placement='auto'>10</vcpu>
<numatune>
<memory mode='interleave'/>
</numatune>
And it's also able to control the CPU placement and memory policy
independently. e.g.
1) <vcpu> placement is 'auto', and <numatune> placement is 'static'
<vcpu placement='auto'>10</vcpu>
<numatune>
<memory mode='strict' nodeset='0-10,^7'/>
</numatune>
2) <vcpu> placement is 'static', and <numatune> placement is 'auto'
<vcpu placement='static' cpuset='0-24,^12'>10</vcpu>
<numatune>
<memory mode='interleave' placement='auto'/>
</numatume>
A follow up patch will change the XML formatting codes to always output
'placement' for <vcpu>, even it's 'static'.
It turns out that when cgroups are enabled, the use of a block device
for a snapshot target was failing with EPERM due to libvirt failing
to add the block device to the cgroup whitelist. See also
https://bugzilla.redhat.com/show_bug.cgi?id=810200
* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive)
(qemuDomainSnapshotUndoSingleDiskActive): Account for cgroup.
(qemuDomainSnapshotCreateDiskActive): Update caller.
qemu's behavior in this case is to change the spice server behavior to
require secure connection to any channel not otherwise specified as
being in plaintext mode. libvirt doesn't currently allow requesting this
(via plaintext-channel=<channel name>).
RHBZ: 819499
Signed-off-by: Alon Levy <alevy@redhat.com>
Until now, the nl_pid of the source address of every message sent by
virNetlinkCommand has been set to the value of getpid(). Most of the
time this doesn't matter, and in the one case where it does
(communication with lldpad), it previously was the proper thing to do,
because the netlink event service (which listens on a netlink socket
for unsolicited messages from lldpad) coincidentally always happened
to bind with a local nl_pid == getpid().
With the fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=816465
that particular nl_pid is now effectively a reserved value, so the
netlink event service will always bind to something else
(coincidentally "getpid() + (1 << 22)", but it really could be
anything). The result is that communication between lldpad and
libvirtd is broken (lldpad gets a "disconnected" error when it tries
to send a directed message).
The solution to this problem caused by a solution, is to query the
netlink event service's nlhandle for its "local_port", and send that
as the source nl_pid (but only when sending to lldpad, of course - in
other cases we maintain the old behavior of sending getpid()).
There are two cases where a message is being directed at lldpad - one
in virNetDevLinkDump, and one in virNetDevVPortProfileOpSetLink.
The case of virNetDevVPortProfileOpSetLink is simplest to explain -
only if !nltarget_kernel, i.e. the message isn't targetted for the
kernel, is the dst_pid set (by calling
virNetDevVPortProfileGetLldpadPid()), so only in that case do we call
virNetlinkEventServiceLocalPid() to set src_pid.
For virNetDevLinkDump, it's a bit more complicated. The call to
virNetDevVPortProfileGetLldpadPid() was effectively up one level (in
virNetDevVPortProfileOpCommon), although obscured by an unnecessary
passing of a function pointer. This patch removes the function
pointer, and calls virNetDevVPortProfileGetLldpadPid() directly in
virNetDevVPortProfileOpCommon - if it's doing this, it knows that it
should also call virNetlinkEventServiceLocalPid() to set src_pid too;
then it just passes src_pid and dst_pid down to
virNetDevLinkDump. Since (src_pid == 0 && dst_pid == 0) implies that
the kernel is the destination, there is no longer any need to send
nltarget_kernel as an arg to virNetDevLinkDump, so it's been removed.
The disparity between src_pid being int and dst_pid being uint32_t may
be a bit disconcerting to some, but I didn't want to complicate
virNetlinkEventServiceLocalPid() by having status returned separately
from the value.
This value will be needed to set the src_pid when sending netlink
messages to lldpad. It is part of the solution to:
https://bugzilla.redhat.com/show_bug.cgi?id=816465
Note that libnl's port generation algorithm guarantees that the
nl_socket_get_local_port() will always be > 0 (since it is "getpid() +
(n << 22>" where n is always < 1024), so it is okay to cast the
uint32_t to int (thus allowing us to use -1 as an error sentinel).
Until now, virNetlinkCommand has assumed that the nl_pid in the source
address of outgoing netlink messages should always be the return value
of getpid(). In most cases it actually doesn't matter, but in the case
of communication with lldpad, lldpad saves this info and later uses it
to send netlink messages back to libvirt. A recent patch to fix Bug
816465 changed the order of the universe such that the netlink event
service socket is no longer bound with nl_pid == getpid(), so lldpad
could no longer send unsolicited messages to libvirtd. Adding src_pid
as an argument to virNetlinkCommand() is the first step in notifying
lldpad of the proper address of the netlink event service socket.
src/qemu/qemu_hostdev.c:
refactor qemuPrepareHostdevUSBDevices function, make it focus on
adding usb device to activeUsbHostdevs after check. After that,
the usb hotplug function qemuDomainAttachHostDevice also could use
it.
expand qemuPrepareHostUSBDevices to perform the usb search,
rollback on failure.
src/qemu/qemu_hotplug.c:
If there are multiple usb devices available with same vendorID and productID,
but with different value of "bus, device", we give an error to let user
use <address> to specify the desired one.
usbFindDevice():get usb device according to
idVendor, idProduct, bus, device
it is the exact match of the four parameters
usbFindDeviceByBus():get usb device according to bus, device
it returns only one usb device same as usbFindDevice
usbFindDeviceByVendor():get usb device according to idVendor,idProduct
it probably returns multiple usb devices.
usbDeviceSearch(): a helper function to do the actual search
When we added the default USB controller into domain XML, we efficiently
broke migration to older versions of libvirt that didn't support USB
controllers at all (0.9.4 and earlier) even for domains that don't use
anything that the older libvirt can't provide. We still want to present
the default USB controller in any XML seen by a user/app but we can
safely remove it from the domain XML used during migration. If we are
migrating to a new enough libvirt, it will add the controller XML back,
while older libvirt won't be confused with it although it will still
tell qemu to create the controller.
Similar approach can be used in the future whenever we find out we
always enabled some kind of device without properly advertising it in
domain XML.
Commit 4c82f09e added a capability check for qemu per-device io
throttling, but only applied it to domain startup. As mentioned
in the previous commit (98cec05), the user can still get an 'internal
error' message during a hotplug attempt, when the monitor command
doesn't exist. It is confusing to allow tuning on inactive domains
only to then be rejected when starting the domain.
* src/qemu/qemu_driver.c (qemuDomainSetBlockIoTune): Reject
offline tuning if online can't match it.
If you have a qemu build that lacks the blockio tune monitor command,
then this command:
$ virsh blkdeviotune rhel6u2 hda --total_bytes_sec 1000
error: Unable to change block I/O throttle
error: internal error Unexpected error
fails as expected (well, the error message is lousy), but the next
dumpxml shows that the domain was modified anyway. Worse, that means
if you save the domain then restore it, the restore will likely fail
due to throttling being unsupported, even though no throttling should
even be active because the monitor command failed in the first place.
* src/qemu/qemu_driver.c (qemuDomainSetBlockIoTune): Check for
error before making modification permanent.
These two functions are called from main() on all platforms, and
always return success on platforms that don't support libnl. They
still log an error message, though, which doesn't make sense - they
should just be NOPs on those platforms. (Per a suggestion during
review, I've turned the logs into debug messages rather than removing
them completely).
Error: STRING_NULL:
/libvirt/src/node_device/node_device_linux_sysfs.c:80:
string_null_argument: Function "saferead" does not terminate string "*buf".
/libvirt/src/util/util.c:101:
string_null_argument: Function "read" fills array "*buf" with a non-terminated string.
/libvirt/src/node_device/node_device_linux_sysfs.c:87:
string_null: Passing unterminated string "buf" to a function expecting a null-terminated string.
Error: STRING_NULL:
/libvirt/src/util/uuid.c:273:
string_null_argument: Function "getDMISystemUUID" does not terminate string "*dmiuuid".
/libvirt/src/util/uuid.c:241:
string_null_argument: Function "saferead" fills array "*uuid" with a non-terminated string.
/libvirt/src/util/util.c:101:
string_null_argument: Function "read" fills array "*buf" with a non-terminated string.
/libvirt/src/util/uuid.c:274:
string_null: Passing unterminated string "dmiuuid" to a function expecting a null-terminated string.
/libvirt/src/util/uuid.c:138:
var_assign_parm: Assigning: "cur" = "uuidstr". They now point to the same thing.
/libvirt/src/util/uuid.c:164:
string_null_sink_loop: Searching for null termination in an unterminated array "cur".
Error: RESOURCE_LEAK:
/libvirt/src/qemu/qemu_driver.c:6968:
alloc_fn: Calling allocation function "calloc".
/libvirt/src/qemu/qemu_driver.c:6968:
var_assign: Assigning: "nodeset" = storage returned from "calloc(1UL, 1UL)".
/libvirt/src/qemu/qemu_driver.c:6977:
noescape: Variable "nodeset" is not freed or pointed-to in function "virTypedParameterAssign".
/libvirt/src/qemu/qemu_driver.c:6997:
leaked_storage: Variable "nodeset" going out of scope leaks the storage it points to.
Error: RESOURCE_LEAK:
/libvirt/src/vmx/vmx.c:2431:
alloc_fn: Calling allocation function "calloc".
/libvirt/src/vmx/vmx.c:2431:
var_assign: Assigning: "networkName" = storage returned from "calloc(1UL, 1UL)".
/libvirt/src/vmx/vmx.c:2495:
leaked_storage: Variable "networkName" going out of scope leaks the storage it points to.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/nodeinfo.c:629: alloc_fn: Calling allocation function "fopen".
/builddir/build/BUILD/libvirt-0.9.10/src/nodeinfo.c:629: var_assign: Assigning: "cpuinfo" = storage returned from "fopen("/proc/cpuinfo", "r")".
/builddir/build/BUILD/libvirt-0.9.10/src/nodeinfo.c:638: leaked_storage: Variable "cpuinfo" going out of scope leaks the storage it points to.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/test/test_driver.c:1041: alloc_arg: Calling allocation function "virXPathNodeSet" on "devs".
/builddir/build/BUILD/libvirt-0.9.10/src/util/xml.c:621: alloc_arg: "virAllocN" allocates memory that is stored into "*list".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:129: alloc_fn: Storage is returned from allocation function "calloc".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:129: var_assign: Assigning: "*((void **)ptrptr)" = "calloc(count, size)".
/builddir/build/BUILD/libvirt-0.9.10/src/util/xml.c:625: noescape: Variable "*list" is not freed or pointed-to in function "memcpy".
/builddir/build/BUILD/libvirt-0.9.10/src/test/test_driver.c:1098: leaked_storage: Variable "devs" going out of scope leaks the storage it points to.
Coverity logs:
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_inotify.c:103: alloc_fn: Calling allocation function "xenDaemonLookupByUUID".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xend_internal.c:2534: alloc_fn: Storage is returned from allocation function "virGetDomain".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:191: alloc_arg: "virAlloc" allocates memory that is stored into "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: alloc_fn: Storage is returned from allocation function "calloc".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: var_assign: Assigning: "*((void **)ptrptr)" = "calloc(1UL, size)".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:210: return_alloc: Returning allocated memory "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xend_internal.c:2534: var_assign: Assigning: "ret" = "virGetDomain(conn, name, uuid)".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xend_internal.c:2541: return_alloc: Returning allocated memory "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_inotify.c:103: var_assign: Assigning: "dom" = storage returned from "xenDaemonLookupByUUID(conn, rawuuid)".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_inotify.c:126: leaked_storage: Variable "dom" going out of scope leaks the storage it points to.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2742: alloc_fn: Calling allocation function "fopen".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2742: var_assign: Assigning: "cpuinfo" = storage returned from "fopen("/proc/cpuinfo", "r")".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2763: noescape: Variable "cpuinfo" is not freed or pointed-to in function "xenHypervisorMakeCapabilitiesInternal".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2574:45: noescape: "xenHypervisorMakeCapabilitiesInternal" does not free or save its pointer parameter "cpuinfo".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2768: leaked_storage: Variable "cpuinfo" going out of scope leaks the storage it points to.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2752: alloc_fn: Calling allocation function "fopen".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2752: var_assign: Assigning: "capabilities" = storage returned from "fopen("/sys/hypervisor/properties/capabilities", "r")".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2763: noescape: Variable "capabilities" is not freed or pointed-to in function "xenHypervisorMakeCapabilitiesInternal".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2574:60: noescape: "xenHypervisorMakeCapabilitiesInternal" does not free or save its pointer parameter "capabilities".
/builddir/build/BUILD/libvirt-0.9.10/src/xen/xen_hypervisor.c:2768: leaked_storage: Variable "capabilities" going out of scope leaks the storage it points to.
Coverity logs:
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:523: alloc_fn: Calling allocation function "fopen".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:523: var_assign: Assigning: "fd" = storage returned from "fopen(local_file, "rb")".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:540: noescape: Variable "fd" is not freed or pointed-to in function "fread".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:542: noescape: Variable "fd" is not freed or pointed-to in function "feof".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:575: leaked_storage: Variable "fd" going out of scope leaks the storage it points to.
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:585: leaked_storage: Variable "fd" going out of scope leaks the storage it points to.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2088: alloc_fn: Calling allocation function "phypVolumeLookupByName".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2026: alloc_fn: Storage is returned from allocation function "virGetStorageVol".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:724: alloc_arg: "virAlloc" allocates memory that is stored into "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: alloc_fn: Storage is returned from allocation function "calloc".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: var_assign: Assigning: "*((void **)ptrptr)" = "calloc(1UL, size)".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:753: return_alloc: Returning allocated memory "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2026: var_assign: Assigning: "vol" = "virGetStorageVol(pool->conn, pool->name, volname, key)".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2030: return_alloc: Returning allocated memory "vol".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2088: leaked_storage: Failing to save storage allocated by "phypVolumeLookupByName(pool, voldef->name)" leaks it.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2725: alloc_fn: Calling allocation function "phypGetStoragePoolLookUpByUUID".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2689: alloc_fn: Storage is returned from allocation function "virGetStoragePool".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:592: alloc_arg: "virAlloc" allocates memory that is stored into "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: alloc_fn: Storage is returned from allocation function "calloc".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: var_assign: Assigning: "*((void **)ptrptr)" = "calloc(1UL, size)".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:610: return_alloc: Returning allocated memory "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2689: var_assign: Assigning: "sp" = "virGetStoragePool(conn, pools[i], uuid)".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2694: return_alloc: Returning allocated memory "sp".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2725: leaked_storage: Failing to save storage allocated by "phypGetStoragePoolLookUpByUUID(conn, def->uuid)" leaks it.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2719: alloc_fn: Calling allocation function "phypStoragePoolLookupByName".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2254: alloc_fn: Storage is returned from allocation function "virGetStoragePool".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:592: alloc_arg: "virAlloc" allocates memory that is stored into "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: alloc_fn: Storage is returned from allocation function "calloc".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: var_assign: Assigning: "*((void **)ptrptr)" = "calloc(1UL, size)".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:610: return_alloc: Returning allocated memory "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2254: return_alloc_fn: Directly returning storage allocated by "virGetStoragePool".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2719: leaked_storage: Failing to save storage allocated by "phypStoragePoolLookupByName(conn, def->name)" leaks it.
Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2270: alloc_fn: Calling allocation function "phypStoragePoolLookupByName".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2254: alloc_fn: Storage is returned from allocation function "virGetStoragePool".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:592: alloc_arg: "virAlloc" allocates memory that is stored into "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: alloc_fn: Storage is returned from allocation function "calloc".
/builddir/build/BUILD/libvirt-0.9.10/src/util/memory.c:101: var_assign: Assigning: "*((void **)ptrptr)" = "calloc(1UL, size)".
/builddir/build/BUILD/libvirt-0.9.10/src/datatypes.c:610: return_alloc: Returning allocated memory "ret".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2254: return_alloc_fn: Directly returning storage allocated by "virGetStoragePool".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2270: var_assign: Assigning: "sp" = storage returned from "phypStoragePoolLookupByName(vol->conn, vol->pool)".
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2324: leaked_storage: Variable "sp" going out of scope leaks the storage it points to.
/builddir/build/BUILD/libvirt-0.9.10/src/phyp/phyp_driver.c:2327: leaked_storage: Variable "sp" going out of scope leaks the storage it points t
On 32-bit platforms, gcc warns that the comparison between a long
and (ULLONG_MAX/1024/1024) is always false; throwing in a type
conversion shuts up the warning.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Shut gcc up.
configure.ac: check for libnl-3 in addition to libnl-1
src/Makefile.am: link against libnl when needed
src/util/virnetlink.c:
support libnl3 api. To minimize impact on code flow, wrap the
differences under the virNetlink* namespace.
Unfortunately libnl3 moves netlink/msg.h to
/usr/include/libnl3/netlink/msg.h, so the LIBNL_CFLAGS need to be added
to a bunch of places where they weren't needed with libnl1.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Add function virJSONValueObjectKeysNumber, virJSONValueObjectGetKey
and virJSONValueObjectGetValue, which allow you to iterate over all
fields of json object: you can get number of fields and then get
name and value, stored in field with that name by index.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
The ATTRIBUTE_NONNULL(m) macro normally resolves to the gcc builtin
__attribute__((__nonnull__(m))). The effect of this in gcc is
unfortunately only to make gcc believe that "m" can never possibly be
NULL, *not* to add in any checks to guarantee that it isn't ever NULL
(i.e. it is an optimization aid, *not* something to verify code
correctness.) - see the following gcc bug report for more details:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17308
Static source analyzers such as clang and coverity apparently can use
ATTRIBUTE_NONNULL(), though, to detect dead code (in the case that the
arg really is guaranteed non-NULL), as well as situations where an
obviously NULL arg is given to the function.
https://bugzilla.redhat.com/show_bug.cgi?id=815270 is a good example
of a bug caused by erroneous application of ATTRIBUTE_NONNULL().
Several people spent a long time staring at this code and not finding
the problem, because the problem wasn't in the function itself, but in
the prototype that specified ATTRIBUTE_NONNULL() for an arg that
actually *wasn't* always non-NULL, and caused a segv when dereferenced
(even though the code that dereferenced the pointer was inside an if()
that checked for a NULL pointer, that code was optimized out by gcc).
There may be some very small gain to be had from the optimizations
that can be inferred from ATTRIBUTE_NONNULL(), but it seems safer to
err on the side of generating code that behaves as expected, while
turning on the attribute for static analyzers.
Once lxcContainerSetStdio is invoked, logging will not work as
expected in libvirt_lxc. So make sure this is the last thing to
be called, in particular after setting the security process label
The virLogSetFromEnv call was done too late in startup to
catch many log messages (eg from security driver initialization).
To assist debugging also explicitly log the security details
at startup
The driver->securityDriverName field may be NULL, if automatic
probing is used to determine security driver. This meant that
unless selinux was explicitly requested in lxc.conf, it was
not being sent to the libvirt_lxc process.
The driver->securityManager field is guaranteed non-NULL, since
there will always be the 'none' security driver present if
nothing else exists. So use that to set the driver name for
libvirt_lxc
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the libvirt_lxc process uses VIR_DOMAIN_XML_INACTIVE
when loading the XML for the container. This means it loses
any dynamic data such as the, just allocated, SELinux label.
Further there is an inconsistency in the libvirt LXC driver
whereby it saves the live config XML and then later overwrites
the file with the live status XML instead. Add a comment about
this for future reference.
* src/lxc/lxc_controller.c: Remove VIR_DOMAIN_XML_INACTIVE
when loading XML
* src/lxc/lxc_driver.c: Add comment about inconsistent
config file formats
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This works with newer qemu that doesn't allow escaping spaces.
It's backwards compatible as well.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
In fact, the 'tapfd' is always NULL, the function 'virNetDevTapCreate()' hasn't
assign 'fd' to 'tapfd', when the function 'virNetDevSetMAC()' is failed then
goto 'error' label, finally, the VIR_FORCE_CLOSE() will deref a NULL 'tapfd'.
* util/virnetdevtap.c (virNetDevTapCreateInBridgePort): fix a NULL pointer derefing.
* How to reproduce?
$ cat > /tmp/net.xml <<EOF
<network>
<name>test</name>
<forward mode='nat'/>
<bridge name='br1' stp='off' delay='1' />
<mac address='00:00:00:00:00:00'/>
<ip address='192.168.100.1' netmask='255.255.255.0'>
<dhcp>
<range start='192.168.100.2' end='192.168.100.254' />
</dhcp>
</ip>
</network>
EOF
$ virsh net-define /tmp/net.xml
$ virsh net-start test
error: Failed to start network brTest
error: End of file while reading data: Input/output error
Signed-off-by: Alex Jia <ajia@redhat.com>
The previous storage patch missed an instance affected by the struct
member rename. It also had some botched whitespace detected by
'make check'.
* src/storage/storage_backend_iscsi.c
(virStorageBackendISCSIFindPoolSources): Adjust to new struct.
* src/conf/storage_conf.c (virStoragePoolSourceFormat): Fix
indentation.
It doesn't break out the "for" loop even if duplicate pool is
found, and thus the "matchpool" could be overriden as NULL again
if there is different pool afterwards.
To address the problem in libvirt-user list:
https://www.redhat.com/archives/libvirt-users/2012-April/msg00150.html
The current storage pools for NFS and iSCSI only require one host to
connect to. Future storage pools like RBD and Sheepdog will require
multiple hosts.
This patch allows multiple source hosts and rewrites the current
storage drivers.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
When libvirtd is started, we create "libvirt/qemu" directories under
hugetlbfs mount point. Only the "qemu" subdirectory is chowned to qemu
user and "libvirt" remains owned by root. If umask was too restrictive
when libvirtd started, qemu user may lose access to "qemu"
subdirectory. Let's explicitly grant search permissions to "libvirt"
directory for all users.
Currently, qemu GA is not providing 'desc' field for errors like
we are used to from qemu monitor. Therefore, we fall back to this
general 'unknown error' string. However, GA is reporting 'class' which
is not perfect, but much more helpful than generic error string.
Thus we should fall back to class firstly and if even no class
is presented, then we can fall back to that generic string.
Before this patch:
virsh # dompmsuspend --target mem f16
error: Domain f16 could not be suspended
error: internal error unable to execute QEMU command
'guest-suspend-ram': unknown QEMU command error
After this patch:
virsh # dompmsuspend --target mem f16
error: Domain f16 could not be suspended
error: internal error unable to execute QEMU command
'guest-suspend-ram': The command has not been found
More bug extermination in the category of:
Error: CHECKED_RETURN:
/libvirt/src/conf/network_conf.c:595:
check_return: Calling function "virAsprintf" without checking return value (as is done elsewhere 515 out of 543 times).
/libvirt/src/qemu/qemu_process.c:2780:
unchecked_value: No check of the return value of "virAsprintf(&msg, "was paused (%s)", virDomainPausedReasonTypeToString(reason))".
/libvirt/tests/commandtest.c:809:
check_return: Calling function "setsid" without checking return value (as is done elsewhere 4 out of 5 times).
/libvirt/tests/commandtest.c:830:
unchecked_value: No check of the return value of "virTestGetDebug()".
/libvirt/tests/commandtest.c:831:
check_return: Calling function "virTestGetVerbose" without checking return value (as is done elsewhere 41 out of 42 times).
/libvirt/tests/commandtest.c:833:
check_return: Calling function "virInitialize" without checking return value (as is done elsewhere 18 out of 21 times).
One note about the error in commandtest line 809: setsid() seems to fail when running the test -- could be removed ?
With RHEL 6.2, virDomainBlockPull(dom, dev, bandwidth, 0) has a race
with non-zero bandwidth: there is a window between the block_stream
and block_job_set_speed monitor commands where an unlimited amount
of data was let through, defeating the point of a throttle.
This race was first identified in commit a9d3495e, and libvirt was
able to reduce the size of the window for that race. In the meantime,
the qemu developers decided to fix things properly; per this message:
https://lists.gnu.org/archive/html/qemu-devel/2012-04/msg03793.html
the fix will be in qemu 1.1, and changes block-job-set-speed to use
a different parameter name, as well as adding a new optional parameter
to block-stream, which eliminates the race altogether.
Since our documentation already mentioned that we can refuse a non-zero
bandwidth for some hypervisors, I think the best solution is to do
just that for RHEL 6.2 qemu, so that the race is obvious to the user
(anyone using stock RHEL 6.2 binaries won't have this patch, and anyone
building their own libvirt with this patch for RHEL can also rebuild
qemu to get the modern semantics, so it is no real loss in behavior).
Meanwhile the code must be fixed to honor actual qemu 1.1 naming.
Rename the parameter to 'modern', since the naming difference now
covers more than just 'async' block-job-cancel. And while at it,
fix an unchecked integer overflow.
* src/qemu/qemu_monitor.h (enum BLOCK_JOB_CMD): Drop unused value,
rename enum to match conventions.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Reflect enum rename.
* src/qemu_qemu_monitor_json.h (qemuMonitorJSONBlockJob): Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJob): Likewise,
and support difference between RHEL 6.2 and qemu 1.1 block pull.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Reject
bandwidth during pull with too-old qemu.
* src/libvirt.c (virDomainBlockPull, virDomainBlockRebase):
Document this.
Error: UNINIT:
/libvirt/src/lxc/lxc_driver.c:1412:
var_decl: Declaring variable "fd" without initializer.
/libvirt/src/lxc/lxc_driver.c:1460:
uninit_use_in_call: Using uninitialized value "fd" when calling "virFileClose".
/libvirt/src/util/virfile.c:50:
read_parm: Reading a parameter value.
Error: DEADCODE:
/libvirt/src/lxc/lxc_controller.c:960:
dead_error_condition: On this path, the condition "ret == 4" cannot be true.
/libvirt/src/lxc/lxc_controller.c:959:
at_most: After this line, the value of "ret" is at most -1.
/libvirt/src/lxc/lxc_controller.c:959:
new_values: Noticing condition "ret < 0".
/libvirt/src/lxc/lxc_controller.c:961:
dead_error_line: Execution cannot reach this statement "continue;".
Error: UNINIT:
/libvirt/src/lxc/lxc_controller.c:1104:
var_decl: Declaring variable "consoles" without initializer.
/libvirt/src/lxc/lxc_controller.c:1237:
uninit_use: Using uninitialized value "consoles".
QEMU binary is called several times when we probe different kinds of
capabilities the binary supports. This patch introduces new common
helper so that all probes use a consistent way of invoking qemu.
https://bugzilla.redhat.com/show_bug.cgi?id=816662 pointed out
that attempting 'virsh blockpull' on an offline domain gave a
misleading error message about qemu lacking support for the
operation, even when qemu was specifically updated to support it.
The real problem is that we have several capabilities that are
only determined when starting a domain, and therefore are still
clear when first working with an inactive domain (namely, any
capability set by qemuMonitorJSONCheckCommands).
While this patch was able to hoist an existing check in one of the
three culprits, it had to add redundant checks in the other two
places (because you always have to check for an active domain after
obtaining a VM job lock, but the capability bits were being checked
prior to obtaining the job lock).
Someday it would be nice to patch libvirt to cache the set of
capabilities per qemu binary (as determined by inode and timestamp),
rather than re-probing the binary every time a domain is started,
and to teach the cache how to query the monitor during the one
time the probe is made rather than having to wait until a guest
is started; then, a capability probe would succeed even for offline
guests because it just refers to the cache, and the single check for
an active domain after grabbing the job lock would be sufficient.
But since that will involve a lot more coding, I'm happy to go
with this simpler solution for an immediate solution.
* src/qemu/qemu_driver.c (qemuDomainPMSuspendForDuration)
(qemuDomainSnapshotCreateXML, qemuDomainBlockJobImpl): Check for
offline state before checking an online-only cap.
Below patch fixes the following coverity findings
Error: OVERRUN_STATIC:
/libvirt/src/qemu/qemu_command.c:152:
overrun-buffer-val: Overrunning static array "net->mac" of size 6 bytes by passing it as an argument to a function which indexes it at byte position 15.
/libvirt/src/util/virnetdevmacvlan.c:948:
access_dbuff_const: Calling "virNetDevMacVLanVPortProfileRegisterCallback" indexes array "macaddress" at byte position 15.
/libvirt/src/util/virnetdevmacvlan.c:773:
access_dbuff_const: Calling "memcpy" indexes array "macaddress" with index "16UL" at byte position 15.
Error: OVERRUN_STATIC:
/libvirt/src/qemu/qemu_migration.c:2744:
overrun-buffer-val: Overrunning static array "net->mac" of size 6 bytes by passing it as an argument to a function which indexes it at byte position 15.
/libvirt/src/util/virnetdevmacvlan.c:773:
access_dbuff_const: Calling "memcpy" indexes array "macaddress" with index "16UL" at byte position 15.
Error: OVERRUN_STATIC:
/libvirt/src/qemu/qemu_driver.c:435:
overrun-buffer-val: Overrunning static array "net->mac" of size 6 bytes by passing it as an argument to a function which indexes it at byte position 15.
/libvirt/src/util/virnetdevmacvlan.c:1036:
access_dbuff_const: Calling "virNetDevMacVLanVPortProfileRegisterCallback" indexes array "macaddress" at byte position 15.
/libvirt/src/util/virnetdevmacvlan.c:773:
access_dbuff_const: Calling "memcpy" indexes array "macaddress" with index "16UL" at byte position 15.
This patch addresses the following coverity findings:
/libvirt/src/conf/nwfilter_params.c:390:
var_assigned: Assigning: "varValue" = null return value from "virHashLookup".
/libvirt/src/conf/nwfilter_params.c:392:
dereference: Dereferencing a pointer that might be null "varValue" when calling "virNWFilterVarValueGetNthValue".
/libvirt/src/conf/nwfilter_params.c:399:
dereference: Dereferencing a pointer that might be null "tmp" when calling "virNWFilterVarValueGetNthValue".
This patch addresses the following coverity findings:
/libvirt/src/conf/nwfilter_params.c:157:
deref_parm: Directly dereferencing parameter "val".
/libvirt/src/conf/nwfilter_params.c:473:
negative_returns: Using variable "iterIndex" as an index to array "res->iter".
/libvirt/src/nwfilter/nwfilter_ebiptables_driver.c:2891:
unchecked_value: No check of the return value of "virAsprintf(&protostr, "-d 01:80:c2:00:00:00 ")".
/libvirt/src/nwfilter/nwfilter_ebiptables_driver.c:2894:
unchecked_value: No check of the return value of "virAsprintf(&protostr, "-p 0x%04x ", l3_protocols[protoidx].attr)".
/libvirt/src/nwfilter/nwfilter_ebiptables_driver.c:3590:
var_deref_op: Dereferencing null variable "inst".
Some of the error messages in this function should have been
virReportSystemError (since they have an errno they want to log), but
were mistakenly written as netlinkError, which expects a libvirt error
code instead. The result was that when one of the errors was
encountered, "No error message provided" would be printed instead of
something meaningful (see
https://bugzilla.redhat.com/show_bug.cgi?id=816465 for an example).
Once qemu monitor reports migration has completed, we just closed our
end of the pipe and let migration tunnel die. This generated bogus error
in case we did so before the thread saw EOF on the pipe and migration
was aborted even though it was in fact successful.
With this patch we first wake up the tunnel thread and once it has read
all data from the pipe and finished the stream we close the
filedescriptor.
A small additional bonus of this patch is that real errors reported
inside qemuMigrationIOFunc are not overwritten by virStreamAbort any
more.
When QEMU reported failed or canceled migration, we correctly detected
it but didn't really consider it as an error condition and migration
protocol just went on. Luckily, some of the subsequent steps eventually
failed end we reported an (unrelated and mostly random) error back to
the caller.
Currently, non-blocking calls are either sent immediately or discarded
in case sending would block. This was implemented based on the
assumption that the non-blocking keepalive call is not needed as there
are other calls in the queue which would keep the connection alive.
However, if those calls are no-reply calls (such as those carrying
stream data), the remote party knows the connection is alive but since
we don't get any reply from it, we think the connection is dead.
This is most visible in tunnelled migration. If it happens to be longer
than keepalive timeout (30s by default), it may be unexpectedly aborted
because the connection is considered to be dead.
With this patch, we only discard non-blocking calls when the last call
with a thread is completed and thus there is no thread left to keep
sending the remaining non-blocking calls.
In some cases (spotted with broken connection during tunneled migration)
we were overwriting the original error with worse or even misleading
errors generated when we were cleaning up after failed migration.
The docs for virConnectSetKeepAlive() advertise that this function
should be able to disable keepalives on negative or zero interval time.
This patch removes the check that prohibited this and adds code to
disable keepalives on negative/zero interval.
* src/libvirt.c: virConnectSetKeepAlive(): - remove check for negative
values
* src/rpc/virnetclient.c
* src/rpc/virnetclient.h: - add virNetClientKeepAliveStop() to disable
keepalive messages
* src/remote/remote_driver.c: remoteSetKeepAlive(): -add ability to
disable keepalives
This patch resolves https://bugzilla.redhat.com/show_bug.cgi?id=815270
The function virNetDevMacVLanVPortProfileRegisterCallback() takes an
arg "virtPortProfile", and was checking it for non-NULL before using
it. However, the prototype for
virNetDevMacVLanPortProfileRegisterCallback had marked that arg with
ATTRIBUTE_NONNULL(). Contrary to what one may think,
ATTRIBUTE_NONNULL() does not provide any guarantee that an arg marked
as such really is always non-null; the only effect to the code
generated by gcc, is that gcc *assumes* it is non-NULL; this results
in, for example, the check for a non-NULL value being optimized out.
(Unfortunately, this code removal only occurs when optimization is
enabled, and I am in the habit of doing local builds with optimization
off to ease debugging, so the bug didn't show up in my earlier local
testing).
In general, virPortProfile might always be NULL, so it shouldn't be
marked as ATTRIBUTE_NONNULL. One other function prototype made this
same error, so this patch fixes it as well.
Add 2 new functions to the virSocketAddr 'class':
- virSocketAddrEqual: tests whether two IP addresses and their ports are equal
- virSocketaddSetIPv4Addr: set a virSocketAddr given a 32 bit int
This patch improves the previously added virAtomicInt implementation
by using gcc-builtins if possible. The needed builtins are available
since GCC >= 4.1. At least the 4.0 docs don't mention them.
vboxArray is not castable to a COM item type. vboxArray is a
wrapper around the XPCOM and MSCOM specific array handling.
In this case we can avoid passing NULL as an empty array to
IMachine::Delete by passing a dummy IMedium* array with a single
NULL item.
In order to track a block copy job across libvirtd restarts, we
need to save internal XML that tracks the name of the file
holding the mirror. Displaying this name in dumpxml might also
be useful to the user, even if we don't yet have a way to (re-)
start a domain with mirroring enabled up front. This is done
with a new <mirror> sub-element to <disk>, as in:
<disk type='file' device='disk'>
<driver name='qemu' type='raw'/>
<source file='/var/lib/libvirt/images/original.img'/>
<mirror file='/var/lib/libvirt/images/copy.img' format='qcow2' ready='yes'/>
...
</disk>
For now, the element is output-only, in live domains; it is ignored
when defining a domain or hot-plugging a disk (since those contexts
use VIR_DOMAIN_XML_INACTIVE in parsing). The 'ready' attribute appears
when libvirt knows that the job has changed from the initial pulling
phase over to the mirroring phase, although absence of the attribute
is not a sure indicator of the current phase. If we come up with a way
to make qemu start with mirroring enabled, we can relax the xml
restriction, and allow <mirror> (but not attribute 'ready') on input.
Testing active-only XML meant tweaking the testsuite slightly, but it
was worth it.
* docs/schemas/domaincommon.rng (diskspec): Add diskMirror.
* docs/formatdomain.html.in (elementsDisks): Document it.
* src/conf/domain_conf.h (_virDomainDiskDef): New members.
* src/conf/domain_conf.c (virDomainDiskDefFree): Clean them.
(virDomainDiskDefParseXML): Parse them, but only internally.
(virDomainDiskDefFormat): Output them.
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: New test file.
* tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-mirror.xml: Likewise.
* tests/qemuxml2xmltest.c (testInfo): Alter members.
(testCompareXMLToXMLHelper): Allow more test control.
(mymain): Run new test.
This patch introduces a new block job, useful for live storage
migration using pre-copy streaming. Justification for including
this under virDomainBlockRebase rather than adding a new command
includes: 1) there are now two possible block jobs in qemu, with
virDomainBlockRebase starting either type of command, and
virDomainBlockJobInfo and virDomainBlockJobAbort working to end
either type; 2) reusing this command allows distros to backport
this feature to the libvirt 0.9.10 API without a .so bump.
Note that a future patch may add a more powerful interface named
virDomainBlockJobCopy, dedicated to just the block copy job, in
order to expose even more options (such as setting an arbitrary
format type for the destination without having to probe it from a
pre-existing destination file); adding a new command for targetting
just block copy would be similar to how we already have
virDomainBlockPull for targetting just the block pull job.
Using a live VM with the backing chain:
base <- snap1 <- snap2
as the starting point, we have:
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY)
creates /path/to/copy with the same format as snap2, with no backing
file, so entire chain is copied and flattened
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_COPY_RAW)
creates /path/to/copy as a raw file, so entire chain is copied and
flattened
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_SHALLOW)
creates /path/to/copy with the same format as snap2, but with snap1 as
a backing file, so only snap2 is copied.
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT)
reuse existing /path/to/copy (must have empty contents, and format is
probed[*] from the metadata), and copy the full chain
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT|
VIR_DOMAIN_BLOCK_REBASE_SHALLOW)
reuse existing /path/to/copy (contents must be identical to snap1,
and format is probed[*] from the metadata), and copy only the contents
of snap2
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT|
VIR_DOMAIN_BLOCK_REBASE_SHALLOW|VIR_DOMAIN_BLOCK_REBASE_COPY_RAW)
reuse existing /path/to/copy (must be raw volume with contents
identical to snap1), and copy only the contents of snap2
Less useful combinations:
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_SHALLOW|
VIR_DOMAIN_BLOCK_REBASE_COPY_RAW)
fail if source is not raw, otherwise create /path/to/copy as raw and
the single file is copied (no chain involved)
- virDomainBlockRebase(dom, disk, "/path/to/copy", 0,
VIR_DOMAIN_BLOCK_REBASE_COPY|VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT|
VIR_DOMAIN_BLOCK_REBASE_COPY_RAW)
makes little sense: the destination must be raw but have no contents,
meaning that it is an empty file, so there is nothing to reuse
The other three flags are rejected without VIR_DOMAIN_BLOCK_COPY.
[*] Note that probing an existing file for its format can be a security
risk _if_ there is a possibility that the existing file is 'raw', in
which case the guest can manipulate the file to appear like some other
format. But, by virtue of the VIR_DOMAIN_BLOCK_REBASE_COPY_RAW flag,
it is possible to avoid probing of raw files, at which point, probing
of any remaining file type is no longer a security risk.
It would be nice if we could issue an event when pivoting from phase 1
to phase 2, but qemu hasn't implemented that, and we would have to poll
in order to synthesize it ourselves. Meanwhile, qemu will give us a
distinct job info and completion event when we either cancel or pivot
to end the job. Pivoting is accomplished via the new:
virDomainBlockJobAbort(dom, disk, VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT)
Management applications can pre-create the copy with a relative
backing file name, and use the VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT
flag to have qemu reuse the metadata; if the management application
also copies the backing files to a new location, this can be used
to perform live storage migration of an entire backing chain.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_JOB_TYPE_COPY):
New block job type.
(virDomainBlockJobAbortFlags, virDomainBlockRebaseFlags): New enums.
* src/libvirt.c (virDomainBlockRebase): Document the new flags,
and implement general restrictions on flag combinations.
(virDomainBlockJobAbort): Document the new flag.
(virDomainSaveFlags, virDomainSnapshotCreateXML)
(virDomainRevertToSnapshot, virDomainDetachDeviceFlags): Document
restrictions.
* include/libvirt/virterror.h (VIR_ERR_BLOCK_COPY_ACTIVE): New
error.
* src/util/virterror.c (virErrorMsg): Define it.
This patch modifies the CPU comparrison function to report the
incompatibilities in more detail to ease identification of problems.
* src/cpu/cpu.h:
cpuGuestData(): Add argument to return detailed error message.
* src/cpu/cpu.c:
cpuGuestData(): Add passthrough for error argument.
* src/cpu/cpu_x86.c
x86FeatureNames(): Add function to convert a CPU definition to flag
names.
x86Compute(): - Add error message parameter
- Add macro for reporting detailed error messages.
- Improve error reporting.
- Simplify calculation of forbidden flags.
x86DataIteratorInit():
x86cpuidMatchAny(): Remove functions that are no longer needed.
* src/qemu/qemu_command.c:
qemuBuildCpuArgStr(): - Modify for new function prototype
- Add detailed error reports
- Change error code on incompatible processors
to VIR_ERR_CONFIG_UNSUPPORTED instead of
internal error
* tests/cputest.c:
cpuTestGuestData(): Modify for new function prototype
virThreadSelf tries to access the virThreadPtr stored in TLS for the
current thread via TlsGetValue. When virThreadSelf is called on a thread
that was not created via virThreadCreate (e.g. the main thread) then
TlsGetValue returns NULL as TlsAlloc initializes TLS slots to NULL.
virThreadSelf can be called on the main thread via this call chain from
virsh
vshDeinit
virEventAddTimeout
virEventPollAddTimeout
virEventPollInterruptLocked
virThreadIsSelf
triggering a segfault as virThreadSelf unconditionally dereferences the
return value of TlsGetValue.
Fix this by making virThreadSelf check the TLS slot value for NULL and
setting the given virThreadPtr accordingly.
Reported by Marcel Müller.
POSIX says that sa_sigaction is only safe to use if sa_flags
includes SA_SIGINFO; conversely, sa_handler is only safe to
use when flags excludes that bit. Gnulib doesn't guarantee
an implementation of SA_SIGINFO, but does guarantee that
if SA_SIGINFO is undefined, we can safely define it to 0 as
long as we don't dereference the 2nd or 3rd argument of
any handler otherwise registered via sa_sigaction.
Based on a report by Wen Congyang.
* src/rpc/virnetserver.c (SA_SIGINFO): Stub for mingw.
(virNetServerSignalHandler): Avoid bogus dereference.
(virNetServerFatalSignal, virNetServerNew): Set flags properly.
(virNetServerAddSignalHandler): Drop unneeded #ifdef.
I almost copied-and-pasted some redundant () into my new code,
and figured a general cleanup prereq patch would be better instead.
No semantic change.
* src/conf/domain_conf.c (virDomainLeaseDefParseXML)
(virDomainDiskDefParseXML, virDomainFSDefParseXML)
(virDomainActualNetDefParseXML, virDomainNetDefParseXML)
(virDomainGraphicsDefParseXML, virDomainVideoAccelDefParseXML)
(virDomainVideoDefParseXML, virDomainHostdevFind)
(virDomainControllerInsertPreAlloced, virDomainDefParseXML)
(virDomainObjParseXML, virDomainCpuSetFormat)
(virDomainCpuSetParse, virDomainDiskDefFormat)
(virDomainActualNetDefFormat, virDomainNetDefFormat)
(virDomainTimerDefFormat, virDomainGraphicsListenDefFormat)
(virDomainDefFormatInternal, virDomainNetGetActualHostdev)
(virDomainNetGetActualBandwidth, virDomainGraphicsGetListen):
Reduce extra ().
Ensure we don't introduce any more lousy integer parsing in new
code, while avoiding a scrub-down of existing legacy code.
Note that we also need to enable sc_prohibit_atoi_atof (see cfg.mk
local-checks-to-skip) before we are bulletproof, but that also
entails scrubbing I'm not ready to do at the moment.
* src/util/util.c (virStrToLong_i, virStrToLong_ui)
(virStrToLong_l, virStrToLong_ul, virStrToLong_ll)
(virStrToLong_ull, virStrToDouble): Mark exemptions.
* src/util/virmacaddr.c (virMacAddrParse): Likewise.
* cfg.mk (sc_prohibit_strtol): New syntax check.
(exclude_file_name_regexp--sc_prohibit_strtol): Ignore files that
I'm not willing to fix yet.
(local-checks-to-skip): Re-enable sc_prohibit_atoi_atof.
https://bugzilla.redhat.com/show_bug.cgi?id=617711 reported that
even with my recent patched to allow <memory unit='G'>1</memory>,
people can still get away with trying <memory>1G</memory> and
silently get <memory unit='KiB'>1</memory> instead. While
virt-xml-validate catches the error, our C parser did not.
Not to mention that it's always fun to fix bugs while reducing
lines of code. :)
* src/conf/domain_conf.c (virDomainParseMemory): Check for parse error.
(virDomainDefParseXML): Avoid strtoll.
* src/conf/storage_conf.c (virStorageDefParsePerms): Likewise.
* src/util/xml.c (virXPathLongBase, virXPathULongBase)
(virXPathULongLong, virXPathLongLong): Likewise.
Commit 78345c68 makes at least gcc 4.1.2 on RHEL 5 complain:
cc1: warnings being treated as errors
In file included from vbox/vbox_V4_0.c:13:
vbox/vbox_tmpl.c: In function 'vboxDomainUndefineFlags':
vbox/vbox_tmpl.c:5298: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
* src/vbox/vbox_tmpl.c (vboxDomainUndefineFlags): Use union to
avoid compiler warning.
DBus connection. The HAL device code further requires that
the DBus connection is integrated with the event loop and
provides such glue logic itself.
The forthcoming FirewallD integration also requires a
dbus connection with event loop integration. Thus we need
to pull the current event loop glue out of the HAL driver.
Thus we create src/util/virdbus.{c,h} files. This contains
just one method virDBusGetSystemBus() which obtains a handle
to the single shared system bus instance, with event glue
automagically setup.
Fix the support for trusted DHCP server in the ebtables code's
hard-coded function applying DHCP only filtering rules:
Rather than using a char * use the more flexible
virNWFilterVarValuePtr that contains the trusted DHCP server(s)
IP address. Process all entries.
Since all callers so far provided NULL as parameter, no changes
are necessary in any other code.
Currently upon a migration a callback is created when a 802.1qbg link
is set to PREASSOCIATE, this should not happen because this is a no-op
on most switches, and does not lead to an ASSOCIATE state. This patch
only creates callbacks when CREATE or RESTORE is requested. Migration
and libvirtd restart scenarios are already handled elsewhere.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
The below patch fixes the following memory leak.
==20624== 24 bytes in 2 blocks are definitely lost in loss record 532 of 1,867
==20624== at 0x4A05E46: malloc (vg_replace_malloc.c:195)
==20624== by 0x38EC27FC01: strdup (strdup.c:43)
==20624== by 0x4EB6BA3: virDomainChrSourceDefCopy (domain_conf.c:1122)
==20624== by 0x495D76: qemuProcessFindCharDevicePTYs (qemu_process.c:1497)
==20624== by 0x498321: qemuProcessWaitForMonitor (qemu_process.c:1258)
==20624== by 0x49B5F9: qemuProcessStart (qemu_process.c:3652)
==20624== by 0x468B5C: qemuDomainObjStart (qemu_driver.c:4753)
==20624== by 0x469171: qemuDomainStartWithFlags (qemu_driver.c:4810)
==20624== by 0x4F21735: virDomainCreate (libvirt.c:8153)
==20624== by 0x4302BF: remoteDispatchDomainCreateHelper (remote_dispatch.h:852)
==20624== by 0x4F72C14: virNetServerProgramDispatch (virnetserverprogram.c:416)
==20624== by 0x4F6D690: virNetServerHandleJob (virnetserver.c:164)
==20624== by 0x4E8F43D: virThreadPoolWorker (threadpool.c:144)
==20624== by 0x4E8EAB5: virThreadHelper (threads-pthread.c:161)
==20624== by 0x38EC606CCA: start_thread (pthread_create.c:301)
==20624== by 0x38EC2E0C2C: clone (clone.S:115)
Most of our errors complaining about an inability to support a
particular action due to qemu limitations used CONFIG_UNSUPPORTED,
but we had a few outliers. Reported by Jiri Denemark.
* src/qemu/qemu_command.c (qemuBuildDriveDevStr): Prefer
CONFIG_UNSUPPORTED.
* src/qemu/qemu_driver.c (qemuDomainReboot)
(qemuDomainBlockJobImpl): Likewise.
* src/qemu/qemu_hotplug.c (qemuDomainAttachPciControllerDevice):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorTransaction)
(qemuMonitorBlockJob, qemuMonitorSystemWakeup): Likewise.
So that a domain xml which doesn't have "placement" specified, but
"cpuset" is specified, could be parsed. And in this case, the
"placement" mode will be set as "static".
The objects (domain, pool, network, etc) for testing are defined/
started each time when opening a connect to test driver, and thus
the UUID for the objects will be generated each time, with different
values. e.g.
% for i in {1..3}; do ./tools/virsh --connect \
test:///default dumpxml test | grep uuid; done
<uuid>a1b6ee1f-97de-f0ee-617a-0cdb74947df5</uuid>
<uuid>ee68d7d2-3eb9-593e-2769-797ce1f4c4aa</uuid>
<uuid>fecb1d3a-918a-8412-e534-76192cf32b18</uuid>
It's the potential bug which can cause operations like below to fail:
$ virsh -c test:///default dumpxml test > test.xml
[ Some modificatons, though it's not supported, but it should work ]
$ virsh -c test:///default define test.xml
This patch set fixed UUID for objects which support it. (domain,
pool, network).
A "ide-drive" device can be either a hard disk or a CD-ROM,
if there is ",media=cdrom" specified for the backend, it's
a CD-ROM, otherwise it's a hard disk.
Upstream qemu splitted "ide-drive" into "ide-hd" and "ide-cd"
since commit 1f56e32, and ",media=cdrom" is not required for
ide-cd anymore. "ide-drive" is still supported for backwards
compatibility, but no doubt we should go foward.
A "scsi-disk" device can be either a hard disk or a CD-ROM,
if there is ",media=cdrom" specified for the backend, it's
a CD-ROM, otherwise it's a hard disk.
But upstream qemu splitted "scsi-disk" into "scsi-hd" and
"scsi-cd" since commit b443ae, and ",media=cdrom" is not
required for scsi-cd anymore. "scsi-disk" is still supported
for backwards compatibility, but no doubt we should go
foward.
If console[0] is an alias for serial[0], do not enforce the former to
have a PTY source type. This breaks serial consoles on stdio and makes
no sense.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
When using the xm/xend stack to manage instances there is a bug
that causes the emulated interfaces to be unusable when the vif
config contains type=ioemu.
The current code already has a special quirk to not use this
keyword if no specific model is given for the emulated NIC
(defaulting to rtl8139).
Essentially it works because regardless of the type argument,i
the Xen stack always creates emulated and paravirt interfaces and
lets the guest decide which one to use. So neither xl nor xm stack
actually require the type keyword for emulated NICs.
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Currently, we have 3 boolean arguments we have to pass
to qemuProcessStart(). As libvirt grows it is harder and harder
to remember them and their position. Therefore we should
switch to flags instead.
lvcreate want's the parent pool's name, not the pool path
lvchange and lvremove want lv specified as $vgname/$lvname
This largely worked before because these commands strip off a
starting /dev. But https://bugzilla.redhat.com/show_bug.cgi?id=714986
is from a user using a 'nested VG' that was having problems.
I couldn't find any info on nested LVM and the reporter never responded,
but I reproduced with XML that specified a valid source name, and
set target path to a symlink.
As explained in previous patch, numad will balance the affinity
dynamically, so reflecting the cpuset from numad at the first
time doesn't make much case, and may just could cause confusion.
Instead of returning a CPUs list, numad returns NUMA node
list instead, this patch is to convert the node list to
cpumap before affinity setting. Otherwise, the domain
processes will be pinned only to CPU[$numa_cell_num],
which will cause significiant performance losses.
Also because numad will balance the affinity dynamically,
reflecting the cpuset from numad back doesn't make much
sense then, and it may just could produce confusion for
the users. Thus the better way is not to reflect it back
to XML. And in this case, it's better to ignore the cpuset
when parsing XML.
The codes to update the cpuset is removed in this patch
incidentally, and there will be a follow up patch to ignore
the manually specified "cpuset" if "placement" is "auto",
and document will be updated too.
The linux-2.6.32 kernel header does not yet define IFLA_VF_MAX and others,
which breaks compiling a new libvirt on old systems like Debian Squeeze.
(I also have to add --without-macvtap --disable-werror --without-virtualport to
./configure to get it to compile.)
Signed-off-by: Philipp Hahn <hahn@univention.de>
Although it should be harmless to do:
disk = disk = def->disks[i]
some not-so-wise compilers may fool around.
Besides, such assignment is useless here.
On newer xend (v3.x and after) there is no state and domid reported
for inactive domains. When initially creating connections this is
handled in various places by assigning domain->id = -1.
But once an instance has been running, the id is set to the current
domain id. And it does not change when the instance is shut down.
So when querying the domain info, the hypervisor driver, which gets
asked first will indicate it cannot find information, then the
xend driver is asked and will set the status to NOSTATE because it
checks for the -1 domain id.
Checking domain/status for 0 seems to be more reliable for that.
One note: I am not sure whether the domain->id also should get set
back to -1 whenever any sub-driver thinks the instance is no longer
running.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=746007
BugLink: http://bugs.launchpad.net/bugs/929626
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
This patch adds a netlink callback when migrating a VEPA enabled
virtual machine. It fixes a Bug where a VM would not request a port
association when it was cleared by lldpad.
This patch requires the latest git version of lldpad to work.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
If dynamic_ownership is off and we are creating a file on NFS
we force chown. This will fail as chown/chmod are not supported
on NFS. However, with no dynamic_ownership we are not required
to do any chown.
In my testing, I was able to provoke an odd block pull failure:
$ virsh blockpull dom vda --bandwidth 10000
error: Requested operation is not valid: No active operation on device: drive-virtio-disk0
merely by using gdb to artifically wait to do the block job set speed
until after the pull had already finished. But in reality, that should
be a success, since the pull finished before we had a chance to set
speed. Furthermore, using a double job lock is not only annoying, but
a bug in itself - if you do parallel virDomainBlockRebase, and hit
the race window just right, the first call grabs the VM job to start
a fast block job, then the second call grabs the VM job to start
a long-running job with unspecified speed, then the first call finally
regrabs the VM job and sets the speed, which ends up running the
second job under the speed from the first call. By consolidating
things into a single job, we avoid opening that race, as well as reduce
the time between starting the job and changing the speed, for less
likelihood of the speed change happening after block job completion
in the first place.
* src/qemu/qemu_monitor.h (BLOCK_JOB_CMD): Add new mode.
* src/qemu/qemu_driver.c (qemuDomainBlockRebase): Move secondary
job call...
(qemuDomainBlockJobImpl): ...here, for fewer locks.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJob): Change
return value on new internal mode.
Without the VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC flag, libvirt will internally
poll using qemu's "query-block-jobs" API and will not return until the
operation has been completed. API users are advised that this operation
is unbounded and further interaction with the domain during this period
may block. Future patches may refactor things to allow other queries in
parallel with this polling. For older qemu, we synthesize the cancellation
event, since qemu won't generate it.
The choice of polling duration copies from the code in qemu_migration.c.
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Probably in the noise, but this will let us scale more efficiently
as we learn to recognize even more qemu events.
* src/qemu/qemu_monitor_json.c (eventHandlers): Sort.
(qemuMonitorEventCompare): New helper function.
(qemuMonitorJSONIOProcessEvent): Optimize event lookup.
Block job cancellation can take a while. Now that upstream qemu 1.1
has asynchronous block cancellation, we want to expose that to the user.
Therefore, the following updates are made to the virDomainBlockJob API:
A new block job event type VIR_DOMAIN_BLOCK_JOB_CANCELED is managed by
libvirt. Regardless of the flags used with virDomainBlockJobAbort, this
event will be raised: 1. when using synchronous block_job_cancel (the
event will be synthesized by libvirt), and 2. whenever it is received
from qemu (via asynchronous block-job-cancel). Note that the event
may be detected by libvirt even before the virDomainBlockJobAbort
completes (always true when it is synthesized, but also possible if
cancellation was fast).
A new extension flag VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC is added to the
virDomainBlockJobAbort API. When enabled, this function will allow
(but not require) asynchronous operation (ie, it returns as soon as
possible, which might be before the job has actually been canceled).
When the API is used in this mode, it is the responsibility of the
caller to wait for a VIR_DOMAIN_BLOCK_JOB_CANCELED event or poll via
the virDomainGetBlockJobInfo API to check the cancellation status.
This patch also exposes the new flag through virsh, and makes virsh
slightly easier to use (--async implies --abort, and lack of any options
implies --info), although it leaves the qemu implementation for later
patches.
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
RHEL 6.2 was released with an early version of block jobs, which only
worked on the qed file format, where the commands were spelled with
underscore (contrary to QMP style), and where 'block_job_cancel' was
synchronous and did not trigger an event.
The upcoming qemu 1.1 release has fixed these short-comings [1][2]:
the commands now work on multiple file types, are spelled with dash,
and 'block-job-cancel' is asynchronous and emits an event upon conclusion.
[1]qemu commit 370521a1d6f5537ea7271c119f3fbb7b0fa57063
[2]https://lists.gnu.org/archive/html/qemu-devel/2012-04/msg01248.html
This patch recognizes the new spellings, and fixes virDomainBlockRebase
to give a graceful error when talking to a too-old qemu on a partial
rebase attempt. Fixes for the new semantics will come later. This
patch also removes a bogus ATTRIBUTE_NONNULL mistakenly added in
commit 10ec36e2.
* src/qemu/qemu_capabilities.h (QEMU_CAPS_BLOCKJOB_SYNC)
(QEMU_CAPS_BLOCKJOB_ASYNC): New bits.
* src/qemu/qemu_capabilities.c (qemuCaps): Name them.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONCheckCommands): Set
them.
(qemuMonitorJSONBlockJob): Manage both command names.
(qemuMonitorJSONDiskSnapshot): Minor formatting fix.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJob): Alter signature.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJob): Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Pass through
capability bit.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Update callers.
The new safe console handling introduced a possibility to deadlock the
qemu driver when a new console connection forcibly disconnects a
previous console stream that belongs to an already closed connection.
The virStreamFree function calls subsequently a the virReleaseConnect
function that tries to lock the driver while discarding the connection,
but the driver was already locked in qemuDomainOpenConsole.
Backtrace of the deadlocked thread:
0 0x00007f66e5aa7f14 in __lll_lock_wait () from /lib64/libpthread.so.0
1 0x00007f66e5aa3411 in _L_lock_500 () from /lib64/libpthread.so.0
2 0x00007f66e5aa322a in pthread_mutex_lock () from/lib64/libpthread.so.0
3 0x0000000000462bbd in qemudClose ()
4 0x00007f66e6e178eb in virReleaseConnect () from/usr/lib64/libvirt.so.0
5 0x00007f66e6e19c8c in virUnrefStream () from /usr/lib64/libvirt.so.0
6 0x00007f66e6e3d1de in virStreamFree () from /usr/lib64/libvirt.so.0
7 0x00007f66e6e09a5d in virConsoleHashEntryFree () from/usr/lib64/libvirt.so.0
8 0x00007f66e6db7282 in virHashRemoveEntry () from/usr/lib64/libvirt.so.0
9 0x00007f66e6e09c4e in virConsoleOpen () from /usr/lib64/libvirt.so.0
10 0x00000000004526e9 in qemuDomainOpenConsole ()
11 0x00007f66e6e421f1 in virDomainOpenConsole () from/usr/lib64/libvirt.so.0
12 0x00000000004361e4 in remoteDispatchDomainOpenConsoleHelper ()
13 0x00007f66e6e80375 in virNetServerProgramDispatch () from/usr/lib64/libvirt.so.0
14 0x00007f66e6e7ae11 in virNetServerHandleJob () from/usr/lib64/libvirt.so.0
15 0x00007f66e6da897d in virThreadPoolWorker () from/usr/lib64/libvirt.so.0
16 0x00007f66e6da7ff6 in virThreadHelper () from/usr/lib64/libvirt.so.0
17 0x00007f66e5aa0c5c in start_thread () from /lib64/libpthread.so.0
18 0x00007f66e57e7fcd in clone () from /lib64/libc.so.6
* src/qemu/qemu_driver.c: qemuDomainOpenConsole()
-- unlock the qemu driver right after acquiring the domain
object
In case an API fails with "cannot acquire state change lock", searching
for the API that possibly forgot to end its job is not always easy.
Let's keep track of the job owner and print it out for easier
identification.
As reported by Daniel Berrangé, we have a huge performance regression
for virDomainGetInfo() due to the change which makes virDomainEndJob()
save the XML status file every time it is called. Previous to that
change, 2000 calls to virDomainGetInfo() took ~2.5 seconds. After that
change, 2000 calls to virDomainGetInfo() take 2 *minutes* 45 secs.
We made the change to be able to recover from libvirtd restart in the
middle of a job. However, only destroy and async jobs are taken care of.
Thus it makes more sense to only save domain state XML when these jobs
are started/stopped.
I noticed these compiler warnings when building for the s390 architecture.
* src/node_device/node_device_udev.c (udevDeviceMonitorStartup):
Mark unused variable.
* src/nodeinfo.c (linuxNodeInfoCPUPopulate): Avoid unused variable.
* src/qemu/qemu_command.c: Wire up -bios with <loader>
* tests/qemuxml2argvdata/qemuxml2argv-bios.args,
tests/qemuxml2argvdata/qemuxml2argv-bios.xml: Expand
existing BIOS test case to cover <loader>
Leak introduced in commit 0436d32. If we allocate an actions array,
but fail early enough to never consume it with the qemu monitor
transaction call, we leaked memory.
But our semantics of making the transaction command free the caller's
memory is awkward; avoiding the memory leak requires making every
intermediate function in the call chain check for error. It is much
easier to fix things so that the function that allocates also frees,
while the call chain leaves the caller's data intact. To do that,
I had to hack our JSON data structure to make it easy to protect a
portion of an arbitrary JSON tree from being freed.
* src/util/json.h (virJSONType): Name the enum.
(_virJSONValue): New field.
* src/util/json.c (virJSONValueFree): Use it to protect a portion
of an array.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONTransaction): Avoid
freeing caller's data.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive):
Free actions array on failure.
We can tell qemuDomainSnapshotFSThaw if we want it to report errors or
not. However, if we don't want to and an error has been already set by
previous qemuReportError() we must keep copy of that error not just a
pointer to it. Otherwise, it get overwritten if FSThaw reports an error.
Detected by valgrind. Leaks are introduced in commit b22eaa7.
* src/conf/domain_conf.c (virDomainDiskDefParseXML): fix memory leaks.
How to reproduce?
% make && make -C tests check TESTS=qemuxml2argvtest
% cd tests && valgrind -v --leak-check=full ./qemuxml2argvtest
actual result:
==2143== 12 bytes in 2 blocks are definitely lost in loss record 74 of 179
==2143== at 0x4A05FDE: malloc (vg_replace_malloc.c:236)
==2143== by 0x39D90A67DD: xmlStrndup (xmlstring.c:45)
==2143== by 0x4F5EC0: virDomainDiskDefParseXML (domain_conf.c:3438)
==2143== by 0x502F00: virDomainDefParseXML (domain_conf.c:8304)
==2143== by 0x505FE3: virDomainDefParseNode (domain_conf.c:9080)
==2143== by 0x5069AE: virDomainDefParse (domain_conf.c:9030)
==2143== by 0x41CBF4: testCompareXMLToArgvHelper (qemuxml2argvtest.c:105)
==2143== by 0x41E5DD: virtTestRun (testutils.c:145)
==2143== by 0x416FA3: mymain (qemuxml2argvtest.c:399)
==2143== by 0x41DCB7: virtTestMain (testutils.c:700)
==2143== by 0x39CF01ECDC: (below main) (libc-start.c:226)
Signed-off-by: Alex Jia <ajia@redhat.com>
If the daemon is restarted it will lose list of active
USB devices assigned to active domains. Therefore we need
to rebuild this list on qemuProcessReconnect().
To prevent assigning one USB device to two domains,
we keep a list of assigned USB devices. On domain
startup - qemuProcessStart() - we insert devices
used by domain into the list but remove them only
on detach-device. Devices are, however, released
on qemuProcessStop() as well.
Originally, qemuDomainCheckEjectableMedia was entering monitor with qemu
driver lock. Commit 2067e31bf9, which I
made to fix that, revealed another issue we had (but didn't notice it
since the driver was locked): we didn't set nested job when
qemuDomainCheckEjectableMedia is called during migration. Thus the
original fix I made was wrong.
XenD-3.1 introduced managed domains. HV-domains have rtc_timeoffset
(hgd24f37b31030 from 2007-04-03), which tracks the offset between the
hypervisors clock and the domains RTC, and is persisted by XenD.
In combination with localtime=1 this had a bug until XenD-3.4
(hg5d701be7c37b from 2009-04-01) (I'm not 100% sure how that bug
manifests, but at least for me in TZ=Europe/Berlin I see the previous
offset relative to utc being applied to localtime again, which manifests
in an extra hour being added)
XenD implements the following variants for clock/@offset:
- PV domains don't have a RTC → 'localtime' | 'utc'
- <3.1: no managed domains → 'localtime' | 'utc'
- ≥3.1: the offset is tracked for HV → 'variable'
due to the localtime=1 bug → 'localtime' | 'utc'
- ≥3.4: the offset is tracked for HV → 'variable'
Current libvirtd still thinks XenD only implements <clock offset='utc'/>
and <clock offset='localtime'/>, which is wrong, since the semantic of
'utc' and 'localtime' specifies, that the offset will be reset on
domain-restart, while with 'variable' the offset is kept. (keeping the
offset over "virsh edit" is important, since otherwise the clock might
jump, which confuses certain guest OSs)
xendConfigVersion was last incremented to 4 by the xen-folks for
xen-3.1.0. I know of no way to reliably detect the version of XenD
(user space tools), which may be different from the version of the
hypervisor (kernel) version! Because of this only the change from
'utc'/'localtime' to 'variable' in XenD-3.1 is handled, not the buggy
behaviour of XenD-3.1 until XenD-3.4.
For backward compatibility with previous versions of libvirt Xen-HV
still accepts 'utc' and 'localtime', but they are returned as 'variable'
on the next read-back from Xend to libvirt, since this is what XenD
implements: The RTC is NOT reset back to the specified time on next
restart, but the previous offset is kept.
This behaviour can be turned off by adding the additional attribute
adjustment='reset', in which case libvirt will report an error instead
of doing the conversion. The attribute can also be used as a shortcut to
offset='variable' with basis='...'.
With these changes, it is also necessary to adjust the xen tests:
"localtime = 0" is always inserted, because otherwise on updates the
value is not changed within XenD.
adjustment='reset' is inserted for all cases, since they're all <
XEND_CONFIG_VERSION_3_1_0, only 3.1 introduced persistent
rtc_timeoffset.
Some statements change their order because code was moved around.
Signed-off-by: Philipp Hahn <hahn@univention.de>
Since Xen 3.1 the clock=variable semantic is supported. In addition to
qemu/kvm Xen also knows about a variant where the offset is relative to
'localtime' instead of 'utc'.
Extends the libvirt structure with a flag 'basis' to specify, if the
offset is relative to 'localtime' or 'utc'.
Extends the libvirt structure with a flag 'reset' to force the reset
behaviour of 'localtime' and 'utc'; this is needed for backward
compatibility with previous versions of libvirt, since they report
incorrect XML.
Adapt the only user 'qemu' to the new name.
Extend the RelaxNG schema accordingly.
Document the new 'basis' attribute in the HTML documentation.
Adapt test for the new attribute.
Signed-off-by: Philipp Hahn <hahn@univention.de>
https://bugzilla.redhat.com/show_bug.cgi?id=808979
The leak is really in virProcessInfoGetAffinity, as shown in the
valgrind output given in the above bug report - it calls CPU_ALLOC(),
but then fails to call CPU_FREE().
This leak has existed in every version of libvirt since 0.7.5.
Commit 1b1402b introduced a regression. Since older libvirt versions
would silently round memory up (until the previous patch), but populated
current memory based on querying the guest, it was possible to have
dumpxml show cur > max by the amount of the rounding. For example, if
a user requested 1048570 KiB memory (just shy of 1GiB), the qemu
driver would actually run with 1048576 KiB, and libvirt 0.9.10 would
output a current that was 6KiB larger than the maximum. Situations
where this could have an impact include, but are not limited to,
migration from old to new libvirt, managedsave in old libvirt and
start in new libvirt, snapshot creation in old libvirt and revert in
new libvirt - without this patch, the new libvirt would reject the
VM because of the rounding discrepancy.
Fix things by adding a fuzz factor, and silently clamp current down to
maximum in that case, rather than failing to reparse XML for an existing
VM. From a practical standpoint, this has no user impact: 'virsh
dumpxml' will continue to query the running guest rather than rely on
the incoming xml, which will see the currect current value, and even if
clamping down occurs during parsing, it will be by at most the fuzz
factor of a megabyte alignment, and rounded back up when passed back to
the hypervisor.
Meanwhile, we continue to reject cur > max if the difference is beyond
the fuzz factor of nearest megabyte. But this is not a real change in
behavior, since with 0.9.10, even though the parser allowed it, later
in the processing stream we would reject it at the qemu layer; so
rejecting it in the parser just moves error detection to a nicer place.
* src/conf/domain_conf.c (virDomainDefParseXML): Don't reject
existing XML.
Based on a report by Zhou Peng.
If we round up a user's memory request, we should update the XML
to reflect the actual value in use by the VM, rather than giving
an artificially small value back to the user.
* src/qemu/qemu_command.c (qemuBuildNumaArgStr)
(qemuBuildCommandLine): Reflect rounding back to XML.
This patch was created to resolve this upstream bug:
https://bugzilla.redhat.com/show_bug.cgi?id=784767
and is at least a partial solution to this RHEL RFE:
https://bugzilla.redhat.com/show_bug.cgi?id=805071
Previously the only attribute of a network device that could be
modified by virUpdateDeviceFlags() ("virsh update-device") was the
link state; attempts to change any other attribute would log an error
and fail.
This patch adds recognition of a change in bridge device name, and
supports reconnecting the guest's interface to the new device.
Standard audit logs for detaching and attaching a network device are
also generated. Although the current auditing function doesn't log the
bridge being attached to, this will later be changed in a separate
patch.
Regression introduced when we changed types in commit 3e2c3d8f6.
We've done this sort of cleanup before (see commit c685993d7).
* src/conf/storage_conf.c (virStoragePoolDefFormat)
(virStorageVolTargetDefFormat): Cast gid_t and uid_t.
We are so close to a release that we don't want to pull in a
gnulib submodule update and risk regressions, since there has
been a lot of other gnulib churn upstream. However, there are
a couple of gnulib issues that are worth fixing in isolation,
by applying local patches to gnulib.
There was an upstream gnulib bug in maint.mk that rendered most
of our syntax checks ineffective (and fixing it flushed out a
minor bug in our code):
https://lists.gnu.org/archive/html/bug-gnulib/2012-03/msg00194.html
There is still an upstream bug where gnulib uses the wrong type
for ssize_t on mingw; we need the fix now even though it has not
yet been accepted into gnulib:
https://lists.gnu.org/archive/html/bug-gnulib/2012-03/msg00188.html
* gnulib/local/top/maint.mk.diff: Pick up upstream gnulib
maint.mk.
* gnulib/local/m4/ssize_t.m4.diff: Work around gnulib bug.
* src/libvirt.c: Remove unused header.
* cfg.mk
(exclude_file_name_regexp--sc_prohibit_empty_lines_at_EOF): Exempt
gnulib local files.
qemuBuildHostNetStr had a switch-within-a-switch where both were
looking at the same variable. This was apparently to take advantage of
code common to three different cases (while also taking care of some
code that was different). However, there were only 2 lines common to
all, one of those can be eliminated by merging it into the
virAsprintfs that are in each case. On top of that, all the extra
empty cases cause Coverity complaints (because they are unreachable),
but absence of the empty cases causes a compile error due to
"enumeration value not handled in switch".
The solution is to just make each toplevel case independent, folding
in the common code to each.
commit b0e2bb33 set a default value for the SPICE agent channel by
inserting it during parsing of the channel XML. That method of setting
a default is problematic because it makes a format/parse roundtrip
unclean, and experience with setting other values as a side effect of
parsing has led to headaches (e.g. automatically setting a MAC address
in the parser when one isn't specified in the input XML).
This patch does not revert commit b0e2bb33 (it will be reverted in a
separate patch) but adds the alternate implementation of simply
inserting the default value in the appropriate place on the qemu
commandline when no value is provided.
If we issue guest command and GA is not running, the issuing thread
will block endlessly. We can check for GA presence by issuing
guest-sync with unique ID (timestamp). We don't want to issue real
command as even if GA is not running, once it is started, it process
all commands written to GA socket.
With latest gnulib we are checking even the lowest level functions
whether they check flags. Moreover, we are shadowing the real error
on system without TUNSETIFF support.
The code is splattered with a mix of
sizeof foo
sizeof (foo)
sizeof(foo)
Standardize on sizeof(foo) and add a syntax check rule to
enforce it
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A handful of places used %zd for format specifiers even
though the args was size_t, not ssize_t.
* src/remote/remote_driver.c, src/util/xml.c: s/%zd/%zu/
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
* src/conf/domain_conf.c (virDomainChannelDefCheckABIStability): avoid
crashing libvirtd due to derefing a NULL pointer.
For details, please see bug:
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=808371
Signed-off-by: Alex Jia <ajia@redhat.com>
When qemu cannot start, we may call qemuProcessStop() twice.
We have check whether the vm is running at the beginning of
qemuProcessStop() to avoid libvirt deadlock. We call
qemuProcessStop() with driver and vm locked. It seems that
we can avoid libvirt deadlock. But unfortunately we may
unlock driver and vm in the function qemuProcessKill() while
vm->def->id is not -1. So qemuProcessStop() will be run twice,
and monitor will be freed unexpectedly. So we should set
vm->def->id to -1 at the beginning of qemuProcessStop().
An upstream gnulib bug[1] meant that some of our syntax checks
weren't being run. Fix up our offenders before we upgrade to
a newer gnulib.
[1] https://lists.gnu.org/archive/html/bug-gnulib/2012-03/msg00194.html
* src/util/virnetdevtap.c (virNetDevTapCreate): Use flags.
* tests/lxcxml2xmltest.c (mymain): Strip useless ().
In the current V3 migration protocol, Libvirt does not
check the result of the function
qemuMigrationVPAssociatePortProfiles
This means that it is possible for a migration to complete
successfully even when the VM loses network connectivity on
the destination host.
With this change libvirt aborts the migration
(during the "finish" step) when the above function fails, that
is to say when at least one of the port profile associations fails.
Signed-off by: Christian Benvenuti <benve@cisco.com>
libvirt documentation for channels with type 'spicevmc' says that the
'target' child node has:
"an optional attribute name controls how the guest will have access
to the channel, and defaults to name='com.redhat.spice.0'."
However, this default value is never set in libvirt code base,
there's only a check in qemu_command.c to error out if the name
attribute doesn't have the expected value (if it's set).
This commit sets a default target name for spicevmc channels during
the domain configuration parsing so that the code agrees with the
documentation.
Commit d42a2ff caused a regression in creating a disk-only snapshot
of a qcow2 disk; by passing the wrong variable to the monitor call,
libvirt ended up creating JSON that looked like "format":null instead
of the intended "format":"qcow2".
To make it easier to diagnose this in the future, make JSON creation
error out if "s:arg" is paired with NULL (it is still possible to
use "n:arg" in the rare cases where qemu will accept a null).
* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive): Pass correct value.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONMakeCommandRaw):
Improve error message.
Pass argv to the init binary of LXC, using a new <initarg> element.
* docs/formatdomain.html.in: Document <os> usage for containers
* docs/schemas/domaincommon.rng: Add <initarg> element
* src/conf/domain_conf.c, src/conf/domain_conf.h: parsing and
formatting of <initarg>
* src/lxc/lxc_container.c: Setup LXC argv
* tests/Makefile.am, tests/lxcxml2xmldata/lxc-systemd.xml,
tests/lxcxml2xmltest.c, tests/testutilslxc.c,
tests/testutilslxc.h: Test parsing/formatting of LXC related
XML parts
The SELinux mount point moved from /selinux to /sys/fs/selinux
when systemd came along.
* configure.ac: Probe for SELinux mount point
* src/lxc/lxc_container.c: Use SELinux mount point determined
by configure.ac
When libvirtd is restarted, also restart the netlink event
message callbacks for existing VEPA connections and send
a message to lldpad for these existing links, so it learns
the new libvirtd pid.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
This avoids possible deadlock of the qemu driver in case a domain is
begin migrated (in Begin phase) and unrelated connection to qemu driver
is closed at the right time.
I checked all callers of qemuDomainCheckEjectableMedia() and they are
calling this function with qemu driver locked.
Found when attempting to build on Fedora 17 alpha with:
./autogen.sh --system --enable-compile-warnings=error
(this same build command works without problem on Fedora 16). Since
the consumer of the qemuProcessReconnectData doesn't assume that the
other fields of the struct are initialized (although it uses them
internally), the simpler solution is to just switch to C99-style
struct initialization (which doesn't require specification of all
fields).
libvirt always adds -Werror-frame-larger-than=4096 to the flags when
it builds. When building on Fedora 17, two functions with multiple
1024 buffers declared inside if {} blocks would generate frame size
errors; apparently the version of gcc on Fedora 16 will merge these
multiple buffers into a single buffer even when optimization is off,
but Fedora 17 won't.
The fix is to declare a single 1024 buffer at the top of the two
offending functions, and reuse the single buffer throughout the
functions.
Return statements with parameter enclosed in parentheses were modified
and parentheses were removed. The whole change was scripted, here is how:
List of files was obtained using this command:
git grep -l -e '\<return\s*([^()]*\(([^()]*)[^()]*\)*)\s*;' | \
grep -e '\.[ch]$' -e '\.py$'
Found files were modified with this command:
sed -i -e \
's_^\(.*\<return\)\s*(\(\([^()]*([^()]*)[^()]*\)*\))\s*\(;.*$\)_\1 \2\4_' \
-e 's_^\(.*\<return\)\s*(\([^()]*\))\s*\(;.*$\)_\1 \2\3_'
Then checked for nonsense.
The whole command looks like this:
git grep -l -e '\<return\s*([^()]*\(([^()]*)[^()]*\)*)\s*;' | \
grep -e '\.[ch]$' -e '\.py$' | xargs sed -i -e \
's_^\(.*\<return\)\s*(\(\([^()]*([^()]*)[^()]*\)*\))\s*\(;.*$\)_\1 \2\4_' \
-e 's_^\(.*\<return\)\s*(\([^()]*\))\s*\(;.*$\)_\1 \2\3_'
When qparams support was dropped in commit bc1ff160, we forgot
to add tests to ensure that viruri can do the same round trip
handling of a URI. This round trip was broken, due to use
of the old 'query' field of xmlUriPtr, instead of the new
'query_raw'
Also, we forgot to report an OOM error.
* tests/viruritest.c (mymain): Add tests based on just-deleted
qparamtest.
(testURIParse): Allow difference in input and expected output.
* src/util/viruri.c (virURIFormat): Add missing error. Use
query_raw, instead of query for xmlUriPtr object.
The oVirt developers have stated that the real reasons they want
to have qemu reuse existing volumes when creating a snapshot are:
1. the management framework is set up so that creation has to be
done from a central node for proper resource tracking, and having
libvirt and/or qemu create things violates the framework, and
2. qemu defaults to creating snapshots with an absolute path to
the backing file, but oVirt wants to manage a backing chain that
uses just relative names, to allow for easier migration of a chain
across storage locations.
When 0.9.10 added VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT (commit
4e9953a4), it only addressed point 1, but libvirt was still using
O_TRUNC which violates point 2. Meanwhile, the new qemu
'transaction' monitor command includes a new optional mode argument
that will force qemu to reuse the metadata of the file it just
opened (with the burden on the caller to have valid metadata there
in the first place). So, this tweaks the meaning of the flag to
cover both points as intended for use by oVirt. It is not strictly
backward-compatible to 0.9.10 behavior, but it can be argued that
the O_TRUNC of 0.9.10 was a bug.
Note that this flag is all-or-nothing, and only selects between
'existing' and the default 'absolute-paths'. A more flexible
approach that would allow per-disk selections, as well as adding
support for the 'no-backing-file' mode, would be possible by
extending the <domainsnapshot> xml to have a per-disk mode, but
until we have a management application expressing a need for that
additional complexity, it is not worth doing.
* src/libvirt.c (virDomainSnapshotCreateXML): Tweak documentation.
* src/qemu/qemu_monitor.h (qemuMonitorDiskSnapshot): Add
parameters.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDiskSnapshot):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorDiskSnapshot): Pass them
through.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDiskSnapshot): Use
new monitor command arguments.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive)
(qemuDomainSnapshotCreateSingleDiskActive): Adjust callers.
(qemuDomainSnapshotDiskPrepare): Allow qed, modify rules on reuse.
The hardest part about adding transactions is not using the new
monitor command, but undoing the partial changes we made prior
to a failed transaction.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive): Use
transaction when available.
(qemuDomainSnapshotUndoSingleDiskActive): New function.
(qemuDomainSnapshotCreateSingleDiskActive): Pass through actions.
(qemuDomainSnapshotCreateXML): Adjust caller.
QEmu 1.1 is adding a 'transaction' command to the JSON monitor.
Each element of a transaction corresponds to a top-level command,
with the additional guarantee that the transaction flushes all
pending I/O, then guarantees that all actions will be successful
as a group or that failure will roll back the state to what it
was before the monitor command. The difference between a
top-level command:
{ "execute": "blockdev-snapshot-sync", "arguments":
{ "device": "virtio0", ... } }
and a transaction:
{ "execute": "transaction", "arguments":
{ "actions": [
{ "type": "blockdev-snapshot-sync", "data":
{ "device": "virtio0", ... } } ] } }
is just a couple of changed key names and nesting the shorter
command inside a JSON array to the longer command. This patch
just adds the framework; the next patch will actually use a
transaction.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONMakeCommand): Move
guts...
(qemuMonitorJSONMakeCommandRaw): ...into new helper. Add support
for array element.
(qemuMonitorJSONTransaction): New command.
(qemuMonitorJSONDiskSnapshot): Support use in a transaction.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDiskSnapshot): Add
argument.
(qemuMonitorJSONTransaction): New declaration.
* src/qemu/qemu_monitor.h (qemuMonitorTransaction): Likewise.
(qemuMonitorDiskSnapshot): Add argument.
* src/qemu/qemu_monitor.c (qemuMonitorTransaction): New wrapper.
(qemuMonitorDiskSnapshot): Pass argument on.
* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive): Update caller.
Taking an external snapshot of just one disk is atomic, without having
to pause and resume the VM. This also paves the way for later patches
to interact with the new qemu 'transaction' monitor command.
The various scenarios when requesting atomic are:
online, 1 disk, old qemu - safe, allowed by this patch
online, more than 1 disk, old qemu - failure, this patch
offline snapshot - safe, once a future patch implements offline disk snapshot
online, 1 or more disks, new qemu - safe, once future patch uses transaction
Taking an online system checkpoint snapshot is atomic, since it is
done via a single 'savevm' monitor command. Taking an offline system
checkpoint snapshot is atomic, thanks to the previous patch.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Support
new flag for single-disk setups.
(qemuDomainSnapshotDiskPrepare): Check for atomic here.
(qemuDomainSnapshotCreateDiskActive): Skip pausing the VM when
atomic supported.
(qemuDomainSnapshotIsAllowed): Use bool instead of int.
Offline internal snapshots can be rolled back with just a little
bit of refactoring, meaning that we are now automatically atomic.
* src/qemu/qemu_domain.c (qemuDomainSnapshotForEachQcow2): Move
guts...
(qemuDomainSnapshotForEachQcow2Raw): ...to new helper, to allow
rollbacks.
Right now, it is appallingly easy to cause qemu disk snapshots
to alter a domain then fail; for example, by requesting a two-disk
snapshot where the second disk name resides on read-only storage.
In this failure scenario, libvirt reports failure, but modifies
the live domain XML in-place to record that the first disk snapshot
was taken; and places a difficult burden on the management app
to grab the XML and reparse it to see which disks, if any, were
altered by the partial snapshot.
This patch adds a new flag where implementations can request that
the hypervisor make snapshots atomically; either no changes to
XML occur, or all disks were altered as a group. If you request
the flag, you either get outright failure up front, or you take
advantage of hypervisor abilities to make an atomic snapshot. Of
course, drivers should prefer the atomic means even without the
flag explicitly requested.
There's no way to make snapshots 100% bulletproof - even if the
hypervisor does it perfectly atomic, we could run out of memory
during the followup tasks of updating our in-memory XML, and report
a failure. However, these sorts of catastrophic failures are rare
and unlikely, and it is still nicer to know that either all
snapshots happened or none of them, as that is an easier state to
recover from.
* include/libvirt/libvirt.h.in
(VIR_DOMAIN_SNAPSHOT_CREATE_ATOMIC): New flag.
* src/libvirt.c (virDomainSnapshotCreateXML): Document it.
* tools/virsh.c (cmdSnapshotCreate, cmdSnapshotCreateAs): Expose it.
* tools/virsh.pod (snapshot-create, snapshot-create-as): Document
it.
We need a capability bit to gracefully error out if some of the
additions in future patches can't be implemented by the running qemu.
* src/qemu/qemu_capabilities.h (QEMU_CAPS_TRANSACTION): New cap.
* src/qemu/qemu_capabilities.c (qemuCaps): Name it.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONCheckCommands): Set
it.
Recent changes have caused build failures on systems where pdwtags works:
commit a26a196 mistakenly exported a public variable
commits a26a196, 57ddcc2, 487c063 all had copy-paste bugs in
hand-updating the golden API rather than rerunning pdwtags
* include/libvirt/libvirt.h.in (virDomainEventTrayChangeReason):
Make this a typedef, not external storage.
* src/remote_protocol-structs (remote_procedure): Fix spelling.
This introduces a new running reason VIR_DOMAIN_RUNNING_WAKEUP,
and new suspend event type VIR_DOMAIN_EVENT_STARTED_WAKEUP.
While a wakeup event is emitted, the domain which entered into
VIR_DOMAIN_PMSUSPENDED will be transferred to "running"
with reason VIR_DOMAIN_RUNNING_WAKEUP, and a new domain lifecycle
event emitted with type VIR_DOMAIN_EVENT_STARTED_WAKEUP.
This introduces a new domain state pmsuspended to represent
the domain which has been suspended by guest power management,
e.g. (entered itno s3 state). Because a "running" state could
be confused in this case, one will see the guest is paused
actually while playing. And state "paused" is for the domain
which was paused by virDomainSuspend.
This patch introduces a new event type for the QMP event
SUSPEND:
VIR_DOMAIN_EVENT_ID_PMSUSPEND
The event doesn't take any data, but considering there might
be reason for wakeup in future, the callback definition is:
typedef void
(*virConnectDomainEventSuspendCallback)(virConnectPtr conn,
virDomainPtr dom,
int reason,
void *opaque);
"reason" is unused currently, always passes "0".
This patch introduces a new event type for the QMP event
WAKEUP:
VIR_DOMAIN_EVENT_ID_PMWAKEUP
The event doesn't take any data, but considering there might
be reason for wakeup in future, the callback definition is:
typedef void
(*virConnectDomainEventWakeupCallback)(virConnectPtr conn,
virDomainPtr dom,
int reason,
void *opaque);
"reason" is unused currently, always passes "0".
This is similiar with physical world, one will be surprised if the
box starts with medium exists while the tray is open.
New tests are added, tests disk-{cdrom,floppy}-tray are for the qemu
supports "-device" flag, and disk-{cdrom,floppy}-no-device-cap are
for old qemu, i.e. which doesn't support "-device" flag.
This patch introduces a new event type for the QMP event
DEVICE_TRAY_MOVED, which occurs when the tray of a removable
disk is moved (i.e opened or closed):
VIR_DOMAIN_EVENT_ID_TRAY_CHANGE
The event's data includes the device alias and the reason
for tray status' changing, which indicates why the tray
status was changed. Thus the callback definition for the event
is:
enum {
VIR_DOMAIN_EVENT_TRAY_CHANGE_OPEN = 0,
VIR_DOMAIN_EVENT_TRAY_CHANGE_CLOSE,
\#ifdef VIR_ENUM_SENTINELS
VIR_DOMAIN_EVENT_TRAY_CHANGE_LAST
\#endif
} virDomainEventTrayChangeReason;
typedef void
(*virConnectDomainEventTrayChangeCallback)(virConnectPtr conn,
virDomainPtr dom,
const char *devAlias,
int reason,
void *opaque);
Libvirt on x86 parses 'dmidecode' to gather characteristics of host
system. On PowerPC, this is now implemented by reading /proc/cpuinfo
NOTE: memory-DIMM information is not presently implemented.
Acked-by: Daniel Veillard <veillard@redhat.com>
Acked-by: Daniel P Berrange <berrange@redhat.com>
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
When SASL requests auth credentials, try to look them up in the
config file first. If any are found, remove them from the list
that the user is prompted for
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
SASL may prompt for credentials after either a 'start' or 'step'
invocation. In both cases the code to handle this is the same.
Refactor this code into a separate method to reduce the duplication,
since the complexity is about to grow
* src/remote/remote_driver.c: Refactor interaction with SASL
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ensure that the functions in virauth.h have names matching the file
prefix, by renaming virRequest{Username,Password} to
virAuthGet{Username,Password}
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
To follow latest naming conventions, rename src/util/authhelper.[ch]
to src/util/virauth.[ch].
* src/util/authhelper.[ch]: Rename to src/util/virauth.[ch]
* src/esx/esx_driver.c, src/hyperv/hyperv_driver.c,
src/phyp/phyp_driver.c, src/xenapi/xenapi_driver.c: Update
for renamed include files
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The '.ini' file format is a useful alternative to the existing
config file style, when you need to have config files which
are hashes of hashes. The 'virKeyFilePtr' object provides a
way to parse these file types.
* src/Makefile.am, src/util/virkeyfile.c,
src/util/virkeyfile.h: Add .ini file parser
* tests/Makefile.am, tests/virkeyfiletest.c: Test
basic parsing capabilities
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Convert drivers currently using the qparams APIs, to instead
use the virURIPtr query parameters directly.
* src/esx/esx_util.c, src/hyperv/hyperv_util.c,
src/remote/remote_driver.c, src/xenapi/xenapi_utils.c: Remove
use of qparams
* src/util/qparams.h, src/util/qparams.c: Delete
* src/Makefile.am, src/libvirt_private.syms: Remove qparams
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Avoid the need for each driver to parse query parameters itself
by storing them directly in the virURIPtr struct. The parsing
code is a copy of that from src/util/qparams.c The latter will
be removed in a later patch
* src/util/viruri.h: Add query params to virURIPtr
* src/util/viruri.c: Parse query parameters when creating virURIPtr
* tests/viruritest.c: Expand test to cover params
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of just typedef'ing the xmlURIPtr struct for virURIPtr,
use a custom libvirt struct. This allows us to fix various
problems with libxml2. This initially just fixes the query vs
query_raw handling problems.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The parameter in the virURIFormat impl mistakenly used the
xmlURIPtr type, instead of virURIPtr. Since they will soon
cease to be identical, this needs fixing
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Since we defined a custom virURIPtr type, we should use a
virURIFree method instead of assuming it will always be
a typedef for xmlURIPtr
* src/util/viruri.c, src/util/viruri.h, src/libvirt_private.syms:
Add a virURIFree method
* src/datatypes.c, src/esx/esx_driver.c, src/libvirt.c,
src/qemu/qemu_migration.c, src/vmx/vmx.c, src/xen/xend_internal.c,
tests/viruritest.c: s/xmlFreeURI/virURIFree/
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When a client which started non-p2p migration dies in a bad time, the
source libvirtd never clears the migration job and almost nothing can be
done with the domain without restarting the daemon. This patch makes use
of connection close callbacks and ensures that migration job is properly
discarded when the client disconnects.
Destination daemon should not rely on the client or source daemon
(depending on the type of migration) to call Finish when migration
fails, because the client may crash before it can do so. The domain
prepared for incoming migration is set to be destroyed (and migration
job cleaned up) when connection with the client closes but this is not
enough. If the associated qemu process crashes after Prepare step and
the domain is cleaned up before the connection gets closed, autodestroy
is not called for the domain and migration jobs remains set. In case the
domain is defined on destination host (i.e., it is not completely
removed once destroyed) we keep the job set for ever. To fix this, we
register a cleanup callback which is responsible to clean migration-in
job when a domain dies anywhere between Prepare and Finish steps. Note
that we can't blindly clean any job when spotting EOF on monitor since
normally an API is running at that time.
This reverts commit 61f2b6ba5f and most of
commit d8916dc8e2, which effectively
brings back commit ef1065cf5a written by
Jim Fehlig:
The qemu migration speed default is 32MiB/s as defined in migration.c
/* Migration speed throttling */
static int64_t max_throttle = (32 << 20);
There's no need to throttle migration when targeting a file, so set
migration speed to unlimited prior to migration, and restore to libvirt
default value after migration.
Default units is MB for migrate_set_speed monitor command, so
(INT64_MAX / (1024 * 1024)) is used for unlimited migration speed.
This was reverted because migration to file could not be canceled and
even monitored since qemu was not processing any monitor commands until
the migration finished. This is now different as we make sure the
file descriptor we pass to qemu is able to properly report EAGAIN.
Recent qemu changes might have helped as well.
I tested managedsave with this patch in and indeed, it is 10x faster
while I can still monitor its progress.
A few times libvirt users manually setting mac addresses have
complained of a networking failure that ends up being due to a multicast
mac address being used for a guest interface. This patch prevents that
by logging an error and failing if a multicast mac address is
encountered in each of the three following cases:
1) domain xml <interface> mac address.
2) network xml bridge mac address.
3) network xml dhcp/host mac address.
There are several other places where a mac address can be input that
aren't controlled in this manner because failure to do so has no
consequences (e.g., if the address will be used to search through
existing interfaces for a match).
The RNG has been updated to add multiMacAddr and uniMacAddr along with
the existing macAddr, and macAddr was switched to uniMacAddr where
appropriate.
If an error was encountered parsing a dhcp host entry mac address or
name, parsing would continue and log a less descriptive error that
might make it more difficult to notice the true nature of the problem.
This patch returns immediately on logging the first error.
This patch is in response to:
https://bugzilla.redhat.com/show_bug.cgi?id=798467
If a guest's tap device is created using the same MAC address the
guest uses for its own network card (which connects to the tap
device), the Linux kernel will log the following message and traffic
will not pass:
kernel: vnet9: received packet with own address as source address
This patch disallows MAC addresses with a first byte of 0xFE, but only in
the case that the MAC address is used for a guest interface that's
connected by way of a standard tap device. (In other words, the
validation is done at runtime at the same place the MAC address is
modified for the tap device, rather than when mac address is parsed,
the idea being that it is then we know for sure the address will be
problematic.)
Using inheritance, this patch cleans up the cpu_map.xml file and also
sorts all CPU features according to the feature and registry
values. Model features are sorted the same way as foeatures in the
specification.
Also few models that are related were organized together and parts of
the XML are marked with comments
If a guest is paused, we were silently ignoring the quiesce flag,
which results in unclean snapshots, contrary to the intent of the
flag. Since we can't quiesce without guest agent support, we should
instead fail if the guest is not running.
Meanwhile, if we attempt a quiesce command, but the guest agent
doesn't respond, and we time out, we may have left the command
pending on the guest's queue, and when the guest resumes parsing
commands, it will freeze even though our command is no longer
around to issue a thaw. To be safe, we must _always_ pair every
quiesce call with a counterpart thaw, even if the quiesce call
failed due to a timeout, so that if a guest wakes up and starts
processing a command backlog, it will not get stuck in a frozen
state.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive):
Always issue thaw after a quiesce, even if quiesce failed.
(qemuDomainSnapshotFSThaw): Add a parameter.
This patch fixes a NULL pointer check that was causing SegFault on
some specific configurations. It also reverts commit 59d0c9801c
that was checking for this value in one place.
A common coding pattern for changing blkio parameters is
1. virDomainGetBlkioParameters
2. change one or more params
3. virDomainSetBlkioParameters
For this to work, it must be possible to roundtrip through
the methods without error. Unfortunately virDomainGetBlkioParameters
will return "" for the deviceWeight parameter for guests by default,
which virDomainSetBlkioParameters will then reject as invalid.
This fixes the handling of "" to be a no-op, and also improves the
error message to tell you what was invalid
How to reproduce:
% valgrind -v --leak-check=full virsh migrate mig \
qemu+ssh://$dest/system --unsafe
== 8 bytes in 1 blocks are definitely lost in loss record 1 of 28
== at 0x4A04A28: calloc (vg_replace_malloc.c:467)
== by 0x3EB7115FB8: xdr_reference (in /lib64/libc-2.12.so)
== by 0x3EB7115F10: xdr_pointer (in /lib64/libc-2.12.so)
== by 0x4D1EA84: xdr_remote_string (remote_protocol.c:40)
== by 0x4D1EAD8: xdr_remote_domain_migrate_prepare3_ret (remote_protocol.c:4772)
== by 0x4D2FFD2: virNetMessageDecodePayload (virnetmessage.c:382)
== by 0x4D2789C: virNetClientProgramCall (virnetclientprogram.c:382)
== by 0x4D0707D: callWithFD (remote_driver.c:4549)
== by 0x4D070FB: call (remote_driver.c:4570)
== by 0x4D12AEE: remoteDomainMigratePrepare3 (remote_driver.c:4138)
== by 0x4CF7BE9: virDomainMigrateVersion3 (libvirt.c:4815)
== by 0x4CF9432: virDomainMigrate2 (libvirt.c:5454)
==
== LEAK SUMMARY:
== definitely lost: 8 bytes in 1 blocks
== indirectly lost: 0 bytes in 0 blocks
== possibly lost: 0 bytes in 0 blocks
== still reachable: 126,995 bytes in 1,343 blocks
== suppressed: 0 bytes in 0 blocks
This patch also fixes the leaks in remoteDomainMigratePrepare and
remoteDomainMigratePrepare2.
* src/libvirt.c (virStorageVolResize): correct comment typo according to
virStorageVolResizeFlags enum definition.
Signed-off-by: Alex Jia <ajia@redhat.com>
If no <interface> elements are included in an LXC guest XML
description, then the LXC guest will just see the host's
network interfaces. It is desirable to be able to hide the
host interfaces, without having to define any guest interfaces.
This patch introduces a new feature flag <privnet/> to allow
forcing of a private network namespace for LXC. In the future
I also anticipate that we will add <privuser/> to force a
private user ID namespace.
* src/conf/domain_conf.c, src/conf/domain_conf.h: Add support
for <privnet/> feature. Auto-set <privnet> if any <interface>
devices are defined
* src/lxc/lxc_container.c: Honour request for private network
namespace
Commit e457d5ef20 adds ability to pass the
default URI using the client configuration file. If the file is not
present, it still accesses the NULL config object causing a segfault.
Caught running "make check".
Wire up the domain graphics event notifications for SPICE. Adapted
from a RHEL-only patch written by Dan Berrange that used custom
__com.redhat_SPICE events - equivalent events are now available in
upstream QEMU (including a SPICE_CONNECTED event, which was missing in
the __COM.redhat_SPICE version).
* src/qemu/qemu_monitor_json.c: Wire up SPICE graphics events
Currently if the URI passed to virConnectOpen* is NULL, then we
- Look for LIBVIRT_DEFAULT_URI env var
- Probe for drivers
This changes it so that
- Look for LIBVIRT_DEFAULT_URI env var
- Look for 'uri_default' in $HOME/.libvirt/libvirt.conf
- Probe for drivers
numad is an user-level daemon that monitors NUMA topology and
processes resource consumption to facilitate good NUMA resource
alignment of applications/virtual machines to improve performance
and minimize cost of remote memory latencies. It provides a
pre-placement advisory interface, so significant processes can
be pre-bound to nodes with sufficient available resources.
More details: http://fedoraproject.org/wiki/Features/numad
"numad -w ncpus:memory_amount" is the advisory interface numad
provides currently.
This patch add the support by introducing a new XML attribute
for <vcpu>. e.g.
<vcpu placement="auto">4</vcpu>
<vcpu placement="static" cpuset="1-10^6">4</vcpu>
The returned advisory nodeset from numad will be printed
in domain's dumped XML. e.g.
<vcpu placement="auto" cpuset="1-10^6">4</vcpu>
If placement is "auto", the number of vcpus and the current
memory amount specified in domain XML will be used for numad
command line (numad uses MB for memory amount):
numad -w $num_of_vcpus:$current_memory_amount / 1024
The advisory nodeset returned from numad will be used to set
domain process CPU affinity then. (e.g. qemuProcessInitCpuAffinity).
If the user specifies both CPU affinity policy (e.g.
(<vcpu cpuset="1-10,^7,^8">4</vcpu>) and placement == "auto"
the specified CPU affinity will be overridden.
Only QEMU/KVM drivers support it now.
See docs update in patch for more details.
With current code, we pass true iff domain is cold booting. However,
if disk is inaccessible and startupPolicy for that disk is set to
'requisite' we have to fail iff cold booting.
AMD Bulldozer (or Opteron_G4 as called in QEMU) was added to the list
of cpu models, flags were taken from upstream qemu cpu specifications
and should be sorted by bit values (or first occurence in the feature
specification part of cpu_map.xml).
Based on QEMU upstream commit 885bb0369a4f0abe2c0185178f3cb347cb02cdf1.
Even though we say in documentation setting (tls-)port to -1 is legacy
compat style for enabling autoport, we're roughly doing this for VNC.
However, in case of SPICE auto enable autoport iff both port & tlsPort
are equal -1 as documentation says autoport plays with both.
In qemuDomainDetachNetDevice, detach was being used before it had been
validated. If no matching device was found, this resulted in a
dereference of a NULL pointer.
This behavior was a regression introduced in commit
cf90342be0, so it has not been a part of
any official libvirt release.
When host-model and host-passthrouh CPU modes were introduced, qemu
driver was properly modify to update guest CPU definition during
migration so that we use the right CPU at the destination. However,
similar treatment is needed for (managed)save and snapshots since they
need to save the exact CPU so that a domain can be properly restored.
To avoid repetition of such situation, all places that need live XML
share the code which generates it.
As a side effect, this patch fixes error reporting from
qemuDomainSnapshotWriteMetadata().
Thanks to cgroups, providing user vs. system time of the overall
guest is easy to add to our existing API.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_CPU_STATS_USERTIME)
(VIR_DOMAIN_CPU_STATS_SYSTEMTIME): New constants.
* src/util/virtypedparam.h (virTypedParameterArrayValidate)
(virTypedParameterAssign): Enforce checking the result.
* src/qemu/qemu_driver.c (qemuDomainGetPercpuStats): Fix offender.
(qemuDomainGetTotalcpuStats): Implement new parameters.
* tools/virsh.c (cmdCPUStats): Tweak output accordingly.
As documented in linux.git/Documentation/cgroups/cpuacct.txt,
cpuacct.stat returns user and system time in ticks (the same
unit used in times(2)). It would be a bit nicer if it were like
getrusage(2) and reported timeval contents, or like cpuacct.usage
and in nanoseconds, but we can't be picky.
* src/util/cgroup.h (virCgroupGetCpuacctStat): New function.
* src/util/cgroup.c (virCgroupGetCpuacctStat): Implement it.
(virCgroupGetValueStr): Allow for multi-line files.
* src/libvirt_private.syms (cgroup.h): Export it.
If there is a disk file with a comma in the name, QEmu expects a double
comma instead of a single one (e.g., the file "virtual,disk.img" needs
to be specified as "virtual,,disk.img" in QEmu's command line). This
patch fixes libvirt to work with that feature. Fix RHBZ #801036.
Based on an initial patch by Crístian Viana.
* src/util/buf.h (virBufferEscape): Alter signature.
* src/util/buf.c (virBufferEscape): Add parameter.
(virBufferEscapeSexpr): Fix caller.
* src/qemu/qemu_command.c (qemuBuildRBDString): Likewise. Also
escape commas in file names.
(qemuBuildDriveStr): Escape commas in file names.
* docs/schemas/basictypes.rng (absFilePath): Relax RNG to allow
commas in input file names.
* tests/qemuxml2argvdata/*-disk-drive-network-sheepdog.*: Update
test.
Signed-off-by: Eric Blake <eblake@redhat.com>
We found few more AMD-specific features in cpu64-rhel* models that
made it impossible to start qemu guest on Intel host (with this
setting) even though qemu itself starts correctly with them.
This impacts one test, thus the fix in tests/cputestdata/.
virNetworkDNSHostsDefParseXML was calling VIR_ALLOC(def->hosts) if
def->hosts was NULL. This is a waste of time, though, since
VIR_REALLOC_N is called a few lines further down, prior to any use of
def->hosts. (initializing def->nhosts to 0 is also redundant, because
the newly allocated memory will always be cleared to all 0's anyway).
If user hasn't supplied any tlsPort we default to setting it
to zero in our internal structure. However, when building command
line we test it against -1 which is obviously wrong.
This is nearly identical to an earlier patch for virnetlink.c.
There are special stub versions of all public functions in this file
that are compiled when the platform isn't linux. Each of these
functions had an almost identical message, differing only in the
function name included in the message. Since log messages already
contain the function name, we can just define a const char* with the
common part of the string, and use that same string for all the log
messages.
If nothing else, this at least makes for less strings that need
translating...
This function was freeing a virDomainNetDef with
VIR_FREE(). virDomainNetDef is a complex structure with many pointers
to other dynamically allocated data; to properly free it
virDomainNetDefFree() must be called instead, otherwise several
strings (and potentially other things) will be leaked.
For some reason, although live hotplug of <hostdev> devices is
supported, persistent hotplug is not. This patch adds the proper
VIR_DOMAIN_DEVICE_HOSTDEV cases to the switches in
qemuDomainAttachDeviceConfig and qemuDomainDetachDeviceConfig.
There are several functions that call virNetlinkCommand, and they all
follow a common pattern, with three exit labels: err_exit (or
cleanup), malformed_resp, and buffer_too_small. All three of these
labels do their own cleanup and have their own return. However, the
malformed_resp label usually frees the same items as the
cleanup/err_exit label, and the buffer_too_small label just doesn't
free recvbuf (because it's known to always be NULL at the time we goto
buffer_too_small.
In order to simplify and standardize the code, I've made the following
changes to all of these functions:
1) err_exit is replaced with the more libvirt-ish "cleanup", which
makes sense because in all cases this code is also executed in the
case of success, so labelling it err_exit may be confusing.
2) rc is initialized to -1, and set to 0 just before the cleanup
label. Any code that currently sets rc = -1 is made to instead goto
cleanup.
3) malformed_resp and buffer_too_small just log their error and goto
cleanup. This gives us a single return path, and a single place to
free up resources.
4) In one instance, rather then logging an error immediately, a char*
msg was pointed to an error string, then goto cleanup (and cleanup
would log an error if msg != NULL). It takes no more lines of code
to just log the message as we encounter it.
This patch should have 0 functional effects.
There are several functions in domain_conf.c that remove a device
object from the domain's list of that object type, but don't free the
object or return it to the caller to free. In many cases this isn't a
problem because the caller already had a pointer to the object and
frees it afterward, but in several cases the removed object was just
left floating around with no references to it.
In particular, the function qemuDomainDetachDeviceConfig() calls
functions to locate and remove net (virDomainNetRemoveByMac), disk
(virDomainDiskRemoveByName()), and lease (virDomainLeaseRemove())
devices, but neither it nor its caller qemuDomainModifyDeviceConfig()
ever obtain a pointer to the device being removed, much less free it.
This patch modifies the following "remove" functions to return a
pointer to the device object being removed from the domain device
arrays, to give the caller the option of freeing the device object
using that pointer if needed. In places where the object was
previously leaked, it is now freed:
virDomainDiskRemove
virDomainDiskRemoveByName
virDomainNetRemove
virDomainNetRemoveByMac
virDomainHostdevRemove
virDomainLeaseRemove
virDomainLeaseRemoveAt
The functions that had been leaking:
libxlDomainDetachConfig - leaked a virDomainDiskDef
qemuDomainDetachDeviceConfig - could leak a virDomainDiskDef,
a virDomainNetDef, or a
virDomainLeaseDef
qemuDomainDetachLease - leaked a virDomainLeaseDef
There were certain paths through the hostdev detach code that could
lead to the lower level function failing (and not removing the object
from the domain's hostdevs list), but the higher level function
free'ing the hostdev object anyway. This would leave a stale
hostdevdef pointer in the list, which would surely cause a problem
eventually.
This patch relocates virDomainHostdevRemove from the lower level
functions qemuDomainDetachThisHostDevice and
qemuDomainDetachHostPciDevice, to their caller
qemuDomainDetachThisHostDevice, placing it just before the call to
virDomainHostdevDefFree. This makes it easy to verify that either both
operations are done, or neither.
NB: The "dangling pointer" part of this problem was introduced in
commit 13d5a6, so it is not present in libvirt versions prior to
0.9.9. Earlier versions would return failure in certain cases even
though the the device object was removed/deleted, but the removal and
deletion operations would always both happen or neither.
There are special stub versions of all public functions in this file
that are compiled when either libnl isn't available or the platform
isn't linux. Each of these functions had two almost identical message,
differing only in the function name included in the message. Since log
messages already contain the function name, we can just define a const
char* with the common part of the string, and use that same string for
all the log messages.
Also, rather than doing #if defined ... #else ... #endif *inside the
error log macro invocation*, this patch does #if defined ... just
once, using it to decide which single string to define. This turns the
error log in each function from 6 lines, to 1 line.
This patch will allow OpenFlow controllers to identify which interface
belongs to a particular VM by using the Domain UUID.
ovs-vsctl get Interface vnet0 external_ids
{attached-mac="52:54:00:8C:55:2C", iface-id="83ce45d6-3639-096e-ab3c-21f66a05f7fa", iface-status=active, vm-id="142a90a7-0acc-ab92-511c-586f12da8851"}
V2 changes:
Replaced vm-uuid with vm-id. There was a discussion in Open vSwitch
mailinglist that we should stick with the same DB key postfixes for the
sake of consistency (e.g iface-id, vm-id ...).
The indentation on the final lines of the function was off by four
spaces, making me wonder for a second if there was something
missing. (There wasn't.)
Commit 5d4b0c4c80 tried to fix certain classes of VPATH builds,
but was too limited. In particular, Guannan Ren reported:
> For example: The libvirt source code resides in /home/testuser,
> I make dist in /tmp/buildvpath, the XDR routine .c file will
> include full path of the header file like:
>
> #include "/home/testuser/src/rpc/virnetprotocol.h"
> #include "internal.h"
> #include <arpa/inet.h>
>
> If we distribute the tarball to another machine to compile,
> it will report error as follows:
>
> rpc/virnetprotocol.c:7:59: fatal error:
> /home/testuser/src/rpc/virnetprotocol.h: No such file or directory
* src/rpc/genprotocol.pl: Fix more include lines.
If we need to virFork() to check assess() under different
UID+GID we need to translate returned status via WEXITSTATUS().
Otherwise, we may return values greater than 255 which is
obviously wrong.
The function sanlock_inquire can return NULL in the state string if the
message consists only of a header. The return value is arbitrary and
sent by the server. We should proceed carefully while touching such
pointers.
Some members are generated during XML parse (e.g. MAC address of
an interface); However, with current implementation, if we
are plugging a device both to persistent and live config,
we parse given XML twice: first time for live, second for config.
This is wrong then as the second time we are not guaranteed
to generate same values as we did for the first time.
To prevent that we need to create a copy of DeviceDefPtr;
This is done through format/parse process instead of writing
functions for deep copy as it is easier to maintain:
adding new field to any virDomain*DefPtr doesn't require change
of copying function.
Currently, startupPolicy='requisite' was determining cold boot
by migrateFrom != NULL. That means, if domain was started up
with migrateFrom set we didn't require disk source path and allowed
it to be dropped. However, on snapshot-revert domain wasn't migrated
but according to documentation, requisite should drop disk source
as well.
Output is still in kibibytes, but input can now be in different
scales for ease of typing.
* src/conf/domain_conf.c (virDomainParseMemory): New helper.
(virDomainDefParseXML): Use it when parsing.
* docs/schemas/domaincommon.rng: Expand XML; rename memoryKBElement
to memoryElement and update callers.
* docs/formatdomain.html.in (elementsMemoryAllocation): Document
scaling.
* tests/qemuxml2argvdata/qemuxml2argv-memtune.xml: Adjust test.
* tests/qemuxml2xmltest.c: Likewise.
* tests/qemuxml2xmloutdata/qemuxml2xmlout-memtune.xml: New file.
Using 'unsigned long' for memory values is risky on 32-bit platforms,
as a PAE guest can have more than 4GiB memory. Our API is
(unfortunately) locked at 'unsigned long' and a scale of 1024, but
the rest of our system should consistently use 64-bit values,
especially since the previous patch centralized overflow checking.
* src/conf/domain_conf.h (_virDomainDef): Always use 64-bit values
for memory. Change hugepage_backed to a bool.
* src/conf/domain_conf.c (virDomainDefParseXML)
(virDomainDefCheckABIStability, virDomainDefFormatInternal): Fix
clients.
* src/vmx/vmx.c (virVMXFormatConfig): Likewise.
* src/xenxs/xen_sxpr.c (xenParseSxpr, xenFormatSxpr): Likewise.
* src/xenxs/xen_xm.c (xenXMConfigGetULongLong): New function.
(xenXMConfigGetULong, xenXMConfigSetInt): Avoid truncation.
(xenParseXM, xenFormatXM): Fix clients.
* src/phyp/phyp_driver.c (phypBuildLpar): Likewise.
* src/openvz/openvz_driver.c (openvzDomainSetMemoryInternal):
Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainDefineXML): Likewise.
* src/qemu/qemu_command.c (qemuBuildCommandLine): Likewise.
* src/qemu/qemu_process.c (qemuProcessStart): Likewise.
* src/qemu/qemu_monitor.h (qemuMonitorGetBalloonInfo): Likewise.
* src/qemu/qemu_monitor_text.h (qemuMonitorTextGetBalloonInfo):
Likewise.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextGetBalloonInfo):
Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONGetBalloonInfo):
Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONGetBalloonInfo):
Likewise.
* src/qemu/qemu_driver.c (qemudDomainGetInfo)
(qemuDomainGetXMLDesc): Likewise.
* src/uml/uml_conf.c (umlBuildCommandLine): Likewise.
On 64-bit platforms, unsigned long and unsigned long long are
identical, so we don't have to worry about overflow checks.
On 32-bit platforms, anywhere we narrow unsigned long long back
to unsigned long, we have to worry about overflow; it's easier
to do this in one place by having most of the code use the same
or wider types, and only doing the narrowing at the last minute.
Therefore, the memory set commands remain unsigned long, and
the memory get command now centralizes the overflow check into
libvirt.c, so that drivers don't have to repeat the work.
This also fixes a bug where xen returned the wrong value on
failure (most APIs return -1 on failure, but getMaxMemory
must return 0 on failure).
* src/driver.h (virDrvDomainGetMaxMemory): Use long long.
* src/libvirt.c (virDomainGetMaxMemory): Raise overflow.
* src/test/test_driver.c (testGetMaxMemory): Fix driver.
* src/rpc/gendispatch.pl (name_to_ProcName): Likewise.
* src/xen/xen_hypervisor.c (xenHypervisorGetMaxMemory): Likewise.
* src/xen/xen_driver.c (xenUnifiedDomainGetMaxMemory): Likewise.
* src/xen/xend_internal.c (xenDaemonDomainGetMaxMemory):
Likewise.
* src/xen/xend_internal.h (xenDaemonDomainGetMaxMemory):
Likewise.
* src/xen/xm_internal.c (xenXMDomainGetMaxMemory): Likewise.
* src/xen/xm_internal.h (xenXMDomainGetMaxMemory): Likewise.
* src/xen/xs_internal.c (xenStoreDomainGetMaxMemory): Likewise.
* src/xen/xs_internal.h (xenStoreDomainGetMaxMemory): Likewise.
* src/xenapi/xenapi_driver.c (xenapiDomainGetMaxMemory):
Likewise.
* src/esx/esx_driver.c (esxDomainGetMaxMemory): Likewise.
* src/libxl/libxl_driver.c (libxlDomainGetMaxMemory): Likewise.
* src/qemu/qemu_driver.c (qemudDomainGetMaxMemory): Likewise.
* src/lxc/lxc_driver.c (lxcDomainGetMaxMemory): Likewise.
* src/uml/uml_driver.c (umlDomainGetMaxMemory): Likewise.
The test domain allows <memory>0</memory>, but the RNG was stating
that memory had to be at least 4096000 bytes. Hypervisors should
enforce their own limits, rather than complicating the RNG.
Meanwhile, some copy and paste had introduced some fishy constructs
in various unit tests.
* docs/schemas/domaincommon.rng (memoryKB, memoryKBElement): Drop
limit that isn't enforced in code.
* src/conf/domain_conf.c (virDomainDefParseXML): Require current
<= maximum.
* tests/qemuxml2argvdata/*.xml: Fix offenders.
Disk manufacturers are fond of quoting sizes in powers of 10,
rather than powers of 2 (after all, 2.1 GB sounds larger than
2.0 GiB, even though the exact opposite is true). So, we might
as well follow coreutils' lead in supporting three types of
suffix: single letter ${u} (which we already had) and ${u}iB
for the power of 2, and ${u}B for power of 10.
Additionally, it is impossible to create a file with more than
2**63 bytes, since off_t is signed (if you have enough storage
to even create one 8EiB file, I'm jealous). This now reports
failure up front rather than down the road when the kernel
finally refuses an impossible size.
* docs/schemas/basictypes.rng (unit): Add suffixes.
* src/conf/storage_conf.c (virStorageSize): Use new function.
* docs/formatstorage.html.in: Document it.
* tests/storagevolxml2xmlin/vol-file-backing.xml: Test it.
* tests/storagevolxml2xmlin/vol-file.xml: Likewise.
Make it obvious to 'dumpxml' readers what unit we are using,
since our default of KiB for memory (1024) differs from qemu's
default of MiB; and differs from our use of bytes for storage.
Tests were updated via:
$ find tests/*data tests/*out -name '*.xml' | \
xargs sed -i 's/<\(memory\|currentMemory\|hard_limit\|soft_limit\|min_guarantee\|swap_hard_limit\)>/<\1 unit='"'KiB'>/"
$ find tests/*data tests/*out -name '*.xml' | \
xargs sed -i 's/<\(capacity\|allocation\|available\)>/<\1 unit='"'bytes'>/"
followed by a few fixes for the stragglers.
Note that with this patch, the RNG for <memory> still forbids
validation of anything except unit='KiB', since the code silently
ignores the attribute; a later patch will expand <memory> to allow
scaled input in the code and update the RNG to match.
* docs/schemas/basictypes.rng (unit): Add 'bytes'.
(scaledInteger): New define.
* docs/schemas/storagevol.rng (sizing): Use it.
* docs/schemas/storagepool.rng (sizing): Likewise.
* docs/schemas/domaincommon.rng (memoryKBElement): New define; use
for memory elements.
* src/conf/storage_conf.c (virStoragePoolDefFormat)
(virStorageVolDefFormat): Likewise.
* src/conf/domain_conf.h (_virDomainDef): Document unit used
internally.
* src/conf/storage_conf.h (_virStoragePoolDef, _virStorageVolDef):
Likewise.
* tests/*data/*.xml: Update all tests.
* tests/*out/*.xml: Likewise.
* tests/define-dev-segfault: Likewise.
* tests/openvzutilstest.c (testReadNetworkConf): Likewise.
* tests/qemuargv2xmltest.c (blankProblemElements): Likewise.
Scaling an integer based on a suffix is something we plan on reusing
in several contexts: XML parsing, virsh CLI parsing, and possibly
elsewhere. Make it easy to reuse, as well as adding in support for
powers of 1000.
* src/util/util.h (virScaleInteger): New function.
* src/util/util.c (virScaleInteger): Implement it.
* src/libvirt_private.syms (util.h): Export it.
Overflow can be user-induced, so it deserves more than being called
an internal error. Note that in general, 32-bit platforms have
far more places to trigger this error (anywhere the public API
used 'unsigned long' but the other side of the connection is a
64-bit server); but some are possible on 64-bit platforms (where
the public API computes the product of two numbers).
* include/libvirt/virterror.h (VIR_ERR_OVERFLOW): New error.
* src/util/virterror.c (virErrorMsg): Translate it.
* src/libvirt.c (virDomainSetVcpusFlags, virDomainGetVcpuPinInfo)
(virDomainGetVcpus, virDomainGetCPUStats): Use it.
* daemon/remote.c (HYPER_TO_TYPE): Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockResize): Likewise.
Yes, I like kilobytes better than kibibytes (when I say kilobytes,
I generally mean 1024). But since the term is ambiguous, it can't
hurt to say what we mean, by using both the correct name and
calling out the numeric equivalent.
* src/libvirt.c (virDomainGetMaxMemory, virDomainSetMaxMemory)
(virDomainSetMemory, virDomainSetMemoryFlags)
(virNodeGetFreeMemory): Tweak wording.
* docs/formatdomain.html.in: Likewise.
* docs/formatstorage.html.in: Likewise.
ATTRIBUTE_UNUSED was accidentally forgotten on one arg of a stub
function for functionality that's not present on non-linux
platforms. This causes a non-linux build with
--enable-compile-warnings=error to fail.
The RPC code assumed that the array returned by the driver would be
fully populated; that is, ncpus on entry resulted in ncpus * return
value on exit. However, while we don't support holes in the middle
of ncpus, we do want to permit the case of ncpus on entry being
longer than the array returned by the driver (that is, it should be
safe for the caller to pass ncpus=128 on entry, and the driver will
stop populating the array when it hits max_id).
Additionally, a successful return implies that the caller will then
use virTypedParamArrayClear on the entire array; for this to not
free uninitialized memory, the driver must ensure that all skipped
entries are explicitly zeroed (the RPC driver did this, but not
the qemu driver).
There are now three cases:
server 0.9.10 and client 0.9.10 or newer: No impact - there were no
hypervisor drivers that supported cpu stats
server 0.9.11 or newer and client 0.9.10: if the client calls with
ncpus beyond the max, then the rpc call will fail on the client side
and disconnect the client, but the server is no worse for the wear
server 0.9.11 or newer and client 0.9.11: the server can return a
truncated array and the client will do just fine
I reproduced the problem by using a host with 2 CPUs, and doing:
virsh cpu-stats $dom --start 1 --count 2
* daemon/remote.c (remoteDispatchDomainGetCPUStats): Allow driver
to omit tail of array.
* src/remote/remote_driver.c (remoteDomainGetCPUStats):
Accommodate driver that omits tail of array.
* src/libvirt.c (virDomainGetCPUStats): Document this.
* src/qemu/qemu_driver.c (qemuDomainGetPercpuStats): Clear all
unpopulated entries.
* For now, only "cpu_time" is supported.
* cpuacct cgroup is used for providing percpu cputime information.
* src/qemu/qemu.conf - take care of cpuacct cgroup.
* src/qemu/qemu_conf.c - take care of cpuacct cgroup.
* src/qemu/qemu_driver.c - added an interface
* src/util/cgroup.c/h - added interface for getting percpu cputime
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
These changes are applied only if the hostdev has a parent net device
(i.e. if it was defined as "<interface type='hostdev'>" rather than
just "<hostdev>"). If the parent netdevice has virtual port
information, the original virtualport associate functions are called
(these set and restore both mac and port profile on an
interface). Otherwise, only mac address is set on the device.
Note that This is only supported for SR-IOV Virtual Functions (not for
standard PCI or USB netdevs), and virtualport association is only
supported for 802.1Qbh. For all other types of cards and types of
virtualport, a "Config Unsupported" error is returned and the
operation fails.
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
This patch includes the following changes to virnetdevmacvlan.c and
virnetdevvportprofile.c:
- removes some netlink functions which are now available in
virnetdev.c
- Adds a vf argument to all port profile functions.
For 802.1Qbh devices, the port profile calls can use a vf argument if
passed by the caller. If the vf argument is -1 it will try to derive the vf
if the device passed is a virtual function.
For 802.1Qbg devices, This patch introduces a null check for the device
argument because during port profile assignment on a hostdev, this argument
can be null.
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
This patch adds the following:
- functions to set and get vf configs
- Functions to replace and store vf configs (Only mac address is handled today.
But the functions can be easily extended for vlans and other vf configs)
- function to dump link dev info (This is moved from virnetdevvportprofile.c)
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
pciDeviceGetVirtualFunctionInfo returns pf netdevice name and virtual
function index for a given vf. This is just a wrapper around existing functions
to return vf's pf and vf_index with one api call
pciConfigAddressToSysfsfile returns the sysfile pci device link
from a 'struct pci_config_address'
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
qemuDomainAttachNetDevice
- re-ordered some things at start of function because
networkAllocateActualDevice should always be run and a slot
in def->nets always allocated, but host_net_add isn't needed
if the actual type is hostdev.
- if actual type is hostdev, defer to
qemuDomainAttachHostDevice (which will reach up to the NetDef
for things like MAC address when necessary). After return
from qemuDomainAttachHostDevice, slip directly to cleanup,
since the rest of the function is specific to emulated net
devices.
- put assignment of new NetDef into expanded def->nets down
below cleanup: (but only on success) since it is also needed
for emulated and hostdev net devices.
qemuDomainDetachHostDevice
- after locating the exact device to detach, check if it's a
network device and, if so, use toplevel
qemuDomainDetachNetDevice instead so that the def->nets list
is properly updated, and 'actual device' properly returned to
network pool if appropriate. Otherwise, for normal hostdevs,
call the lower level qemuDomainDetachThisDevice.
qemuDomainDetachNetDevice
- This is where it gets a bit tricky. After locating the device
on the def->nets list, if the network device type == hostdev,
call the *lower level* qemuDomainDetachThisDevice (which will
reach back up to the parent net device for MAC address /
virtualport when appropriate, then clear the device out of
def->hostdevs) before skipping past all the emulated
net-device-specific code to cleanup:, where the network
device is removed from def->nets, and the network device
object is freed.
In short, any time a hostdev-type network device is detached, we must
go through the toplevel virDomaineDetachNetDevice function first and
last, to make sure 1) the def->nnets list is properly managed, and 2)
any device allocated with networkAllocateActualDevice is properly
freed. At the same time, in the middle we need to go through the
lower-level vidDomainDetach*This*HostDevice to be sure that 1) the
def->hostdevs list is properly managed, 2) the PCI device is properly
detached from the guest and reattached to the host (if appropriate),
and 3) any higher level teardown is called at the appropriate time, by
reaching back up to the NetDef config (part (3) will be covered in a
separate patch).
This patch makes sure that each network device ("interface") of
type='hostdev' appears on both the hostdevs list and the nets list of
the virDomainDef, and it modifies the qemu driver startup code so that
these devices will be presented to qemu on the commandline as hostdevs
rather than as network devices.
It does not add support for hotplug of these type of devices, or code
to honor the <mac address> or <virtualport> given in the config (both
of those will be done in separate patches).
Once each device is placed on both lists, much of what this patch does
is modify places in the code that traverse all the device lists so
that these hybrid devices are only acted on once - either along with
the other hostdevs, or along with the other network interfaces. (In
many cases, only one of the lists is traversed / a specific operation
is performed on only one type of device. In those instances, the code
can remain unchanged.)
There is one special case - when building the commandline, interfaces
are allowed to proceed all the way through
networkAllocateActualDevice() before deciding to skip the rest of
netdev-specific processing - this is so that (once we have support for
networks with pools of hostdev devices) we can get the actual device
allocated, then rely on the loop processing all hostdevs to generate
the correct commandline.
(NB: <interface type='hostdev'> is only supported for PCI network
devices that are SR-IOV Virtual Functions (VF). Standard PCI[e] and
USB devices, and even the Physical Functions (PF) of SR-IOV devices
can only be assigned to a guest using the more basic <hostdev> device
entry. This limitation is mostly due to the fact that non-SR-IOV
ethernet devices tend to lose mac address configuration whenever the
card is reset, which happens when a card is assigned to a guest;
SR-IOV VFs fortunately don't suffer the same problem.)
This is the new interface type that sets up an SR-IOV PCI network
device to be assigned to the guest with PCI passthrough after
initializing some network device-specific things from the config
(e.g. MAC address, virtualport profile parameters). Here is an example
of the syntax:
<interface type='hostdev' managed='yes'>
<source>
<address type='pci' domain='0' bus='0' slot='4' function='3'/>
</source>
<mac address='00:11:22:33:44:55'/>
<address type='pci' domain='0' bus='0' slot='7' function='0'/>
</interface>
This would assign the PCI card from bus 0 slot 4 function 3 on the
host, to bus 0 slot 7 function 0 on the guest, but would first set the
MAC address of the card to 00:11:22:33:44:55.
NB: The parser and formatter don't care if the PCI card being
specified is a standard single function network adapter, or a virtual
function (VF) of an SR-IOV capable network adapter, but the upcoming
code that implements the back end of this config will work *only* with
SR-IOV VFs. This is because modifying the mac address of a standard
network adapter prior to assigning it to a guest is pointless - part
of the device reset that occurs during that process will reset the MAC
address to the value programmed into the card's firmware.
Although it's not supported by any of libvirt's hypervisor drivers,
usb network hostdevs are also supported in the parser and formatter
for completeness and consistency. <source> syntax is identical to that
for plain <hostdev> devices, except that the <address> element should
have "type='usb'" added if bus/device are specified:
<interface type='hostdev'>
<source>
<address type='usb' bus='0' device='4'/>
</source>
<mac address='00:11:22:33:44:55'/>
</interface>
If the vendor/product form of usb specification is used, type='usb'
is implied:
<interface type='hostdev'>
<source>
<vendor id='0x0012'/>
<product id='0x24dd'/>
</source>
<mac address='00:11:22:33:44:55'/>
</interface>
Again, the upcoming patch to fill in the backend of this functionality
will log an error and fail with "Unsupported Config" if you actually
try to assign a USB network adapter to a guest using <interface
type='hostdev'> - just use a standard <hostdev> entry in that case
(and also for single-port PCI adapters).
This refactoring is necessary to support hotplug detach of
type=hostdev network devices, but needs to be in a separate patch to
make potential debugging of regressions more practical.
Rather than the lowest level functions searching for a matching
device, the search is now done in the toplevel function, and an
intermediate-level function (qemuDomainDetachThisHostDevice()), which
expects that the device's entry is already found, is called (this
intermediate function will be called by qemuDomainDetachNetDevice() in
order to support detach of type=hostdev net devices)
This patch should result in 0 differences in functionality.
Three new functions useful in other files:
virDomainHostdevInsert:
Add a new hostdev at the end of the array. This would more sensibly be
called virDomainHostdevAppend, but the existing functions for other
types of devices are called Insert.
virDomainHostdevRemove:
Eliminates one entry from the hostdevs array, but doesn't free it;
patterned after the code at the end of the two
qemuDomainDetachHostXXXDevice functions (and also other pre-existing
virDomainXXXRemove functions for other device types).
virDomainHostdevFind:
This function is patterned from the search loops at the top of
qemuDomainDetachHostPciDevice and qemuDomainDetachHostUsbDevice, and
will be used to re-factor those (and other detach-related) functions.
To shorten some new code that accesses the many fields within the
subsys struct of a hostdev, create a separate toplevel, typedefed
virDomainHostdevSubsys struct so that we can define temporary pointers
to the subsys part.
The parent can be any type of device. It defaults to type=none, and a
NULL pointer. The intent is that if a hostdevdef is contained in the
def for a higher level device (e.g. virDomainNetDef), hostdev->parent
will point to the higher level device, and type will be set to that
type of device. This way, during attach and detach of the device,
parent can be checked, and appropriate callouts made to do higher
level device initialization (e.g. setting MAC address).
Also, although these hostdevs with parents will be added to a domain's
hostdevs list, they will be treated slightly differently when
traversing the list, e.g. virDomainHostdefDefFree for a hostdev that
has a parent doesn't need to be called (and will be a NOP); it will
simply be removed from the list (since the parent device object is in
its own type-specific list, and will be freed from there).
In an upcoming patch, virDomainNetDef will acquire a
virDomainHostdevDef, and the <interface> XML will take on some of the
elements of a <hostdev>. To avoid duplicating the code for parsing and
formatting the <source> element (which will be nearly identical in
these two cases), this patch factors those parts out of the
HostdevDef's parse and format functions, and puts them into separate
helper functions that are now called by the HostdevDef
parser/formatter, and will soon be called by the NetDef
parser/formatter.
One change in behavior - previously virDomainHostdevDefParseXML() had
diverged from current common coding practice by logging an error and
failing if it found any subelements of <hostdev> other than those it
understood (standard libvirt practice is to ignore/discard unknown
elements and attributes during parse). The new helper function ignores
unknown elements, and thus so does the new
virDomainHostdevDefParseXML.
In order to allow for a virDomainHostdevDef that uses the
virDomainDeviceInfo of a "higher level" device (such as a
virDomainNetDef), this patch changes the virDomainDeviceInfo in the
HostdevDef into a virDomainDeviceInfoPtr. Rather than adding checks
all over the code to check for a null info, we just guarantee that it
is always valid. The new function virDomainHostdevDefAlloc() allocates
a virDomainDeviceInfo and plugs it in, and virDomainHostdevDefFree()
makes sure it is freed.
There were 4 places allocating virDomainHostdevDefs, all of them
parsers of one sort or another, and those have all had their
VIR_ALLOC(hostdev) changed to virDomainHostdevDefAlloc(). Other than
that, and the new functions, all the rest of the changes are just
mechanical removals of "&" or changing "." to "->".
There will be cases where the iterator callback will need to know the
type of the device whose info is being operated on, and possibly even
need to use some of the device's config. This patch adds a
virDomainDeviceDefPtr to the args of every callback, and fills it in
appropriately as the devices are iterated through.
The virDomainDeviceInfoPtrs in qemuCollectPCIAddress and
qemuComparePCIDevice are named "dev" and "dev1", but those functions
will be changed (in order to match a change in the args sent to
virDomainDeviceInfoIterate() callback args) to contain a
virDomainDeviceDefPtr device.
This patch renames "dev" to "info" (and "dev[n]" to "info[n]") to
avoid later confusion.
This patch is only code movement + adding some forward definitions of
typedefs.
virDomainHostdevDef (not just a pointer to it, but an actual object)
will be needed in virDomainNetDef and virDomainActualNetDef, so it
must be relocated earlier in the file.
Likewise, virDomainDeviceDef will be needed in virDomainHostdevDef, so
it must be moved up even earlier. This, in turn, creates a forward
reference problem, but fortunately only with pointers to other device
types, so their typedefs can be moved up in the file, eliminating the
problem.
Not all device types were represented in virDomainDeviceType, so some
types of devices couldn't be represented in a virDomainDeviceDef
(which requires a different type of pointer in the union for each
different kind of device).
Since serial, parallel, channel, and console devices are all
virDomainChrDef, and the virDomainDeviceType is never used to produce
a string from the type (and only used in the other direction
internally to code, never to produce XML), I only added one "CHR"
type, which is associated with "virDomainChrDefPtr chr" in the union.
Commit 723d5c (added after the release of 0.9.10) adds a
NetlinkEventClient for each interface sent to
virNetDevMacVLanCreateWithVPortProfile. This should only be done if
the interface actually *has* a virtPortProfile, otherwise the event
handler would be a NOP. The bigger problem is that part of the setup
to create the NetlinkEventClient is to do a memcpy of virtPortProfile
- if it's NULL, this triggers a segv.
This patch just qualifies the code that adds the client - if
virtPortProfile is NULL, it's skipped.
Qemu supports sizing by bytes; we shouldn't force the user to
round up if they really wanted an unaligned total size.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_BLOCK_RESIZE_BYTES):
New flag.
* src/libvirt.c (virDomainBlockResize): Document it.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockResize): Take
size in bytes.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextBlockResize):
Likewise. Pass bytes, not megabytes, to monitor.
* src/qemu/qemu_driver.c (qemuDomainBlockResize): Implement new
flag.
No matter what cache mode is used, readonly disks are always safe wrt
migration. Shared disks are required to be readonly or to disable
host-side cache, which makes them safe as well.
A multi-threaded client with event loop may crash if one of its threads
closes a connection while event loop is in the middle of sending
keep-alive message (either request or response). The right place for it
is inside virNetClientIOEventLoop() between poll() and
virNetClientLock(). We should only close a connection directly if no-one
is using it and defer the closing to the last user otherwise. So far we
only did so if the close was initiated by keep-alive timeout.
Building virt-aa-helper with dtrace probes enabled, ldd complained about
undefined references:
./.libs/libvirt_util.a(libvirt_util_la-event_poll.o):(.note.stapsdt+0x24):
undefined reference to `libvirt_event_poll_purge_timeout_semaphore'
...
Lets say I got a volume with '1G' allocation and '10G' capacity. The
available space in the parent pool is '5G'. With the current check for
overcapacity, I can only try to resize to <= '6G'. You see the problem?
With an additional new bool added to determine whether or not to
discourage the use of the supplied MAC address by the bridge itself,
virNetDevTapCreateInBridgePort had three booleans (well, 2 bools and
an int used as a bool) in the arg list, which made it increasingly
difficult to follow what was going on. This patch combines those three
into a single flags arg, which not only shortens the arg list, but
makes it more self-documenting.
When a tap device for a domain is created and attached to a bridge,
the first byte of the tap device MAC address is set to 0xFE, while the
rest is set to match the MAC address that will be presented to the
guest as its network device MAC address. Setting this high value in
the tap's MAC address discourages the bridge from using the tap
device's MAC address as the bridge's own MAC address (Linux bridges
always take on the lowest numbered MAC address of all attached devices
as their own).
In one case within libvirt, a tap device is created and attached to
the bridge with the intent that its MAC address be taken on by the
bridge as its own (this is used to assure that the bridge has a fixed
MAC address to prevent network outages created by the bridge MAC
address "flapping" as guests are started and stopped). In this case,
the first byte of the mac address is *not* altered to 0xFE.
In the current code, callers to virNetDevTapCreateInBridgePort each
make the MAC address modification themselves before calling, which
leads to code duplication, and also prevents lower level functions
from knowing the real MAC address being used by the guest. The problem
here is that openvswitch bridges must be informed about this MAC
address, or they will be unable to pass traffic to/from the guest.
This patch centralizes the location of the MAC address "0xFE fixup"
into virNetDevTapCreateInBridgePort(), meaning 1) callers of this
function no longer need the extra strange bit of code, and 2)
bitNetDevTapCreateBridgeInPort itself now is called with the guest's
unaltered MAC address, and can pass it on, unmodified, to
virNetDevOpenvswitchAddPort.
There is no other behavioral change created by this patch.
Nuke the last vestiges of printing pid_t values with the wrong
types, at least in code compiled on mingw64. There may be other
places, but for now they are only compiled on systems where the
existing %d doesn't trigger gcc warnings.
* src/rpc/virnetsocket.c (virNetSocketNew): Use %lld and casting,
rather than assuming any particular int type for pid_t.
* src/util/command.c (virCommandRunAsync, virPidWait)
(virPidAbort): Likewise.
(verify): Drop a now stale assertion.
No thanks to 64-bit windows, with 64-bit pid_t, we have to avoid
constructs like 'int pid'. Our API in libvirt-qemu cannot be
changed without breaking ABI; but then again, libvirt-qemu can
only be used on systems that support UNIX sockets, which rules
out Windows (even if qemu could be compiled there) - so for all
points on the call chain that interact with this API decision,
we require a different variable name to make it clear that we
audited the use for safety.
Adding a syntax-check rule only solves half the battle; anywhere
that uses printf on a pid_t still needs to be converted, but that
will be a separate patch.
* cfg.mk (sc_correct_id_types): New syntax check.
* src/libvirt-qemu.c (virDomainQemuAttach): Document why we didn't
use pid_t for pid, and validate for overflow.
* include/libvirt/libvirt-qemu.h (virDomainQemuAttach): Tweak name
for syntax check.
* src/vmware/vmware_conf.c (vmwareExtractPid): Likewise.
* src/driver.h (virDrvDomainQemuAttach): Likewise.
* tools/virsh.c (cmdQemuAttach): Likewise.
* src/remote/qemu_protocol.x (qemu_domain_attach_args): Likewise.
* src/qemu_protocol-structs (qemu_domain_attach_args): Likewise.
* src/util/cgroup.c (virCgroupPidCode, virCgroupKillInternal):
Likewise.
* src/qemu/qemu_command.c(qemuParseProcFileStrings): Likewise.
(qemuParseCommandLinePid): Use pid_t for pid.
* daemon/libvirtd.c (daemonForkIntoBackground): Likewise.
* src/conf/domain_conf.h (_virDomainObj): Likewise.
* src/probes.d (rpc_socket_new): Likewise.
* src/qemu/qemu_command.h (qemuParseCommandLinePid): Likewise.
* src/qemu/qemu_driver.c (qemudGetProcessInfo, qemuDomainAttach):
Likewise.
* src/qemu/qemu_process.c (qemuProcessAttach): Likewise.
* src/qemu/qemu_process.h (qemuProcessAttach): Likewise.
* src/uml/uml_driver.c (umlGetProcessInfo): Likewise.
* src/util/virnetdev.h (virNetDevSetNamespace): Likewise.
* src/util/virnetdev.c (virNetDevSetNamespace): Likewise.
* tests/testutils.c (virtTestCaptureProgramOutput): Likewise.
* src/conf/storage_conf.h (_virStoragePerms): Use mode_t, uid_t,
and gid_t rather than int.
* src/security/security_dac.c (virSecurityDACSetOwnership): Likewise.
* src/conf/storage_conf.c (virStorageDefParsePerms): Avoid
compiler warning.
Commit 7c90026 added #include "conf/domain_conf.h" to
util/virrandom.c. Fortunately it didn't actually use anything from
domain_conf.h, since as far as I'm aware, files in util aren't allowed
to reference anything in conf (although the opposite is allowed). So
this #include is unnecessary.
I verified it still compiles with the line removed, but have placed a
one day moratorium on me doing any "trivial rule" pushes, so will
wait for someone else to verify/ACK before pushing.
This actually wires up the new optional parameter to block_stream:
http://wiki.qemu.org/Features/LiveBlockMigration/ImageStreamingAPI
The error checking is still sparse, since libvirt must not use
qemu-img or header probing on a qcow2 file in use by qemu to
check if the backing file name is valid; so for now, libvirt is
relying on qemu to diagnose an incorrect backing name. Fixing this
will require libvirt to track the entire backing file chain at the
time qemu is started and keeps it updated with snapshot and pull
operations.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONBlockJob): Add
parameter, and update callers.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONBlockJob): Update
signature.
* src/qemu/qemu_monitor.h (qemuMonitorBlockJob): Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Update caller.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Likewise.
Block job commands are not part of upstream qemu until 1.1; and
proper support of job completion and cancellation depends on being
able to receive QMP events, which implies the JSON monitor.
Additionally, some early versions of block job commands were
backported to RHEL qemu, but these versions lacked asynchronous
job cancellation and partial block pull, so there are several
patches that will still be needed in this area of libvirt code
to support both flavors of block job commands.
Due to earlier patches in libvirt, we are guaranteed that all versions
of qemu that support block job commands already require libvirt to
use the JSON monitor. That means that the text version of block jobs
will not be used, and having to refactor two copies of the block job
handlers makes no sense. So instead, we delete the text handlers.
* src/qemu/qemu_monitor.c (qemuMonitorBlockJob): Drop text monitor
support.
* src/qemu/qemu_monitor_text.h (qemuMonitorTextBlockJob): Delete.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextParseBlockJobOne)
(qemuMonitorTextParseBlockJob, qemuMonitorTextBlockJob):
Likewise.
Add de-association handling for 802.1qbg (vepa) via lldpad
netlink messages. Also adds the possibility to perform an
association request without waiting for a confirmation.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
This code adds a netlink event interface to libvirt.
It is based upon the event_poll code and makes use of
it. An event is generated for each netlink message sent
to the libvirt pid.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
In qemu there are 2 cpu models (cpu64-rhel5 and cpu64-rhel6) not
supported by libvirt. This patch adds the support with the flags
specifications from /usr/share/qemu-kvm/cpu-model/cpu-x86_64.conf
The only difference is that AMD-specific features are removed so
the processor type is not vendor-specific. Those features are either
emulated or ignored by qemu if host CPU doesn't support them.
This call to virDomainDeviceDefParse is both unnecessary (since
it will again be called at the top of the immediately following if(),
and if not there, then at the top of the if following that), but it
also creates a leak of one virDomainDeviceDef and one [whatever type
of device the DeviceDef is pointing to; probably a virDomainDiskDef]
in the case that the function has been called with
VIR_DOMAIN_DEVICE_MODIFY_CONFIG (the second parse will overwrite the
devicedef that was just created).
For any disk controller model which is not "lsilogic", the command
line will be like:
-drive file=/dev/sda,if=none,id=drive-scsi0-0-3-0,format=raw \
-device scsi-disk,bus=scsi0.0,channel=0,scsi-id=3,lun=0,i\
drive=drive-scsi0-0-3-0,id=scsi0-0-3-0
The relationship between the libvirt address attrs and the qdev
properties are (controller model is not "lsilogic"; strings
inside <> represent libvirt adress attrs):
bus=scsi<controller>.0
channel=<bus>
scsi-id=<target>
lun=<unit>
* src/qemu/qemu_command.h: (New param "virDomainDefPtr def"
for function qemuBuildDriveDevStr; new param "virDomainDefPtr
vmdef" for function qemuAssignDeviceDiskAlias. Both for
virDomainDiskFindControllerModel's use).
* src/qemu/qemu_command.c:
- New param "virDomainDefPtr def" for qemuAssignDeviceDiskAliasCustom.
For virDomainDiskFindControllerModel's use, if the disk bus is "scsi"
and the controller model is not "lsilogic", "target" is one part of
the alias name.
- According change on qemuAssignDeviceDiskAlias and qemuBuildDriveDevStr
* src/qemu/qemu_hotplug.c:
- Changes to be consistent with declarations of qemuAssignDeviceDiskAlias
qemuBuildDriveDevStr, and qemuBuildControllerDevStr.
* tests/qemuxml2argvdata/qemuxml2argv-pseries-vio-user-assigned.args,
tests/qemuxml2argvdata/qemuxml2argv-pseries-vio.args: Update the
generated command line.
* src/conf/domain_conf.h: Add new member "target" to struct
_virDomainDeviceDriveAddress.
* src/conf/domain_conf.c: Parse and format "target"
* Lots of tests (.xml) in tests/domainsnapshotxml2xmlout,
tests/qemuxml2argvdata, tests/qemuxml2xmloutdata, and
tests/vmx2xmldata/ are modified for newly introduced
attribute "target" for address of "drive" type.
KVM will be able to use a PCI SCSI controller even on POWER. Let
the user specify the vSCSI controller by other means than a default.
After this patch, the QEMU driver will actually look at the model
and reject anything but auto, lsilogic and ibmvscsi.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Osier Yang <jyang@redhat.com>
In qemuDomainAttachNetDevice, the guest's tap interface has only been
attached to the bridge if iface_connected is true. It's possible for
an error to occur prior to that happening, and previously we would
attempt to remove the tap interface from the bridge even if it hadn't
been attached.
QMP commands don't need to be escaped since converting them to json
also escapes special characters. When a QMP command fails, however,
libvirt falls back to HMP commands. These fallback functions
(qemuMonitorText*) do their own escaping, and pass the result directly
to qemuMonitorHMPCommandWithFd. If the monitor is in json mode, these
pre-escaped commands will be escaped again when converted to json,
which can result in the wrong arguments being sent.
For example, a filename test\file would be sent in json as
test\\file.
This prevented attaching an image file with a " or \ in its name in
qemu 1.0.50, and also broke rbd attachment (which uses backslashes to
escape some internal arguments.)
Reported-by: Masuko Tomoya <tomoya.masuko@gmail.com>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
This patch fixes console corruption, that happens if two concurrent
sessions are opened for a single console on a domain. Result of this
corruption was that each of the console streams recieved just a part
of the data written to the pipe so every console rendered unusable.
New helper function for safe console handling is used to establish the
console stream connection. This function ensures that no other libvirt
client is using the console (with the ability to disconnect consoles of
libvirt clients) and that no UUCP style lockfile is placed on the PTY
device.
* src/qemu/qemu_domain.h
- add data structure to domain's private data dealing with
console connections
* src/qemu/qemu_domain.c:
- allocate/free domain's console data structure
* src/qemu/qemu_driver.c
- use the new helper function for console handling
This patch adds a set of functions used in creating console streams for
domains using PTYs and ensures mutually exclusive access to the PTYs.
If mutually exclusive access is not used, two clients may open the same
console, which results in corruption on both clients as both of them
race to read data from the PTY.
Two approaches are used to ensure this:
1) Internal data structure holding open PTYs.
This is used internally and enables the user to forcibly
terminate another console connection eg. when somebody leaves
the console open on another host.
2) UUCP style lock files:
This uses UUCP lock files according to the FHS
( http://www.pathname.com/fhs/pub/fhs-2.3.html#VARLOCKLOCKFILES )
to check if other programs (like minicom) are not using the pty
device of the console.
This feature is disabled by default and may be enabled using
configure parameter
--with-console-lock-files=/path/to/lock/file/directory
or --with-console-lock-files=auto (which tries to infer the
location from OS used (currently only linux).
On usual linux systems, normal users may not write to the
/var/lock directory containing the locks. This poses problems
while in session mode. If the current user has no access to the
lockfile directory, check for presence of the file is still
done, but no lock file is created. This does NOT result in an
error.
This patch adds another callback to a FDstream object. The original
callback is used by the daemon stream driver to handle events.
This callback is called if and only if the stream is about to be closed.
This might be used to handle cleanup steps after a fdstream exits. This
will be used later on in ensuring mutually exclusive access to consoles.
* src/fdstream.c:
- emit the callback, when stream is being closed
- add data structures needed to handle the callback
- add function to register callback
* src/fdstream.h:
- define function prototypes for the callback
This patch causes the fdstream driver to call the stream event callback
if virStreamAbort() is called on a stream using this driver.
A remote handler for a stream can only detect changes via stream events,
so this event callback is necessary in order to enable a daemon to abort
a stream in such a way that the client will see the change.
* src/fdstream.c:
- modify close function to call stream event callback
This patch adds a set of flags to be used with the virDomainOpenConsole
API call to specify if the user wishes to interrupt an existing console
session or just to try open a new one.
VIR_DOMAIN_CONSOLE_SAFE - specifies that the console connection should
be opened only if the hypervisor supports
mutually exclusive access to console devices
VIR_DOMAIN_CONSOLE_FORCE - specifies that the caller wishes to interrupt
existing session and force a creation of a
new one.
This patch changes behavior of virPidFileRead to enable passing NULL as
path to the binary the pid file should be checked against to skip this
check. This enables using this function for reading files that have same
semantics as pid files, but belong to unknown processes.
using 'system-wakeup' monitor command. It is supported only in JSON,
as we are enabling it if possible. Moreover, this command is available
in qemu-1.1+ which definitely has JSON.
Function xmlParseURI does not remove square brackets around IPv6
address when parsing. One of the solutions is making wrappers around
functions working with xmlURI*. This assures that uri->server will be
always properly assigned and it doesn't have to be changed when used
on some new place in the code.
For this purpose, functions virParseURI and virSaveURI were
added. These function are wrappers around xmlParseURI and xmlSaveUri
respectively.
Also there is one new syntax check function to prohibit these functions
anywhere else.
File changes:
- src/util/viruri.h -- declaration
- src/util/viruri.c -- definition
- src/libvirt_private.syms -- symbol export
- src/Makefile.am -- added source and header files
- cfg.mk -- added sc_prohibit_xmlURI
- all others -- ID name and include fixes
The /usr/include/python/pyconfig.h file pollutes the global
namespace with a huge number of HAVE_XXX and WITH_XXX
defines. These change what we detected in our own config.h
In particular if you try to build without DTrace, python's
headers turn it back on with predictable fail.
THe hack to workaround this is to rename WITH_DTRACE to
WITH_DTRACE_PROBES to avoid the namespace clash
It's possible to disable SPICE TLS in qemu.conf. When this happens,
libvirt ignores any SPICE TLS port or x509 directory that may have
been set when it builds the qemu command line to use. However, it's
not ignoring the secure channels that may have been set and adds
tls-channel arguments to qemu command line.
Current qemu versions don't report an error when this happens, and try to use
TLS for the specified channels.
Before this patch
<domain type='kvm'>
<name>auto-tls-port</name>
<memory>65536</memory>
<os>
<type arch='x86_64' machine='pc'>hvm</type>
</os>
<devices>
<graphics type='spice' port='5900' tlsPort='-1' autoport='yes' listen='0' ke
<listen type='address' address='0'/>
<channel name='main' mode='secure'/>
<channel name='inputs' mode='secure'/>
</graphics>
</devices>
</domain>
generates
-spice port=5900,addr=0,disable-ticketing,tls-channel=main,tls-channel=inputs
and starts QEMU.
After this patch, an error is reported if a TLS port is set in the XML
or if secure channels are specified but TLS is disabled in qemu.conf.
This is the behaviour the oVirt people (where I spotted this issue) said
they would expect.
This fixes bug #790436
This patch adds support for vmx files with empty networkName
values (which is the case for vmx generated by Workstation).
It also adds support for vmx containing NATed network interfaces.
Update test suite accordingly
[forwarding this here from RH bug #796732]
When creating a network (virsh net-create) with an erroneous XML
containing an empty <name> element, the error message is misleading:
error: Failed to create network from foo.xml
error: missing domain name information
It took me a bit of time to figure out that it was the *network* name
that was missing (I generate this xml and didn't look at it, first).
I realized that the same message is used for missing name when creating
a domain, network, or device node.
https://bugzilla.redhat.com/show_bug.cgi?id=795656 mentions
that a graceful destroy request can time out, meaning that the
error message is user-visible and should be more appropriate
than just internal error.
* src/qemu/qemu_driver.c (qemuDomainDestroyFlags): Swap error type.
Migrating domains with disks using cache != none is unsafe unless the
disk images are stored on coherent clustered filesystem. Thus we forbid
migrating such domains unless VIR_MIGRATE_UNSAFE flags is used.
This patch adds VIR_MIGRATE_UNSAFE flag for migration APIs and new
VIR_ERR_MIGRATION_UNSAFE error code. The error code should be returned
whenever migrating a domain is considered unsafe (e.g., it's configured
in a way that does not ensure data integrity once it is migrated).
VIR_MIGRATE_UNSAFE flag may be used to force migration even though it
would normally be considered unsafe and forbidden.
AC_CHECK_PROG checks for program in given path. However, if it doesn't
exists, [variable] is set to [value-if-not-found]. We don't want this
to be the empty string in case of 'modprobe' and 'scrub' as we want to
fallback to runtime detection.
Adding "Expect:" to the header list stops libcurl from sending a
Expect header at all.
Before, a dummy Expect header was added that might confuse HTTP
proxies and result in HTTP error code 417 being reported.
Previously we would have:
"os type 'hvm' & arch 'idontexist' combination is not supported"
Now we get
"No guest options available for arch 'idontexist'"
or if options available but guest OS type not applicable:
"No os type 'xen' available for arch 'x86_64'"
* src/util/virfile.h: the virFileWrapperFdFlags being defined as
a globa variable instead of a type ended up generating a duplicate
symbol error.
* AUTHORS: added Lincoln Myers
* src/qemu/qemu_process.c (qemuFindAgentConfig): avoid crash libvirtd due to
deref a NULL pointer.
* How to reproduce?
1. virsh edit the following xml into guest configuration:
<channel type='pty'>
<target type='virtio'/>
</channel>
2. virsh start <domain>
or
% virt-install -n foo -r 1024 --disk path=/var/lib/libvirt/images/foo.img,size=1 \
--channel pty,target_type=virtio -l <installation tree>
Signed-off-by: Alex Jia <ajia@redhat.com>
When migrating a qemu domain, we enter the monitor, send some commands,
try to connect to destination qemu, send other commands, end exit the
monitor. However, if we couldn't connect to destination qemu we forgot
to exit the monitor.
Bug introduced by commit d9d518b1c8.
In case libvirtd cannot detect host CPU model (which may happen if it
runs inside a virtual machine), the daemon is likely to segfault when
starting a new qemu domain. It segfaults when domain XML asks for host
(either model or passthrough) CPU or does not ask for any specific CPU
model at all.
Currently, if scrub (used for wiping algorithms) is not present
at compile time, we don't support any other wiping algorithms than
zeroing, even if it was installed later. Switch to runtime detection
instead.
Bug introduced in commit 35abced. On an inactive domain,
$ virsh snapshot-create-as dom snap
$ virsh snapshot-create dom
$ virsh snapshot-create dom
$ virsh snapshot-delete --children dom snap
could crash libvirtd, due to a use-after-free that results
when the callback freed the current element in the iteration.
* src/conf/domain_conf.c (virDomainSnapshotForEachChild)
(virDomainSnapshotActOnDescendant): Allow iteration to delete
current child.
This patch allows libvirt to add interfaces to already
existing Open vSwitch bridges. The following syntax in
domain XML file can be used:
<interface type='bridge'>
<mac address='52:54:00:d0:3f:f2'/>
<source bridge='ovsbr'/>
<virtualport type='openvswitch'>
<parameters interfaceid='921a80cd-e6de-5a2e-db9c-ab27f15a6e1d'/>
</virtualport>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x03' function='0x0'/>
</interface>
or if libvirt should auto-generate the interfaceid use
following syntax:
<interface type='bridge'>
<mac address='52:54:00:d0:3f:f2'/>
<source bridge='ovsbr'/>
<virtualport type='openvswitch'>
</virtualport>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x03' function='0x0'/>
</interface>
It is also possible to pass an optional profileid. To do that
use following syntax:
<interface type='bridge'>
<source bridge='ovsbr'/>
<mac address='00:55:1a:65:a2:8d'/>
<virtualport type='openvswitch'>
<parameters interfaceid='921a80cd-e6de-5a2e-db9c-ab27f15a6e1d'
profileid='test-profile'/>
</virtualport>
</interface>
To create Open vSwitch bridge install Open vSwitch and
run the following command:
ovs-vsctl add-br ovsbr
The current default method of terminating the qemu process is to send
a SIGTERM, wait for up to 1.6 seconds for it to cleanly shutdown, then
send a SIGKILL and wait for up to 1.4 seconds more for the process to
terminate. This is problematic because occasionally 1.6 seconds is not
long enough for the qemu process to flush its disk buffers, so the
guest's disk ends up in an inconsistent state.
Since this only occasionally happens when the timeout prior to SIGKILL
is 1.6 seconds, this patch increases that timeout to 10 seconds. At
the very least, this should reduce the occurrence from "occasionally"
to "extremely rarely". (Once SIGKILL is sent, it waits another 5
seconds for the process to die before returning).
Note that in the cases where it takes less than this for qemu to
shutdown cleanly, libvirt will *not* wait for any longer than it would
without this patch - qemuProcessKill polls the process and returns as
soon as it is gone.
This patch is based on an earlier patch by Eric Blake which was never
committed:
https://www.redhat.com/archives/libvir-list/2011-November/msg00243.html
Aside from rebasing, this patch only drops the driver lock once (prior
to the first time the function sleeps), then leaves it dropped until
it returns (Eric's patch would drop and re-acquire the lock around
each call to sleep).
At the time Eric sent his patch, the response (from Dan Berrange) was
that, while it wasn't a good thing to be holding the driver lock while
sleeping, we really need to rethink locking wrt the driver object,
switching to a finer-grained approach that locks individual items
within the driver object separately to allow for greater concurrency.
This is a good plan, and at the time it made sense to not apply the
patch because there was no known bug related to the driver lock being
held in this function.
However, we now know that the length of the wait in qemuProcessKill is
sometimes too short to allow the qemu process to fully flush its disk
cache before SIGKILL is sent, so we need to lengthen the timeout (in
order to improve the situation with management applications until they
can be updated to use the new VIR_DOMAIN_DESTROY_GRACEFUL flag added
in commit 72f8a7f197). But, if we
lengthen the timeout, we also lengthen the amount of time that all
other threads in libvirtd are essentially blocked from doing anything
(since just about everything needs to acquire the driver lock, if only
for long enough to get a pointer to a domain).
The solution is to modify qemuProcessKill to drop the driver lock
while sleeping, as proposed in Eric's patch. Then we can increase the
timeout with a clear conscience, and thus at least lower the chances
that someone running with existing management software will suffer the
consequence's of qemu's disk cache not being flushed.
In the meantime, we still should work on Dan's proposal to make
locking within the driver object more fine grained.
(NB: although I couldn't find any instance where qemuProcessKill() was
called with no jobs active for the domain (or some other guarantee
that the current thread had at least one refcount on the domain
object), this patch still follows Eric's method of temporarily adding
a ref prior to unlocking the domain object, because I couldn't
convince myself 100% that this was the case.)
In the future (my next patch in fact) we may want to make
decisions depending on qemu having a monitor command or not.
Therefore, we want to set qemuCaps flag instead of querying
on the monitor each time we are about to make that decision.
When blkdeviotune was first committed in 0.9.8, we had the limitation
that setting one value reset all others. But bytes and iops should
be relatively independent. Furthermore, setting tuning values on
a live domain followed by dumpxml did not output the new settings.
* src/qemu/qemu_driver.c (qemuDiskPathToAlias): Add parameter, and
update callers.
(qemuDomainSetBlockIoTune): Don't lose previous unrelated
settings. Make live changes reflect to dumpxml output.
* tools/virsh.pod (blkdeviotune): Update documentation.
Detected by valgrind. Leaks are introduced in commit c1b2264.
* src/remote/remote_driver.c (doRemoteOpen): free client program memory in failure path.
* How to reproduce?
% valgrind -v --leak-check=full virsh -c qemu:
* Actual result
==3969== 40 bytes in 1 blocks are definitely lost in loss record 8 of 28
==3969== at 0x4A04A28: calloc (vg_replace_malloc.c:467)
==3969== by 0x4C89C41: virAlloc (memory.c:101)
==3969== by 0x4D5A236: virNetClientProgramNew (virnetclientprogram.c:60)
==3969== by 0x4D47AB4: doRemoteOpen (remote_driver.c:658)
==3969== by 0x4D49FFF: remoteOpen (remote_driver.c:871)
==3969== by 0x4D13373: do_open (libvirt.c:1196)
==3969== by 0x4D14535: virConnectOpenAuth (libvirt.c:1422)
==3969== by 0x425627: main (virsh.c:18537)
==3969==
==3969== 40 bytes in 1 blocks are definitely lost in loss record 9 of 28
==3969== at 0x4A04A28: calloc (vg_replace_malloc.c:467)
==3969== by 0x4C89C41: virAlloc (memory.c:101)
==3969== by 0x4D5A236: virNetClientProgramNew (virnetclientprogram.c:60)
==3969== by 0x4D47AD7: doRemoteOpen (remote_driver.c:664)
==3969== by 0x4D49FFF: remoteOpen (remote_driver.c:871)
==3969== by 0x4D13373: do_open (libvirt.c:1196)
==3969== by 0x4D14535: virConnectOpenAuth (libvirt.c:1422)
==3969== by 0x425627: main (virsh.c:18537)
==3969==
==3969== LEAK SUMMARY:
==3969== definitely lost: 80 bytes in 2 blocks
Signed-off-by: Alex Jia <ajia@redhat.com>
The auto-generated WWN comply with the new addressing schema of WWN:
<quote>
the first nibble is either hex 5 or 6 followed by a 3-byte vendor
identifier and 36 bits for a vendor-specified serial number.
</quote>
We choose hex 5 for the first nibble. And for the 3-bytes vendor ID,
we uses the OUI according to underlying hypervisor type, (invoking
virConnectGetType to get the virt type). e.g. If virConnectGetType
returns "QEMU", we use Qumranet's OUI (00:1A:4A), if returns
ESX|VMWARE, we use VMWARE's OUI (00:05:69). Currently it only
supports qemu|xen|libxl|xenapi|hyperv|esx|vmware drivers. The last
36 bits are auto-generated.
Some audit records generated by libvirt contain fields enclosed by single
quotes. Since those fields are inside the msg field, which is enclosed by
single quotes, these records generated by libvirt are not correctly parsed by
libauparse.
Some tools, such as virt-manager, prefers having the default USB
controller explicit in the XML document. This patch makes sure there
is one. With this patch, it is now possible to switch from USB1 to
USB2 from the release 0.9.1 of virt-manager.
Fix tests to pass with this change.
virsh blkiotune dom --device-weights /dev/sda,400 --config
wasn't working correctly.
* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters): Use
correct definition.
Now that no one is relying on the return value being a pointer to
somewhere inside of the passed-in argument, we can simplify the
callers to simply return success or failure. Also wrap some long
lines and add some const-correctness.
* src/util/sysinfo.c (virSysinfoParseBIOS, virSysinfoParseSystem)
(virSysinfoParseProcessor, virSysinfoParseMemory): Change return.
(virSysinfoRead): Adjust caller.
Reported by Alex Jia:
==21503== 112 (32 direct, 80 indirect) bytes in 1 blocks are
definitely lost in loss record 37 of 40
==21503== at 0x4A04A28: calloc (vg_replace_malloc.c:467)
==21503== by 0x4A8991: virAlloc (memory.c:101)
==21503== by 0x505A6C: x86DataCopy (cpu_x86.c:247)
==21503== by 0x507B34: x86Compute (cpu_x86.c:1225)
==21503== by 0x43103C: qemuBuildCommandLine (qemu_command.c:3561)
==21503== by 0x41C9F7: testCompareXMLToArgvHelper
(qemuxml2argvtest.c:183)
==21503== by 0x41E10D: virtTestRun (testutils.c:141)
==21503== by 0x41B942: mymain (qemuxml2argvtest.c:705)
==21503== by 0x41D7E7: virtTestMain (testutils.c:696)
In case the caller specifies that confined guests are required but the
security driver turns out to be 'none', we should return an error since
this driver clearly cannot meet that requirement. As a result of this
error, libvirtd fails to start when the host admin explicitly sets
confined guests are required but there is no security driver available.
Since security driver 'none' cannot create confined guests, we override
default confined setting so that hypervisor drivers do not thing they
should create confined guests.
Security label type 'none' requires relabel to be set to 'no' so there's
no reason to output this extra attribute. Moreover, since relabel is
internally stored in a negative from (norelabel), the default value for
relabel would be 'yes' in case there is no <seclabel> element in domain
configuration. In case VIR_DOMAIN_SECLABEL_DEFAULT turns into
VIR_DOMAIN_SECLABEL_NONE, we would incorrectly output relabel='yes' for
seclabel type 'none'.
Qemu uses non-blocking I/O which doesn't play nice with regular file
descriptors. We need to pass a pipe to qemu instead, which can easily be
done using iohelper.
virFileDirectFd was used for accessing files opened with O_DIRECT using
libvirt_iohelper. We will want to use the helper for accessing files
regardless on O_DIRECT and thus virFileDirectFd was generalized and
renamed to virFileWrapperFd.
dmidecode displays processor information, followed by BIOS, system and
memory-DIMM details.
Calls to virSysinfoParseBIOS(), virSysinfoParseSystem() would update
the buffer pointer 'base', so the processor information would be lost
before virSysinfoParseProcessor() was called. Sysinfo would therefore
not be able to display processor details -- It only described <bios>,
<system> and <memory_device> details.
This patch attempts to insulate sysinfo from ordering of dmidecode
output.
Before the fix:
---------------
virsh # sysinfo
<sysinfo type='smbios'>
<bios>
....
</bios>
<system>
....
</system>
<memory_device>
....
</memory_device>
After the fix:
-------------
virsh # sysinfo
<sysinfo type='smbios'>
<bios>
....
</bios>
<system>
....
</system>
<processor>
....
</processor>
<memory_device>
....
</memory_device>
Input to the volume cloning code is a source volume and an XML
descriptor for the new volume. It is possible for the new volume
to have a greater size than source volume, at which point libvirt
will just stick 0s on the end of the new image (for raw format
anyways).
Unfortunately a logic error messed up our tracking of the of the
excess amount that needed to be written: end result is that sparse
clones were made very much non-sparse, and cloning regular disk
images could end up excessively sized (though data unaltered).
Drop the 'remain' variable entriely here since it's redundant, and
track actual allocation directly against the desired 'total'.
gcc 4.7 complains:
util/virhashcode.c:49:17: error: always_inline function might not be inlinable [-Werror=attributes]
util/virhashcode.c:35:17: error: always_inline function might not be inlinable [-Werror=attributes]
Normal 'inline' is a hint that the compiler may ignore; the fact
that the function is static is good enough. We don't care if the
compiler decided not to inline after all.
* src/util/virhashcode.c (getblock, fmix): Relax attribute.
On CentOS5:
If "virsh edit $DOM" is used and an error happens (for example changing
any live cycle action to a non-existing value), libvirt forgets that
$DOM exists, since it is already removed from the internal hash tables,
which are used for domain lookup.
In once case (unreproducible) even the persistent configuration
/etc/xen/$DOM was deleted.
Instead of using the compound function xenXMConfigSaveFile() explicitly
use xenFomatXM() and virConfWriteFile() to distinguish between a failure
in converting the libvirt definition to the xen-xm format and a problem
when writing the file.
Signed-off-by: Philipp Hahn <hahn@univention.de>
Commit b170eb99 introduced a bug: domains that had an explicit
<seclabel type='none'/> when started would not be reparsed if
libvirtd restarted. It turns out that our testsuite was not
exercising this because it never tried anything but inactive
parsing. Additionally, the live XML for such a domain failed
to re-validate. Applying just the tests/ portion of this patch
will expose the bugs that are fixed by the other two files.
* docs/schemas/domaincommon.rng (seclabel): Allow relabel under
type='none'.
* src/conf/domain_conf.c (virSecurityLabelDefParseXML): Per RNG,
presence of <seclabel> with no type implies dynamic. Don't
require sub-elements for type='none'.
* tests/qemuxml2xmltest.c (mymain): Add test.
* tests/qemuxml2argvtest.c (mymain): Likewise.
* tests/qemuxml2argvdata/qemuxml2argv-seclabel-none.xml: Add file.
* tests/qemuxml2argvdata/qemuxml2argv-seclabel-none.args: Add file.
Reported by Ansis Atteka.
On CentOS5 with xen-3.0.3:
Program received signal SIGSEGV, Segmentation fault.
virFree (ptrptr=0x8) at util/memory.c:310
310 free(*(void**)ptrptr);
(gdb) bt
#0 virFree (ptrptr=0x8) at util/memory.c:310
#1 0x00002aaaaae167c8 in xenXMDomainDefineXML (conn=0x694e80, xml=0x6b2ce0 "P\fk") at xen/xm_internal.c:1199
#2 0x00002aaaaae070d7 in xenUnifiedDomainDefineXML (conn=0x8,
xml=0x6ac040 "<domain type='xen'>\n <name>pv</name>\n <uuid>20291bc0-453a-4d6c-c6ac-4e5af63b932c</uuid>\n <memory>1048576</memory>\n <currentMemory>1048576</currentMemory>\n <vcpu>1</vcpu>\n <os>\n <type arch='x8"...) at xen/xen_driver.c:1524
#3 0x00002aaaaada7803 in virDomainDefineXML (conn=0x694e80,
xml=0x6ac040 "<domain type='xen'>\n <name>pv</name>\n <uuid>20291bc0-453a-4d6c-c6ac-4e5af63b932c</uuid>\n <memory>1048576</memory>\n <currentMemory>1048576</currentMemory>\n <vcpu>1</vcpu>\n <os>\n <type arch='x8"...) at libvirt.c:7823
#4 0x0000000000426173 in cmdEdit (ctl=0x7fffffffb8e0, cmd=<value optimized out>) at virsh.c:14882
#5 0x000000000041c9ce in vshCommandRun (ctl=0x7fffffffb8e0, cmd=0x658c50) at virsh.c:17712
#6 0x000000000042c3b9 in main (argc=1, argv=<value optimized out>) at virsh.c:19317
Signed-off-by: Philipp Hahn <hahn@univention.de>
Calling qemuDomainMigrateGraphicsRelocate notifies spice clients to
connect to destination qemu so that they can seamlessly switch streams
once migration is done. Unfortunately, current qemu is not able to
accept any connections while incoming migration connection is open.
Thus, we need to delay opening the migration connection to the point
spice client is already connected to the destination qemu.
Unlike .cvsignore under CVS, git allows for ignoring nested
names. We weren't very consistent where new tests were
being ignored (some in .gitignore, some in tests/.gitignore),
and I found it easier to just consolidate everything.
* .gitignore: Subsume entries from subdirectories.
* daemon/.gitignore: Delete.
* docs/.gitignore: Likewise.
* docs/devhelp/.gitignore: Likewise.
* docs/html/.gitignore: Likewise.
* examples/dominfo/.gitignore: Likewise.
* examples/domsuspend/.gitignore: Likewise.
* examples/hellolibvirt/.gitignore: Likewise.
* examples/openauth/.gitignore: Likewise.
* examples/domain-events/events-c/.gitignore: Likewise.
* include/libvirt/.gitignore: Likewise.
* src/.gitignore: Likewise.
* src/esx/.gitignore: Likewise.
* tests/.gitignore: Likewise.
* tools/.gitignore: Likewise.
This eliminates the warning message reported in:
https://bugzilla.redhat.com/show_bug.cgi?id=624447
It was caused by a failure to open an image file that is not
accessible by root (the uid libvirtd is running as) because it's on a
root-squash NFS share, owned by a different user, with permissions of
660 (or maybe 600).
The solution is to use virFileOpenAs() rather than open(). The
codepath that generates the error is during qemuSetupDiskCGroup(), but
the actual open() is in a lower-level generic function called from
many places (virDomainDiskDefForeachPath), so some other pieces of the
code were touched just to add dummy (or possibly useful) uid and gid
arguments.
Eliminating this warning message has the nice side effect that the
requested operation may even succeed (which in this case isn't
necessary, but shouldn't hurt anything either).
virFileOpenAs previously would only try opening a file as the current
user, or as a different user, but wouldn't try both methods in a
single call. This made it cumbersome to use as a replacement for
open(2). Additionally, it had a lot of historical baggage that led to
it being difficult to understand.
This patch refactors virFileOpenAs in the following ways:
* reorganize the code so that everything dealing with both the parent
and child sides of the "fork+setuid+setgid+open" method are in a
separate function. This makes the public function easier to understand.
* Allow a single call to virFileOpenAs() to first attempt the open as
the current user, and if that fails to automatically re-try after
doing fork+setuid (if deemed appropriate, i.e. errno indicates it
would now be successful, and the file is on a networkFS). This makes
it possible (in many, but possibly not all, cases) to drop-in
virFileOpenAs() as a replacement for open(2).
(NB: currently qemuOpenFile() calls virFileOpenAs() twice, once
without forking, then again with forking. That unfortunately can't
be changed without at least some discussion of the ramifications,
because the requested file permissions are different in each case,
which is something that a single call to virFileOpenAs() can't deal
with.)
* Add a flag so that any fchown() of the file to a different uid:gid
is explicitly requested when the function is called, rather than it
being implied by the presence of the O_CREAT flag. This just makes
for less subtle surprises to consumers. (Commit
b1643dc15c added the check for O_CREAT
before forcing ownership. This patch just makes that restriction
more explicit.)
* If either the uid or gid is specified as "-1", virFileOpenAs will
interpret this to mean "the current [gu]id".
All current consumers of virFileOpenAs should retain their present
behavior (after a few minor changes to their setup code and
arguments).
Rename the src/util/netlink files to src/util/virnetlink to
better fit the naming scheme. Also rename nlComm to virNetlinkCommand.
Signed-off-by: D. Herrendoerfer <d.herrendoerfer@herrendoerfer.name>
When libvirt's virDomainDestroy API is shutting down the qemu process,
it first sends SIGTERM, then waits for 1.6 seconds and, if it sees the
process still there, sends a SIGKILL.
There have been reports that this behavior can lead to data loss
because the guest running in qemu doesn't have time to flush its disk
cache buffers before it's unceremoniously whacked.
This patch maintains that default behavior, but provides a new flag
VIR_DOMAIN_DESTROY_GRACEFUL to alter the behavior. If this flag is set
in the call to virDomainDestroyFlags, SIGKILL will never be sent to
the qemu process; instead, if the timeout is reached and the qemu
process still exists, virDomainDestroy will return an error.
Once this patch is in, the recommended method for applications to call
virDomainDestroyFlags will be with VIR_DOMAIN_DESTROY_GRACEFUL
included. If that fails, then the application can decide if and when
to call virDomainDestroyFlags again without
VIR_DOMAIN_DESTROY_GRACEFUL (to force the issue with SIGKILL).
(Note that this does not address the issue of existing applications
that have not yet been modified to use VIR_DOMAIN_DESTROY_GRACEFUL.
That is a separate patch.)
Our HACKING discourages use of malloc and free, for at least
a couple of years now. But we weren't enforcing it, until now :)
For now, I've exempted python and tests, and will clean those up
in subsequent patches. Examples should be permanently exempt,
since anyone copying our examples won't have use of our
internal-only memory.h via libvirt_util.la.
* cfg.mk (sc_prohibit_raw_allocation): New rule.
(exclude_file_name_regexp--sc_prohibit_raw_allocation): and
exemptions.
* src/cpu/cpu.c (cpuDataFree): Avoid false positive.
* src/conf/network_conf.c (virNetworkDNSSrvDefParseXML): Fix
offenders.
* src/libxl/libxl_conf.c (libxlMakeDomBuildInfo, libxlMakeVfb)
(libxlMakeDeviceModelInfo): Likewise.
* src/rpc/virnetmessage.c (virNetMessageSaveError): Likewise.
* tools/virsh.c (_vshMalloc, _vshCalloc): Likewise.
Sometimes, its easier to run children with 2>&1 in shell notation,
and just deal with stdout and stderr interleaved. This was already
possible for fd handling; extend it to also work when doing string
capture of a child process.
* docs/internals/command.html.in: Document this.
* src/util/command.c (virCommandSetErrorBuffer): Likewise.
(virCommandRun, virExecWithHook): Implement it.
* tests/commandtest.c (test14): Test it.
* daemon/remote.c (remoteDispatchAuthPolkit): Use new command
feature.
This patch fixes the access of variable "con" in two files where the
variable was declared only on SELinux builds and thus the build failed
without SELinux. It's a rather nasty fix but helps fix the build
quickly and without any major changes to the code.
Detected by valgrind. Leak is introduced in commit 397e6a7.
* src/conf/domain_conf.c(virDomainDiskDefParseXML): fix memory leak.
How to reproduce?
% make -C tests check TESTS=qemuxml2argvtest
% cd tests && valgrind -v --leak-check=full ./qemuxml2argvtest
* Actual result:
==16352== 4 bytes in 1 blocks are definitely lost in loss record 12 of 147
==16352== at 0x4A05FDE: malloc (vg_replace_malloc.c:236)
==16352== by 0x39D90A67DD: xmlStrndup (xmlstring.c:45)
==16352== by 0x4E83D5: virDomainDiskDefParseXML (domain_conf.c:2894)
==16352== by 0x4F542D: virDomainDefParseXML (domain_conf.c:7626)
==16352== by 0x4F8683: virDomainDefParseNode (domain_conf.c:8390)
==16352== by 0x4F904E: virDomainDefParse (domain_conf.c:8340)
==16352== by 0x41C626: testCompareXMLToArgvHelper (qemuxml2argvtest.c:105)
==16352== by 0x41DED1: virtTestRun (testutils.c:142)
==16352== by 0x418172: mymain (qemuxml2argvtest.c:486)
==16352== by 0x41D5C7: virtTestMain (testutils.c:697)
==16352== by 0x39CF01ECDC: (below main) (in /lib64/libc-2.12.so)
Signed-off-by: Alex Jia <ajia@redhat.com>
To allow the container to access /dev and /dev/pts when under
sVirt, set an explicit mount option. Also set a max size on
the /dev mount to prevent DOS on memory usage
* src/lxc/lxc_container.c: Set /dev mount context
* src/lxc/lxc_controller.c: Set /dev/pts mount context
For the sake of backwards compat, LXC guests are *not*
confined by default. This is because it is not practical
to dynamically relabel containers using large filesystem
trees. Applications can create confined containers though,
by giving suitable XML configs
* src/Makefile.am: Link libvirt_lxc to security drivers
* src/lxc/libvirtd_lxc.aug, src/lxc/lxc_conf.h,
src/lxc/lxc_conf.c, src/lxc/lxc.conf,
src/lxc/test_libvirtd_lxc.aug: Config file handling for
security driver
* src/lxc/lxc_driver.c: Wire up security driver functions
* src/lxc/lxc_controller.c: Add a '--security' flag to
specify which security driver to activate
* src/lxc/lxc_container.c, src/lxc/lxc_container.h: Set
the process label just before exec'ing init.
Curently security labels can be of type 'dynamic' or 'static'.
If no security label is given, then 'dynamic' is assumed. The
current code takes advantage of this default, and avoids even
saving <seclabel> elements with type='dynamic' to disk. This
means if you temporarily change security driver, the guests
can all still start.
With the introduction of sVirt to LXC though, there needs to be
a new default of 'none' to allow unconfined LXC containers.
This patch introduces two new security label types
- default: the host configuration decides whether to run the
guest with type 'none' or 'dynamic' at guest start
- none: the guest will run unconfined by security policy
The 'none' label type will obviously be undesirable for some
deployments, so a new qemu.conf option allows a host admin to
mandate confined guests. It is also possible to turn off default
confinement
security_default_confined = 1|0 (default == 1)
security_require_confined = 1|0 (default == 0)
* src/conf/domain_conf.c, src/conf/domain_conf.h: Add new
seclabel types
* src/security/security_manager.c, src/security/security_manager.h:
Set default sec label types
* src/security/security_selinux.c: Handle 'none' seclabel type
* src/qemu/qemu.conf, src/qemu/qemu_conf.c, src/qemu/qemu_conf.h,
src/qemu/libvirtd_qemu.aug: New security config options
* src/qemu/qemu_driver.c: Tell security driver about default
config
This re-introduces parsing & formatting for per device seclabels.
There is a new virDomainDeviceSeclabelPtr struct and corresponding
APIs for parsing/formatting.
Revert parsing changes:
commit 302fe95ffa
Author: Eric Blake <eblake@redhat.com>
Date: Wed Jan 4 16:01:24 2012 -0700
seclabel: fix regression in libvirtd restart
commit b43432931a
Author: Eric Blake <eblake@redhat.com>
Date: Thu Dec 22 17:47:50 2011 -0700
seclabel: allow a seclabel override on a disk src
These two commits changed the sec label parsing code so that
the same code dealt with both the VM level sec label, and the
per device label. Unfortunately, as we add more options to the
VM level sec label, the logic required to use the same parsing
code for the per device label becomes unintelligible.
* src/conf/domain_conf.c: Remove support for parsing per
device sec labels
I slightly botched commit be9fb5a - I converted '--arg=value' to
'--arg value', which has no semantic change, but did trip up the
testsuite.
* src/network/bridge_driver.c (networkBuildDnsmasqArgv): Restore
expected output.
libvirt supports 4 different versions of the user-land XenD daemon. When
queried the daemon just returns its generation number, which is hard to
match to the version of the Xen tools.
Replace the magic generation numbers by named enum definitions to
improve code readability.
Signed-off-by: Philipp Hahn <hahn@univention.de>
Detected by valgrind. Leaks introduced in commit 973af236.
* src/network/bridge_driver.c: fix memory leaks on failure and successful path.
* How to reproduce?
% make -C tests check TESTS=networkxml2argvtest
% cd tests && valgrind -v --leak-check=full ./networkxml2argvtest
* Actual result:
==2226== 3 bytes in 1 blocks are definitely lost in loss record 1 of 24
==2226== at 0x4A05FDE: malloc (vg_replace_malloc.c:236)
==2226== by 0x39CF0FEDE7: __vasprintf_chk (in /lib64/libc-2.12.so)
==2226== by 0x41DFF7: virVasprintf (stdio2.h:199)
==2226== by 0x41E0B7: virAsprintf (util.c:1695)
==2226== by 0x41A2D9: networkBuildDhcpDaemonCommandLine (bridge_driver.c:545)
==2226== by 0x4145C8: testCompareXMLToArgvHelper (networkxml2argvtest.c:47)
==2226== by 0x4156A1: virtTestRun (testutils.c:141)
==2226== by 0x414332: mymain (networkxml2argvtest.c:123)
==2226== by 0x414D97: virtTestMain (testutils.c:696)
==2226== by 0x39CF01ECDC: (below main) (in /lib64/libc-2.12.so)
==2226==
==2226== 3 bytes in 1 blocks are definitely lost in loss record 2 of 24
==2226== at 0x4A05FDE: malloc (vg_replace_malloc.c:236)
==2226== by 0x39CF0FEDE7: __vasprintf_chk (in /lib64/libc-2.12.so)
==2226== by 0x41DFF7: virVasprintf (stdio2.h:199)
==2226== by 0x41E0B7: virAsprintf (util.c:1695)
==2226== by 0x41A307: networkBuildDhcpDaemonCommandLine (bridge_driver.c:551)
==2226== by 0x4145C8: testCompareXMLToArgvHelper (networkxml2argvtest.c:47)
==2226== by 0x4156A1: virtTestRun (testutils.c:141)
==2226== by 0x414332: mymain (networkxml2argvtest.c:123)
==2226== by 0x414D97: virtTestMain (testutils.c:696)
==2226== by 0x39CF01ECDC: (below main) (in /lib64/libc-2.12.so)
==2226==
==2226== 5 bytes in 1 blocks are definitely lost in loss record 4 of 24
==2226== at 0x4A05FDE: malloc (vg_replace_malloc.c:236)
==2226== by 0x39CF0FEDE7: __vasprintf_chk (in /lib64/libc-2.12.so)
==2226== by 0x41DFF7: virVasprintf (stdio2.h:199)
==2226== by 0x41E0B7: virAsprintf (util.c:1695)
==2226== by 0x41A2AB: networkBuildDhcpDaemonCommandLine (bridge_driver.c:539)
==2226== by 0x4145C8: testCompareXMLToArgvHelper (networkxml2argvtest.c:47)
==2226== by 0x4156A1: virtTestRun (testutils.c:141)
==2226== by 0x414332: mymain (networkxml2argvtest.c:123)
==2226== by 0x414D97: virtTestMain (testutils.c:696)
==2226== by 0x39CF01ECDC: (below main) (in /lib64/libc-2.12.so)
==2226==
==2226== LEAK SUMMARY:
==2226== definitely lost: 11 bytes in 3 blocks
Signed-off-by: Alex Jia <ajia@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
This is a trivial implementation, which works with the current
released qemu 1.0 with backports of preliminary block pull but
no partial rebase. Future patches will update the monitor handling
to support an optional parameter for partial rebase; but as qemu
1.1 is unreleased, it can be in later patches, designed to be
backported on top of the supported API.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Add parameter,
and adjust callers. Drop redundant check.
(qemuDomainBlockPull): Move guts...
(qemuDomainBlockRebase): ...to new function.
Qemu is adding the ability to do a partial rebase. That is, given:
base <- intermediate <- current
virDomainBlockPull will produce:
current
but qemu now has the ability to leave base in the chain, to produce:
base <- current
Note that current qemu can only do a forward merge, and only with
the current image as the destination, which is fully described by
this API without flags. But in the future, it may be possible to
enhance this API for additional scenarios by using flags:
Merging the current image back into a previous image (that is,
undoing a live snapshot), could be done by passing base as the
destination and flags with a bit requesting a backward merge.
Merging any other part of the image chain, whether forwards (the
backing image contents are pulled into the newer file) or backwards
(the deltas recorded in the newer file are merged back into the
backing file), could also be done by passing a new flag that says
that base should be treated as an XML snippet rather than an
absolute path name, where the XML could then supply the additional
instructions of which part of the image chain is being merged into
any other part.
* include/libvirt/libvirt.h.in (virDomainBlockRebase): New
declaration.
* src/libvirt.c (virDomainBlockRebase): Implement it.
* src/libvirt_public.syms (LIBVIRT_0.9.10): Export it.
* src/driver.h (virDrvDomainBlockRebase): New driver callback.
* src/rpc/gendispatch.pl (long_legacy): Add exemption.
* docs/apibuild.py (long_legacy_functions): Likewise.
This patch adds support for the new api into the qemu driver to support
modification and retrieval of domain description and title. This patch
does not add support for modifying the <metadata> element.
This patch adds API to modify domain metadata for running and stopped
domains. The api supports changing description, title as well as the
newly added <metadata> element. The API has support for storing data in
the metadata element using xml namespaces.
* include/libvirt/libvirt.h.in
* src/libvirt_public.syms
- add function headers
- add enum to select metadata to operate on
- export functions
* src/libvirt.c
- add public api implementation
* src/driver.h
- add driver support
* src/remote/remote_driver.c
* src/remote/remote_protocol.x
- wire up the remote protocol
* include/libvirt/virterror.h
* src/util/virterror.c
- add a new error message note that metadata for domain are
missing
This patch adds a new element <title> to the domain XML. This attribute
can hold a short title defined by the user to ease the identification of
domains. The title may not contain newlines and should be reasonably short.
*docs/formatdomain.html.in
*docs/schemas/domaincommon.rng
- add schema grammar for the new element and documentation
*src/conf/domain_conf.c
*src/conf/domain_conf.h
- add field to hold the new attribute
- add code to parse and create XML with the new attribute
This was forgotten when the function was originally written (not
noticed because it wasn't used at the time). It's required for
proper compilation with modules enabled after applying the recent
virStorageVolResize patches.
This was forgotten when the function was initially written (not
noticed because it wasn't used at the time). It's required for proper
compilation with modules enabled after applying the recent rawio
patches.
We already provide ways to detect when a domain has been paused as a
result of I/O error, but there was no way of getting the exact error or
even the device that experienced it. This new API may be used for both.
If we are building not on a WIN32 architecture and without HAVE_CAPNG
virSetCapabilities has unused argument and virClearCapabilities
is unused as well.
In qemuDomainShutdownFlags if we try to use guest agent,
which has error or is not configured, we jump go endjob
label even if we haven't started any job yet. This may
lead to the daemon crash:
1) virsh shutdown --mode agent on a domain without agent configured
2) wait until domain quits
3) virsh edit
Fix my typo at
commit 74e034964c
"disk->rawio == -1" indicates that this value is not
specified. So in case of this, domain must not
be tainted.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Using new function 'virTypedParameterArrayClear' to simplify block of codes.
* daemon/remote.c, src/remote/remote_driver.c: simplify codes.
Signed-off-by: Alex Jia <ajia@redhat.com>
This patch revises qemuProcessStart() function for qemu
processes to retain CAP_SYS_RAWIO if needed.
And in case of that, add taint flag to domain.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Shota Hirae <m11g1401@hibikino.ne.jp>
This patch introduces virSetCapabilities() function and implements
virCommandAllowCap() function.
Existing virClearCapabilities() is function to clear all capabilities.
Instead virSetCapabilities() is function to set arbitrary capabilities.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Shota Hirae <m11g1401@hibikino.ne.jp>
This patch adds a new attribute "rawio" to the "disk" element
of domain XML. Valid values of "rawio" attribute are "yes"
and "no".
rawio='yes' indicates the disk is desirous of CAP_SYS_RAWIO.
If you specify the following XML:
<disk type='block' device='lun' rawio='yes'>
...
</disk>
the domain will be granted CAP_SYS_RAWIO.
(of course, the domain have to be executed with root privilege)
NOTE:
- "rawio" attribute is only valid when device='lun'
- At the moment, any other disks you won't use rawio can use rawio.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Our existing virDomainBlockResize takes an unsigned long long
argument; if that command is later taught a DELTA and SHRINK flag,
we cannot change its type without breaking API (but at least such
a change would be ABI compatible). Meanwhile, the only time a
negative size makes sense is if both DELTA and SHRINK are used
together, but if we keep the argument unsigned, applications can
pass the positive delta amount by which they would like to shrink
the system, and have the flags imply the negative value. So,
since this API has not yet been released, and in the interest of
consistency with existing API, we swap virStorageVolResize to
always pass an unsigned value.
* include/libvirt/libvirt.h.in (virStorageVolResize): Use unsigned
argument.
* src/libvirt.c (virStorageVolResize): Likewise.
* src/driver.h (virDrvStorageVolUpload): Adjust clients.
* src/remote/remote_protocol.x (remote_storage_vol_resize_args):
Likewise.
* src/remote_protocol-structs: Regenerate.
Suggested by Daniel P. Berrange.
Fix several references to now renamed functions and parameters when the
functions were moved from src/xen/ to src/xenxs/.
Signed-off-by: Philipp Hahn <hahn@univention.de>
This patch addresses: https://bugzilla.redhat.com/show_bug.cgi?id=781562
Along with the "rombar" option that controls whether or not a boot rom
is made visible to the guest, qemu also has a "romfile" option that
allows specifying a binary file to present as the ROM BIOS of any
emulated or passthrough PCI device. This patch adds support for
specifying romfile to both passthrough PCI devices, and emulated
network devices that attach to the guest's PCI bus (just about
everything other than ne2k_isa).
One example of the usefulness of this option is described in the
bugzilla report: 82576 sriov network adapters don't provide a ROM BIOS
for the cards virtual functions (VF), but an image of such a ROM is
available, and with this ROM visible to the guest, it can PXE boot.
In libvirt's xml, the new option is configured like this:
<hostdev>
...
<rom file='/etc/fake/boot.bin'/>
...
</hostdev
(similarly for <interface>).
When support for the rombar option was added, it was only added for
PCI passthrough devices, configured with <hostdev>. The same option is
available for any network device that is attached to the guest's PCI
bus. This patch allows setting rombar for any PCI network device type.
After adding cases to test this to qemuxml2argv-hostdev-pci-rombar.*,
I decided to rename those files (to qemuxml2argv-pci-rom.*) to more
accurately reflect the additional tests, and also noticed that up to
now we've only been performing a domainschematest for that case, so I
added the "pci-rom" test to both qemuxml2argv and qemuxml2xml (and in
the process found some bugs whose fixes I squashed into previous
commits of this series).
Since these two items are now in the virDomainDeviceInfo struct, it
makes sense to parse/format them in the functions written to
parse/format that structure. Not all types of devices allow them, so
two internal flags are added to indicate when it is appropriate to do
so.
I was lucky - only one test case needed to be re-ordered!
To help consolidate the commonality between virDomainHostdevDef and
virDomainNetDef into as few members as possible (and because I
think it makes sense), this patch moves the rombar and bootIndex
members into the "info" member that is common to both (and to all the
other structs that use them).
It's a bit problematic that this gives rombar and bootIndex to many
device types that don't use them, but this is already the case for the
master and mastertype members of virDomainDeviceInfo, and is properly
commented as such in the definition.
Note that this opens the door to supporting rombar for other devices
that are attached to the guest PCI bus - virtio-blk-pci,
virtio-net-pci, various other network adapters - which which have that
capability in qemu, but previously had no support in libvirt.
Unlike other users of virTypedParameter with RPC, this interface
can return zero-filled entries because the interface assumes
2 dimensional array. We compress these entries out from the
server when generating the over-the-wire contents, then reconstitute
them in the client.
Signed-off-by: Eric Blake <eblake@redhat.com>
add new API virDomainGetCPUStats() for getting cpu accounting information
per real cpus which is used by a domain. The API is designed to allow
future extensions for additional statistics.
based on ideas by Lai Jiangshan and Eric Blake.
* src/libvirt_public.syms: add API for LIBVIRT_0.9.10
* src/libvirt.c: define virDomainGetCPUStats()
* include/libvirt/libvirt.h.in: add virDomainGetCPUStats() header
* src/driver.h: add driver API
* python/generator.py: add python API (as not implemented)
Signed-off-by: Eric Blake <eblake@redhat.com>
This API allows a domain to be put into one of S# ACPI states.
Currently, S3 and S4 are supported. These states are shared
with virNodeSuspendForDuration.
However, for now we don't support any duration other than zero.
The same apply for flags.
Add a new function to allow changing of capacity of storage volumes.
Plan out several flags, even if not all of them will be implemented
up front.
Expose the new command via 'virsh vol-resize'.
Signed-off-by: Eric Blake <eblake@redhat.com>
And hook it up for policykit auth. This allows virt-manager to detect
that the user clicked the policykit 'cancel' button and not throw
an 'authentication failed' error message at the user.
If yajl was not compiled in, we end up freeing an incoming
parameter, which leads to a bogus free later on. Regression
introduced in commit 6e769eb.
* src/qemu/qemu_capabilities.c (qemuCapsParseHelpStr): Avoid alloc
on failure path, which in turn fixes bogus free.
Reported by Cole Robinson.
The virEmitXMLWarning function should always have been in
the xml.[hc] files, and should use virXML as its name
prefix
* src/util/util.c, src/util/util.h: Remove virEmitXMLWarning
* src/util/xml.c, src/util/xml.h: Add virXMLEmitWarning
QEMU supports a bunch of CPUID features that are tied to the kvm CPUID
nodes rather than the processor's. They are "kvmclock",
"kvm_nopiodelay", "kvm_mmu", "kvm_asyncpf". These are not known to
libvirt and their CPUID leaf might move if (for example) the Hyper-V
extensions are enabled. Hence their handling would anyway require some
special-casing.
However, among these the most useful is kvmclock; an additional
"property" of this feature is that a <timer> element is a better model
than a CPUID feature. Although, creating part of the -cpu command-line
from something other than the <cpu> XML element introduces some
ugliness.
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add kvmclock timer to documentation, schema and parsers. Keep the
platform timer first since it is kind of special, and alphabetize
the others when possible (i.e. when it does not change the ABI).
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Avoid creating an empty <cpu> element when the QEMU command-line simply
specifies the default "-cpu qemu32" or "-cpu qemu64".
This requires the previous patch, which lets us represent "-cpu qemu32"
as <os arch='i686'> in the generated XML.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The qemu32 CPU model is chosen based on the <os arch=...> name when
creating the QEMU command line for a 64-bit host. For the opposite
transformation we can test the guest CPU model for the "lm" feature.
If it is absent, def->os.arch needs to be corrected.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When running under KVM, the arch is usually set to i686 because
the name of the emulator is not qemu-system-x86_64. Use the host
arch instead.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Recently (or not so recently) QEMU added the kvm32 and kvm64
architectures, representing a least common denominator of all
hosts that can run KVM. Add them to the machine map.
Also, some features that TCG supports were added to qemu64.
Add them to the cpu_map.xml whenever KVM is guaranteed to support
those. We still have to leave some out, because they would not
be available to guests running on older hosts.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The qemu developers have made it clear that modern qemu will no
longer guarantee human monitor command stability; furthermore,
some features, such as async events, are only supported via qmp.
If we are compiled without support for handling JSON, we cannot
expect to sanely interact with modern qemu.
However, things must continue to build on RHEL 5, where qemu
is stuck at 0.10, and where yajl is not available.
Another benefit of this patch: future additions of new monitor
commands need only focus on qemu_monitor_json.c, instead of
also wasting time with qemu_monitor_text.c.
* src/qemu/qemu_capabilities.c (qemuCapsComputeCmdFlags): Report
error if yajl is missing but qemu requires qmp.
(qemuCapsParseHelpStr): Propagate error.
(qemuCapsExtractVersionInfo): Update caller.
* tests/qemuhelptest.c (testHelpStrParsing): Likewise.
I'm getting tired of remembering to backport RHEL-specific
patches when building upstream libvirt on RHEL 6.x or CentOS.
All the affected versions of RHEL qemu-kvm have backported
enough patches to a) make JSON useful, and b) modify the
-help text to mention libvirt as the preferred interface;
which means this string in the help output is a reliable
indicator that we can outsmart a strict version check,
even when upstream qemu 0.12 lacked the needed features.
* src/qemu/qemu_capabilities.c (qemuCapsComputeCmdFlags):
Recognize particular help string present when enough features were
backported to be worth using JSON.
* tests/qemuhelptest.c (mymain): Update tests accordingly.
Compare two filters' XML for equality and only rebuild/instantiate the new
filter if the new and current filters are found to be different. This
improves performance during an update of a filter with no obvious change
or the reloading of filters during a 'kill -SIGHUP'
Introduce a function that rebuilds all running VMs' filters. Call
this function when reloading the nwfilter driver.
This addresses a problem introduced by the 2nd patch that typically
causes no filters to be reinstantiate anymore upon driver reload
since their XML has not changed. Yet the current behavior is that
upon a SIGHUP all filters get reinstantiated.
QEMU always sends details about all available block devices as an answer
for "info block"/"query-block" command. On the other hand, our
qemuMonitorGetBlockInfo was made for a single block devices queries
only. Thus, when asking for multiple devices, we asked qemu multiple
times to always get the same answer from which different parts were
filtered. This patch makes qemuMonitorGetBlockInfo return a hash table
of all block devices, which may later be used for getting details about
specific devices.
Added a new field "vm-pid" to the VIRT_CONTROL audit record. This information
is useful to correlated another audit events to the events generated by
libvirt.
On RHEL5, I got:
util/virrandom.c:66: warning: nested extern declaration of '_gl_verify_function66' [-Wnested-externs]
The fix is to hoist the verify earlier. Also some other hodge-podge
fixes I noticed while reviewing Dan's recent series.
* .gitignore: Ignore new test.
* src/util/cgroup.c: Bump copyright year.
* src/util/virhash.c: Fix typo in description.
* src/util/virrandom.c (virRandomBits): Mark doc comment, and
hoist assert to silence older gcc.
Recent discussions have illustrated the potential for DOS attacks
with the hash table implementations used by most languages and
libraries.
https://lwn.net/Articles/474912/
libvirt has an internal hash table impl, and uses hash tables for
a variety of purposes. The hash key generation code is pretty
simple and thus not strongly collision resistant.
This patch replaces the current libvirt hash key generator with
the (public domain) Murmurhash3 code. In addition every hash
table now gets a random seed value which is used to perturb the
hashing code. This should make it impossible to mount any
practical attack against libvirt hashing code.
* bootstrap.conf: Import bitrotate module
* src/Makefile.am: Add virhashcode.[ch]
* src/util/util.c: Make virRandom() return a fixed 32 bit
integer value.
* src/util/hash.c, src/util/hash.h, src/util/cgroup.c: Replace
hash code generation with a call to virHashCodeGen()
* src/util/virhashcode.h, src/util/virhashcode.c: Add a new
virHashCodeGen() API using the Murmurhash3 algorithm.
In preparation for the patch to include Murmurhash3, which
introduces a virhashcode.h and virhashcode.c files, rename
the existing hash.h and hash.c to virhash.h and virhash.c
respectively.
In preparation for conversion over to use the Murmurhash3
algorithm, convert various virHash APIs to use size_t or
uint32 for their return values/parameters, instead of the
variable size 'unsigned long' or 'int' types
The old virRandom() API was not generating good random numbers.
Replace it with a new API virRandomBits which instead of being
told the upper limit, gets told the number of bits of randomness
required.
* src/util/virrandom.c, src/util/virrandom.h: Add virRandomBits,
and move virRandomInitialize
* src/util/util.h, src/util/util.c: Delete virRandom and
virRandomInitialize
* src/libvirt.c, src/security/security_selinux.c,
src/test/test_driver.c, src/util/iohelper.c: Update for
changes from virRandom to virRandomBits
* src/storage/storage_backend_iscsi.c: Remove bogus call
to virRandomInitialize & convert to virRandomBits
Currently, we support only filling a volume with zeroes on wiping.
However, it is not enough as data might still be readable by
experienced and equipped attacker. Many technical papers have been
written, therefore we should support other wiping algorithms.
In file included from ../gnulib/lib/unistd.h:51:0,
from ../src/util/util.h:30,
from rpc/virkeepalive.c:29:
/usr/x86_64-w64-mingw32/sys-root/mingw/include/winsock2.h:15:2: warning: #warning Please include winsock2.h before windows.h [-Wcpp]
Reported by Marc-André Lureau.
* src/util/threads-win32.h (includes): Pick up winsock2.h before
windows.h, as required by mingw64.
On F16 at least, empty volume groups don't have a directory under /dev.
The directory only appears once a logical volume is created.
This tickles some behavior in BackendStablePath which ends with
libvirt sleeping for 5 seconds while waiting for the directory to appear.
This causes all sorts of problems for the virStorageVolLookupByPath API
which virtinst uses, even if trying to resolve a path that is independent
of the logical pool.
In reality we don't even need to do that checking since logical pools
always have a stable target path. Short circuit the polling in that
case.
Fixes bug 782261
Systemd detects containers based on whether they have
an environment variable starting with 'container=lxc';
using a longer name fits the expectations, while also
allowing detection of who created the container.
Requested by Lennart Poettering, in response to
https://bugs.freedesktop.org/show_bug.cgi?id=45175
* src/lxc/lxc_container.c (lxcContainerBuildInitCmd): Add another
env-var.
The current setup code for LXC is bind mounting /dev/pts/ptmx
on top of a character device /dev/ptmx. This is denied by SELinux
policy and is just wrong. The target of a bind mount should just
be a plain file
* src/lxc/lxc_container.c: Don't bind /dev/pts/ptmx onto
a char device
Add a virFileTouch API which ensures that a file will always
exist, even if zero length
* src/util/virfile.c, src/util/virfile.h,
src/libvirt_private.syms: Introduce virFileTouch
Direct boot (using kernel, initrd, and command line) is used by
virt-install/virt-manager for network install. While any bootindex has
no direct effect since -kernel is always first, we need it as a hint for
SeaBIOS to present disks in the same order as they will be presented
during normal boot.
It's better to group all the metadata together. This is a
cosmetic output change; since the RNG allows interleave, it
doesn't matter where the user stuck it on input, and an XPath
query will find the same information when parsing the output.
* src/conf/domain_conf.c (virDomainDefFormatInternal): Output
metadata earlier.
* docs/formatdomain.html.in: Update documentation.
* tests/domainsnapshotxml2xmlout/metadata.xml: Update test.
* tests/qemuxml2xmloutdata/qemuxml2xmlout-metadata.xml: Likewise.
Applications can now insert custom nodes and hierarchies into domain
configuration XML. Although currently not enforced, applications are
required to use their own namespaces on every custom node they insert,
with only one top-level element per namespace.
POLLIN and POLLHUP are not mutually exclusive. Currently the following
seems possible: the child writes 3K to its stdout or stderr pipe, and
immediately closes it. We get POLLIN|POLLHUP (I'm not sure that's possible
on Linux, but SUSv4 seems to allow it). We read 1K and throw away the
rest.
When poll() returns and we're about to check the /revents/ member in a
given array element, let's map all the revents bits to two (independent)
ideas: "let's attempt to read()", and "let's attempt to write()". This
should cover all errors, EOFs, and normal conditions; the read()/write()
call should report any pending error.
Under this approach, both POLLHUP and POLLERR are mapped to "needs read()"
if we're otherwise prepared for POLLIN. POLLERR also maps to "needs
write()" if we're otherwise prepared for POLLOUT. The rest of the mappings
(POLLPRI etc.) would be easy, but probably useless for pipes.
Additionally, SUSv4 doesn't appear to forbid POLLIN|POLLERR (or
POLLOUT|POLLERR) set simultaneously. One could argue that the read() or
write() call would return without blocking in these cases (with an error),
so POLLIN / POLLOUT would be justified beside POLLERR.
The code now penalizes POLLIN|POLLERR differently from plain POLLERR. The
former (ie. read() returning -1) is terminal and we jump to cleanup, while
plain POLLERR masks only the affected file descriptor for the future.
Let's unify those.
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
This makes use of the QEMU guest agent to implement the
virDomainShutdownFlags and virDomainReboot APIs. With
no flags specified, it will prefer to use the agent, but
fallback to ACPI. Explicit choice can be made by using
a suitable flag
* src/qemu/qemu_driver.c: Wire up use of agent
Add a new API virDomainShutdownFlags and define:
VIR_DOMAIN_SHUTDOWN_DEFAULT = 0,
VIR_DOMAIN_SHUTDOWN_ACPI_POWER_BTN = (1 << 0),
VIR_DOMAIN_SHUTDOWN_GUEST_AGENT = (1 << 1),
Also define some flags for the reboot API
VIR_DOMAIN_REBOOT_DEFAULT = 0,
VIR_DOMAIN_REBOOT_ACPI_POWER_BTN = (1 << 0),
VIR_DOMAIN_REBOOT_GUEST_AGENT = (1 << 1),
Although these two APIs currently have the same flags, using
separate enums allows them to expand separately in the future.
Add stub impls of the new API for all existing drivers
There is now a standard QEMU guest agent that can be installed
and given a virtio serial channel
<channel type='unix'>
<source mode='bind' path='/var/lib/libvirt/qemu/f16x86_64.agent'/>
<target type='virtio' name='org.qemu.guest_agent.0'/>
</channel>
The protocol that runs over the guest agent is JSON based and
very similar to the JSON monitor. We can't use exactly the same
code because there are some odd differences in the way messages
and errors are structured. The qemu_agent.c file is based on
a combination and simplification of qemu_monitor.c and
qemu_monitor_json.c
* src/qemu/qemu_agent.c, src/qemu/qemu_agent.h: Support for
talking to the agent for shutdown
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h: Add thread
helpers for talking to the agent
* src/qemu/qemu_process.c: Connect to agent whenever starting
a guest
* src/qemu/qemu_monitor_json.c: Make variable static
When converting a linear enum to a string, we have checks in
place in the VIR_ENUM_IMPL macro to ensure that there is one
string for every value, which lets us quickly flag if a user
added a value but forgot to add a counterpart string. However,
this only works if we use the _LAST marker.
* cfg.mk (sc_require_enum_last_marker): New syntax check.
* src/conf/domain_conf.h (virDomainSnapshotState): Add new marker.
* src/conf/domain_conf.c (virDomainSnapshotState): Fix offender.
* src/qemu/qemu_monitor_json.c (qemuMonitorWatchdogAction)
(qemuMonitorIOErrorAction, qemuMonitorGraphicsAddressFamily):
Likewise.
* src/util/virtypedparam.c (virTypedParameter): Likewise.
Although this is a public API break, it only affects users that
were compiling against *_LAST values, and can be trivially
worked around without impacting compilation against older
headers, by the user defining VIR_ENUM_SENTINELS before using
libvirt.h. It is not an ABI break, since enum values do not
appear as .so entry points. Meanwhile, it prevents users from
using non-stable enum values without explicitly acknowledging
the risk of doing so.
See this list discussion:
https://www.redhat.com/archives/libvir-list/2012-January/msg00804.html
* include/libvirt/libvirt.h.in: Hide all sentinels behind
LIBVIRT_ENUM_SENTINELS, and add missing sentinels.
* src/internal.h (VIR_DEPRECATED): Allow inclusion after
libvirt.h.
(LIBVIRT_ENUM_SENTINELS): Expose sentinels internally.
* daemon/libvirtd.h: Use the sentinels.
* src/remote/remote_protocol.x (includes): Don't expose sentinels.
* python/generator.py (enum): Likewise.
* tests/cputest.c (cpuTestCompResStr): Silence compiler warning.
* tools/virsh.c (vshDomainStateReasonToString)
(vshDomainControlStateToString): Likewise.
While we still don't want to enable gcc's new -Wformat-literal
warning, I found a rather easy case where the warning could be
reduced, by getting rid of obsolete error-reporting practices.
This is the last place where we were passing the (unused) net
and conn arguments for constructing an error.
* src/util/virterror_internal.h (virErrorMsg): Delete prototype.
(virReportError): Delete macro.
* src/util/virterror.c (virErrorMsg): Make static.
* src/libvirt_private.syms (virterror_internal.h): Drop export.
* src/util/conf.c (virConfError): Convert to macro.
(virConfErrorHelper): New function, and adjust error calls.
* src/xen/xen_hypervisor.c (virXenErrorFunc): Delete.
(xenHypervisorGetSchedulerType)
(xenHypervisorGetSchedulerParameters)
(xenHypervisorSetSchedulerParameters)
(xenHypervisorDomainBlockStats)
(xenHypervisorDomainInterfaceStats)
(xenHypervisorDomainGetOSType)
(xenHypervisorNodeGetCellsFreeMemory, xenHypervisorGetVcpus):
Update callers.
Reusing common code makes things smaller; it also buys us some
additional safety, such as now rejecting duplicate parameters
during a set operation.
* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters)
(qemuDomainSetMemoryParameters, qemuDomainSetNumaParameters)
(qemuSetSchedulerParametersFlags)
(qemuDomainSetInterfaceParameters, qemuDomainSetBlockIoTune)
(qemuDomainGetBlkioParameters, qemuDomainGetMemoryParameters)
(qemuDomainGetNumaParameters, qemuGetSchedulerParametersFlags)
(qemuDomainBlockStatsFlags, qemuDomainGetInterfaceParameters)
(qemuDomainGetBlockIoTune): Use new helpers.
* src/esx/esx_driver.c (esxDomainSetSchedulerParametersFlags)
(esxDomainSetMemoryParameters)
(esxDomainGetSchedulerParametersFlags)
(esxDomainGetMemoryParameters): Likewise.
* src/libxl/libxl_driver.c
(libxlDomainSetSchedulerParametersFlags)
(libxlDomainGetSchedulerParametersFlags): Likewise.
* src/lxc/lxc_driver.c (lxcDomainSetMemoryParameters)
(lxcSetSchedulerParametersFlags, lxcDomainSetBlkioParameters)
(lxcDomainGetMemoryParameters, lxcGetSchedulerParametersFlags)
(lxcDomainGetBlkioParameters): Likewise.
* src/test/test_driver.c (testDomainSetSchedulerParamsFlags)
(testDomainGetSchedulerParamsFlags): Likewise.
* src/xen/xen_hypervisor.c (xenHypervisorSetSchedulerParameters)
(xenHypervisorGetSchedulerParameters): Likewise.
Preparation for another patch that refactors common patterns
into the new file for fewer lines of code overall.
* src/util/util.h (virTypedParameterArrayClear): Move...
* src/util/virtypedparam.h: ...to new file.
(virTypedParameterArrayValidate, virTypedParameterAssign): New
prototypes.
* src/util/util.c (virTypedParameterArrayClear): Likewise.
* src/util/virtypedparam.c: New file.
* po/POTFILES.in: Mark file for translation.
* src/Makefile.am (UTIL_SOURCES): Build it.
* src/libvirt_private.syms (util.h): Split...
(virtypedparam.h): to new section.
(virkeycode.h): Sort.
* daemon/remote.c: Adjust callers.
* tools/virsh.c: Likewise.
Based on qemu changes made in commits ae523427 and 659ded58.
* src/lxc/lxc_driver.c (lxcSetSchedulerParametersFlags)
(lxcGetSchedulerParametersFlags, lxcDomainSetBlkioParameters)
(lxcDomainGetBlkioParameters): Use helpers.
(lxcDomainSetBlkioParameters): Allow setting live and config at
once.
We had a memory leak on a very arcane OOM situation (unlikely to ever
hit in practice, but who knows if libvirt.so would ever be linked
into some other program that exhausts all thread-local storage keys?).
I found it by code inspection, while analyzing a valgrind report
generated by Alex Jia.
* src/util/threads.h (virThreadLocalSet): Alter signature.
* src/util/threads-pthread.c (virThreadHelper): Reduce allocation
lifetime.
(virThreadLocalSet): Detect failure.
* src/util/threads-win32.c (virThreadLocalSet): Likewise.
(virCondWait): Fix caller.
* src/util/virterror.c (virLastErrorObject): Likewise.
The RPC generator transforms methods matching certain
patterns like 'id' or 'uuid', etc but does not anchor
its matches to the end of the word. So if a method
contains 'id' in the middle (eg virIdentity) then the
RPC generator munges that.
* src/rpc/gendispatch.pl: Anchor matches
To avoid a namespace clash with forthcoming identity APIs,
rename the virNet*GetLocalIdentity() APIs to have the form
virNet*GetUNIXIdentity()
* daemon/remote.c, src/libvirt_private.syms: Update
for renamed APIs
* src/rpc/virnetserverclient.c, src/rpc/virnetserverclient.h,
src/rpc/virnetsocket.c, src/rpc/virnetsocket.h: s/LocalIdentity/UNIXIdentity/
There was missing capability for blkiotune and thus specifying these
settings caused libvirt to run qemu with invalid parameters and then
reporting qemu error instead of the standard libvirt one. The support
for blkiotune setting was added in upstream qemu repo under commit
0563e191516289c9d2f282a8c50f2eecef2fa773.
Given an LXC guest with a root filesystem path of
/export/lxc/roots/helloworld/root
During startup, we will pivot the root filesystem to end up
at
/.oldroot/export/lxc/roots/helloworld/root
We then try to open
/.oldroot/export/lxc/roots/helloworld/root/dev/pts
Now consider if '/export/lxc' is an absolute symlink pointing
to '/media/lxc'. The kernel will try to open
/media/lxc/roots/helloworld/root/dev/pts
whereas it should be trying to open
/.oldroot//media/lxc/roots/helloworld/root/dev/pts
To deal with the fact that the root filesystem can be moved,
we need to resolve symlinks in *any* part of the filesystem
source path.
* src/libvirt_private.syms, src/util/util.c,
src/util/util.h: Add virFileResolveAllLinks to resolve
all symlinks in a path
* src/lxc/lxc_container.c: Resolve all symlinks in filesystem
paths during startup
pciTrySecondaryBusReset checks if there is active device on the
same bus, however, qemu driver doesn't maintain an effective
list for the inactive devices, and it passes meaningless argument
for parameter "inactiveDevs". e.g. (qemuPrepareHostdevPCIDevices)
if (!(pcidevs = qemuGetPciHostDeviceList(hostdevs, nhostdevs)))
return -1;
..skipped...
if (pciResetDevice(dev, driver->activePciHostdevs, pcidevs) < 0)
goto reattachdevs;
NB, the "pcidevs" used above are extracted from domain def, and
thus one won't be able to attach a device of which bus has other
device even detached from host (nodedev-detach). To see more
details of the problem:
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=773667
This patch is to resolve the problem by introducing an inactive
PCI device list (just like qemu_driver->activePciHostdevs), and
the whole logic is:
* Add the device to inactive list during nodedev-dettach
* Remove the device from inactive list during nodedev-reattach
* Remove the device from inactive list during attach-device
(for non-managed device)
* Add the device to inactive list after detach-device, only
if the device is not managed
With the above, we have a sufficient inactive PCI device list, and thus
we can use it for pciResetDevice. e.g.(qemuPrepareHostdevPCIDevices)
if (pciResetDevice(dev, driver->activePciHostdevs,
driver->inactivePciHostdevs) < 0)
goto reattachdevs;
This introduces new attribute wrpolicy with only supported
value as immediate. This will be an optional
attribute with no defaults. This helps specify whether
to skip the host page cache.
When wrpolicy is specified, meaning when wrpolicy=immediate
a writeback is explicitly initiated for the dirty pages in
the host page cache as part of the guest file write operation.
Usage:
<filesystem type='mount' accessmode='passthrough'>
<driver type='path' wrpolicy='immediate'/>
<source dir='/export/to/guest'/>
<target dir='mount_tag'/>
</filesystem>
Currently this only works with type='mount' for the QEMU/KVM driver.
Signed-off-by: Deepak C Shetty <deepakcs@linux.vnet.ibm.com>
In the past we didn't reserve 0:0:2.0 PCI address if there was no video
device assigned to a domain, which made it impossible to add a video
device later on. So we fixed it (commit v0.9.0-37-g7b2cac1) by always
reserving that address. However, that breaks existing domains without
video devices that already have another device assigned to the
problematic address.
This patch reserves address 0:0:2.0 only in case it was not explicitly
assigned to another device, which means libvirt will try to keep this
address free and will not automatically assign it new devices. But
existing domains for which older libvirt already assigned the address to
a non-video device will keep working as they used to work before 0.9.1.
Moreover, users who want to create a domain without a video device and
use its address for another device may do so by explicitly configuring
the PCI address in domain XML.
There are several reasons for doing this:
- the CPU specification is out of libvirt's control so we cannot
guarantee stable guest ABI
- not every feature of a CPU may actually work as expected when
advertised directly to a guest
- migration between two machines with exactly the same CPU may work but
no guarantees can be made
- this mode is not supported and its use is at one's own risk
VIR_DOMAIN_XML_UPDATE_CPU flag for virDomainGetXMLDesc may be used to
get updated custom mode guest CPU definition in case it depends on host
CPU. This patch implements the same behavior for host-model and
host-passthrough CPU modes.
The mode can be either of "custom" (default), "host-model",
"host-passthrough". The semantics of each mode is described in the
following examples:
- guest CPU is a default model with specified topology:
<cpu>
<topology sockets='1' cores='2' threads='1'/>
</cpu>
- guest CPU matches selected model:
<cpu mode='custom' match='exact'>
<model>core2duo</model>
</cpu>
- guest CPU should be a copy of host CPU as advertised by capabilities
XML (this is a short cut for manually copying host CPU specification
from capabilities to domain XML):
<cpu mode='host-model'/>
In case a hypervisor does not support the exact host model, libvirt
automatically falls back to a closest supported CPU model and
removes/adds features to match host. This behavior can be disabled by
<cpu mode='host-model'>
<model fallback='forbid'/>
</cpu>
- the same as previous returned by virDomainGetXMLDesc with
VIR_DOMAIN_XML_UPDATE_CPU flag:
<cpu mode='host-model' match='exact'>
<model fallback='allow'>Penryn</model> --+
<vendor>Intel</vendor> |
<topology sockets='2' cores='4' threads='1'/> + copied from
<feature policy='require' name='dca'/> | capabilities XML
<feature policy='require' name='xtpr'/> |
... --+
</cpu>
- guest CPU should be exactly the same as host CPU even in the aspects
libvirt doesn't model (such domain cannot be migrated unless both
hosts contain exactly the same CPUs):
<cpu mode='host-passthrough'/>
- the same as previous returned by virDomainGetXMLDesc with
VIR_DOMAIN_XML_UPDATE_CPU flag:
<cpu mode='host-passthrough' match='minimal'>
<model>Penryn</model> --+ copied from caps
<vendor>Intel</vendor> | XML but doesn't
<topology sockets='2' cores='4' threads='1'/> | describe all
<feature policy='require' name='dca'/> | aspects of the
<feature policy='require' name='xtpr'/> | actual guest CPU
... --+
</cpu>
In case a hypervisor doesn't support the exact CPU model requested by a
domain XML, we automatically fallback to a closest CPU model the
hypervisor supports (and make sure we add/remove any additional features
if needed). This patch adds 'fallback' attribute to model element, which
can be used to disable this automatic fallback.
Commit 5d784bd6d7 was a nice attempt to
clarify the semantics by requiring domain name from dxml to either match
original name or dname. However, setting dxml domain name to dname
doesn't really work since destination host needs to know the original
domain name to be able to use it in migration cookies. This patch
requires domain name in dxml to match the original domain name. The
change should be safe and backward compatible since migration would fail
just a bit later in the process.
There are three address validation routines that do nothing:
virDomainDeviceDriveAddressIsValid()
virDomainDeviceUSBAddressIsValid()
virDomainDeviceVirtioSerialAddressIsValid()
Remove them, and replace their call sites with "1" which is what they
currently return. In some cases this means we can remove an entire
if block.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
We can't call qemuCapsExtractVersionInfo() from test code, because it
expects to be able to call the emulator, and for testing we have fake
emulators that can't be executed. For that reason qemuxml2argvtest.c
doesn't call qemuDomainAssignPCIAddresses(), instead it open codes its
own version.
That means we can't call qemuDomainAssignAddresses() from the test code,
instead we need to manually call qemuDomainAssignSpaprVioAddresses().
Also add logic to cope with qemuDomainAssignSpaprVioAddresses() failing,
so that we can write a test that checks for a known failure in there.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
KVM will be able to use a PCI SCSI controller even on POWER. Let
the user specify the vSCSI controller by other means than a default.
After this patch, the QEMU driver will actually look at the model
and reject anything but auto, lsilogic and ibmvscsi.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit d09f6ba5fe introduced a regression in event
registration. virDomainEventCallbackListAddID() will only return a positive
integer if the type of event being registered is VIR_DOMAIN_EVENT_ID_LIFECYCLE.
For other event types, 0 is always returned on success. This has the
unfortunate side effect of not enabling remote event callbacks because
remoteDomainEventRegisterAny() uses the return value from the local call to
determine if an event callback needs to be registered on the remote end.
Make sure virDomainEventCallbackListAddID() returns the callback count for the
eventID being registered.
Signed-off-by: Adam Litke <agl@us.ibm.com>
The new introduced optional attribute "copy_on_read</code> controls
whether to copy read backing file into the image file. The value can
be either "on" or "off". Copy-on-read avoids accessing the same backing
file sectors repeatedly and is useful when the backing file is over a
slow network. By default copy-on-read is off.
Earlier, when the number of vcpus was greater than the topology allowed,
libvirt didn't raise an error and continued, resulting in running qemu
with parameters making no sense. Even though qemu did not report any
error itself, the number of vcpus was set to maximum allowed by the
topology.
Detected by Coverity. Although unlikely, if we are ever started
with stdin closed, we could reach a situation where we open a
uuid file but then fail to close it, making that file the new
stdin for the rest of the process.
* src/util/uuid.c (getDMISystemUUID): Allow for stdin.
Currently the LXC controller attempts to deal with EOF on a
tty by spawning a thread to do an edge triggered epoll_wait().
This avoids the normal event loop spinning on POLLHUP. There
is a subtle mistake though - even after seeing POLLHUP on a
master PTY, it is still perfectly possible & valid to write
data to the PTY. There is a buffer that can be filled with
data, even when no client is present.
The second mistake is that the epoll_wait() thread was not
looking for the EPOLLOUT condition, so when a new client
connects to the LXC console, it had to explicitly send a
character before any queued output would appear.
Finally, there was in fact no need to spawn a new thread to
deal with epoll_wait(). The epoll file descriptor itself
can be poll()'d on normally.
This patch attempts to deal with all these problems.
- The blocking epoll_wait() thread is replaced by a poll
on the epoll file descriptor which then does a non-blocking
epoll_wait() to handle events
- Even if POLLHUP is seen, we continue trying to write
any pending output until getting EAGAIN from write.
- Once write returns EAGAIN, we modify the epoll event
mask to also look for EPOLLOUT
* src/lxc/lxc_controller.c: Avoid stalled I/O upon
connected to an LXC console
If client stream does not have any data to sink and neither received
EOF, a dummy packet is sent to the daemon signalising client is ready to
sink some data. However, after we added event loop to client a race may
occur:
Thread 1 calls virNetClientStreamRecvPacket and since no data are cached
nor stream has EOF, it decides to send dummy packet to server which will
sent some data in turn. However, during this decision and actual message
exchange with server -
Thread 2 receives last stream data from server. Therefore an EOF is set
on stream and if there is a call waiting (which is not yet) it is woken
up. However, Thread 1 haven't sent anything so far, so there is no call
to be woken up. So this thread sent dummy packet to daemon, which
ignores that as no stream is associated with such packet and therefore
no reply will ever come.
This race causes client to hang indefinitely.
QEMU does not support security_model for anything but 'path' fs driver type.
Currently in libvirt, when security_model ( accessmode attribute) is not
specified it auto-generates it irrespective of the fs driver type, which
can result in a qemu error for drivers other than path. This patch ensures
that the qemu cmdline is correctly generated by taking into account the
fs driver type.
Signed-off-by: Deepak C Shetty <deepakcs@linux.vnet.ibm.com>
If a system has 64 or more VF's, it is quite tedious to mention each VF
in the interface pool.
The following modification will implicitly create an interface pool from
the SR-IOV PF.
This functions enables us to get the Virtual Functions attached to
a Physical function given the name of a SR-IOV physical functio.
In order to accomplish the task, added a getter function pciGetDeviceAddrString
to get the BDF of the Virtual Function in a char array.
The autobuilder pointed out an odd failure on mingw:
../../src/interface/netcf_driver.c:644:5: error: unknown field 'close_used_without_including_unistd_h' specified in initializer
cc1: warnings being treated as errors
This is because the gnulib headers #define close to different strings,
according to which headers are included, in order to work around some
odd mingw problems with close(), and these defines happen to also
affect field members declared with a name of struct foo.close. As long
as all headers are included before both the definition and use of the
struct, the various #define doesn't matter, but the netcf file hit
an instance where things were included in a different order. Fix this
for all clients that use a struct member named 'close'.
* src/driver.h: Include <unistd.h> before using 'close'.
For some weird reason, i686-pc-mingw32-gcc version 4.6.1 at -O2 complained:
../../src/conf/nwfilter_params.c: In function 'virNWFilterVarCombIterCreate':
../../src/conf/nwfilter_params.c:346:23: error: 'minValue' may be used uninitialized in this function [-Werror=uninitialized]
../../src/conf/nwfilter_params.c:319:28: note: 'minValue' was declared here
../../src/conf/nwfilter_params.c:344:23: error: 'maxValue' may be used uninitialized in this function [-Werror=uninitialized]
../../src/conf/nwfilter_params.c:319:18: note: 'maxValue' was declared here
cc1: all warnings being treated as errors
even though all paths of the preceding switch statement either
assign the variables or return.
* src/conf/nwfilter_params.c (virNWFilterVarCombIterAddVariable):
Initialize variables.
Address side effect of accessing a variable via an index: Filters
accessing a variable where an element is accessed that is beyond the
size of the list (for example $TEST[10] and only 2 elements are available)
cannot instantiate that filter. Test for this and report proper error
to user.
This patch adds access to single elements of variables via index. Example:
<rule action='accept' direction='in' priority='500'>
<tcp srcipaddr='$ADDR[1]' srcportstart='$B[2]'/>
</rule>
This patch introduces the capability to use a different iterator per
variable.
The currently supported notation of variables in a filtering rule like
<rule action='accept' direction='out'>
<tcp srcipaddr='$A' srcportstart='$B'/>
</rule>
processes the two lists 'A' and 'B' in parallel. This means that A and B
must have the same number of 'N' elements and that 'N' rules will be
instantiated (assuming all tuples from A and B are unique).
In this patch we now introduce the assignment of variables to different
iterators. Therefore a rule like
<rule action='accept' direction='out'>
<tcp srcipaddr='$A[@1]' srcportstart='$B[@2]'/>
</rule>
will now create every combination of elements in A with elements in B since
A has been assigned to an iterator with Id '1' and B has been assigned to an
iterator with Id '2', thus processing their value independently.
The first rule has an equivalent notation of
<rule action='accept' direction='out'>
<tcp srcipaddr='$A[@0]' srcportstart='$B[@0]'/>
</rule>
In this patch we introduce testing whether the iterator points to a
unique set of entries that have not been seen before at one of the previous
iterations. The point is to eliminate duplicates and with that unnecessary
filtering rules by preventing identical filtering rules from being
instantiated.
Example with two lists:
list1 = [1,2,1]
list2 = [1,3,1]
The 1st iteration would take the 1st items of each list -> 1,1
The 2nd iteration would take the 2nd items of each list -> 2,3
The 3rd iteration would take the 3rd items of each list -> 1,1 but
skip them since this same pair has already been encountered in the 1st
iteration
Implementation-wise this is solved by taking the n-th element of list1 and
comparing it against elements 1..n-1. If no equivalent is found, then there
is no possibility of this being a duplicate. In case an equivalent element
is found at position i, then the n-th element in the 2nd list is compared
against the i-th element in the 2nd list and if that is not the same, then
this is a unique pair, otherwise it is not unique and we may need to do
the same comparison on the 3rd list.
When sVirt is integrated with the LXC driver, it will be neccessary
to invoke the security driver APIs using only a virDomainDefPtr
since the lxc_container.c code has no virDomainObjPtr available.
Aside from two functions which want obj->pid, every bit of the
security driver code only touches obj->def. So we don't need to
pass a virDomainObjPtr into the security drivers, a virDomainDefPtr
is sufficient. Two functions also gain a 'pid_t pid' argument.
* src/qemu/qemu_driver.c, src/qemu/qemu_hotplug.c,
src/qemu/qemu_migration.c, src/qemu/qemu_process.c,
src/security/security_apparmor.c,
src/security/security_dac.c,
src/security/security_driver.h,
src/security/security_manager.c,
src/security/security_manager.h,
src/security/security_nop.c,
src/security/security_selinux.c,
src/security/security_stack.c: Change all security APIs to use a
virDomainDefPtr instead of virDomainObjPtr
When disk snapshots were first implemented, libvirt blindly refused
to allow an external snapshot destination that already exists, since
qemu will blindly overwrite the contents of that file during the
snapshot_blkdev monitor command, and we don't like a default of
data loss by default. But VDSM has a scenario where NFS permissions
are intentionally set so that the destination file can only be
created by the management machine, and not the machine where the
guest is running, so that libvirt will necessarily see the destination
file already existing; adding a flag will allow VDSM to force the file
reuse without libvirt complaining of possible data loss.
https://bugzilla.redhat.com/show_bug.cgi?id=767104
* include/libvirt/libvirt.h.in (virDomainSnapshotCreateFlags): Add
VIR_DOMAIN_SNAPSHOT_CREATE_REUSE_EXT.
* src/libvirt.c (virDomainSnapshotCreateXML): Document it. Add
note about partial failure.
* tools/virsh.c (cmdSnapshotCreate, cmdSnapshotCreateAs): Add new
flag.
* tools/virsh.pod (snapshot-create, snapshot-create-as): Document
it.
* src/qemu/qemu_driver.c (qemuDomainSnapshotDiskPrepare)
(qemuDomainSnapshotCreateXML): Implement the new flag.
We had loads of different styles in describing the @flags parameter
for various APIs, as well as several APIs that didn't list which
enums provided the bit values valid for the flags.
The end result is one of two formats:
@flags: bitwise-OR of vir...Flags
@flags: extra flags; not used yet, so callers should always pass 0
* src/libvirt.c: Use common sentences for flags. Also,
(virDomainGetBlockIoTune): Mention virTypedParameterFlags.
(virConnectOpenAuth): Mention virConnectFlags.
(virDomainMigrate, virDomainMigrate2, virDomainMigrateToURI)
(virDomainMigrateToURI2): Mention virDomainMigrateFlags.
(virDomainMemoryPeek): Mention virDomainMemoryFlags.
(virStoragePoolBuild): Mention virStoragePoolBuildFlags.
(virStoragePoolDelete): Mention virStoragePoolDeleteFlags.
(virStreamNew): Mention virStreamFlags.
(virDomainOpenGraphics): Mention virDomainOpenGraphicsFlags.
This *kind of* addresses:
https://bugzilla.redhat.com/show_bug.cgi?id=772395
(it doesn't eliminate the failure to start, but causes libvirt to give
a better idea about the cause of the failure).
If a guest uses a kvm emulator (e.g. /usr/bin/qemu-kvm) and the guest
is started when kvm isn't available (either because virtualization is
unavailable / has been disabled in the BIOS, or the kvm modules
haven't been loaded for some reason), a semi-cryptic error message is
logged:
libvirtError: internal error Child process (LC_ALL=C
PATH=/sbin:/usr/sbin:/bin:/usr/bin /usr/bin/qemu-kvm -device ? -device
pci-assign,? -device virtio-blk-pci,? -device virtio-net-pci,?) status
unexpected: exit status 1
This patch notices at process start that a guest needs kvm, and checks
for the presence of /dev/kvm (a reasonable indicator that kvm is
available) before trying to execute the qemu binary. If kvm isn't
available, a more useful (too verbose??) error is logged.
It should be a copy-paste error, the result is programming will result in an
infinite loop again due to without iterating 'j' variable.
* src/qemu/qemu_driver.c: fix a typo on qemuDomainSetBlkioParameters.
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=770520
Signed-off-by: Alex Jia <ajia@redhat.com>
I previously mentioned [1] a PolicyKit issue where libvirt would
proceed with authentication even though polkit-auth failed:
testusr xen134:~> virsh list --all
Attempting to obtain authorization for org.libvirt.unix.manage.
polkit-grant-helper: given auth type (8 -> yes) is bogus
Failed to obtain authorization for org.libvirt.unix.manage.
Id Name State
----------------------------------
0 Domain-0 running
- sles11sp1-pv shut off
AFAICT, libvirt attempts to obtain a privilege it already has,
causing polkit-auth to fail with above message. Instead of calling
obtain and then checking auth, IMO the workflow should be for the
server to check auth first, and if that fails ask the client to
obtain it and check again. This workflow also allows for checking
only successful exit of polkit-auth in virConnectAuthGainPolkit().
[1] https://www.redhat.com/archives/libvir-list/2011-December/msg00837.html
In the past, generic SCSI commands issued from a guest to a virtio
disk were always passed through to the underlying disk by qemu, and
the kernel would also pass them on.
As a result of CVE-2011-4127 (see:
http://seclists.org/oss-sec/2011/q4/536), qemu now honors its
scsi=on|off device option for virtio-blk-pci (which enables/disables
passthrough of generic SCSI commands), and the kernel will only allow
the commands for physical devices (not for partitions or logical
volumes). The default behavior of qemu is still to allow sending
generic SCSI commands to physical disks that are presented to a guest
as virtio-blk-pci devices, but libvirt prefers to disable those
commands in the standard virtio block devices, enabling it only when
specifically requested (hopefully indicating that the requester
understands what they're asking for). For this purpose, a new libvirt
disk device type (device='lun') has been created.
device='lun' is identical to the default device='disk', except that:
1) It is only allowed if bus='virtio', type='block', and the qemu
version is "new enough" to support it ("new enough" == qemu 0.11 or
better), otherwise the domain will fail to start and a
CONFIG_UNSUPPORTED error will be logged).
2) The option "scsi=on" will be added to the -device arg to allow
SG_IO commands (if device !='lun', "scsi=off" will be added to the
-device arg so that SG_IO commands are specifically forbidden).
Guests which continue to use disk device='disk' (the default) will no
longer be able to use SG_IO commands on the disk; those that have
their disk device changed to device='lun' will still be able to use SG_IO
commands.
*docs/formatdomain.html.in - document the new device attribute value.
*docs/schemas/domaincommon.rng - allow it in the RNG
*tests/* - update the args of several existing tests to add scsi=off, and
add one new test that will test scsi=on.
*src/conf/domain_conf.c - update domain XML parser and formatter
*src/qemu/qemu_(command|driver|hotplug).c - treat
VIR_DOMAIN_DISK_DEVICE_LUN *almost* identically to
VIR_DOMAIN_DISK_DEVICE_DISK, except as indicated above.
Note that no support for this new device value was added to any
hypervisor drivers other than qemu, because it's unclear what it might
mean (if anything) to those drivers.
This patch adds two capabilities flags to deal with various aspects
of supporting SG_IO commands on virtio-blk-pci devices:
QEMU_CAPS_VIRTIO_BLK_SCSI
set if -device virtio-blk-pci accepts the scsi="on|off" option
When present, this is on by default, but can be set to off to disable
SG_IO functions.
QEMU_CAPS_VIRTIO_BLK_SG_IO
set if SG_IO commands are supported in the virtio-blk-pci driver
(present since qemu 0.11 according to a qemu developer, if I
understood correctly)
This fixes https://bugzilla.redhat.com/show_bug.cgi?id=638633
Although scripts are not used by interfaces of type other than
"ethernet" in qemu, due to the fact that the parser stores the script
name in a union that is only valid when type is ethernet or bridge,
there is no way for anyone except the parser itself to catch the
problem of specifying an interface script for an inappropriate
interface type (by the time the parsed data gets back to the code that
called the parser, all evidence that a script was specified is
forgotten).
Since the parser itself should be agnostic to which type of interface
allows scripts (an example of why: a script specified for an interface
of type bridge is valid for xen domains, but not for qemu domains),
the solution here is to move the script out of the union(s) in the
DomainNetDef, always populate it when specified (regardless of
interface type), and let the driver decide whether or not it is
appropriate.
Currently the qemu, xen, libxml, and uml drivers recognize the script
parameter and do something with it (the uml driver only to report that
it isn't supported). Those drivers have been updated to log a
CONFIG_UNSUPPORTED error when a script is specified for an interface
type that's inappropriate for that particular hypervisor.
(NB: There was earlier discussion of solving this problem by adding a
VALIDATE flag to all libvirt APIs that accept XML, which would cause
the XML to be validated against the RNG files. One statement during
that discussion was that the RNG shouldn't contain hypervisor-specific
things, though, and a proper solution to this problem would require
that (again, because a script for an interface of type "bridge" is
accepted by xen, but not by qemu).
Commit ae523427 missed one pair of functions that could use
the helper routine.
* src/qemu/qemu_driver.c (qemuSetSchedulerParametersFlags)
(qemuGetSchedulerParametersFlags): Simplify.
On rawhide, gcc is new enough to output new DWARF information that
pdwtags has not yet learned, but the resulting 'make check' output
was rather confusing:
$ make -C src check
...
GEN virkeepaliveprotocol-structs
die__process_function: DW_TAG_INVALID (0x4109) @ <0x58c> not handled!
WARNING: your pdwtags program is too old
WARNING: skipping the virkeepaliveprotocol-structs test
WARNING: install dwarves-1.3 or newer
...
$ pdwtags --version
v1.9
I've filed the pdwtags deficiency as
https://bugzilla.redhat.com/show_bug.cgi?id=772358
* src/Makefile.am (PDWTAGS): Don't leave -t file behind on version
mismatch. Soften warning message, since 1.9 is newer than 1.3.
Don't leak stderr from broken version.
Commit db371a2 mistakenly added new functions inside a #ifndef WIN32
guard, even though they are needed on all platforms.
* src/util/command.c (virCommandFDSet): Move outside WIN32
conditional.
Detected by valgrind. Leak introduced in commit 5745dc1.
* src/qemu/qemu_command.c: fix memory leak on failure and successful path.
* How to reproduce?
% valgrind -v --leak-check=full ./qemuargv2xmltest
* Actual result:
==2196== 80 bytes in 1 blocks are definitely lost in loss record 3 of 4
==2196== at 0x4A05FDE: malloc (vg_replace_malloc.c:236)
==2196== by 0x39CF07F6E1: strdup (in /lib64/libc-2.12.so)
==2196== by 0x419823: qemuParseRBDString (qemu_command.c:1657)
==2196== by 0x4221ED: qemuParseCommandLine (qemu_command.c:5934)
==2196== by 0x422AFB: qemuParseCommandLineString (qemu_command.c:7561)
==2196== by 0x416864: testCompareXMLToArgvHelper (qemuargv2xmltest.c:48)
==2196== by 0x417DB1: virtTestRun (testutils.c:141)
==2196== by 0x415CAF: mymain (qemuargv2xmltest.c:175)
==2196== by 0x4174A7: virtTestMain (testutils.c:696)
==2196== by 0x39CF01ECDC: (below main) (in /lib64/libc-2.12.so)
==2196==
==2196== LEAK SUMMARY:
==2196== definitely lost: 80 bytes in 1 blocks
Signed-off-by: Alex Jia <ajia@redhat.com>
When setting numa nodeset for a domain which has no nodeset set
before, libvirtd crashes by dereferencing the pointer to the old
nodemask which is null in that case.
Commit baade4d fixed a memory leak on failure, but in the process,
introduced a use-after-free on success, which can be triggered with:
1. set bandwidth with --live
2. query bandwidth
3. set bandwidth with --live
* src/qemu/qemu_driver.c (qemuDomainSetInterfaceParameters): Don't
free newBandwidth on success.
Reported by Hu Tao.
Commit b434329 has a logic bug: seclabel overrides don't set
def->type, but the default value is 0 (aka static). Restarting
libvirtd would thus reject the XML for any domain with an
override of <seclabel relabel='no'/> (which happens quite
easily if a disk image lives on NFS), with a message:
2012-01-04 22:29:40.949+0000: 6769: error : virSecurityLabelDefParseXMLHelper:2593 : XML error: security label is missing
Fix the logic to never read from an override's def->type, and
to allow a missing <label> subelement when relabel is no. There's
a lot of stupid double-negatives in the code (!norelabel) because
of the way that we want the zero-initialized defaults to behave.
* src/conf/domain_conf.c (virSecurityLabelDefParseXMLHelper): Use
type field from correct location.
Currently, virCommand implementation uses FD_ macros from
sys/select.h. However, those cannot handle more opened files
than FD_SETSIZE. Therefore switch to generalized implementation
based on array of integers.
xen-unstable c/s 23874:651aed73b39c added another member to
xen_domctl_getdomaininfo struct and bumped domctl version to 8.
Add a corresponding domctl v8 struct in xen hypervisor sub-driver
and detect domctl v8 during initialization.
The console path in xenstore is /local/domain/<id>/console/tty
for PV guests (PV console) and /local/domain/<id>/serial/0/tty
(serial console) for HVM guests. Similar to Xen's in-tree console
client, read the correct path for PV vs HVM.
It is a good practise to set revents to zero before doing any poll().
Moreover, we should check if event we waited for really occurred or
if any of fds we were polling on didn't encountered hangup.
Most severe here is a latent (but currently untriggered) memory leak
if any hypervisor ever adds a string interface property; the
remainder are mainly cosmetic.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_BANDWIDTH_*): Move
macros closer to interface that uses them, and document type.
* src/libvirt.c (virDomainSetInterfaceParameters)
(virDomainGetInterfaceParameters): Formatting tweaks.
* daemon/remote.c (remoteDispatchDomainGetInterfaceParameters):
Avoid memory leak.
* src/libvirt_public.syms (LIBVIRT_0.9.9): Sort lines.
* src/libvirt_private.syms (domain_conf.h): Likewise.
* src/qemu/qemu_driver.c (qemuDomainSetInterfaceParameters): Fix
comments, break long lines.
Hi,
this is the fifth version of my SRV record for DNSMasq patch rebased
for the current codebase to the bridge driver and libvirt XML file to
include support for the SRV records in the DNS. The syntax is based on
DNSMasq man page and tests for both xml2xml and xml2argv were added as
well. There are some things written a better way in comparison with
version 4, mainly there's no hack in tests/networkxml2argvtest.c and
also the xPath context is changed to use a simpler query using the
virXPathInt() function relative to the current node.
Also, the patch is also fixing the networkxml2argv test to pass both
checks, i.e. both unit tests and also syntax check.
Please review,
Michal
Signed-off-by: Michal Novotny <minovotn@redhat.com>
Leak detected by Coverity, and introduced in commit 93ab585.
Reported by Alex Jia.
* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters): Free
devices array on error.
The blocks to extract node information on a per-arch
basis wasn't well balanced leading to a compilation
failure if not on one of the handled arches (PCs and PPCs)
This wires up the XML changes in the previous patch to let SELinux
labeling honor user overrides, as well as affecting the live XML
configuration in one case where the user didn't specify anything
in the offline XML.
I noticed that the logs contained messages like this:
2011-12-05 23:32:40.382+0000: 26569: warning : SELinuxRestoreSecurityFileLabel:533 : cannot lookup default selinux label for /nfs/libvirt/images/dom.img
for all my domain images living on NFS. But if we would just remember
that on domain creation that we were unable to set a SELinux label (due to
NFSv3 lacking labels, or NFSv4 not being configured to expose attributes),
then we could avoid wasting the time trying to clear the label on
domain shutdown. This in turn is one less point of NFS failure,
especially since there have been documented cases of virDomainDestroy
hanging during an attempted operation on a failed NFS connection.
* src/security/security_selinux.c (SELinuxSetFilecon): Move guts...
(SELinuxSetFileconHelper): ...to new function.
(SELinuxSetFileconOptional): New function.
(SELinuxSetSecurityFileLabel): Honor override label, and remember
if labeling failed.
(SELinuxRestoreSecurityImageLabelInt): Skip relabeling based on
override.
Implement the parsing and formatting of the XML addition of
the previous commit. The new XML doesn't affect qemu command
line, so we can now test round-trip XML->memory->XML handling.
I chose to reuse the existing structure, even though per-device
override doesn't use all of those fields, rather than create a
new structure, in order to reuse more code.
* src/conf/domain_conf.h (_virDomainDiskDef): Add seclabel member.
* src/conf/domain_conf.c (virDomainDiskDefFree): Free it.
(virSecurityLabelDefFree): New function.
(virDomainDiskDefFormat): Print it.
(virSecurityLabelDefFormat): Reduce output if model not present.
(virDomainDiskDefParseXML): Alter signature, and parse seclabel.
(virSecurityLabelDefParseXML): Split...
(virSecurityLabelDefParseXMLHelper): ...into new helper.
(virDomainDeviceDefParse, virDomainDefParseXML): Update callers.
* tests/qemuxml2argvdata/qemuxml2argv-seclabel-dynamic-override.args:
New file.
* tests/qemuxml2xmltest.c (mymain): Enhance test.
* tests/qemuxml2argvtest.c (mymain): Likewise.
A future patch will parse and output <seclabel> in more than one
location in a <domain> xml; make it easier to reuse code.
* src/conf/domain_conf.c (virSecurityLabelDefFree): Rename...
(virSecurityLabelDefClear): ...and make static.
(virSecurityLabelDefParseXML): Alter signature.
(virDomainDefParseXML, virDomainDefFree): Adjust callers.
(virDomainDefFormatInternal): Split output...
(virSecurityLabelDefFormat): ...into new helper.
* daemon/remote.c: implement the server side support
* src/remote/remote_driver.c: implement the client side support
* src/remote/remote_protocol.x: definitions for the new entry points
* src/remote_protocol-structs: structure definitions
The APIs are used to set/get domain's network interface's parameters.
Currently supported parameters are bandwidth settings.
* include/libvirt/libvirt.h.in: new API and parameters definition
* python/generator.py: skip the Python API generation
* src/driver.h: add new entry to the driver structure
* src/libvirt_public.syms: export symbols
https://bugzilla.redhat.com/show_bug.cgi?id=770520
We had two nested loops both trying to use 'i' as the iteration
variable, which can result in an infinite loop when the inner
loop interferes with the outer loop. Introduced in commit 93ab585.
* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters): Don't
reuse iteration variable across two loops.
This patch adds max_files option to qemu.conf which can be used to
override system default limit on number of opened files that are
allowed for qemu user.
Remove the requirement that DHCP messages have to be broadcasted.
DHCP requests are most often sent via broadcast but can be directed
towards a specific DHCP server. For example 'dhclient' takes '-s <server>'
as a command line parameter thus allowing DHCP requests to be sent to a
specific DHCP server.
Add logic to assign addresses for devices with spapr-vio addresses.
We also do validation of addresses specified by the user, ie. ensuring
that there are not duplicate addresses on the bus.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
For QEMU PPC64 we have a machine type ("pseries") which has a virtual
bus called "spapr-vio". We need to be able to create devices on this
bus, and as such need a way to specify the address for those devices.
This patch adds a new address type "spapr-vio", which achieves this.
The addressing is specified with a "reg" property in the address
definition. The reg is optional, if it is not specified QEMU will
auto-assign an address for the device.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Currently non-x86 guests must have <acpi/> defined in <features> to
prevent libvirt from running qemu with -no-acpi. Although it works, it
is a hack.
Instead add a capability flag which indicates whether qemu understands
the -no-acpi option. Use it to control whether libvirt emits -no-acpi.
Current versions of qemu always display -no-acpi in their help output,
so this patch has no effect. However the development version of qemu
has been modified such that -no-acpi is only displayed when it is
actually supported.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
The RPC code had several latent memory leaks and an attempt to
free the wrong string, but thankfully nothing triggered them
(blkiotune was the only one returning a string, and always as
the last parameter). Also, our cleanups for rpcgen ended up
nuking a line of code that renders VIR_TYPED_PARAM_INT broken,
because it was the only use of 'i' in a function, even though
it was a member usage rather than a standalone declaration.
* daemon/remote.c (remoteSerializeTypedParameters): Free the
correct array element.
(remoteDispatchDomainGetSchedulerParameters)
(remoteDispatchDomainGetSchedulerParametersFlags)
(remoteDispatchDomainBlockStatsFlags)
(remoteDispatchDomainGetMemoryParameters): Don't leak strings.
* src/rpc/genprotocol.pl: Don't nuke member-usage of 'buf' or 'i'.
The lifetime of the virDomainEventState object is tied to
the lifetime of the driver, which in stateless drivers is
tied to the lifetime of the virConnectPtr.
If we add & remove a timer when allocating/freeing the
virDomainEventState object, we can get a situation where
the timer still triggers once after virDomainEventState
has been freed. The timeout callback can't keep a ref
on the event state though, since that would be a circular
reference.
The trick is to only register the timer when a callback
is registered with the event state & remove the timer
when the callback is unregistered.
The demo for the bug is to run
while true ; do date ; ../tools/virsh -q -c test:///default 'shutdown test; undefine test; dominfo test' ; done
prior to this fix, it will frequently hang and / or
crash, or corrupt memory
Currently all drivers using domain events need to provide a callback
for handling a timer to dispatch events in a clean stack. There is
no technical reason for dispatch to go via driver specific code. It
could trivially be dispatched directly from the domain event code,
thus removing tedious boilerplate code from all drivers
Also fix the libxl & xen drivers to pass 'true' when creating the
virDomainEventState, since they run inside the daemon & thus always
expect events to be present.
* src/conf/domain_event.c, src/conf/domain_event.h: Internalize
dispatch of events from timer callback
* src/libxl/libxl_driver.c, src/lxc/lxc_driver.c,
src/qemu/qemu_domain.c, src/qemu/qemu_driver.c,
src/remote/remote_driver.c, src/test/test_driver.c,
src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
src/xen/xen_driver.c: Remove all timer dispatch functions
The virDomainEventCallbackList and virDomainEventQueue APIs are
now solely helpers used internally by virDomainEventState APIs.
Remove their decls from domain_event.h since no driver code should
need to use them any more.
* src/conf/domain_event.c: Make virDomainEventCallbackList and
virDomainEventQueue APIs static & remove some unused APIs
* src/conf/domain_event.h, src/libvirt_private.syms: Remove
virDomainEventCallbackList and virDomainEventQueue APIs
No caller of the domain events APIs should need to poke at the
struct internals. Thus they should all be removed from the
header file
* src/conf/domain_event.h: Remove struct definitions
* src/conf/domain_event.c: Add struct definitions
While virDomainEventState has APIs for managing removal of callbacks,
while locked, adding callbacks in the first place requires direct
access to the virDomainEventCallbackList structure. This is not
threadsafe since it is bypassing the virDomainEventState locks
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Add APIs for managing callbacks
via virDomainEventState.
When registering a callback for a particular event some callers
need to know how many callbacks already exist for that event.
While it is possible to ask for a count, this is not free from
race conditions when threaded. Thus the API for registering
callbacks should return the count of callbacks. Also rename
virDomainEventStateDeregisterAny to virDomainEventStateDeregisterID
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Return count of callbacks when
registering callbacks
* src/libxl/libxl_driver.c, src/libxl/libxl_driver.c,
src/qemu/qemu_driver.c, src/remote/remote_driver.c,
src/remote/remote_driver.c, src/uml/uml_driver.c,
src/vbox/vbox_tmpl.c, src/xen/xen_driver.c: Update
for change in APIs
The Xen & VBox drivers deal with callbacks & dispatching of
events directly. All the other drivers use a timer to dispatch
events from a clean stack state, rather than deep inside the
drivers. Convert Xen & VBox over to virDomainEventState so
that they match behaviour of other drivers
* src/conf/domain_event.c: Return count of remaining
callbacks when unregistering event callback
* src/vbox/vbox_tmpl.c, src/xen/xen_driver.c,
src/xen/xen_driver.h: Convert to virDomainEventState
If only iptables rules are created then two unnecessary ebtables chains
are also created. This patch fixes this and prevents these chains from
being created. They have been cleaned up properly, though.
A generic error code was returned, if the user aborted a migration job.
This made it hard to distinguish between a user requested abort and an
error that might have occured. This patch introduces a new error code,
which is returned in the specific case of a user abort, while leaving
all other failures with their existing code. This makes it easier to
distinguish between failure while mirgrating and an user requested
abort.
* include/libvirt/virterror.h: - add new error code
* src/util/virterror.c: - add message for the new error code
* src/qemu/qemu_migration.h: - Emit operation aborted error instead of
operation failed, on migration abort
If managed save fails at the right point in time, then the save
image can end up with 0 bytes in length (no valid header), and
our attempts in commit 55d88def to detect and skip invalid save
files missed this case.
* src/qemu/qemu_driver.c (qemuDomainSaveImageOpen): Also unlink
empty file as corrupt. Reported by Dennis Householder.
Currently, on device detach, we parse given XML, find the device
in domain object, free it and try to restore security labels.
However, in some cases (e.g. usb hostdev) parsed XML contains
less information than freed device. In usb case it is bus & device
IDs. These are needed during label restoring as a symlink into
/dev/bus is generated from them. Therefore don't drop device
configuration until security labels are restored.
In commit 6f84e110 I mistakenly set default migration speed to
33554432 Mb! The units of migMaxBandwidth is Mb, with conversion
handled in qemuMonitor{JSON,Text}SetMigrationSpeed().
Also, remove definition of QEMU_DOMAIN_FILE_MIG_BANDWIDTH_MAX since
it is no longer used after reverting commit ef1065cf.
If an async job run on a domain will stop the domain at the end of the
job, a concurrently run query job can hang in qemu monitor and nothing
can be done with that domain from this point on. An attempt to start
such domain results in "Timed out during operation: cannot acquire state
change lock" error.
However, quite a few things have to happen at the right time... There
must be an async job running which stops a domain at the end. This race
was reported with dump --crash but other similar jobs, such as
(managed)save and migration, should be able to trigger this bug as well.
While this async job is processing its last monitor command, that is a
query-migrate to which qemu replies with status "completed", a new
libvirt API that results in a query job must arrive and stay waiting
until the query-migrate command finishes. Once query-migrate is done but
before the async job closes qemu monitor while stopping the domain, the
other thread needs to wake up and call qemuMonitorSend to send its
command to qemu. Before qemu gets a chance to respond to this command,
the async job needs to close the monitor. At this point, the query job
thread is waiting for a condition that no-one will ever signal so it
never finishes the job.
* src/qemu/qemu_hostdev.c (qemuDomainReAttachHostdevDevices):
pciDeviceListFree(pcidevs) in the end free()s the device even if
it's in use by other domain, which can cause a race.
How to reproduce:
<script>
virsh nodedev-dettach pci_0000_00_19_0
virsh start test
virsh attach-device test hostdev.xml
virsh start test2
for i in {1..5}; do
echo "[ -- ${i}th time --]"
virsh nodedev-reattach pci_0000_00_19_0
done
echo "clean up"
virsh destroy test
virsh nodedev-reattach pci_0000_00_19_0
</script>
Device pci_0000_00_19_0 dettached
Domain test started
Device attached successfully
error: Failed to start domain test2
error: Requested operation is not valid: PCI device 0000:00:19.0 is in use by domain test
[ -- 1th time --]
Device pci_0000_00_19_0 re-attached
[ -- 2th time --]
Device pci_0000_00_19_0 re-attached
[ -- 3th time --]
Device pci_0000_00_19_0 re-attached
[ -- 4th time --]
Device pci_0000_00_19_0 re-attached
[ -- 5th time --]
Device pci_0000_00_19_0 re-attached
clean up
Domain test destroyed
Device pci_0000_00_19_0 re-attached
The patch also fixes another problem, there won't be error like
"qemuDomainReAttachHostdevDevices: Not reattaching active
device 0000:00:19.0" in daemon log if some device is in active.
As pciResetDevice and pciReattachDevice won't be called for
the device anymore. This is sensible as we already reported
error when preparing the device if it's active. Blindly trying
to pciResetDevice & pciReattachDevice on the device and getting
an error is just redundant.
This patch fixes two problems:
1) The device will be reattached to host even if it's not
managed, as there is a "pciDeviceSetManaged".
2) The device won't be reattached to host with original
driver properly. As it doesn't honor the device original
properties which are maintained by driver->activePciHostdevs.
This chunk of code below repeated in several functions, factor it into
a helper method virDomainLiveConfigHelperMethod to eliminate duplicated code
based on Eric and Adam's suggestion. I have tested it for all the
relevant APIs changed.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
If the vol object is newly created, it increases the volumes count,
but doesn't decrease the volumes count when do cleanup. It can
cause libvirtd to crash when one trying to free the volume objects
like:
for (i = 0; i < pool->volumes.count; i++)
virStorageVolDefFree(pool->volumes.objs[i]);
It's more reliable if we add the newly created vol object in the
end.
When destroying a domain qemuDomainDestroy kills its qemu process and
starts a new job, which means it unlocks the domain object and locks it
again after some time. Although the object is usually unlocked for a
pretty short time, chances are another thread processing an EOF event on
qemu monitor is able to lock the object first and does all the cleanup
by itself. This leads to wrong shutoff reason and lifecycle event detail
and virDomainDestroy API incorrectly reporting failure to destroy an
inactive domain.
Reported by Charlie Smurthwaite.
Current "-ay | -an" has problems on pool starting/refreshing if
the volumes are clustered. Rommer has posted a patch to list 2
months ago.
https://www.redhat.com/archives/libvir-list/2011-October/msg01116.html
But IMO we shouldn't skip the inactived vols. So this is a squashed
patch by Rommer.
Signed-off-by: Rommer <rommer@active.by>
Network disks don't have paths to be resolved or files to be checked
for ownership. ee3efc41e6 checked this
for some image label functions, but was partially reverted in a
refactor. This finishes adding the check to each security driver's
set and restore label methods for images.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
This patch addresses https://bugzilla.redhat.com/show_bug.cgi?id=760442
When a network has any forward type other than route, nat or none, the
network configuration should be done completely external to libvirt -
libvirt only uses these types to allow configuring guests in a manner
that isn't tied to a specific host (all the host-specific information,
in particular interface names, port profile data, and bandwidth
configuration is in the network definition, and the guest
configuration only references it).
Due to a bug in the bridge network driver, libvirt was adding iptables
rules for networks with forward type='bridge' etc. any time libvirtd
was restarted while one of these networks was active.
This patch eliminates that error by only "reloading" iptables rules if
forward type is route, nat, or none.
Currently qemuDomainAssignPCIAddresses() is called to assign addresses
to PCI devices.
We need to do something similar for devices with spapr-vio addresses.
So create one place where address assignment will be done, that is
qemuDomainAssignAddresses().
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
For the PPC64 pseries machine type we need to add address information
for the spapr-vty device.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
In QEMU PPC64 we have a network device called "spapr-vlan". We can specify
this using the existing syntax for network devices, however libvirt
currently rejects "spapr-vlan" in virDomainNetDefParseXML() because of
the "-". Fix the code to accept "-".
* src/conf/domain_conf.c (virDomainNetDefParseXML): Allow '-' in
model name, and be more efficient.
* docs/schemas/domaincommon.rng: Limit valid model names to match code.
Based on a patch by Michael Ellerman.
When parsing ppc64 models on an x86 host an out-of-memory error message is displayed due
to it checking for retcpus being NULL. Fix this by removing the check whether retcpus is NULL
since we will realloc into this variable.
Also in the X86 model parser display the OOM error at the location where it happens.
Fix memory leak:
==27534== 24 bytes in 1 blocks are definitely lost in loss record 207 of 530
==27534== at 0x4A05E46: malloc (vg_replace_malloc.c:195)
==27534== by 0x38EC26EC37: vasprintf (in /lib64/libc-2.13.so)
==27534== by 0x4E998E6: virVasprintf (util.c:1677)
==27534== by 0x4E999F1: virAsprintf (util.c:1695)
==27534== by 0x4F1EAAC: nodeGetInfo (nodeinfo.c:593)
==27534== by 0x47948F: qemuCapsInitCPU (qemu_capabilities.c:855)
==27534== by 0x4796B1: qemuCapsInit (qemu_capabilities.c:915)
==27534== by 0x456550: qemuCreateCapabilities (qemu_driver.c:245)
==27534== by 0x4578C4: qemudStartup (qemu_driver.c:580)
==27534== by 0x4F20886: virStateInitialize (libvirt.c:852)
==27534== by 0x420E55: daemonRunStateInit (libvirtd.c:1156)
==27534== by 0x4E94C56: virThreadHelper (threads-pthread.c:157)
Mark this leaked variable as const char * when it is passed into another
function.
Pool creates new workers dynamically. However, it is possible
for a pool to have no workers. If we want to free that pool,
we don't want to wait on quit condition as it will never be
signaled.
Add support for newly supported Intel cpu features. Newly supported
flags are: pclmuldq, dtes64, smx, fma, pdcm, movbe, xsave, osxsave and
avx. This adds support for Intel's Sandy Bridge platform.
A preparatory patch for DHCP snooping where we want to be able to
differentiate between a VM's interface using the tuple of
<VM UUID, Interface MAC address>. We assume that MAC addresses could
possibly be re-used between different networks (VLANs) thus do not only
want to rely on the MAC address to identify an interface.
At the current 'final destination' in virNWFilterInstantiate I am leaving
the vmuuid parameter as ATTRIBUTE_UNUSED until the DHCP snooping patches arrive.
(we may not post the DHCP snooping patches for 0.9.9, though)
Mostly this is a pretty trivial patch. On the lowest layers, in lxc_driver
and uml_conf, I am passing the virDomainDefPtr around until I am passing
only the VM's uuid into the NWFilter calls.
This patch cleans up return codes in the nwfilter subsystem.
Some functions in nwfilter_conf.c (validators and formatters) are
keeping their bool return for now and I am converting their return
code to true/false.
All other functions now have failure return codes of -1 and success
of 0.
[I searched for all occurences of ' 1;' and checked all 'if ' and
adapted where needed. After that I did a grep for 'NWFilter' in the source
tree.]
In some error situations, the function testDomainRestoreFlags() could
unlock the test driver mutex without first locking it. This patch
moves the lock operation earlier, so that it occurs before any
potential jump down to the unlock call.
I found this problem while auditing the test driver lock usage to
determine the cause of a hang while running the following test:
cd tests; while true; do printf x; ./undefine; done
This patch *does not* solve that problem, but we now understand its
actual source, and danpb is working on a patch.
assumptions from generic code.
This implements the minimal set of changes needed in libvirt to launch a
PowerPC-KVM based guest.
It removes x86-specific assumptions about choice of serial driver backend
from generic qemu guest commandline generation code.
It also restricts the ACPI capability to be available for an x86 or
x86_64 domain.
This is not a complete solution -- it still does not guarantee libvirt
the capability to flag non-supported options in guest XML. (Eg, an ACPI
specification in a PowerPC guest XML will still get processed, even
though qemu-system-ppc64 does not support it while qemu-system-x86_64 does.)
This drawback exists because libvirt falls back on qemu to query supported
features, and qemu '-h' blindly lists all capabilities -- irrespective
of whether they are available while emulating a given architecture or not.
The long-term solution would be for qemu to list out capabilities based
on architecture and platform -- so that libvirt can cleanly make out what
devices are supported on an arch (say 'ppc64') and platform (say, 'mac99').
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
This enables libvirt to select the correct qemu binary (qemu-system-ppc64)
for a guest vm based on arch 'ppc64'.
Also, libvirt is enabled to correctly parse the list of supported PowerPC
CPUs, generated by running 'qemu-system-ppc64 -cpu ?'
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
Acked-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
/proc/cpuinfo
Libvirt at present depends on /proc/cpuinfo to gather host
details such as CPUs, cores, threads, etc. This is an architecture-
dependent approach. An alternative is to use 'Sysfs', which provides
a platform-agnostic interface to parse host CPU topology.
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
On RHEL 5, with libxml2-2.6.26, the build failed with:
virsh.c: In function 'vshNodeIsSuperset':
virsh.c:11951: warning: implicit declaration of function 'xmlChildElementCount'
(or if warnings aren't errors, a link failure later on).
* src/util/xml.h (virXMLChildElementCount): New prototype.
* src/util/xml.c (virXMLChildElementCount): New function.
* src/libvirt_private.syms (xml.h): Export it.
* tools/virsh.c (vshNodeIsSuperset): Use it.
When one thread passes the buck to another thread, it uses
virCondSignal to wake up the target thread. The variable
'haveTheBuck' is not updated in a race-free manner when
this occurs. The current thread sets it to false, and the
woken up thread sets it to true. There is a window where
a 3rd thread can come in and grab the buck.
Even if this didn't lead to crashes & deadlocks, this would
still result in unfairness in the buckpassing algorithm.
A better solution is to *never* set haveTheBuck to false
when we're passing the buck. Only set it to false when there
is no further thread waiting for the buck.
* src/rpc/virnetclient.c: Only set haveTheBuck to false
if no thread is waiting
Commit fd06692544 tried to fix
a race condition in
commit fa9595003d
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Fri Nov 11 15:28:41 2011 +0000
Explicitly track whether the buck is held in remote client
Unfortunately there is a second race condition whereby the
event loop can trigger due to incoming data to read. Revert
this fix, so a complete fix for the problem can be cleanly
applied
* src/rpc/virnetclient.c: Revert fd06692544
With security_driver set to "none" in /etc/libvirt/qemu.conf,
libvirtd would crash when attempted to attach to an existing
qemu process. Only copy the security model if it actually exists.
During virDomainDestroy, QEMU may emit SHUTDOWN event as a response to
SIGTERM and since domain object is still locked, the event is processed
after the domain is destroyed. We need to ignore this event in such case
to avoid changing domain state from shutoff to shutdown.
This patch is to expose the fabric_name of fc_host class, which
might be useful for users who wants to known which fabric the
(v)HBA connects to.
The patch also adds the missed capabilities' XML schema of scsi_host,
(of course, with fabric_wwn added), and update the documents
(docs/formatnode.html.in)
Currently if you try to connect to a local libvirtd when
libvirtd is not in $PATH, you'll get an error
error: internal error invalid use of command API
This is because remoteFindDaemonPath() returns NULL, which
causes us to pass NULL into virNetSocketConnectUNIX which
in turn causes us to pass NULL into virCommandNewArgList.
Adding missing error checks improves this to
error: internal error Unable to locate libvirtd daemon in $PATH
* src/remote/remote_driver.c: Report error if libvirtd
cannot be found
* src/rpc/virnetsocket.c: Report error if caller requested
spawning of daemon, but provided no binary path
The Mingw32 linker highlighted that the symbols for virtime.h
declared in libvirt_private.syms were incorrect
* src/libvirt_private.syms: Fix virtime.h symbols
When QEMU guest finishes its shutdown sequence, qemu stops virtual CPUs
and when started with -no-shutdown waits for us to kill it using
SGITERM. Since QEMU is flushing its internal buffers, some time may pass
before QEMU actually dies. We mistakenly used "paused" state (and
events) for this which is quite confusing since users may see a domain
going to pause while they expect it to shutdown. Since we already have
"shutdown" state with "the domain is being shut down" semantics, we
should use it for this state.
However, the state didn't have a corresponding event so I created one
and called its detail as VIR_DOMAIN_EVENT_SHUTDOWN_FINISHED (guest OS
finished its shutdown sequence) with the intent to add
VIR_DOMAIN_EVENT_SHUTDOWN_STARTED in the future if we have a
sufficiently capable guest agent that can notify us when guest OS starts
to shutdown.
Otherwise connections to older libvirt abort with:
$ virsh -c qemu+ssh://host.example.com/system list
error: invalid connection pointer in virDrvSupportsFeature
error: failed to connect to the hypervisor
Tested against 0.8.3 and 0.9.8-rc2.
https://bugzilla.redhat.com/show_bug.cgi?id=648855 mentioned a
misuse of 'an' where 'a' is proper; that has since been fixed,
but a search found other problems (some were a spelling error for
'and', while most were fixed by 'a').
* daemon/stream.c: Fix grammar.
* src/conf/domain_conf.c: Likewise.
* src/conf/domain_event.c: Likewise.
* src/esx/esx_driver.c: Likewise.
* src/esx/esx_vi.c: Likewise.
* src/rpc/virnetclient.c: Likewise.
* src/rpc/virnetserverprogram.c: Likewise.
* src/storage/storage_backend_fs.c: Likewise.
* src/util/conf.c: Likewise.
* src/util/dnsmasq.c: Likewise.
* src/util/iptables.c: Likewise.
* src/xen/xen_hypervisor.c: Likewise.
* src/xen/xend_internal.c: Likewise.
* src/xen/xs_internal.c: Likewise.
* tools/virsh.c: Likewise.
virBufferContentAndReset (intentionally) returns NULL for a buffer
with no content, but it is feasible to invoke a command with an
explicit empty string.
* src/util/command.c (virCommandAddEnvBuffer): Reject empty string.
(virCommandAddArgBuffer): Allow explicit empty argument.
* tests/commandtest.c (test9): Test it.
* tests/commanddata/test9.log: Adjust.
The RPC fixups needed on Linux are also needed on cygwin, and
worked without further tweaking to the list of fixups. Also,
unlike BSD, Cygwin exports 'struct ifreq', but unlike Linux,
Cygwin lacks the ioctls that we were using 'struct ifreq' to
access. This patch allows compilation under cygwin.
* src/rpc/genprotocol.pl: Also perform fixups on cygwin.
* src/util/virnetdev.c (HAVE_STRUCT_IFREQ): Also require AF_PACKET
definition.
* src/util/virnetdevbridge.c (virNetDevSetupControlFull): Only
compile if SIOCBRADDBR works.
The pathname for the pipe for tunnelled migration is unresolvable. The
libvirt apparmor driver therefore refuses access, causing migration to
fail. If we can't resolve the path, the worst that can happen is that
we should have given permission to the file but didn't. Otherwise
(especially since this is a /proc/$$/fd/N file) the file is already open
and libvirt won't be refused access by apparmor anyway.
Also adjust virt-aa-helper to allow access to the
*.tunnelmigrate.dest.name files.
For more information, see https://launchpad.net/bugs/869553.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Originaly, the code checked if another client is the queue and infered
ownership of the buck from that. Commit fa9595003d
added a separate variable to track the buck. That caused, that a new
call might enter claiming it has the buck, while another thread was
signalled to take the buck. This ends in two threads claiming they hold
the buck and entering poll(). This happens due to a race on waking up
threads on the client lock mutex.
This caused multi-threaded clients to hang, most prominently visible and
reproducible on python based clients, like virt-manager.
This patch causes threads, that have been signalled to take the buck to
re-check if buck is held by another thread.
This ought to fix the build if you have net/if.h but do
not have struct ifreq
* configure.ac: Check for struct ifreq in net/if.h
* src/util/virnetdev.c: Conditionalize to avoid use of
struct ifreq if it does not exist
probes.h can only be generated on Linux, and then only with dtrace
installed. If it is part of the tarball, then either 'make dist'
will fail if you don't have that setup, or we would have to start
keeping probes.h in libvirt.git. Since we only need it to be
generated when dtrace is in use, it's better to avoid shipping
it in the first place, and avoid tracking it in git.
Meanwhile, there is a build dependency - since the RPC code is
generated, it can be built early; but when dtrace is enabled, we
must ensure probes.h is built even earlier. Commit 1afcfbdd tried
to fix this, but did so in a way that added probes.h into the
tarball, and broke VPATH as well. Commit ecbca767 fixed VPATH,
but didn't fix the more fundamental problem. This patch solves
the issue by adding a dependency instead.
Tested with 'make dist' in a clean VPATH builds, for both
'./configure --without-dtrace' and './configure --with-dtrace';
all configurations were able to correctly build a tarball, and
the dtrace configuration no longer sticks probes.h in the tarball.
* src/Makefile.am (REMOTE_DRIVER_GENERATED): Don't ship probes.h;
rather, make it a dependency.
Fix a logic error, the initial value of ret = -1, if just set --config,
it will goto endjob directly without doing its really job here.
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
The glibc time.h header has an undocumented __isleap macro
that we are using. Since it is undocumented & does not appear
on any other OS, stop using it and just define the macro in
libvirt code instead.
* src/util/virtime.c: Remove __isleap usage
Only one IPv4 DHCP definition is supported. Originally the code checked
for a multiple definition and returned an error, but the new domain
definition was already added to networks. This patch moves the check
before the newly defined network is added to active networks.
*src/network/bridge_driver.c: networkDefine(): - move multiple IPv4
addresses check before
definition is used.
Detected by Coverity. Leak introduced in commit c1df2c1.
Two bugs here:
1. memory leak on successful parse
2. failure to parse still returned success
Signed-off-by: Alex Jia <ajia@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Detected by Coverity. Leak introduced in commit 8866eed.
Two bugs here:
1. logfd wasn't closed on all return paths
2. if we failed to mark a domain autodestroy, then the domain
was not made transient but we still returned success
Signed-off-by: Alex Jia <ajia@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Detected by Coverity. Leak introduced in commit 673adba.
Two separate bugs here:
1. call was not freed on all error paths
2. virCondDestroy was called even if virCondInit failed
Signed-off-by: Alex Jia <ajia@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
To add support for running libvirt on PowerPC, a CPU driver for the
PowerPC platform must be added.
Most generic cpu driver routines such as CPU compare, decode, etc
are based on CPUID comparison and are not relevant for non-x86
platforms.
Here, we introduce stubs for relevant PowerPC routines invoked by libvirt.
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@au.ibm.com>
filter 0-device-weight when:
- getting blkio parameters with --config
- starting up a domain
When testing with blkio, I found these issues:
(dom is down)
virsh blkiotune dom --device-weights /dev/sda,300,/dev/sdb,500
virsh blkiotune dom --device-weights /dev/sda,300,/dev/sdb,0
virsh blkiotune dom
weight : 800
device_weight : /dev/sda,200,/dev/sdb,0
# issue 1: shows 0 device weight of /dev/sdb that may confuse user
(continued)
virsh start dom
# issue 2: If /dev/sdb doesn't exist, libvirt refuses to bring the
# dom up because it wants to set the device weight to 0 of a
# non-existing device. Since 0 means no weight-limit, we really don't
# have to set it.
Prior to this patch, for a running dom, the commands:
$ virsh blkiotune dom --device-weights /dev/sda,502,/dev/sdb,498
$ virsh blkiotune dom --device-weights /dev/sda,503
$ virsh blkiotune dom
weight : 500
device_weight : /dev/sda,503
claim that /dev/sdb no longer has a non-default weight, but
directly querying cgroups says otherwise:
$ cat /cgroup/blkio/libvirt/qemu/dom/blkio.weight_device
8:0 503
8:16 498
After this patch, an explicit 0 is required to remove a device path
from the XML, and omitting a device path that was previously
specified leaves that device path untouched in the XML, to match
cgroups behavior.
* src/qemu/qemu_driver.c (parseBlkioWeightDeviceStr): Rename...
(qemuDomainParseDeviceWeightStr): ...and use correct type.
(qemuDomainSetBlkioParameters): After parsing string, modify
rather than replacing existing table.
* tools/virsh.pod (blkiotune): Tweak wording.
The next patch will make it possible to have virDomainSetBlkioParameters
leave device weights unchanged if they are not mentioned in the incoming
string, but this only works if the list of block weights does not allow
duplicate paths. Technically, a user can still confuse libvirt by
passing alternate spellings that resolve to the same device, but it
is not worth worrying about working around that kind of abuse.
* src/conf/domain_conf.c (virDomainDefParseXML): Require unique
paths.
Implement the block I/O throttle setting and getting support to qemu
driver.
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Enable block I/O throttle for per-disk in XML, as the first
per-disk IO tuning parameter.
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Support Block I/O Throttle setting and query to remote driver.
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
The virTimestamp and virTimeMs functions in src/util/util.h
duplicate functionality from virtime.h, in a non-async signal
safe manner. Remove them, and convert all code over to the new
APIs.
* src/util/util.c, src/util/util.h: Delete virTimeMs and virTimestamp
* src/lxc/lxc_driver.c, src/qemu/qemu_domain.c,
src/qemu/qemu_driver.c, src/qemu/qemu_migration.c,
src/qemu/qemu_process.c, src/util/event_poll.c: Convert to use
virtime APIs
The logging APIs need to be able to generate formatted timestamps
using only async signal safe functions. This rules out using
gmtime/localtime/malloc/gettimeday(!) and much more.
Introduce a new internal API which is async signal safe.
virTimeMillisNowRaw replacement for gettimeofday. Uses clock_gettime
where available, otherwise falls back to the unsafe
gettimeofday
virTimeFieldsNowRaw replacements for gmtime(), convert a timestamp
virTimeFieldsThenRaw into a broken out set of fields. No localtime()
replacement is provided, because converting to
local time is not practical with only async signal
safe APIs.
virTimeStringNowRaw replacements for strftime() which print a timestamp
virTimeStringThenRaw into a string, using a pre-determined format, with
a fixed size buffer (VIR_TIME_STRING_BUFLEN)
For each of these there is also a version without the Raw postfix
which raises a full libvirt error. These versions are not async
signal safe
* src/Makefile.am, src/util/virtime.c, src/util/virtime.h: New files
* src/libvirt_private.syms: New APis
* configure.ac: Check for clock_gettime in -lrt
* tests/virtimetest.c, tests/Makefile.am: Test new APIs
Fix the build on Mingw32 by removing the now obsolete
virGetPMCapabilities symbol from the private exports file
* src/libvirt_private.syms: Remove virGetPMCapabilities
If suspend failed for some reason (e.g. too short duration) then
subsequent attempts to trigger suspend were rejected because we
had already marked a suspend as being in progress
* src/util/virnodesuspend.c: Don't mark suspend as active
until we've successfully triggered it
The command name for the suspend action does not need to be
strdup'd. The constant string can be used directly. This
also means the code can be trivially rearranged to make the
switch clearer
* src/util/virnodesuspend.c: Remove strdup of cmdString
To avoid probing the host power management features on any
call to virInitialize, only initialize the mutex in
virNodeSuspendInit. Do lazy load of the supported PM target
mask when it is actually needed
* src/util/virnodesuspend.c: Lazy init of supported features
If we ensure that virNodeSuspendGetTargetMask always resets
*bitmask to zero upon failure, there is no need for the
powerMgmt_valid field.
* src/util/virnodesuspend.c: Ensure *bitmask is zero upon
failure
* src/conf/capabilities.c, src/conf/capabilities.h: Remove
powerMgmt_valid field
* src/qemu/qemu_capabilities.c: Remove powerMgmt_valid
The node suspend capabilities APIs should not have been put into
util.[ch]. Instead move them into virnodesuspend.[ch]
* src/util/util.c, src/util/util.h: Remove suspend capabilities APIs
* src/util/virnodesuspend.c, src/util/virnodesuspend.h: Add
suspend capabilities APIs
* src/qemu/qemu_capabilities.c: Include virnodesuspend.h
Rename virGetPMCapabilities to virNodeSuspendGetTargetMask and
virDiscoverHostPMFeature to virNodeSuspendSupportsTarget.
* src/util/util.c, src/util/util.h: Rename APIs
* src/qemu/qemu_capabilities.c, src/util/virnodesuspend.c: Adjust
for new names
Since virDiscoverHostPMFeature is just checking one feature,
there is no reason for it to return a bitmask. Change it to
return a boolean
* src/util/util.c, src/util/util.h: Make virDiscoverHostPMFeature
return a boolean
The virHostPMCapability enum helper was declared in util.h
but implemented in capabilities.c, which is in a completely
separate library at link time. Move the declaration into the
capabilities.c file and rename it to match normal conventions
* src/util/util.h: Remove virHostPMCapability enum decl
* src/conf/capabilities.c: Add virCapsHostPMTarget enum
The capabilities XML uses the x86 specific terms 'S3', 'S4'
and 'Hybrid-Syspend'. Switch it to use the same terminology
as the API constants and virsh options, eg 'suspend_mem'
'suspend_disk' and 'suspend_hybrid'
* docs/formatcaps.html.in, docs/schemas/capability.rng,
src/conf/capabilities.c: Rename suspend constants
The internal virHostPMCapability enum just duplicates the
public virNodeSuspendTarget enum, but with different names.
* src/util/util.c: Use VIR_NODE_SUSPEND_TARGET constants
* src/util/util.h: Remove virHostPMCapability enum
* src/conf/capabilities.c: Use VIR_NODE_SUSPEND_TARGET_LAST
The VIR_NODE_SUSPEND_TARGET constants are not flags, so they
should just be assigned straightforward incrementing values.
* include/libvirt/libvirt.h.in: Change VIR_NODE_SUSPEND_TARGET
values
* src/util/virnodesuspend.c: Fix suspend target checks
Detected by Coverity. the only case is caller passes a NULL to 'format' variable,
then taking 'if (format)' false branch, the function qcow2GetBackingStoreFormat
will directly dereferences the NULL 'format' pointer variable.
Signed-off-by: Alex Jia <ajia@redhat.com>
This patch add new pulic API virDomainSetBlockIoTune and
virDomainGetBlockIoTune.
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
This adds per-device weights to <blkiotune>. Note that the
cgroups implementation only supports weights per block device,
and not per-file within the device; hence this option must be
global to the domain definition rather than tied to individual
<devices>/<disk> entries:
<domain ...>
<blkiotune>
<device>
<path>/path/to/block</path>
<weight>1000</weight>
</device>
</blkiotune>
..
This patch also adds a parameter --device-weights to virsh command
blkiotune for setting/getting blkiotune.weight_device for any
hypervisor that supports it. All <device> entries under
<blkiotune> are concatenated into a single string attribute under
virDomain{Get,Set}BlkioParameters, named "device_weight".
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Without this, 'virsh blkiotune --live --config --weight=n'
only affected live.
* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters): Allow
setting both configurations at once.
After the previous patch, there are now some redundant checks.
* src/qemu/qemu_driver.c (qemudDomainGetVcpuPinInfo)
(qemuGetSchedulerParametersFlags): Drop checks now guaranteed by
libvirt.c.
* src/lxc/lxc_driver.c (lxcGetSchedulerParametersFlags):
Likewise.
Drivers were inconsistent when presented both --live and --config
at once. For example, within qemu, getting memory parameters
favored live, getting blkio tuning favored config, and getting
scheduler parameters errored out. Also, some, but not all,
attempts to mix flags on query were filtered at the virsh level.
We shouldn't have to duplicate efforts in every client app, nor
in every driver. So, it is simpler to just enforce that the two
flags cannot both be used at once on query operations, which has
precedent in libvirt.c, and which matches the documentation of
virDomainModificationImpact.
* src/libvirt.c (virDomainGetMemoryParameters)
(virDomainGetBlkioParameters)
(virDomainGetSchedulerParametersFlags, virDomainGetVcpuPinInfo):
Borrow sanity checking from virDomainGetVcpusFlags.
It requires the domain is running, otherwise fails. Resize to a lower
size is supported, but should be used with extreme caution.
In order to prohibit the "size" overflowing after multiplied by
1024. We do checking in the codes. For QMP mode, the default units
is Bytes, the passed size needs to be multiplied by 1024, however,
for HMP mode, the default units is "Megabytes", the passed "size"
needs to be divided by 1024 then.
Implements functions for both HMP and QMP mode.
For HMP mode, qemu uses "M" as the units by default, so the passed "sized"
is divided by 1024.
For QMP mode, qemu uses "Bytes" as the units by default, the passed "sized"
is multiplied by 1024.
All of the monitor functions return -1 on failure, 0 on success, or -2 if
not supported.
The new API is named as "virDomainBlockResize", intending to add
support for qemu monitor command "block_resize" (both HMP and QMP).
Similar with APIs like "virDomainSetMemoryFlags", the units for
argument "size" is kilobytes.
Add the core functions that implement the functionality of the API.
Suspend is done by using an asynchronous mechanism so that we can return
the status to the caller before the host gets suspended. This asynchronous
operation is achieved by suspending the host in a separate thread of
execution. However, returning the status to the caller is only best-effort,
but not guaranteed.
To resume the host, an RTC alarm is set up (based on how long we want to
suspend) before suspending the host. When this alarm fires, the host
gets woken up.
Suspend-to-RAM operation on a host running Linux can take upto more than 20
seconds, depending on the load of the system. (Freezing of tasks, an operation
preceding any suspend operation, is given up after a 20 second timeout).
And Suspend-to-Disk can take even more time, considering the time required
for compaction, creating the memory image and writing it to disk etc.
So, we do not allow the user to specify a suspend duration of less than 60
seconds, to be on the safer side, since we don't want to prematurely declare
failure when we only had to wait for some more time.
Some systems support a feature known as 'Hybrid-Suspend', apart from the
usual system-wide sleep states such as Suspend-to-RAM (S3) or Suspend-to-Disk
(S4). Add the functionality to discover this power management feature and
export it in the capabilities XML under the <power_management> tag.
When another thread was dispatching while we wanted to send a
non-blocking call, we correctly queued the call and woke up the thread
but the thread just threw the call away since it forgot to recheck if
its socket was writable.
When spawning an ssh connection, the environment variables
DISPLAY, SSH_ASKPASS, ... are passed. However XAUTHORITY,
which is necessary if the .Xauthority is in a non default
place, was not passed.
Signed-off-by: Christian Franke <nobody@nowhere.ws>
virt-xml-validate fails when run on a domain XML file of type 'vbox'.
For failing test case, see https://bugzilla.redhat.com/show_bug.cgi?id=757097
This patch updates the XML schema to accept all valid hypervisor
types, as well as dropping hypervisor types that are not in use
by the current code base.
Signed-off-by: Eric Blake <eblake@redhat.com>
To make lxcSetContainerResources smaller, pull the mem tune
and device ACL setup code out into separate methods
* src/lxc/lxc_controller.c: Introduce lxcSetContainerMemTune
and lxcSetContainerDeviceACL
While LXC does not have the concept of VCPUS, so we can't do
per-VCPU pCPU placement, we can support the VM level CPU
placement. Todo this simply set the CPU affinity of the LXC
controller at startup. All child processes will inherit this
affinity.
* src/lxc/lxc_controller.c: Set process affinity
This partly reverts my previous patch f88de3eb. We need to
get file status after open, as given path could have been symlink,
so fstat() will operate on different file than lstat().
When aligning you need to clear the bits in the mask and leave the
others aside. Likely this code has never run, and will never run.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
One of my latest patches 2e37bf42d2
copy serial console definition. On domain shutdown we save this
info into state XML. However, later on the daemon start we simply
drop this info and since we are not re-reading qemu log,
vm->def->consoles[0] does not get populated with copy. Therefore
we need to avoid dropping console definition if it is just alias
for serial console.
If a connection to destination host is lost during peer-to-peer
migration (because keepalive protocol timed out), we won't be able to
finish the migration and it doesn't make sense to wait for qemu to
transmit all data. This patch automatically cancels such migration
without waiting for virDomainAbortJob to be called.
This API can be used to check if the socket associated with
virConnectPtr is still open or it was closed (probably because keepalive
protocol timed out). If there the connection is local (i.e., no socket
is associated with the connection, it is trivially always alive.
virConnectSetKeepAlive public API can be used by a client connecting to
remote server to start using keepalive protocol. The API is handled
directly by remote driver and not transmitted over the wire to the
server.
The keepalive program has two procedures: PING, and PONG.
Both are used only in asynchronous messages and the sender doesn't wait
for any reply. However, the party which receives PING messages is
supposed to react by sending PONG message the other party, but no
explicit binding between PING and PONG messages is made. For backward
compatibility neither server nor client are allowed to send keepalive
messages before checking that remote party supports them.
When virNetClientIOEventLoop is called for a non-blocking call and not
even a single byte can be sent from this call without blocking, we
properly reported that to the caller which properly frees the call. But
we never removed the call from a call queue.
If something fails while initializing qemu job object in
qemuDomainObjPrivateAlloc(), memory to the private pointer is freed, but
after that, the pointer is still dereferenced, which may result in a
segfault.
* qemuDomainObjPrivateAlloc() - Don't dereference NULL pointer.
Generally, functions which return malloc'd strings should be typed
as 'char *', not 'const char *', to make it obvious that the caller
is responsible to free things. free(const char *) fails to compile,
and although we have a cast embedded in VIR_FREE to work around poor
code that frees const char *, it's better to not rely on that hack.
* src/qemu/qemu_driver.c (qemuDiskPathToAlias): Change return type.
(qemuDomainBlockJobImpl): Update caller.
Given that we can now handle the target's disk shorthand, in addition
to an absolute path to the file or block device used on the host,
the term 'disk' fits a bit better as the parameter name than 'path'.
* include/libvirt/libvirt.h.in: Update some parameter names.
* src/libvirt.c (virDomainBlockStats, virDomainBlockStatsFlags)
(virDomainBlockPeek, virDomainGetBlockInfo, virDomainBlockJobAbort)
(virDomainGetBlockJobInfo, virDomainBlockJobSetSpeed)
(virDomainBlockPull): Likewise.
Commit 89b6284f made it possible to pass either a source name or
the target device to most API demanding a disk designation, but
forgot to update the documentation. It also failed to update
virDomainBlockStats to take both forms. This patch fixes both the
documentation and the remaining function.
Xen continues to use just device shorthand (that is, I did not
implement path lookup there, since xen does not track a domain_conf
to quickly tie a path back to the device shorthand).
* src/libvirt.c (virDomainBlockStats, virDomainBlockStatsFlags)
(virDomainGetBlockInfo, virDomainBlockPeek)
(virDomainBlockJobAbort, virDomainGetBlockJobInfo)
(virDomainBlockJobSetSpeed, virDomainBlockPull): Document
acceptable disk naming conventions.
* src/qemu/qemu_driver.c (qemuDomainBlockStats)
(qemuDomainBlockStatsFlags): Allow lookup by source name.
* src/test/test_driver.c (testDomainBlockStats): Likewise.
The WITH_VIRTUALPORT macro is defined to 0 when disabled, not
left undefined. So #if must be used instead of #ifdef
* src/util/virnetdevvportprofile.c: s/#ifdef/#if/
In preparation of DHCP Snooping and the detection of multiple IP
addresses per interface:
The hash table that is used to collect the detected IP address of an
interface can so far only handle one IP address per interface. With
this patch we extend this to allow it to handle a list of IP addresses.
Above changes the returned variable type of virNWFilterGetIpAddrForIfname()
from char * to virNWFilterVarValuePtr; adapt all existing functions calling
this function.
Signed-off-by: Eli Qiao <taget@linux.vnet.ibm.com>
When configuring a URI alias like this in 'libvirt.conf':
uri_aliases = [
"jj#j=qemu+ssh://root@127.0.0.1/system",
"sleet=qemu+ssh://root@sleet.cloud.example.com/system",
]
virsh -c jj#j
It will show this error message:
'no connection driver available for No connection for URI jj#j'
Actually,we expect this message below:
Malformed 'uri_aliases' config entry 'jj#j=qemu+ssh://root@127.0.0.1/system', aliases may only contain 'a-Z, 0-9, _, -'
Give this patch to fix this error.
This patch adds support for filtering of STP (spanning tree protocol) traffic
to the parser and makes us of the ebtables support for STP filtering. This code
now enables the filtering of traffic in chains with prefix 'stp'.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
With hunks borrowed from one of David Steven's previous patches, we now
add the capability of having a 'mac' chain which is useful to filter
for multiple valid MAC addresses.
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
virStorageBackendLogicalDeleteVol() could not remove the lv with error
"could not remove open logical volume" sometimes. Generally it's caused
by the volume is still active, even if lvremove tries to remove it with
option "--force".
This patch is to fix it by disbale the lv first using "lvchange -aln"
and "lvremove -f" afterwards if the direct "lvremove -f" failed.
This patch exports KVM Host Power Management capabilities as XML so that
higher-level systems management software can make use of these features
available in the host.
The script "pm-is-supported" (from pm-utils package) is run to discover if
Suspend-to-RAM (S3) or Suspend-to-Disk (S4) is supported by the host.
If either of them are supported, then a new tag "<power_management>" is
introduced in the XML under the <host> tag.
However in case the query to check for power management features succeeded,
but the host does not support any such feature, then the XML will contain
an empty <power_management/> tag. In the event that the PM query itself
failed, the XML will not contain any "power_management" tag.
To use this, new APIs could be implemented in libvirt to exploit power
management features such as S3/S4.
None of the callers cared if str was updated to point to the next
byte after the parsed cpuset; simplifying this results in quite
a few code simplifications. Additionally, virCPUDefParseXML was
strdup()'ing a malloc()'d string; avoiding a memory copy resulted
in less code.
* src/conf/domain_conf.h (virDomainCpuSetParse): Alter signature.
* src/conf/domain_conf.c (virDomainCpuSetParse): Don't modify str.
(virDomainVcpuPinDefParseXML, virDomainDefParseXML): Adjust
callers.
* src/conf/cpu_conf.c (virCPUDefParseXML): Likewise.
* src/xen/xend_internal.c (sexpr_to_xend_topology): Likewise.
* src/xen/xm_internal.c (xenXMDomainPinVcpu): Likewise.
* src/xenxs/xen_sxpr.c (xenParseSxpr): Likewise.
* src/xenxs/xen_xm.c (xenParseXM): Likewise.
For direct attach devices, in qemuBuildCommandLine, we seem to be freeing
actual device on error path (with networkReleaseActualDevice). But the actual
device is not deleted.
qemuProcessStop eventually deletes the direct attach device and releases
actual device. But by the time qemuProcessStop is called qemuBuildCommandLine
has already freed actual device, leaving stray macvtap devices behind on error.
So the simplest fix is to remove the networkReleaseActualDevice in
qemuBuildCommandLine. This patch does just that.
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Now, when we support multiple consoles per domain,
the vm->def->console[0] can still remain an alias
for vm->def->serial[0]; However, we need to copy
it's source definition as well otherwise we'll regress
on virDomainOpenConsole.
Mingw32 complains if you request export of a symbol which does
not in fact exist.
* src/libvirt_bridge.syms, src/libvirt_macvtap.syms: Delete
obsolete files
* src/libvirt_private.syms: Remove virNetServerGetDBusConn
* src/libvirt_dbus.syms: Add virNetServerGetDBusConn
lvs outputs "[$lvname_vorigin]" for the virtual snapshot lv
(created with "--virtualsize"), and the original device pointed
by "$lvname_vorigin" is just for lvm internal use, one should
never use it.
Per lvm's nameing rules, "[" is not valid as part of the vg/lv name.
(man 8 lvm).
<quote>
VALID NAMES
The following characters are valid for VG and LV names: a-z A-Z 0-9 + _
. -
VG and LV names cannot begin with a hyphen. There are also various
reserved names that are used internally by lvm that can not be used as
LV or VG names. A VG cannot be called anything that exists in /dev/ at
the time of creation, nor can it be called '.' or '..'. A LV cannot be
called '.' '..' 'snapshot' or 'pvmove'. The LV name may also not con‐
tain the strings '_mlog' or '_mimage'
</quote>
So we can skip the set the lv's backingStore by checking if the name
begins with a "[".
This patch adds support for filtering of VLAN (802.1Q) traffic to the
parser and makes us of the ebtables support for VLAN filtering. This code
now enables the filtering of traffic in chains with prefix 'vlan'.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Xen4.1 initializes some unspecified sexpr config items to an empty
string, unlike previous Xen versions that would leave the item unset.
E.g. the kernel item for an HVM guest (non-direct kernel boot):
Xen4.0 and earlier
...
(image
(hvm
(kernel )
...
Xen4.1
...
(image
(hvm
(kernel '')
...
The empty string for kernel causes some grief in subsequent parsing
where existence of specified kernel is checked, e.g.
if (!def->os.kernel)
...
This patch solves the problem in sexpr_node_copy() by not copying
a node containing an empty string.
This prepares for subsequent patches which introduce dependence
on cgroup cpuset. Enable cgroup cpuset by default so users don't
have to modify configuration file before encountering a cpuset
error.
This patch modifies the NWFilter parameter parser to support multiple
elements with the same name and to internally build a list of items.
An example of the XML looks like this:
<parameter name='TEST' value='10.1.2.3'/>
<parameter name='TEST' value='10.2.3.4'/>
<parameter name='TEST' value='10.1.1.1'/>
The list of values is then stored in the newly introduced data type
virNWFilterVarValue.
The XML formatter is also adapted to print out all items in alphabetical
order sorted by 'name'.
This patch also fixes a bug in the XML schema on the way.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
This patch extends the NWFilter driver for Linux (ebiptables) to create
rules for each member of a previously introduced list. If for example
an attribute value (internally) looks like this:
IP = [10.0.0.1, 10.0.0.2, 10.0.0.3]
then 3 rules will be generated for a rule accessing the variable 'IP',
one for each member of the list. The effect of this is that this now
allows for filtering for multiple values in one field. This can then be
used to support for filtering/allowing of multiple IP addresses per
interface.
An iterator is introduced that extracts each member of a list and
puts it into a hash table which then is passed to the function creating
a rule. For the above example the iterator would cause 3 loops.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
NWFilters can be provided name-value pairs using the following
XML notation:
<filterref filter='xyz'>
<parameter name='PORT' value='80'/>
<parameter name='VAL' value='abc'/>
</filterref>
The internal representation currently is so that a name is stored as a
string and the value as well. This patch now addresses the value part of it
and introduces a data structure for storing a value either as a simple
value or as an array for later support of lists.
This patch adjusts all code that was handling the values in hash tables
and makes it use the new data type.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
The previous patch extends the priority of filtering rules into negative
numbers. We now use this possibility to interleave the jumping into
chains with filtering rules to for example create the 'root' table of
an interface with the following sequence of rules:
Bridge chain: libvirt-I-vnet0, entries: 6, policy: ACCEPT
-p IPv4 -j I-vnet0-ipv4
-p ARP -j I-vnet0-arp
-p ARP -j ACCEPT
-p 0x8035 -j I-vnet0-rarp
-p 0x835 -j ACCEPT
-j DROP
The '-p ARP -j ACCEPT' rule now appears between the jumps.
Since the 'arp' chain has been assigned priority -700 and the 'rarp'
chain -600, the above ordering can now be achieved with the following
rule:
<rule action='accept' direction='out' priority='-650'>
<mac protocolid='arp'/>
</rule>
This patch now sorts the commands generating the above shown jumps into
chains and interleaves their execution with those for generating rules.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
So far rules' priorities have only been valid in the range [0,1000].
Now I am extending their priority into the range [-1000, 1000] for subsequently
being able to sort rules and the access of (jumps into) chains following
priorities.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
This patch enables chains that have a known prefix in their name.
Known prefixes are: 'ipv4', 'ipv6', 'arp', 'rarp'. All prefixes
are also protocols that can be evaluated on the ebtables level.
Following the prefix they will be automatically connected to an interface's
'root' chain and jumped into following the protocol they evaluate, i.e.,
a table 'arp-xyz' will be accessed from the root table using
ebtables -t nat -A <iface root table> -p arp -j I-<ifname>-arp-xyz
thus generating a 'root' chain like this one here:
Bridge chain: libvirt-O-vnet0, entries: 5, policy: ACCEPT
-p IPv4 -j O-vnet0-ipv4
-p ARP -j O-vnet0-arp
-p 0x8035 -j O-vnet0-rarp
-p ARP -j O-vnet0-arp-xyz
-j DROP
where the chain 'arp-xyz' is accessed for filtering of ARP packets.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
This patch extends the filter XML to support priorities of chains
in the XML. An example would be:
<filter name='allow-arpxyz' chain='arp-xyz' priority='200'>
[...]
</filter>
The permitted values for priorities are [-1000, 1000].
By setting the priority of a chain the order in which it is accessed
from the interface root chain can be influenced.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Use the name of the chain rather than its type index (enum).
This pushes the later enablement of chains with user-given names
into the XML parser. For now we still only allow those names that
are well known ('root', 'arp', 'rarp', 'ipv4' and 'ipv6').
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Use scripts for the renaming and cleaning up of chains. This allows us to get
rid of some of the code that is only capable of renaming and removing chains
whose names are hardcoded.
A shell function 'collect_chains' is introduced that is given the name
of an ebtables chain and then recursively determines the names of all
chains that are accessed from this chain and its sub-chains using 'jumps'.
The resulting list of chain names is then used to delete all the found
chains by first flushing and then deleting them.
The same function is also used for renaming temporary filters to their final
names.
I tested this with the bash and dash as script interpreters.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Use the previously introduced chain priorities to sort the chains for access
from an interface's 'root' table and have them created in the proper order.
This gets rid of a lot of code that was previously creating the chains in a
more hardcoded way.
To determine what protocol a filter is used for evaluation do prefix-
matching, i.e., the filter 'arp' is used to filter for the 'arp' protocol,
'ipv4' for the 'ipv4' protocol and 'arp-xyz' will also be used to filter
for the 'arp' protocol following the prefix 'arp' in its name.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
For better handling of the sorting of chains introduce an internally used
priority. Use a lookup table to store the priorities. For now their actual
values do not matter just that the values cause the chains to be properly
sorted through changes in the following patches. However, the values are
chosen as negative so that once they are sorted along with filtering rules
(whose priority may only be positive for now) they will always be instantiated
before them (lower values cause instantiation before higher values). This
is done to maintain backwards compatibility.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Add a function to the virHashTable for getting an array of the hash table's
key-value pairs and have the keys (optionally) sorted.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Support creation of macvlan devices for LXC containers. Do not
allow setting of bandwidth controls or vport profiles due to the
complication that there is no host side visible device to work
with.
* src/lxc/lxc_driver.c: Support type=direct interfaces
Update virNetDevMacVLanCreateWithVPortProfile to allow creation
of plain macvlan devices, as well as macvtap devices. The former
is useful for LXC containers
* src/qemu/qemu_command.c: Explicitly request a macvtap device
* src/util/virnetdevmacvlan.c, src/util/virnetdevmacvlan.h: Add
new flag to allow switching between macvlan and macvtap
creation
The current lxcSetupInterfaces() method directly performs setup
of the bridge devices. Since it will shortly need to also create
macvlan devices, move the bridge related code into a separate
method
* src/lxc/lxc_driver.c: Split lxcSetupInterfaces() to create a
new lxcSetupInterfaceBridge()
The virDomainNetGetActualBridgeName and virDomainNetGetActualDirectDev
methods both return strings that point to data in the virDomainDefPtr
struct, and should therefore not be freed. The return values should
thus be 'const char *' not 'char *'.
* src/conf/domain_conf.c, src/conf/domain_conf.h: Mark const
* src/network/bridge_driver.c: Update to use a const char *
Move the ifaceMacvtapLinkDump and ifaceGetNthParent functions
into virnetdevvportprofile.c since they are specific to that
code. This avoids polluting the headers with the Linux specific
netlink data types
* src/util/interface.c, src/util/interface.h: Move
ifaceMacvtapLinkDump and ifaceGetNthParent functions and delete
remaining file
* src/util/virnetdevvportprofile.c: Add ifaceMacvtapLinkDump
and ifaceGetNthParent functions
* src/network/bridge_driver.c, src/nwfilter/nwfilter_gentech_driver.c,
src/nwfilter/nwfilter_learnipaddr.c, src/util/virnetdevmacvlan.c:
Remove include of interface.h
Rename ifaceIsVirtualFunction to virNetDevIsVirtualFunction,
ifaceGetVirtualFunctionIndex to virNetDevGetVirtualFunctionIndex
and ifaceGetPhysicalFunction to virNetDevGetPhysicalFunction
* src/util/interface.c, src/util/interface.h: Rename APIs
* src/util/virnetdevvportprofile.c: Update for API rename
Rename the ifaceCheck method to virNetDevValidateConfig and change
so that it always raises an error and returns -1 on error.
* src/util/interface.c, src/util/interface.h: Rename ifaceCheck
to virNetDevValidateConfig
* src/nwfilter/nwfilter_gentech_driver.c,
src/nwfilter/nwfilter_learnipaddr.c: Update for API rename
To match up with the existing virNetDevSetIPv4Address, rename
ifaceGetIPAddress to virNetDevGetIPv4Address
* util/interface.h, util/interface.c: Rename API
* network/bridge_driver.c: Update for API rename
Rename the ifaceGetIndex method to virNetDevGetIndex and
ifaceGetVlanID to virNetDevGetVLanID. Also change the error
reporting behaviour to always raise errors and return -1 on
failure
* util/interface.c, util/interface.h: Rename ifaceGetIndex
and ifaceGetVLAN
* nwfilter/nwfilter_gentech_driver.c, nwfilter/nwfilter_learnipaddr.c,
nwfilter/nwfilter_learnipaddr.c, util/virnetdevvportprofile.c: Update
for API renames and error handling changes
Move virNetDevReplaceMacAddress and virNetDevRestoreMacAddress
to the virnetdev.c file where they naturally belong
* util/interface.c, util/interface.h: Remove
virNetDevReplaceMacAddress and virNetDevRestoreMacAddress
* util/virnetdev.c, util/virnetdev.h: Add
virNetDevReplaceMacAddress and virNetDevRestoreMacAddress
Rename ifaceReplaceMacAddress to virNetDevReplaceMacAddress
and ifaceRestoreMacAddress to virNetDevRestoreMacAddress.
* util/interface.c, util/interface.h, util/virnetdevmacvlan.c:
Rename APIs
Move the low level macvlan creation APIs into the
virnetdevmacvlan.c file where they more naturally
belong
* util/interface.c, util/interface.h: Remove virNetDevMacVLanCreate
and virNetDevMacVLanDelete
* util/virnetdevmacvlan.c, util/virnetdevmacvlan.h: Add
virNetDevMacVLanCreate and virNetDevMacVLanDelete
Rename ifaceMacvtapLinkAdd to virNetDevMacVLanCreate and
ifaceLinkDel to virNetDevMacVLanDelete. Strictly speaking
the latter isn't restricted to macvlan devices, but that's
the only use libvirt has for it.
* util/interface.c, util/interface.h,
util/virnetdevmacvlan.c: Rename APIs
Rename virNetDevMacVLanCreate to virNetDevMacVLanCreateWithVPortProfile
and virNetDevMacVLanDelete to virNetDevMacVLanDeleteWithVPortProfile
To make way for renaming the other macvlan creation APIs in
interface.c
* util/virnetdevmacvlan.c, util/virnetdevmacvlan.h,
qemu/qemu_command.c, qemu/qemu_hotplug.c, qemu/qemu_process.c:
Rename APIs
Rename the macvtap.c file to virnetdevmacvlan.c to reflect its
functionality. Move the port profile association code out into
virnetdevvportprofile.c. Make the APIs available unconditionally
to callers
* src/util/macvtap.h: rename to src/util/virnetdevmacvlan.h,
* src/util/macvtap.c: rename to src/util/virnetdevmacvlan.c
* src/util/virnetdevvportprofile.c, src/util/virnetdevvportprofile.h:
Pull in vport association code
* src/Makefile.am, src/conf/domain_conf.h, src/qemu/qemu_conf.c,
src/qemu/qemu_conf.h, src/qemu/qemu_driver.c: Update include
paths & remove conditional compilation
In preparation for code re-organization, rename the Macvtap
management APIs to have the following patterns
virNetDevMacVLanXXXXX - macvlan/macvtap interface management
virNetDevVPortProfileXXXX - virtual port profile management
* src/util/macvtap.c, src/util/macvtap.h: Rename APIs
* src/conf/domain_conf.c, src/network/bridge_driver.c,
src/qemu/qemu_command.c, src/qemu/qemu_command.h,
src/qemu/qemu_driver.c, src/qemu/qemu_hotplug.c,
src/qemu/qemu_migration.c, src/qemu/qemu_process.c,
src/qemu/qemu_process.h: Update for renamed APIs
Add routines to generate -numa QEMU command line option based on
<numa> ... </numa> XML specifications.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
This patch adds XML definitions for guest NUMA specification and contains
routines to parse the same. The guest NUMA specification looks like this:
<cpu>
...
<topology sockets='2' cores='4' threads='2'/>
<numa>
<cell cpus='0-7' memory='512000'/>
<cell cpus='8-15' memory='512000'/>
</numa>
...
</cpu>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
For whatever reason, the kernel allows you to create a regular
file named /dev/sdc.12345; although this file will disappear the
next time devtmpfs is remounted. If you let libvirt generate
the name of the external snapshot for a disk image originally
using the block device /dev/sdc, then the domain will be rendered
unbootable once the qcow2 file is lost on the next devtmpfs
remount. In this case, the user should have used 'virsh
snapshot-create --xmlfile' or 'virsh snapshot-create-as --diskspec'
to specify the name for the qcow2 file in a sane location, rather
than relying on libvirt generating a name that is most likely to
be wrong. We can help avoid naive mistakes by enforcing that
the user provide the external name for any backing file that is
not a regular file.
* src/conf/domain_conf.c (virDomainSnapshotAlignDisks): Only
generate names if backing file exists as regular file.
Reported by MATSUDA Daiki.
I missed adding virNetServerGetDBusConn() to libvirtd_private.syms
in commit b8adfcc6, which didn't cause a problem in 0.9.6 but
results in this build error in 0.9.7
libvirtd-remote.o: In function `remoteDispatchAuthPolkit':
remote.c:(.text+0x188dd): undefined reference to `virNetServerGetDBusConn'
Due to the asynchronous nature of streams, we might continue to
receive some stream packets from the server even after we have
shutdown the stream on the client side. These should be discarded
silently, rather than raising an error in the RPC layer.
* src/rpc/virnetclient.c: Discard stream data silently
Add a new virNetClientSendNonBlock which returns 2 on
full send, 1 on partial send, 0 on no send, -1 on error
If a partial send occurs, then a subsequent call to any
of the virNetClientSend* APIs will finish any outstanding
I/O.
TODO: the virNetClientEvent event handler could be used
to speed up completion of partial sends if an event loop
is present.
* src/rpc/virnetsocket.h, src/rpc/virnetsocket.c: Add new
virNetSocketHasPendingData() API to test for cached
data pending send.
* src/rpc/virnetclient.c, src/rpc/virnetclient.h: Add new
virNetClientSendNonBlock() API to send non-blocking API
Stop multiplexing virNetClientSend for two different purposes,
instead add virNetClientSendWithReply and virNetClientSendNoReply
* src/rpc/virnetclient.c, src/rpc/virnetclient.h: Replace
virNetClientSend with virNetClientSendWithReply and
virNetClientSendNoReply
* src/rpc/virnetclientprogram.c, src/rpc/virnetclientstream.c:
Update for new API names
Remove some duplication by pulling the code for passing the
buck out into a helper method
* src/rpc/virnetclient.c: Introduce virNetClientIOEventLoopPassTheBuck
Instead of inferring whether the buck is held from the waitDispatch
pointer, use an explicit 'bool haveTheBuck' field
* src/rpc/virnetclient.c: Explicitly track the buck
Directly messing around with the linked list is potentially
dangerous. Introduce some helper APIs to deal with list
manipulating the list
* src/rpc/virnetclient.c: Create linked list handlers
This improves the support for qemu rbd devices by adding support for a few
key features (e.g., authentication) and cleaning up the way in which
rbd configuration options are passed to qemu.
An <auth> member of the disk source xml specifies how librbd should
authenticate. The username attribute is the Ceph/RBD user to authenticate as.
The usage or uuid attributes specify which secret to use. Usage is an
arbitrary identifier local to libvirt.
The old RBD support relied on setting an environment variable to
communicate information to qemu/librbd. Instead, pass those options
explicitly to qemu. Update the qemu argument parsing and tests
accordingly.
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Replacing the strchr call with two variables through a strstr call.
Calling strchr with two variables triggers a gcc 4.3/4.4
bug when used in combination with -Wlogical-op and at least -O1.
The ifaceSetMac and ifaceGetMac APIs duplicate the functionality
of the virNetDevSetMAC and virNetDevGetMAC APIs, but returning
errno's instead of raising errors.
* src/util/interface.c, src/util/interface.h: Remove
ifaceSetMac and ifaceGetMac APIs, adjusting callers
for new error behaviour
The ifaceUp, ifaceDown, ifaceCtrl & ifaceIsUp APIs can be replaced
with calls to virNetDevSetOnline and virNetDevIsOnline
* src/util/interface.c, src/util/interface.h: Delete ifaceUp,
ifaceDown, ifaceCtrl & ifaceIsUp
* src/nwfilter/nwfilter_gentech_driver.c, src/util/macvtap.c:
Update to use virNetDevSetOnline and virNetDevIsOnline
Move the virNetDevSetName and virNetDevSetNamespace APIs out
of LXC's veth.c and into virnetdev.c.
Move the remaining content of the file to src/util/virnetdevveth.c
* src/lxc/veth.c: Rename to src/util/virnetdevveth.c
* src/lxc/veth.h: Rename to src/util/virnetdevveth.h
* src/util/virnetdev.c, src/util/virnetdev.h: Add
virNetDevSetName and virNetDevSetNamespace
* src/lxc/lxc_container.c, src/lxc/lxc_controller.c,
src/lxc/lxc_driver.c: Update include paths
The src/lxc/veth.c file contains APIs for managing veth devices,
but some of the APIs duplicate stuff from src/util/virnetdev.h.
Delete thed duplicate APIs and rename the remaining ones to
follow virNetDevVethXXXX
* src/lxc/veth.c, src/lxc/veth.h: Rename APIs & delete duplicates
* src/lxc/lxc_container.c, src/lxc/lxc_controller.c,
src/lxc/lxc_driver.c: Update for API renaming
The src/util/network.c file is a dumping ground for many different
APIs. Split it up into 5 pieces, along functional lines
- src/util/virnetdevbandwidth.c: virNetDevBandwidth type & helper APIs
- src/util/virnetdevvportprofile.c: virNetDevVPortProfile type & helper APIs
- src/util/virsocketaddr.c: virSocketAddr and APIs
- src/conf/netdev_bandwidth_conf.c: XML parsing / formatting
for virNetDevBandwidth
- src/conf/netdev_vport_profile_conf.c: XML parsing / formatting
for virNetDevVPortProfile
* src/util/network.c, src/util/network.h: Split into 5 pieces
* src/conf/netdev_bandwidth_conf.c, src/conf/netdev_bandwidth_conf.h,
src/conf/netdev_vport_profile_conf.c, src/conf/netdev_vport_profile_conf.h,
src/util/virnetdevbandwidth.c, src/util/virnetdevbandwidth.h,
src/util/virnetdevvportprofile.c, src/util/virnetdevvportprofile.h,
src/util/virsocketaddr.c, src/util/virsocketaddr.h: New pieces
* daemon/libvirtd.h, daemon/remote.c, src/conf/domain_conf.c,
src/conf/domain_conf.h, src/conf/network_conf.c,
src/conf/network_conf.h, src/conf/nwfilter_conf.h,
src/esx/esx_util.h, src/network/bridge_driver.c,
src/qemu/qemu_conf.c, src/rpc/virnetsocket.c,
src/rpc/virnetsocket.h, src/util/dnsmasq.h, src/util/interface.h,
src/util/iptables.h, src/util/macvtap.c, src/util/macvtap.h,
src/util/virnetdev.h, src/util/virnetdevtap.c,
tools/virsh.c: Update include files
The virtual port profile parsing/formatting APIs do not
correctly handle unknown profile type strings/numbers.
They behave as a no-op, instead of raising an error
* src/util/network.c, src/util/network.h: Fix error
handling of port profile APIs
* src/conf/domain_conf.c, src/conf/network_conf.c: Update
for API changes
Rename the virVirtualPortProfileParams struct to be
virNetDevVPortProfile, and rename the APIs to match
this prefix.
* src/util/network.c, src/util/network.h: Rename port profile
APIs
* src/conf/domain_conf.c, src/conf/domain_conf.h,
src/conf/network_conf.c, src/conf/network_conf.h,
src/network/bridge_driver.c, src/qemu/qemu_hotplug.c,
src/util/macvtap.c, src/util/macvtap.h: Update for
renamed APIs/structs
Hi
Commit c31d23a787 removed the "conn"
parameter from qemuPhysIfaceConnect(), but it's still used if
WITH_MACVTAP is false. Also, it's still mentioned in the comment
above the function:
/**
* qemuPhysIfaceConnect:
* @def: the definition of the VM (needed by 802.1Qbh and audit)
* @conn: pointer to virConnect object
* @driver: pointer to the qemud_driver
* @net: pointer to he VM's interface description with direct device type
* @qemuCaps: flags for qemu
*
* Returns a filedescriptor on success or -1 in case of error.
*/
int
qemuPhysIfaceConnect(virDomainDefPtr def,
struct qemud_driver *driver,
virDomainNetDefPtr net,
virBitmapPtr qemuCaps,
enum virVMOperationType vmop)
{
int rc;
#if WITH_MACVTAP
[...]
#else
(void)def;
(void)conn;
(void)net;
(void)qemuCaps;
(void)driver;
(void)vmop;
qemuReportError(VIR_ERR_INTERNAL_ERROR,
"%s", _("No support for macvtap device"));
rc = -1;
#endif
return rc;
}
--
Michael Wood <esiotrot@gmail.com>
From f4fc43b4111a4c099395c55902e497b8965e2b53 Mon Sep 17 00:00:00 2001
From: Michael Wood <esiotrot@gmail.com>
Date: Sat, 12 Nov 2011 13:37:53 +0200
Subject: [PATCH] Fix build without MACVTAP.
which would blow away all volumes. Honor VIR_STORAGE_POOL_BUILD_OVERWRITE
to force a rebuild.
This was caught by libvirt-tck's storage/110-disk-pool.t.
Qemu will be the first driver to make use of a typed string in the
next round of additions. Separate out the trivial addition.
* src/qemu/qemu_driver.c (qemudSupportsFeature): Advertise feature.
(qemuDomainGetBlkioParameters, qemuDomainGetMemoryParameters)
(qemuGetSchedulerParametersFlags, qemudDomainBlockStatsFlags):
Allow typed strings flag where trivially supported.
Send and receive string typed parameters across RPC. This also
completes the back-compat mentioned in the previous patch - the
only time we have an older client talking to a newer server is
if RPC is in use, so filtering out strings during RPC prevents
returning an unknown type to the older client.
* src/remote/remote_protocol.x (remote_typed_param_value): Add
another union value.
* daemon/remote.c (remoteDeserializeTypedParameters): Handle
strings on rpc.
(remoteSerializeTypedParameters): Likewise; plus filter out
strings when replying to older clients. Adjust callers.
* src/remote/remote_driver.c (remoteFreeTypedParameters)
(remoteSerializeTypedParameters)
(remoteDeserializeTypedParameters): Handle strings on rpc.
* src/rpc/gendispatch.pl: Properly clean up typed arrays.
* src/remote_protocol-structs: Update.
Based on an initial patch by Hu Tao, with feedback from
Daniel P. Berrange.
Signed-off-by: Eric Blake <eblake@redhat.com>
This allows strings to be transported between client and server
in the context of name-type-value virTypedParameter functions.
For compatibility,
o new clients will not send strings to old servers, based on
a feature check
o new servers will not send strings to old clients without the
flag VIR_TYPED_PARAM_STRING_OKAY; this will be enforced at
the RPC layer in the next patch, so that drivers need not
worry about it in general. The one exception is that
virDomainGetSchedulerParameters lacks a flags argument, so
it must not return a string; drivers that forward that
function on to virDomainGetSchedulerParametersFlags will
have to pay attention to the flag.
o the flag VIR_TYPED_PARAM_STRING_OKAY is set automatically,
based on a feature check (so far, no driver implements it),
so clients do not have to worry about it
Future patches can then enable the feature on a per-driver basis.
This patch also ensures that drivers can blindly strdup() field
names (previously, a malicious client could stuff 80 non-NUL bytes
into field and cause a read overrun).
* src/libvirt_internal.h (VIR_DRV_FEATURE_TYPED_PARAM_STRING): New
driver feature.
* src/libvirt.c (virTypedParameterValidateSet)
(virTypedParameterSanitizeGet): New helper functions.
(virDomainSetMemoryParameters, virDomainSetBlkioParameters)
(virDomainSetSchedulerParameters)
(virDomainSetSchedulerParametersFlags)
(virDomainGetMemoryParameters, virDomainGetBlkioParameters)
(virDomainGetSchedulerParameters)
(virDomainGetSchedulerParametersFlags, virDomainBlockStatsFlags):
Use them.
* src/util/util.h (virTypedParameterArrayClear): New helper
function.
* src/util/util.c (virTypedParameterArrayClear): Implement it.
* src/libvirt_private.syms (util.h): Export it.
Based on an initial patch by Hu Tao, with feedback from
Daniel P. Berrange.
Signed-off-by: Eric Blake <eblake@redhat.com>
Add virnetdev.h,virnetdevbridge.h,virnetdevtap.h to private symbols,
since debian linker no longer allows transitive link resolution
Signed-off-by: Eli Qiao <taget@linux.vnet.ibm.com>
This reverts commit ef1065cf5ac; see also this bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=751900
In qemu 0.15.1 and earlier, during migration to file, the
qemu_savevm_state_begin and qemu_savevm_state_iterate methods
will both process as much migration data as possible until either
1. The file descriptor returns EAGAIN
2. The bandwidth rate limit is reached
If we set the rate limit to ULONG_MAX, test 2 never becomes true. We're
passing a plain file descriptor to QEMU and POSIX does not support EAGAIN on
regular files / block devices, so test 1 never becomes true either.
In the 'virsh save --bypass-cache' case, we pass a pipe instead of a
regular fd, but using a pipe adds I/O overhead, so always passing a
pipe just so qemu can see EAGAIN doesn't seem nice.
The ultimate fix needs to come from qemu - background migration must
respect asynchronous abort requests, or else periodically return
control to the main handling loop without an EAGAIN and without
waiting to hit an insanely large amount of data. But until a
version of qemu is fixed to support "unlimited" data rates while
still allowing cancellation, the best we can do is avoid the
automatic use of unlimited rates from within libvirt (users can
still explicitly change the migration rates, if they are aware that
they are giving up the ability to cancel a job).
Reverting the lone use of QEMU_DOMAIN_FILE_MIG_BANDWIDTH_MAX is
the simplest patch; this slows migration back down to a default
32M/sec cap, but also ensures that the main qemu processing loop
will still be responsive to cancellation requests. Hopefully
upstream qemu will provide us a means of safely using unlimited
speed, including a runtime probe of that capability.
* src/qemu/qemu_migration.c (qemuMigrationToFile): Revert attempt
to use unlimited migration bandwidth when migrating to file.
Signed-off-by: Daniel Veillard <veillard@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
steps to reproduce:
1. having a network xml file(named default.xml) like this one:
<network>
<name>default</name>
<uuid>c5322c4c-81d0-4985-a363-ad6389780d89</uuid>
<bridge name="virbr0" />
<forward/>
<ip address="192.168.122.1" netmask="255.255.255.0">
<dhcp>
<range start="192.168.122.2" end="192.168.122.254" />
</dhcp>
</ip>
</network>
in /etc/libvirt/qemu/networks/, and mark it as autostart:
$ ls -l /etc/libvirt/qemu/networks/autostart
total 0
lrwxrwxrwx 1 root root 14 Oct 12 14:02 default.xml -> ../default.xml
2. start libvirtd and the device virbr0 is not automatically up.
The reason is that the function virNetDevExists is now returns 1 if
the device exists, comparing to the former one returns 0 if the device
exists. But with only this fix will cause a segmentation fault(the same
steps as above) that is fixed by the second chunk of code.
CC libvirt_driver_xenapi_la-xenapi_driver.lo
xenapi/xenapi_driver.c: In function 'xenapiDomainGetVcpus':
xenapi/xenapi_driver.c:1209:21: error: variable 'cpus' set but not used [-Werror=unused-but-set-variable]
* src/xenapi/xenapi_driver.c (xenapiDomainGetVcpus): Silence
compiler warning.
It's not worth even worrying about a temporary file, unless we
ever expect the script to exceed maximum command-line argument
length limits.
* src/nwfilter/nwfilter_ebiptables_driver.c (ebiptablesExecCLI):
Run the commands as an argument to /bin/sh, rather than worrying
about a temporary file.
(ebiptablesWriteToTempFile): Delete unused function.
If /tmp is mounted with the noexec flag (common on security-conscious
systems), then nwfilter will fail to initialize, because we cannot
run any temporary script via virRun("/tmp/script"); but we _can_
use "/bin/sh /tmp/script". For that matter, using /tmp risks collisions
with other unrelated programs; we already have /var/run/libvirt as a
dedicated temporary directory for use by libvirt.
* src/nwfilter/nwfilter_ebiptables_driver.c
(ebiptablesWriteToTempFile): Use internal directory, not /tmp;
drop attempts to make script executable; and detect close error.
(ebiptablesExecCLI): Switch to virCommand, and invoke the shell to
read the script, rather than requiring an executable script.
The socket address APIs in src/util/network.h either take the
form virSocketAddrXXX, virSocketXXX or virSocketXXXAddr.
Sanitize this so everything is virSocketAddrXXXX, and ensure
that the virSocketAddr parameter is always the first one.
* src/util/network.c, src/util/network.h: Santize socket
address API naming
* src/conf/domain_conf.c, src/conf/network_conf.c,
src/conf/nwfilter_conf.c, src/network/bridge_driver.c,
src/nwfilter/nwfilter_ebiptables_driver.c,
src/nwfilter/nwfilter_learnipaddr.c,
src/qemu/qemu_command.c, src/rpc/virnetsocket.c,
src/util/dnsmasq.c, src/util/iptables.c,
src/util/virnetdev.c, src/vbox/vbox_tmpl.c: Update for
API renaming
Following the renaming of the bridge management APIs, we can now
split the source file into 3 corresponding pieces
* src/util/virnetdev.c: APIs for any type of network interface
* src/util/virnetdevbridge.c: APIs for bridge interfaces
* src/util/virnetdevtap.c: APIs for TAP interfaces
* src/util/virnetdev.c, src/util/virnetdev.h,
src/util/virnetdevbridge.c, src/util/virnetdevbridge.h,
src/util/virnetdevtap.c, src/util/virnetdevtap.h: Copied
from bridge.{c,h}
* src/util/bridge.c, src/util/bridge.h: Split into 3 pieces
* src/lxc/lxc_driver.c, src/network/bridge_driver.c,
src/openvz/openvz_driver.c, src/qemu/qemu_command.c,
src/qemu/qemu_conf.h, src/uml/uml_conf.c, src/uml/uml_conf.h,
src/uml/uml_driver.c: Update #include directives
Convert the virNetDevBridgeSetSTP and virNetDevBridgeSetSTPDelay
to use ioctls instead of spawning brctl.
Implement the virNetDevBridgeGetSTP and virNetDevBridgeGetSTPDelay
methods which were declared in the header but never existed
* src/util/bridge.c: Convert to use bridge ioctls instead of brctl
The MTU management APIs are useful to other code inside libvirt,
so should be exposed as non-static APIs.
* src/util/bridge.c, src/util/bridge.h: Expose virNetDevSetMTU,
virNetDevSetMTUFromDevice & virNetDevGetMTU
The existing brXXX APIs in src/util/bridge.h are renamed to
follow one of three different conventions
- virNetDevXXX - operations for any type of interface
- virNetDevBridgeXXX - operations for bridge interfaces
- virNetDevTapXXX - operations for tap interfaces
* src/util/bridge.h, src/util/bridge.c: Rename all APIs
* src/lxc/lxc_driver.c, src/network/bridge_driver.c,
src/qemu/qemu_command.c, src/uml/uml_conf.c,
src/uml/uml_driver.c: Update for API renaming
Currently every caller of the brXXX APIs has to store the returned
errno value and then raise an error message. This results in
inconsistent error messages across drivers, additional burden on
the callers and makes the error reporting inaccurate since it is
hard to distinguish different scenarios from 1 errno value.
* src/util/bridge.c: Raise errors instead of returning errnos
* src/lxc/lxc_driver.c, src/network/bridge_driver.c,
src/qemu/qemu_command.c, src/uml/uml_conf.c,
src/uml/uml_driver.c: Remove error reporting code
The bridge management APIs in src/util/bridge.c require a brControl
object to be passed around. This holds the file descriptor for the
control socket. This extra object complicates use of the API for
only a minor efficiency gain, which is in turn entirely offset by
the need to fork/exec the brctl command for STP configuration.
This patch removes the 'brControl' object entirely, instead opening
the control socket & closing it again within the scope of each method.
The parameter names for the APIs are also made to consistently use
'brname' for bridge device name, and 'ifname' for an interface
device name. Finally annotations are added for non-NULL parameters
and return check validation
* src/util/bridge.c, src/util/bridge.h: Remove brControl object
and update API parameter names & annotations.
* src/lxc/lxc_driver.c, src/network/bridge_driver.c,
src/uml/uml_conf.h, src/uml/uml_conf.c, src/uml/uml_driver.c,
src/qemu/qemu_command.c, src/qemu/qemu_conf.h,
src/qemu/qemu_driver.c: Remove reference to 'brControl' object
MacOS lacks ptsname_r, and gnulib doesn't (yet) provide it.
But we can avoid it altogether, by using gnulib openpty()
instead. Note that we do _not_ want the pt_chown module;
gnulib uses it only to implement a replacement openpty() if
the system lacks both openpty() and granpt(), but all
systems that we currently port to either have at least one of
openpty() and/or grantpt(), or lack ptys altogether. That is,
we aren't porting to any system that requires us to deal with
the hassle of installing a setuid pt_chown helper just to use
gnulib's ability to provide openpty() on obscure platforms.
* .gnulib: Update to latest, for openpty fixes
* bootstrap.conf (gnulib_modules): Add openpty, ttyname_r.
(gnulib_tool_option_extras): Exclude pt_chown module.
* src/util/util.c (virFileOpenTty): Rewrite in terms of openpty
and ttyname_r.
* src/util/util.h (virFileOpenTtyAt): Delete dead prototype.
Every instance of virCapsPtr must have the defaultConsoleTargetType
field set.
* src/security/virt-aa-helper.c: Add defaultConsoleTargetType to
virCapsPtr
The code calling sendfd/recvfd was mistakenly assuming those
calls would never block. They can in fact return EAGAIN and
this is causing us to drop the client connection when blocking
ocurrs while sending/receiving FDs.
Fixing this is a little hairy on the incoming side, since at
the point where we see the EAGAIN, we already thought we had
finished receiving all data for the packet. So we play a little
trick to reset bufferOffset again and go back into polling for
more data.
* src/rpc/virnetsocket.c, src/rpc/virnetsocket.h: Update
virNetSocketSendFD/RecvFD to return 0 on EAGAIN, or 1
on success
* src/rpc/virnetclient.c: Move decoding of header & fds
out of virNetClientCallDispatch and into virNetClientIOHandleInput.
Handling blocking when sending/receiving FDs
* src/rpc/virnetmessage.h: Add a 'donefds' field to track
how many FDs we've sent / received
* src/rpc/virnetserverclient.c: Handling blocking when
sending/receiving FDs
Building on 64-bit FreeBSD 8.2 complained about a cast between
a pointer and a smaller integer. Going through an intermediate
cast shuts up the compiler.
* src/util/threads-pthread.c (virThreadSelfID): Silence a warning.
While building on FreeBSD (and after fixing a ptsname_r link error),
I got this failure:
./.libs/libvirt_util.a(libvirt_util_la-threads.o)(.text+0x240): In function `virThreadCreate':
util/threads-pthread.c:185: undefined reference to `pthread_create'
It turns out that gnulib used only pthread_join for LIB_PTHREAD,
but on FreeBSD, libc provides that (as a stub function); whereas
the more complex pthread_create really does require -pthread,
which gnulib tracked under [LT]LIBMULTITHREAD.
* configure.ac (LIBS): Check LIBMULTITHREAD alongside LIB_PTHREAD.
* src/Makefile.am (THREAD_LIBS): New variable.
(libvirt_util_la_LIBADD, libvirt_lxc_LDADD): Use it.
I got this weird failure:
error: Failed to start domain simple
error: internal error cannot mix caller fds with blocking execution
and tracked it down to a use-after-free - virCommandSetOutputFD
was storing the address of a stack-local variable, which then
went out of scope before the virCommandRun that dereferenced it.
Bug introduced in commit 451cfd05 (0.9.2).
* src/lxc/lxc_driver.c (lxcBuildControllerCmd): Move log fd
registration...
(lxcVmStart): ...to caller.
All constants related to events should have a prefix of
VIR_DOMAIN_EVENT_
* include/libvirt/libvirt.h.in, src/qemu/qemu_domain.c:
Rename VIR_DOMAIN_DISK_CHANGE_MISSING_ON_START to
VIR_DOMAIN_EVENT_DISK_CHANGE_MISSING_ON_START
I ran into the following build failure:
$ mkdir -p build1 build2/a/very/deep/hierarcy
$ cd build2/a/very/deep/hierarcy
$ ../../../../../configure && make
$ cd ../../../../build1
$ ../configure && make
...
../../src/remote/remote_protocol.c:7:55: fatal error: ../../../../../src/remote/remote_protocol.h: No such file or directory
Turns out that we were sometimes generating the remote_protocol.c
file with information from the VPATH build, which is bad, since
any file shipped in the tarball should be idempotent no matter how
deep the VPATH build tree that created it.
* src/rpc/genprotocol.pl: Don't embed VPATH into generated file.
Based on a Coverity report - the return value of waitpid() should
always be checked, to avoid problems with leaking resources.
* src/lxc/lxc_controller.c (lxcControllerRun): Use simpler virPidAbort.
The default console type may vary based on the OS type. ie a Xen
paravirt guests wants a 'xen' console, while a fullvirt guests
wants a 'serial' console.
A plain integer default console type in the capabilities does
not suffice. Instead introduce a callback that is passed the
OS type.
* src/conf/capabilities.h: Use a callback for default console
type
* src/conf/domain_conf.c, src/conf/domain_conf.h: Use callback
for default console type. Add missing LXC/OpenVZ console types.
* src/esx/esx_driver.c, src/libxl/libxl_conf.c,
src/lxc/lxc_conf.c, src/openvz/openvz_conf.c,
src/phyp/phyp_driver.c, src/qemu/qemu_capabilities.c,
src/uml/uml_conf.c, src/vbox/vbox_tmpl.c,
src/vmware/vmware_conf.c, src/xen/xen_hypervisor.c,
src/xenapi/xenapi_driver.c: Set default console type callback
To allow virDomainOpenConsole to access non-primary consoles,
device aliases are required to be set. Until now only the QEMU
driver has done this. Update LXC & UML to set aliases for any
console devices
* src/lxc/lxc_driver.c, src/uml/uml_driver.c: Set aliases
for console devices
When no <target> element was set at all, the default console
target type was not being honoured
* src/conf/domain_conf.c: Set default target type for consoles
with no <target>
Currently the LXC controller only supports setup of a single
text console. This is wired up to the container init's stdio,
as well as /dev/console and /dev/tty1. Extending support for
multiple consoles, means wiring up additional PTYs to /dev/tty2,
/dev/tty3, etc, etc. The LXC controller is passed multiple open
file handles, one for each console requested.
* src/lxc/lxc_container.c, src/lxc/lxc_container.h: Wire up
all the /dev/ttyN links required to symlink to /dev/pts/NN
* src/lxc/lxc_container.h: Open more container side /dev/pts/NN
devices, and adapt event loop to handle I/O from all consoles
* src/lxc/lxc_driver.c: Setup multiple host side PTYs
The current I/O code for LXC uses a hand crafted event loop
to forward I/O between the container & host app, based on
epoll to handle EOF on PTYs. This event loop is not easily
extensible to add more consoles, or monitor other types of
file descriptors.
Remove the custom event loop and replace it with a normal
libvirt event loop. When detecting EOF on a PTY, disable
the event watch on that FD, and fork off a background thread
that does a edge-triggered epoll() on the FD. When the FD
finally shows new incoming data, the thread re-enables the
watch on the FD and exits.
When getting EOF from a read() on the PTY, the existing code
would do waitpid(WNOHANG) to see if the container had exited.
Unfortunately there is a race condition, because even though
the process has closed its stdio handles, it might still
exist.
To deal with this the new event loop uses a SIG_CHILD handler
to perform the waitpid only when the container is known to
have actually exited.
* src/lxc/lxc_controller.c: Rewrite the event loop to use
the standard APIs.
qemuBuildVirtioSerialPortDevStr was mistakenly accessing the
target.name field in the virDomainChrDef object for chardevs
belonging to a console. Those chardevs only have port set,
and if there's > 1 console, the > 1port number results in
trying to access a target.name with address 0x1
* src/qemu/qemu_command.c: Fix target.name handling and
make code more robust wrt error reporting
* src/qemu/qemu_command.c: Conditionally access target.name
While Xen only has a single paravirt console, UML, and
QEMU both support multiple paravirt consoles. The LXC
driver can also be trivially made to support multiple
consoles. This patch extends the XML to allow multiple
<console> elements in the XML. It also makes the UML
and QEMU drivers support this config.
* src/conf/domain_conf.c, src/conf/domain_conf.h: Allow
multiple <console> devices
* src/lxc/lxc_driver.c, src/xen/xen_driver.c,
src/xenxs/xen_sxpr.c, src/xenxs/xen_xm.c: Update for
internal API changes
* src/security/security_selinux.c, src/security/virt-aa-helper.c:
Only label consoles that aren't a copy of the serial device
* src/qemu/qemu_command.c, src/qemu/qemu_driver.c,
src/qemu/qemu_process.c, src/uml/uml_conf.c,
src/uml/uml_driver.c: Support multiple console devices
* tests/qemuxml2xmltest.c, tests/qemuxml2argvtest.c: Extra
tests for multiple virtio consoles. Set QEMU_CAPS_CHARDEV
for all console /channel tests
* tests/qemuxml2argvdata/qemuxml2argv-channel-virtio-auto.args,
tests/qemuxml2argvdata/qemuxml2argv-channel-virtio.args
tests/qemuxml2argvdata/qemuxml2argv-console-virtio.args: Update
for correct chardev syntax
* tests/qemuxml2argvdata/qemuxml2argv-console-virtio-many.args,
tests/qemuxml2argvdata/qemuxml2argv-console-virtio-many.xml: New
test file
Allow the user to call with nparams too small, per API documentation.
* src/xen/xen_hypervisor.c (xenHypervisorGetSchedulerParameters):
Allow fewer than max.
* src/xen/xend_internal.c (xenDaemonGetSchedulerParameters):
Likewise.
libvirt.c guarantees that nparams is non-zero for scheduler parameters.
* src/test/test_driver.c (testDomainGetSchedulerParamsFlags): Drop
redundant check. Avoid strcpy.
Allow the user to call with nparams too small, per API documentation.
Also, libvirt.c filters out nparams of 0 for scheduler parameters.
* src/lxc/lxc_driver.c (lxcDomainGetMemoryParameters): Allow fewer
than max.
(lxcGetSchedulerParametersFlags): Drop redundant check.
Allow the user to call with nparams too small, per API documentation.
* src/libxl/libxl_driver.c
(libxlDomainGetSchedulerParametersFlags): Allow fewer than max.
Allow the user to call with nparams too small, per API documentation.
* src/esx/esx_driver.c (esxDomainGetMemoryParameters): Drop
redundant check.
(esxDomainGetSchedulerParametersFlags): Allow fewer than max.
Document the parameter names that will be used by
virDomain{Get,Set}SchedulerParameters{,Flags}, rather than
hard-coding those names in each driver, to match what is
done with memory, blkio, and blockstats parameters.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_SCHEDULER_CPU_SHARES)
(VIR_DOMAIN_SCHEDULER_VCPU_PERIOD)
(VIR_DOMAIN_SCHEDULER_VCPU_QUOTA, VIR_DOMAIN_SCHEDULER_WEIGHT)
(VIR_DOMAIN_SCHEDULER_CAP, VIR_DOMAIN_SCHEDULER_RESERVATION)
(VIR_DOMAIN_SCHEDULER_LIMIT, VIR_DOMAIN_SCHEDULER_SHARES): New
field name macros.
* src/qemu/qemu_driver.c (qemuSetSchedulerParametersFlags)
(qemuGetSchedulerParametersFlags): Use new defines.
* src/test/test_driver.c (testDomainGetSchedulerParamsFlags)
(testDomainSetSchedulerParamsFlags): Likewise.
* src/xen/xen_hypervisor.c (xenHypervisorGetSchedulerParameters)
(xenHypervisorSetSchedulerParameters): Likewise.
* src/xen/xend_internal.c (xenDaemonGetSchedulerParameters)
(xenDaemonSetSchedulerParameters): Likewise.
* src/lxc/lxc_driver.c (lxcSetSchedulerParametersFlags)
(lxcGetSchedulerParametersFlags): Likewise.
* src/esx/esx_driver.c (esxDomainGetSchedulerParametersFlags)
(esxDomainSetSchedulerParametersFlags): Likewise.
* src/libxl/libxl_driver.c (libxlDomainGetSchedulerParametersFlags)
(libxlDomainSetSchedulerParametersFlags): Likewise.
The field 'mon' in 'struct tm' gives months 0-11, where as
humans tend to expect months 1-12. Thus the month number
needing adjusting by 1
* src/util/logging.c: Use human friendly month number
commit 27908453 introduces a regression, and it will
cause libvirt crashed when starting network.
The reason is that tapfd may be NULL, but we dereference
it without checking whether it is NULL.
Since all virTypedParameter APIs allow us to return the number
of slots we actually populated, we should allow the user to
call with nparams too small (without overrunning their array)
or too large (ignoring the tail of the array that we can't fill),
rather than requiring that they get things exactly right.
Making this change will make it easier for a future patch to
introduce VIR_TYPED_PARAM_STRING, with filtering in libvirt.c
rather than in every single driver, since users already have
to be prepared for *nparams to be smaller on exit than on entry.
* src/qemu/qemu_driver.c (qemuDomainGetBlkioParameters)
(qemuDomainGetMemoryParameters): Allow variable nparams on entry.
(qemuGetSchedulerParametersFlags): Drop redundant check.
(qemudDomainBlockStats, qemudDomainBlockStatsFlags): Rename...
(qemuDomainBlockStats, qemuDomainBlockStatsFlags): ...to this.
Don't return unavailable stats.
virDomainBlockStatsFlags was missing a check that was present in
virDomainGetMemoryParameters. Additionally, I found that the
existing descriptions were a bit hard to read. A later patch
will fix qemu to return fewer than max parameters if @nparams
was too small on input.
* src/libvirt.c (virDomainGetMemoryParameters)
(virDomainGetBlkioParameters, virDomainGetSchedulerParameters)
(virDomainGetSchedulerParametersFlags):
Tweak documentation wording.
(virDomainBlockStatsFlags): Likewise, and add sanity check.
If an LXC VM fails to start, quite a few cleanup paths will
result in the original error message being overwritten. Some
other cleanup paths also forgot to actually terminate the VM.
* src/lxc/lxc_driver.c: Ensure VM is terminated on startup
failure and preserve original error
The LXC code for mounting container filesystems from block devices
tries all filesystems in /etc/filesystems and possibly those in
/proc/filesystems. The regular mount binary, however, first tries
using libblkid to detect the format. Add support for doing the same
in libvirt, since Fedora's /etc/filesystems is missing many formats,
most notably ext4 which is the default filesystem Fedora uses!
* src/Makefile.am: Link libvirt_lxc to libblkid
* src/lxc/lxc_container.c: Probe filesystem format with libblkid
If we looped through /etc/filesystems trying to mount with each
type and failed all options, we forget to actually raise an
error message.
* src/lxc/lxc_container.c: Raise error if unable to detect
the filesystems. Also fix existing error message
The kernel automounter is mostly broken wrt to containers. Most
notably if you start a new filesystem namespace and then attempt
to unmount any autofs filesystem, it will typically fail with a
weird error message like
Failed to unmount '/.oldroot/sys/kernel/security':Too many levels of symbolic links
Attempting to detach the autofs mount using umount2(MNT_DETACH)
will also fail with the same error. Therefore if we get any error on
unmount()ing a filesystem from the old root FS when starting a
container, we must immediately break out and detach the entire
old root filesystem (ignoring any mounts below it).
This has the effect of making the old root filesystem inaccessible
to anything inside the container, but at the cost that the mounts
live on in the kernel until the container exits. Given that SystemD
uses autofs by default, we need LXC to be robust this scenario and
thus this tradeoff is worthwhile.
* src/lxc/lxc_container.c: Detach root filesystem if any umount
operation fails.
The /etc/filesystems file can contain a '*' on the last line to
indicate that /proc/filessystems should be tried next. We have
a check that this '*' only occurs on the last line. Unfortunately
when we then start reading /proc/filesystems, we mistakenly think
we've seen '*' in /proc/filesystems and fail
* src/lxc/lxc_container.c: Skip '*' validation when we're reading
/proc/filesystems
Only some of the return paths of lxcContainerWaitForContinue will
have set errno. In other paths we need to set it manually to avoid
the caller getting a random stale errno value
* src/lxc/lxc_container.c: Set errno in lxcContainerWaitForContinue
We already have a /var/lib/libvirt/images for OS install images.
We need a separate /var/lib/libvirt/filesystems for OS install
trees, since SELinux labelling will be different
* libvirt.spec.in: Add /var/lib/libvirt/filesystems
* src/Makefile.am: Create /var/lib/libvirt/filesystems
Allow the datacenter and compute resource parts of the path
to be prefixed with folders. Therefore, the way the path is
parsed has changed. Before, it was split in 2 or 3 items and
the items' meanings were determined by their positions. Now
the path can have 2 or more items and the the vCenter server
is asked whether a folder, datacenter of compute resource
with the specified name exists at the current hierarchy level.
Before the datacenter and compute resource lookup automatically
traversed folders during lookup. This is logic got removed
and folders have to be specified explicitly.
The proper datacenter path including folders is now used when
accessing a datastore over HTTPS. This makes virsh dumpxml
and define work for datacenters in folders.
https://bugzilla.redhat.com/show_bug.cgi?id=732676
with /etc/libvirt/libvirt.conf below:
uri_aliases = [
"hail=qemu:///system",
"sleet=qemu+ssh://root 9 115 122 57/system",
"sam=qemu+unix:///system?socket=/var/run/libvirt/libvirt-sock",
]
Neither "virsh -c hailly" nor "hai" should result in matching "hail=qemu:///system"
Fix URI alias prefix matching when connecting
Signed-off-by: Wen Ruo Lv <lvroyce@linux.vnet.ibm.com>
If daemon is using SASL it reads client data into a cache. This cache is
big (usually 65KB) and can thus contain 2 or more messages. However,
on socket event we can dispatch only one message. So if we read two
messages at once, the second will not be dispatched as the socket event
goes away with filling the cache.
Moreover, when dispatching the cache we need to remember to take care
of client max requests limit.
If we are comparing storage pools we must skip comparing with
ourself, so that re-defining an existing pool works
* conf/storage_conf.c: Skip self when comparing
The qemu RBD driver needs access to the conn in order to get the secret
needed for connecting to the ceph cluster.
Signed-off-by: Sage Weil <sage@newdream.net>
To support "managed" mode of host PCI device, we record the original
states (unbind_from_stub, remove_slot, and reprobe) so that could
reattach the device to host with original driver. But there is no XML
for theses attrs, and thus after daemon is restarted, we lose the
original states. It's easy to reproduce:
1) virsh start domain
2) virsh attach-device dom hostpci.xml (in 'managed' mode)
3) service libvirtd restart
4) virsh destroy domain
You will see the device won't be bound to the original driver
if there was one.
This patch is to solve the problem by introducing internal XML
(won't be dumped to user, only dumped to status XML). The XML is:
<origstates>
<unbind/>
<remove_slot/>
<reprobe/>
</origstates>
Which will be child node of <hostdev><source>...</souce></hostdev>.
(only for PCI device).
A new struct "virDomainHostdevOrigStates" is introduced for the XML,
and the according members are updated when preparing the PCI device.
And function "qemuUpdateActivePciHostdevs" is modified to honor
the original states. Use of qemuGetPciHostDeviceList is removed
in function "qemuUpdateActivePciHostdevs", and the "managed" value of
the device config is honored by the change. This fixes another problem
alongside:
qemuGetPciHostDeviceList set the device as "managed" force
regardless of whether the device is configured as "managed='yes'"
or not in XML, which is not right.
Deal with the incompatible changes in the VirtualBox 4.1 API.
INetworkAdapter has its different AttachTo* method replaced by
a settable attachmentType property.
The maximum number of network adapters is now requestable per
chipset type.
The OpenMedium method got a bool parameter to request opening
a medium under a new IID.
privP->session->error_description is a list and in order to get the
complete error message all parts of the list should be concatenated.
xenapiSessionErrorHandler does this when its third parameter is NULL.
The current code discards all but the first part of the error message
resulting in a potentially incomplete error message.
This partly reverts 006be75ee2, that tried to avoid reporting
a (null) in the error message. The actual problem is more general in
returnErrorFromSession that might return NULL if there is no error.
Make sure that returnErrorFromSession return non-NULL always. Also
don't skip the last error message part.
- changed some return 1's to return -1
- changed if (rc) error checks to if (rc < 0)
- fixed some other minor convention violations
I might have missed some. Can fix in another patch or can respin
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Reported-by: Eric Blake <eblake@redhat.com>
Reported-by: Laine Stump <laine@laine.org>
Signed-off-by: Eric Blake <eblake@redhat.com>
When using the xml as below:
------------------------------------------------------
<devices>
<emulator>/home/soulxu/data/work-code/qemu-kvm/x86_64-softmmu/qemu-system-x86_64</emulator>
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2'/>
<source file='/home/soulxu/data/VM/images/linux.img'/>
<target dev='vda' bus='virtio'/>
<address type='drive' controller='0' bus='0' unit='0'/>
</disk>
<input type='mouse' bus='ps2'/>
<graphics type='vnc' port='-1' autoport='yes'/>
<video>
<model type='cirrus' vram='9216' heads='1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
</video>
<memballoon model='virtio'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</memballoon>
</devices>
------------------------------------------------------
Then can't startup qemu, the error message as below:
virsh # start test-vm
error: Failed to start domain test-vm
error: internal error process exited while connecting to monitor: qemu-system-x86_64: -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3: PCI: slot 3 function 0 not available for virtio-balloon-pci, in use by virtio-blk-pci
qemu-system-x86_64: -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3: Device 'virtio-balloon-pci' could not be initialized
So adding check for bus type and address type. Only the address of pci type support by virtio bus.
Signed-off-by: Xu He Jie <xuhj@linux.vnet.ibm.com>
Add additional fields to let you specify the how to authenticate with a disk.
The secret to use may be referenced by a usage string or a UUID, i.e.:
<auth username='myuser'>
<secret type='ceph' usage='secretname'/>
</auth>
or
<auth username='myuser'>
<secret type='ceph' uuid='0a81f5b2-8403-7b23-c8d6-21ccc2f80d6f'/>
</auth>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Add a new secret type to store a Ceph authentication key. The name
is simply an identifier for easy human reference.
The xml looks like this:
<secret ephemeral='no' private='no'>
<uuid>0a81f5b2-8403-7b23-c8d6-21ccc2f80d6f</uuid>
<usage type='ceph'>
<name>mycluster_admin</name>
</usage>
</secret>
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.net>
Leak introduced in commit c1bc3d89.
Detected by valgrind:
==18462== 1,100 bytes in 1 blocks are definitely lost in loss record 183 of 184
==18462== at 0x4A05FDE: malloc (vg_replace_malloc.c:236)
==18462== by 0x4A06167: realloc (vg_replace_malloc.c:525)
==18462== by 0x4AADBB: virReallocN (memory.c:161)
==18462== by 0x4A975E: virBufferGrow (buf.c:117)
==18462== by 0x4A9D92: virBufferVasprintf (buf.c:290)
==18462== by 0x4A9EF7: virBufferAsprintf (buf.c:263)
==18462== by 0x429488: qemuBuildControllerDevStr (qemu_command.c:1993)
==18462== by 0x42C4B6: qemuBuildCommandLine (qemu_command.c:3803)
==18462== by 0x41A604: testCompareXMLToArgvHelper (qemuxml2argvtest.c:124)
==18462== by 0x41BB81: virtTestRun (testutils.c:141)
==18462== by 0x416DFF: mymain (qemuxml2argvtest.c:369)
==18462== by 0x41B277: virtTestMain (testutils.c:696)
==18462==
==18462== LEAK SUMMARY:
==18462== definitely lost: 1,100 bytes in 1 blocks
==18462== indirectly lost: 0 bytes in 0 blocks
* src/qemu/qemu_command.c (qemuBuildCommandLine): Clean up on success.
Signed-off-by: Alex Jia <ajia@redhat.com>
Detected by Coverity. The fix in 2c27dfa didn't catch all bad
instances of memcpy(). Thankfully, on further analysis, all of
the problematic uses are only triggered by old qemu that lacks
-device.
* src/qemu/qemu_hotplug.c (qemuDomainAttachPciDiskDevice)
(qemuDomainAttachNetDevice, qemuDomainAttachHostPciDevice): Init
all fields since monitor only populates some of them.
Since it needs to access file descriptors passed in the msg,
the RPC driver for virDomainOpenGraphics needs to be manually
implemented.
* daemon/remote.c: RPC server dispatcher
* src/remote/remote_driver.c: RPC client dispatcher
* src/remote/remote_protocol.x: Define protocol
The RPC server classes are extended to allow FDs to be received
from clients with calls. There is not currently any way for a
procedure to pass FDs back to the client with replies
* daemon/remote.c, src/rpc/gendispatch.pl: Change virNetMessageHeaderPtr
param to virNetMessagePtr in dispatcher impls
* src/rpc/virnetserver.c, src/rpc/virnetserverclient.c,
src/rpc/virnetserverprogram.c, src/rpc/virnetserverprogram.h:
Extend to support FD passing
Extend the RPC client code to allow file descriptors to be sent
to the server with calls, and received back with replies.
* src/remote/remote_driver.c: Stub extra args
* src/libvirt_private.syms, src/rpc/virnetclient.c,
src/rpc/virnetclient.h, src/rpc/virnetclientprogram.c,
src/rpc/virnetclientprogram.h: Extend APIs to allow
FD passing
Define two new RPC message types VIR_NET_CALL_WITH_FDS and
VIR_NET_REPLY_WITH_FDS. These message types are equivalent
to VIR_NET_CALL and VIR_NET_REPLY, except that between the
message header, and payload there is a 32-bit integer field
specifying how many file descriptors have been passed.
The actual file descriptors are sent/recv'd out of band.
* src/rpc/virnetmessage.c, src/rpc/virnetmessage.h,
src/libvirt_private.syms: Add support for handling
passed file descriptors
* src/rpc/virnetprotocol.x: Extend protocol for FD
passing
Add APIs to the virNetSocket object, to allow file descriptors
to be sent/received over UNIX domain socket connections
* src/rpc/virnetsocket.c, src/rpc/virnetsocket.h,
src/libvirt_private.syms: Add APIs for FD send/recv
The QEMU monitor command 'add_client' can be used to connect to
a VNC or SPICE graphics display. This allows for implementation
of the virDomainOpenGraphics API
* src/qemu/qemu_driver.c: Implement virDomainOpenGraphics
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h:
Add binding for 'add_client' command
Not all VNC/SPICE servers use a TCP socket for their connections.
It is possible to configure a UNIX socket server. The graphics
event must thus include a UNIX socket address type.
* include/libvirt/libvirt.h.in: Add UNIX socket address type
for graphics event
* src/qemu/qemu_monitor_json.c: Add 'unix' string to address
type enum
The virDomainOpenGraphics API allows a libvirt client to pass in
a file descriptor for an open socket pair, and get it connected
to the graphics display of the guest. This is limited to working
with local libvirt hypervisors connected over a UNIX domain
socket, since it will use UNIX FD passing
* include/libvirt/libvirt.h.in: Define virDomainOpenGraphics
* src/driver.h: Define driver for virDomainOpenGraphics
* src/libvirt_public.syms, src/libvirt.c: Entry point for
virDomainOpenGraphics
* src/libvirt_internal.h: VIR_DRV_FEATURE_FD_PASSING
This refactors the TAP creation code out of brAddTap into a new
function brCreateTap to allow it to be used on its own. I have also
changed ifSetInterfaceMac to brSetInterfaceMac and exported it since
it is will be needed by code outside of util/bridge.c in the next
patch.
AUTHORS | 1 +
src/libvirt_bridge.syms | 2 +
src/util/bridge.c | 116 +++++++++++++++++++++++++++++++----------------
src/util/bridge.h | 9 ++++
4 files changed, 89 insertions(+), 39 deletions(-)
Every time we write XML into a file we call virEmitXMLWarning to write a
warning that the file is automatically generated. virXMLSaveFile
simplifies this into a single step and makes rewriting existing XML file
safe by using virFileRewrite internally.
When saving config files we just overwrite old content of the file. In
case something fails during that process (e.g. disk gets full) we lose
both old and new content. This patch makes the process more robust by
writing the new content into a separate file and only if that succeeds
the original file is atomically replaced with the new one.
This change adds some systemtap/dtrace probes to the QEMU monitor
client code. In particular it allows watching of all operations
for a VM
* examples/systemtap/qemu-monitor.stp: Watch all monitor commands
* src/Makefile.am: Passing libdir/bindir/sbindir to dtrace2systemtap.pl
* src/dtrace2systemtap.pl: Accept libdir/bindir/sbindir as args
and look for '# binary:' comment to mark probes against libvirtd
vs libvirt.so
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor_json.c,
src/qemu/qemu_monitor_text.c: Add probes for key functions
Previous commit clears number of items alocated in lxcSetupLoopDevices
if VIR_REALLOC_N fails. In that case, the pointer is not NULL, and
causes leaking FDs that have been allocated.
* src/lxc/lxc_controller.c: revert zeroing array size
If the function lxcSetupLoopDevices(def, &nloopDevs, &loopDevs) failed,
the variable loopDevs will keep a initial NULL value, however, the
function VIR_FORCE_CLOSE(loopDevs[i]) will directly deref it.
This patch also fixes returning a bogous number of devices from
lxcSetupLoopDevices on an error path.
* rc/lxc/lxc_controller.c: fixed a null pointer dereference.
Signed-off-by: Alex Jia <ajia@redhat.com>
Cppcheck detected a syntaxError on lxcDomainInterfaceStats.
* src/lxc/lxc_driver.c: fixed missing '{' in the function lxcDomainInterfaceStats.
Signed-off-by: Alex Jia <ajia@redhat.com>
Rather than making all clients of monitor commands that are JSON-only
check whether yajl support was compiled in, it is simpler to just
avoid setting the capability bit up front if we can't use the capability.
* src/qemu/qemu_capabilities.c (qemuCapsComputeCmdFlags): Only set
capability bit if we also have yajl library to use it.
* src/qemu/qemu_driver.c (qemuDomainReboot): Drop #ifdefs.
* src/qemu/qemu_process.c (qemuProcessStart): Likewise.
* tests/qemuhelptest.c (testHelpStrParsing): Pass test even
without yajl.
* tests/qemuxml2argvtest.c (mymain): Simplify use of json flag.
* tests/qemuxml2argvdata/qemuxml2argv-disk-drive-error-*.args:
Update expected results to match.
Break some long lines, and use more efficient functions when possible,
such as relying on virBufferEscapeString to skip output on a NULL arg.
Ensure that output does not embed newlines, since auto-indent won't
work in those situations.
* src/conf/domain_conf.c (virDomainTimerDefFormat): Break output lines.
(virDomainDefFormatInternal, virDomainDiskDefFormat)
(virDomainActualNetDefFormat, virDomainNetDefFormat)
(virDomainHostdevDefFormat): Minor cleanups.
Fixing this involved some refactoring of common code out of
domain_conf and nwfilter_conf into nwfilter_params.
* src/conf/nwfilter_params.h (virNWFilterFormatParamAttributes):
Adjust signature.
* src/conf/nwfilter_params.c (_formatParameterAttrs)
(virNWFilterFormatParamAttributes): Adjust indentation handling,
and handle filterref here.
(formatterParam): Delete unused struct.
* src/conf/domain_conf.c (virDomainNetDefFormat): Adjust caller.
* src/conf/nwfilter_conf.c (virNWFilterIncludeDefFormat): Likewise.
Detected by Coverity. Only possible if qemu-img gives bogus output,
but we might as well be robust.
* src/storage/storage_backend.c
(virStorageBackendQEMUImgBackingFormat): Check for strstr failure.
If a disk source gets dropped because it is not accessible,
mgmt application might want to be informed about this. Therefore
we need to emit an event. The event presented in this patch
is however a bit superset of what written above. The reason is simple:
an intention to be easily expanded, e.g. on 'user ejected disk
in guest' events. Therefore, callback gets source string and disk alias
(which should be unique among a domain) and reason (an integer);
This patch implements on_missing feature in qemu driver.
Upon qemu startup process an accessibility of CDROMs
and floppy disks is checked. The source might get dropped
if unavailable and on_missing is set accordingly.
No event is emit thought. Look for follow up patch.
This patch is rather cosmetic as it only moves device alias
assignation from command line construction just before that.
However, it is needed in connotation of previous and next patch.
This attribute says what to do with cdrom (or floppy) if
the source is missing. It accepts:
- mandatory - fail if missing for any reason (the default)
- requisite - fail if missing on boot up, drop if missing on
migrate/restore/revert
- optional - drop if missing at any start attempt.
However, this patch introduces only XML part of this new
functionality.
Splitting into two functions allows the user to call the right
function, rather than having to remember that a *Free function is
an exception to the rule.
* src/conf/storage_conf.h (virStoragePoolSourceClear): New function.
* src/libvirt_private.syms (storage_conf.h): Export it.
* src/conf/storage_conf.c (virStoragePoolSourceFree): Split...
(virStoragePoolSourceClear): ...into new function.
(virStoragePoolDefFree, virStoragePoolDefParseSourceString):
Update callers.
* src/test/test_driver.c (testStorageFindPoolSources): Likewise.
* src/storage/storage_backend_fs.c
(virStorageBackendFileSystemNetFindPoolSourcesFunc)
(virStorageBackendFileSystemNetFindPoolSources): Likewise.
* src/storage/storage_backend_iscsi.c
(virStorageBackendISCSIFindPoolSources): Likewise.
* src/storage/storage_backend_logical.c
(virStorageBackendLogicalFindPoolSources): Likewise.
Detected by Coverity. virStoragePoolSourceFree does not free the
actual passed-in pointer. A bigger patch would be to rename it
virStoragePoolSourceClear to match behavior, or even split it into
two functions depending on needed behavior; but this is the minimal
fix to the one location out of eight that leaked memory.
* src/storage/storage_backend_iscsi.c
(virStorageBackendISCSIFindPoolSources): Free memory.
Based on a report by Coverity. waitpid() can leak resources if it
fails with EINTR, so it should never be used without checking return
status. But we already have a helper function that does that, so
use it in more places.
* src/lxc/lxc_container.c (lxcContainerAvailable): Use safer
virWaitPid.
* daemon/libvirtd.c (daemonForkIntoBackground): Likewise.
* tests/testutils.c (virtTestCaptureProgramOutput, virtTestMain):
Likewise.
* src/libvirt.c (virConnectAuthGainPolkit): Simplify with virCommand.
Detected by Coverity. Both text and JSON monitors set only the
bus and unit fields, which means driveAddr.controller spends
life as garbage on the stack, and is then memcpy()'d into the
in-memory representation which the user can see via dumpxml.
* src/qemu/qemu_hotplug.c (qemuDomainAttachSCSIDisk): Only copy
defined fields.
More simplifications possible due to auto-indent. Also,
<bandwidth> within <actual> was only using 6 instead of 8 spaces.
* src/util/network.h (virVirtualPortProfileFormat)
(virBandwidthDefFormat): Alter signature.
* src/util/network.c (virVirtualPortProfileFormat)
(virBandwidthDefFormat): Alter indentation.
(virBandwidthChildDefFormat): Tweak to make use easier.
* src/conf/network_conf.c (virPortGroupDefFormat)
(virNetworkDefFormat): Adjust callers.
* src/conf/domain_conf.c (virDomainNetDefFormat): Likewise.
(virDomainActualNetDefFormat): Likewise, and fix bandwidth
indentation.
Auto-indent makes life a bit easier; this patch also drops unused
arguments and replaces a misspelled flag name with two entry points
instead, so that callers don't have to worry about how much spacing
is present when embedding cpu elements.
* src/conf/cpu_conf.h (virCPUFormatFlags): Delete.
(virCPUDefFormat): Drop unused argument.
(virCPUDefFormatBuf): Alter signature.
(virCPUDefFormatBufFull): New prototype.
* src/conf/cpu_conf.c (virCPUDefFormatBuf): Split...
(virCPUDefFormatBufFull): ...into new function.
(virCPUDefFormat): Adjust caller.
* src/conf/domain_conf.c (virDomainDefFormatInternal): Likewise.
* src/conf/capabilities.c (virCapabilitiesFormatXML): Likewise.
* src/cpu/cpu.c (cpuBaselineXML): Likewise.
* tests/cputest.c (cpuTestCompareXML): Likewise.
The improvements to virBuffer, along with a paradigm shift to pass
the original buffer through rather than creating a second buffer,
allow us to shave off quite a few lines of code.
* src/util/sysinfo.h (virSysinfoFormat): Alter signature.
* src/util/sysinfo.c (virSysinfoFormat, virSysinfoBIOSFormat)
(virSysinfoSystemFormat, virSysinfoProcessorFormat)
(virSysinfoMemoryFormat): Change indentation parameter.
* src/conf/domain_conf.c (virDomainSysinfoDefFormat): Adjust
caller.
* src/qemu/qemu_driver.c (qemuGetSysinfo): Likewise.
Add a test for the simple parts of my indentation changes, and
fix the fallout.
* tests/domainsnapshotxml2xmltest.c: New test.
* tests/Makefile.am (domainsnapshotxml2xmltest_SOURCES): Build it.
* src/conf/domain_conf.c (virDomainSnapshotDefFormat): Avoid NULL
deref, match documented order.
* src/conf/domain_conf.h (virDomainSnapshotDefFormat): Add const.
* tests/domainsnapshotxml2xmlout/all_parameters.xml: Tweak output.
* tests/domainsnapshotxml2xmlout/disk_snapshot.xml: Likewise.
* tests/domainsnapshotxml2xmlout/full_domain.xml: Likewise.
* .gitignore: Exempt new binary.
<domainsnapshot> is the first public instance of <domain> being
used as a sub-element, although we have two other private uses
(runtime state, and migration cookie). Although indentation has
no effect on XML parsing, using it makes the output more consistent.
This uses virBuffer auto-indentation to obtain the effect, for all
but the portions of <domain> that are not generated a line at a
time into the same virBuffer. Further patches will clean up the
remaining problems.
* src/conf/domain_conf.h (virDomainDefFormatInternal): New prototype.
* src/conf/domain_conf.c (virDomainDefFormatInternal): Export.
(virDomainObjFormat, virDomainSnapshotDefFormat): Update callers.
* src/libvirt_private.syms (domain_conf.h): Add new export.
* src/qemu/qemu_migration.c (qemuMigrationCookieXMLFormat): Use
new function.
(qemuMigrationCookieXMLFormatStr): Update caller.
Rather than having to adjust all callers in a chain to deal with
indentation, it is nicer to have virBuffer do auto-indentation.
* src/util/buf.h (_virBuffer): Increase size.
(virBufferAdjustIndent, virBufferGetIndent): New prototypes.
* src/libvirt_private.syms (buf.h): Export new functions.
* src/util/buf.c (virBufferAdjustIndent, virBufferGetIndent): New
functions.
(virBufferSetError, virBufferAdd, virBufferAddChar)
(virBufferVasprintf, virBufferStrcat, virBufferURIEncodeString):
Implement auto-indentation.
* tests/virbuftest.c (testBufAutoIndent): Test it.
(testBufInfiniteLoop): Don't rely on internals.
Idea by Daniel P. Berrange.
The next patch wants to add some sanity checking, which would
be a different error than ENOMEM. Many existing callers blindly
report OOM failure if virBuf reports an error, and this will be
wrong in the (unlikely) case that they actually had a usage error
instead; but since the most common error really is ENOMEM, I'm
not going to fix all callers. Meanwhile, new discriminating
callers can react differently depending on what failure happened.
* src/util/buf.c (virBufferSetError): Add parameter.
(virBufferGrow, virBufferVasprintf, virBufferEscapeString)
(virBufferEscapeSexpr, virBufferEscapeShell): Adjust callers.
Although the compiler wasn't complaining (since it was the pointer,
rather than what was being pointed to, that was actually const), it
looks quite suspicious to call a function with an argument labeled
const when the nature of the pointer (virBufferPtr) is hidden behind
a typedef. Dropping const makes the function declarations easier
to read.
* src/util/buf.h: Drop const from all functions that modify buffer
argument.
* src/util/buf.c (virBufferSetError, virBufferAdd)
(virBufferContentAndReset, virBufferFreeAndReset)
(virBufferAsprintf, virBufferVasprintf, virBufferEscapeString)
(virBufferEscapeSexpr, virBufferEscape): Fix fallout.
There is a little difference between the output of domxml-to-native and the actual commandline.
No matter qemu is in control or readline mode, domxml-to-native always converts it to readline mode.
That is because the parameter "monitor_json" for qemuBuildCommandLine() is always set to false
in qemuDomainXMLToNative().
Signed-off-by: tangchen <tangchen@cn.fujitsu.com>
The glibc ones (intentionally) cannot handle ptys opened in a
devpts not mounted at /dev/pts.
Drop the (un-exported, unused) virFileOpenTtyAt.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
The XML parser for the qemu specific extensions expects the qemu name-space
to be bound to the 'qemu' prefix. This is too strict, since the name of the
name-space-prefix is only meant as an internal lookup key. Only the associated
URI is relevant.
<domain>...
<qemu:commandline xmlns:qemu="http://libvirt.org/schemas/domain/qemu/1.0">
...</qemu:commandline>
</domain>
<domain xmlns:ns0="http://libvirt.org/schemas/domain/qemu/1.0">...
<ns0:commandline>
...</ns0:commandline>
</domain>
<domain xmlns:qemu="http://libvirt.org/schemas/domain/qemu/1.0">
<qemu:commandline xmlns:qemu="urn:foo">
...</qemu:commandline>
</domain>
Remove the test for checking the name-space binding on the top-level <domain>
element. Registering the name-space with XPath is enough.
Signed-off-by: Philipp Hahn <hahn@univention.de>
When I compile libvirt with gcc-4.6.1 in ubuntu 11.10, got error as below:
CCLD libvirtd
/usr/bin/ld: ../src/.libs/libvirt_driver_qemu.a(libvirt_driver_qemu_la-qemu_migration.o): undefined reference to symbol 'gnutls_x509_crt_get_dn@@GNUTLS_1_4'
/usr/bin/ld: note: 'gnutls_x509_crt_get_dn@@GNUTLS_1_4' is defined in DSO /usr/lib/x86_64-linux-gnu/libgnutls.so so try adding it to the linker command line
/usr/lib/x86_64-linux-gnu/libgnutls.so: could not read symbols: Invalid operation
collect2: ld returned 1 exit status
make[3]: *** [libvirtd] Error 1
It can compile with gcc-4.5.2 in ubuntu 11.04, but it can not compile with gcc-4.6.1 in ubuntu 11.10.
I didn't find reason. Does Anyone know the reason or the different between gcc-4.5.2 and gcc-4.6.1?
I still provide a patch for this. Just make it is working now.
Signed-off-by: soulxu <soulxu@soulxu-ThinkPad-T410.(none)>
This adds support for a libvirt client configuration file
either /etc/libvirt/libvirt.conf for privileged clients,
or $HOME/.libvirt/libvirt.conf for unprivileged clients.
It allows one parameter
uri_aliases = [
"hail=qemu+ssh://root@hail.cloud.example.com/system",
"sleet=qemu+ssh://root@sleet.cloud.example.com/system",
]
Any call to virConnectOpen with a non-NULL URI will first
attempt to match against the uri_aliases list. An application
can disable this by using VIR_CONNECT_NO_ALIASES
* docs/uri.html.in: Document URI aliases
* include/libvirt/libvirt.h.in: Add VIR_CONNECT_NO_ALIASES
* libvirt.spec.in, mingw32-libvirt.spec.in: Add /etc/libvirt/libvirt.conf
* src/Makefile.am: Install default config file
* src/libvirt.c: Add support for URI aliases
* src/remote/remote_driver.c: Don't try to handle URIs
with no scheme and which clearly are not paths
* src/util/conf.c: Don't raise error on virConfFree(NULL)
* src/xen/xen_driver.c: Don't raise error on URIs
with no scheme
We recently added support for VIR_DOMAIN_START_AUTODESTROY and
an impl to the QEMU driver. It is very desirable to support in
other drivers, so this adds it to LXC and UML
* src/lxc/lxc_conf.h, src/lxc/lxc_driver.c,
src/uml/uml_conf.h, src/uml/uml_driver.c: Wire up autodestroy
functions
Noticed when testing new libvirt against old qemu that lacked the
snapshot_blkdev HMP command. Libvirt was mistakenly treating the
command as successful, and re-writing the domain XML to use the
just-created 0-byte file, rendering the domain broken on restart.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextDiskSnapshot):
Notice another possible error message.
* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive): Don't keep 0-byte file
on failure.
compile error:
./src/.libs/libvirt_driver_qemu.a(libvirt_driver_qemu_la-qemu_hostdev.o): In function `qemuPrepareHostdevPCIDevices':
/home/soulxu/data/work-code/libvirt/src/qemu/qemu_hostdev.c:183: undefined reference to `pciDeviceListFind'
/home/soulxu/data/work-code/libvirt/src/qemu/qemu_hostdev.c:230: undefined reference to `pciDeviceListFind'
./src/.libs/libvirt_driver_qemu.a(libvirt_driver_qemu_la-qemu_hostdev.o): In function `qemuGetActivePciHostDeviceList':
/home/soulxu/data/work-code/libvirt/src/qemu/qemu_hostdev.c:102: undefined reference to `pciDeviceListFind'
./src/.libs/libvirt_driver_qemu.a(libvirt_driver_qemu_la-qemu_hostdev.o): In function `qemuDomainReAttachHostdevDevices':
/home/soulxu/data/work-code/libvirt/src/qemu/qemu_hostdev.c:370: undefined reference to `pciDeviceListFind'
Signed-off-by: Xu He Jie <xuhj@linux.vnet.ibm.com>
probes.h is generated in build directory; setting a dependency on
probes.h from source directory doesn't work well in VPATH builds. Caused
by commit 1afcfbdda0
When I run 'make dist', I receive the following error messages:
make[1]: Entering directory `/home/wency/source/libvirt/src'
GEN remote/remote_protocol.h
GEN remote/remote_protocol.c
GEN remote/qemu_protocol.h
GEN remote/qemu_protocol.c
GEN remote/qemu_client_bodies.h
CC libvirt_driver_remote_la-remote_protocol.lo
In file included from ./remote/remote_protocol.h:16,
from ./remote/remote_protocol.c:7:
/internal.h:249:23: error: probes.h: No such file or directory
make[1]: *** [libvirt_driver_remote_la-remote_protocol.lo] Error 1
make[1]: Leaving directory `/home/wency/source/libvirt/src'
make: *** [distdir] Error 1
The reason is that we use probes.h before generating it.
BZ# https://bugzilla.redhat.com/show_bug.cgi?id=736214
The problem is caused by the original info of domain's PCI dev is
maintained by qemu_driver->activePciHostdevs list, (E.g. dev->reprobe,
which stands for whether need to reprobe driver for the dev when do
reattachment). The fields (dev->reprobe, dev->unbind_from_stub, and
dev->remove_slot) are initialized properly when preparing the PCI
device for managed attachment. However, when do reattachment, it
construct a complete new "pciDevice" without honoring the original
dev info, and thus the dev won't get the original driver or can get
other problem.
This patch is to fix the problem by get the devs from list
driver->activePciHostdevs.
Tested with following 3 scenarios:
* the PCI was bound to some driver not pci-stub before attaching
result: the device will be bound to the original driver
* the PCI was bound to pci-stub before attaching
result: no driver reprobing, and still bound to pci-stub
* The PCI was not bound to any driver
result: no driver reprobing, and still not bound to any driver.
Commit 0472f39 plugged a leak, but introduced another bug:
Actually looks like physfndev is conditionally allocated in getPhysfnDev
Its better to modify getPhysfnDev to allocate physfndev every time.
When failing on starting a domain, it tries to reattach all the PCI
devices defined in the domain conf, regardless of whether the devices
are still used by other domain. This will cause the devices to be deleted
from the list qemu_driver->activePciHostdevs, thus the devices will be
thought as usable even if it's not true. And following commands
nodedev-{reattach,reset} will be successful.
How to reproduce:
1) Define two domains with same PCI device defined in the confs.
2) # virsh start domain1
3) # virsh start domain2
4) # virsh nodedev-reattach $pci_device
You will see the device will be reattached to host successfully.
As pciDeviceReattach just check if the device is still used by
other domain via checking if the device is in list driver->activePciHostdevs,
however, the device is deleted from the list by step 2).
This patch is to prohibit the bug by:
1) Prohibit a domain starting or device attachment right at
preparation period (qemuPrepareHostdevPCIDevices) if the
device is in list driver->activePciHostdevs, which means
it's used by other domain.
2) Introduces a new field for struct _pciDevice, (const char *used_by),
it will be set as the domain name at preparation period,
(qemuPrepareHostdevPCIDevices). Thus we can prohibit deleting
the device from driver->activePciHostdevs if it's still used by
other domain when stopping the domain process.
* src/pci.h (define two internal functions, pciDeviceSetUsedBy and
pciDevceGetUsedBy)
* src/pci.c (new field "const char *used_by" for struct _pciDevice,
implementations for the two new functions)
* src/libvirt_private.syms (Add the two new internal functions)
* src/qemu_hostdev.h (Modify the definition of functions
qemuPrepareHostdevPCIDevices, and qemuDomainReAttachHostdevDevices)
* src/qemu_hostdev.c (Prohibit preparation and don't delete the
device from activePciHostdevs list if it's still used by other domain)
* src/qemu_hotplug.c (Update function usage, as the definitions are
changed)
Signed-off-by: Eric Blake <eblake@redhat.com>
virInitialize() → xenRegister() → xenhypervisorInit() determines the
version of the Hypervisor. This breaks xencapstest when building as root
on a dom0 system, since xenHypervisorBuildCapabilities() adds the "hap"
and "viridian" features based on the detected version.
Add an optional parameter to xenhypervisorInit() to disable automatic
detection of the Hypervisor version. The passed in arguments are used
instead.
Signed-off-by: Philipp Hahn <hahn@univention.de>
Calling virInitialize() → xenRegister() → xenhypervisorInit() directly
opens a connection to the Xen Hypervisor, which breaks some unit tests.
Move all static variables into a struct to make it easier to override
them when testing.
Signed-off-by: Philipp Hahn <hahn@univention.de>
Coverity detected that the only way to get to the cleanup label
is if objectSpec had been successfully allocated, so the null
check was dead code.
* src/esx/esx_vi.c (esxVI_LookupObjectContentByType): Drop
redundant null check.
Detected by Coverity. Leak present since commit 874e65a; and
while commit d50bb45 tried to fix the issue, it missed a path.
* src/conf/domain_conf.c (virDomainDefParseBootXML): Always clean
up useserial.
Setting a hostname that cannot be resolved is not the best configuration
but since virGetHostname only calls getaddrinfo to get host's canonical
name and we do not fail if the returned canonical name is NULL or
"localhost", there is no reason why we should fail if getaddrinfo itself
fails.
Detected by Coverity. p (the pointer to the string) is always true;
when in reality, we wanted to know whether the integer value of the
just-parsed string is '0' or '1'. Logic bug since commit b1b5b51.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextGetBlockInfo): Set
results to proper value.
Detected by Coverity. If, for some reason, our text monitor input
does not match our assumptions, we end up incrementing p while it
is NULL, then dereferencing the pointer 0x1, which will fault.
* src/qemu/qemu_monitor_text.c
(qemuMonitorTextGetBlockStatsParamsNumber): Rewrite to avoid
deref of strchr failure. Fix indentation.
Coverity complained that most, but not all, clients of virUUIDParse
were checking for errors. Silence those coverity warnings by
explicitly marking the cases where we trust the input, and fixing
one instance that really should have been checking. In particular,
this silences a rather large percentage of the warnings I saw on my
most recent Coverity analysis run.
* src/util/uuid.h (virUUIDParse): Enforce rules.
* src/util/uuid.c (virUUIDParse): Drop impossible check; at least
Coverity will detect if we break rules and pass NULL.
* src/xenapi/xenapi_driver.c (xenapiDomainCreateXML)
(xenapiDomainLookupByID, xenapiDomainLookupByName)
(xenapiDomainDefineXML): Ignore return when we trust data source.
* src/vbox/vbox_tmpl.c (nsIDtoChar, vboxIIDToUUID_v3_x)
(vboxCallbackOnMachineStateChange)
(vboxCallbackOnMachineRegistered, vboxStoragePoolLookupByName):
Likewise.
* src/node_device/node_device_hal.c (gather_system_cap): Likewise.
* src/xenxs/xen_sxpr.c (xenParseSxpr): Check for errors.
In virFDStreamOpenFileInternal(), a errfd pipe is opened by
virCommandRunAsync() and given to virFDStreamOpenInternal().
It seems virFDStream should close errfd, just like the other
fd it is given.
This fixes screenshots leaking FDs:
http://bugzilla.redhat.com/show_bug.cgi?id=745761
virCommandTransferFD promises that the fd is no longer owned by
the caller. Normally, we want the fd to remain open until the
child runs, but in error situations, we must close it earlier.
* src/util/command.c (virCommandTransferFD): Close fd now if we
can't track it to close later.
(virCommandKeepFD): Adjust helper to make this easier.
As this is needed. Although some functions check for domain
being active before obtaining job, we need to check it after,
because obtaining job unlocks domain object, during which
a state of domain can be changed.
Currently, push & pop from event queue (both server & client side)
rely on lock from higher levels, e.g. on driver lock (qemu),
private_data (remote), ...; This alone is not sufficient as not
every function that interacts with this queue can/does lock,
esp. in client where we have a different approach, "passing
the buck".
Therefore we need a separate lock just to protect event queue.
For more info see:
https://bugzilla.redhat.com/show_bug.cgi?id=743817
This patch extends qemudDomainCoreDump so it supports new VIR_DUMP_RESET
flag. If this flag is set, domain is reset on successful dump. However,
this is needed to be done after we start CPUs.
With the recent refactoring of qemu snapshot relationships, it
is now trivial to filter on leaves.
* src/conf/domain_conf.c (virDomainSnapshotObjListCount)
(virDomainSnapshotObjListCopyNames): Handle new flag.
* src/qemu/qemu_driver.c (qemuDomainSnapshotListNames)
(qemuDomainSnapshotNum, qemuDomainSnapshotListChildrenNames)
(qemuDomainSnapshotNumChildren): Pass new flag through.
For some versions of Xen the difference between "tap" and "tap2" is
important. When converting back from xen-sxpr to libvirt-xml, that
information is lost, which breaks re-defining the domain using that
data.
Explicitly return "tap2" for disks defined as "device/tap2".
Signed-off-by: Philipp Hahn <hahn@univention.de>
When PyGrub is used as the bootloader in Xen, it gets passed the first
bootable disk. Xend supports a "bootable"-flag for this, which isn't
explicitly supported by libvirt.
When converting libvirt-xml to xen-sxpr the "bootable"-flag gets
implicitly set by xen.xend.XenConfig.device_add() for the first disk
(marked as "Compat hack -- mark first disk bootable").
When converting back xen-sxpr to libvirt-xml, the disks are returned in
the internal order used by Xend ignoring the "bootable"-flag, which
loses the original order. When the domain is then re-defined, the order
of disks is changed, which breaks PyGrub, since a different disk gets
passed.
When converting xen-sxpr to libvirt-xml, use the "bootable"-flag to
determine the first disk.
This isn't perfect, since several disks can be marked as bootable using
the Xend-API, but that is not supported by libvirt. In all known cases
relevant to libvirt exactly one disk is marked as bootable.
Signed-off-by: Philipp Hahn <hahn@univention.de>
VirtFS allows the user to choose between path/handle based fs driver.
As of now, libvirt hardcoded path based driver only. This patch provides
a solution to allow user to choose between path/handle based fs driver.
Sample:
<filesystem type='mount'>
<driver type='handle'/>
<source dir='/folder/to/share1'/>
<target dir='mount_tag1'/>
</filesystem>
<filesystem type='mount'>
<driver type='path'/>
<source dir='/folder/to/share2'/>
<target dir='mount_tag2'/>
</filesystem>
Signed-off-by: Harsh Prateek Bora <harsh@linux.vnet.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Implement a generic helper to escape a given set of characters with a
leading '\'. Generalizes virBufferEscapeSexpr().
Signed-off-by: Sage Weil <sage@newdream.net>
The previous optimizations lead to some follow-on cleanups.
* src/conf/domain_conf.c (virDomainSnapshotForEachChild)
(virDomainSnapshotForEachDescendant): Drop dead parameter.
(virDomainSnapshotActOnDescendant)
(virDomainSnapshotObjListNumFrom)
(virDomainSnapshotObjListGetNamesFrom): Update callers.
* src/qemu/qemu_driver.c (qemuDomainSnapshotNumChildren)
(qemuDomainSnapshotListChildrenNames, qemuDomainSnapshotDelete):
Likewise.
* src/conf/domain_conf.h: Update prototypes.
Among other improvements, virDomainSnapshotForEachDescendant is
changed from iterative O(n^2) to recursive O(n). A bit better
than the O(n^3) implementation in virsh snapshot-list!
* src/conf/domain_conf.c (virDomainSnapshotObjListNum)
(virDomainSnapshotObjListNumFrom)
(virDomainSnapshotObjeListGetNames, virDomainSnapshotForEachChild)
(virDomainSnapshotForEachDescendant): Optimize.
(virDomainSnapshotActOnDescendant): Tweak.
(virDomainSnapshotActOnChild, virDomainSnapshotMarkDescendant):
Delete, now that they are unused.
Maintain the parent/child relationships of all qemu snapshots.
* src/qemu/qemu_driver.c (qemuDomainSnapshotLoad): Populate
relationships after loading.
(qemuDomainSnapshotCreateXML): Set relations on creation; tweak
redefinition to reuse existing object.
(qemuDomainSnapshotReparentChildren, qemuDomainSnapshotDelete):
Clear relations on delete.
No one was using virDomainSnapshotHasChildren, but that was an
O(n) function. Exposing and tracking a bit more metadata for each
snapshot will allow the same query to be made with an O(1) query
of the member field. For single snapshot operations (create,
delete), callers can be trusted to maintain the metadata themselves,
but for reloading, we can't compute parents as we go since there
is no guarantee that parents were parsed before children, so we also
provide a function to refresh the relationships, and which can
be used to detect if the user has ignored our warnings and been
directly modifying files in /var/lib/libvirt/qemu/snapshot. This
patch only adds metadata; later patches will actually use it.
This layout intentionally hardcodes the size of each snapshot struct,
by tracking sibling pointers, rather than having to deal with the
headache of yet more memory management by directly sticking a
dynamically sized child[] on each parent.
* src/conf/domain_conf.h (_virDomainSnapshotObj)
(_virDomainSnapshotObjList): Add members.
(virDomainSnapshotUpdateRelations, virDomainSnapshotDropParent):
New prototypes.
(virDomainSnapshotHasChildren): Delete.
* src/conf/domain_conf.c (virDomainSnapshotSetRelations)
(virDomainSnapshotUpdateRelations, virDomainSnapshotDropParent):
New functions.
(virDomainSnapshotHasChildren): Drop unused function.
* src/libvirt_private.syms (domain_conf): Update exports.
To date, JSON disk snapshots worked by accident, as they were always
using hmp fallback due to a typo in commit e702b5b not picking up
on the (intentional) difference in command names between the two
monitor protocols.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDiskSnapshot):
Spell QMP command correctly.
Reported by Luiz Capitulino.
Detected by autogen.sh on a cross-mingw build:
Creating library file: .libs/libvirt.dll.a
Cannot export virNetSASLContextCheckIdentity: symbol not defined
Cannot export virNetSASLContextNewServer: symbol not defined
...
* src/libvirt_private.syms (virnetsaslcontext.h): Move symbols...
* src/libvirt_sasl.syms: ...to new file.
* src/Makefile.am (USED_SYM_FILES) [HAVE_SASL]: Use new file.
(EXTRA_DIST): Ship it.
I got these distcheck failures with sanlock enabled:
ERROR: files left in build directory after distclean:
./tools/virt-sanlock-cleanup
./src/locking/qemu-sanlock.conf
* src/Makefile.am (DISTCLEANFILES) [HAVE_SANLOCK]: Clean built
file.
* tools/Makefile.am (DISTCLEANFILES): Likewise.
The libvirtd daemon had a few crude system tap probes. Some of
these were broken during the RPC rewrite. The new modular RPC
code is structured in a way that allows much more effective
tracing. Instead of trying to hook up the original probes,
define a new set of probes for the RPC and event code.
The master probes file is now src/probes.d. This contains
probes for virNetServerClientPtr, virNetClientPtr, virSocketPtr
virNetTLSContextPtr and virNetTLSSessionPtr modules. Also add
probes for the poll event loop.
The src/dtrace2systemtap.pl script can convert the probes.d
file into a libvirt_probes.stp file to make use from systemtap
much simpler.
The src/rpc/gensystemtap.pl script can generate a set of
systemtap functions for translating RPC enum values into
printable strings. This works for all RPC header enums (program,
type, status, procedure) and also the authentication enum
The PROBE macro will automatically generate a VIR_DEBUG
statement, so any place with a PROBE can remove any existing
manual DEBUG statements.
* daemon/libvirtd.stp, daemon/probes.d: Remove obsolete probing
* daemon/libvirtd.h: Remove probe macros
* daemon/Makefile.am: Remove all probe buildings/install
* daemon/remote.c: Update authentication probes
* src/dtrace2systemtap.pl, src/rpc/gensystemtap.pl: Scripts
to generate STP files
* src/internal.h: Add probe macros
* src/probes.d: Master list of probes
* src/rpc/virnetclient.c, src/rpc/virnetserverclient.c,
src/rpc/virnetsocket.c, src/rpc/virnettlscontext.c,
src/util/event_poll.c: Insert probe points, removing any
DEBUG statements that duplicate the info
Pull the call to gnutls_x509_crt_get_dn up into a higher function
so that the 'dname' variable will be available for probe points
* src/rpc/virnettlscontext.c: Pull gnutls_x509_crt_get_dn up
one level
If we receive an error on the stream, set the EOF marker so
that any further (bogus) incoming data is dropped.
* src/rpc/virnetclientstream.c: Set EOF on stream
To avoid static linking libvirtd to the RPC server code, which
then prevents sane introduction of DTrace probes, put it all
in the libvirt.so, and export it
* daemon/Makefile.am: Don't link to RPC libraries
* src/Makefile.am: Link all RPC libraries to libvirt.so
* src/libvirt_private.syms: Export all RPC functions
It was fairly trivial to return snapshot listing based on a
point in the hierarchy, rather than starting at all roots.
* src/esx/esx_driver.c (esxDomainSnapshotNumChildren)
(esxDomainSnapshotListChildrenNames): New functions.
Not too hard to wire up. The trickiest part is realizing that
listing children of a snapshot cannot use SNAPSHOT_LIST_ROOTS,
and that we overloaded that bit to also mean SNAPSHOT_LIST_DESCENDANTS;
we use that bit to decide which iteration to use, but don't want
the existing counting/listing functions to see that bit.
* src/conf/domain_conf.h (virDomainSnapshotObjListNumFrom)
(virDomainSnapshotObjListGetNamesFrom): New prototypes.
* src/conf/domain_conf.c (virDomainSnapshotObjListNumFrom)
(virDomainSnapshotObjListGetNamesFrom): New functions.
* src/libvirt_private.syms (domain_conf.h): Export them.
* src/qemu/qemu_driver.c (qemuDomainSnapshotNumChildren)
(qemuDomainSnapshotListChildrenNames): New functions.
Very mechanical. I'm so glad we've automated the generation of things,
compared to what it was in 0.8.x days, where this would be much longer.
* src/remote/remote_protocol.x
(REMOTE_PROC_DOMAIN_SNAPSHOT_NUM_CHILDREN)
(REMOTE_PROC_DOMAIN_SNAPSHOT_LIST_CHILDREN_NAMES): New rpcs.
(remote_domain_snapshot_num_children_args)
(remote_domain_snapshot_num_children_ret)
(remote_domain_snapshot_list_children_names_args)
(remote_domain_snapshot_list_children_names_ret): New structs.
* src/remote/remote_driver.c (remote_driver): Use it.
* src/remote_protocol-structs: Update.
The previous API addition allowed traversal up the hierarchy;
this one makes it easier to traverse down the hierarchy.
In the python bindings, virDomainSnapshotNumChildren can be
generated, but virDomainSnapshotListChildrenNames had to copy
from the hand-written example of virDomainSnapshotListNames.
* include/libvirt/libvirt.h.in (virDomainSnapshotNumChildren)
(virDomainSnapshotListChildrenNames): New prototypes.
(VIR_DOMAIN_SNAPSHOT_LIST_DESCENDANTS): New flag alias.
* src/libvirt.c (virDomainSnapshotNumChildren)
(virDomainSnapshotListChildrenNames): New functions.
* src/libvirt_public.syms: Export them.
* src/driver.h (virDrvDomainSnapshotNumChildren)
(virDrvDomainSnapshotListChildrenNames): New callbacks.
* python/generator.py (skip_impl, nameFixup): Update lists.
* python/libvirt-override-api.xml: Likewise.
* python/libvirt-override.c
(libvirt_virDomainSnapshotListChildrenNames): New wrapper function.
On xen 4.1 I observed configurations that look like:
(image
(hvm
(kernel '')
(loader '/foo/bar')
))
The kernel element is there but unset. This leads to an empty <kernel/>
element in the XML and even worse makes us skip the boot order parsing
and therefore not emit a <boot device='$dev>'/> element which breaks CD
booting.
otherwise a missing UUID in a domain config just shows:
error: An error occurred, but the cause is unknown
Now we have:
error: configuration file syntax error: config value uuid was missing
* src/storage/storage_backend_logical.c:
If a logical vol is created as striped. (e.g. --stripes 3),
the "device" field of lvs output will have multiple fileds which are
seperated by comma. Thus the RE we write in the codes will not
work well anymore. E.g. (lvs output for a stripped vol, uses "#" as
seperator here):
test_stripes##fSLSZH-zAS2-yAIb-n4mV-Al9u-HA3V-oo9K1B#\
/dev/sdc1(10240),/dev/sdd1(0)#42949672960#4194304
The RE we use:
const char *regexes[] = {
"^\\s*(\\S+),(\\S*),(\\S+),(\\S+)\\((\\S+)\\),(\\S+),([0-9]+),?\\s*$"
};
Also the RE doesn't match the "devices" field of striped vol properly,
it contains multiple "device path" and "offset".
This patch mainly does:
1) Change the seperator into "#"
2) Change the RE for "devices" field from "(\\S+)\\((\\S+)\\)"
into "(\\S+)".
3) Add two new options for lvs command, (segtype, stripes)
4) Extend the RE to match the value for the two new fields.
5) Parse the "devices" field seperately in virStorageBackendLogicalMakeVol,
multiple "extents" info are generated if the vol is striped. The
number of "extents" is equal to the stripes number of the striped vol.
A incidental fix: (virStorageBackendLogicalMakeVol)
Free "vol" if it's new created and there is error.
Demo on striped vol with the patch applied:
% virsh vol-dumpxml /dev/test_vg/vol_striped2
<volume>
<name>vol_striped2</name>
<key>QuWqmn-kIkZ-IATt-67rc-OWEP-1PHX-Cl2ICs</key>
<source>
<device path='/dev/sda5'>
<extent start='79691776' end='88080384'/>
</device>
<device path='/dev/sda6'>
<extent start='62914560' end='71303168'/>
</device>
</source>
<capacity>8388608</capacity>
<allocation>8388608</allocation>
<target>
<path>/dev/test_vg/vol_striped2</path>
<permissions>
<mode>0660</mode>
<owner>0</owner>
<group>6</group>
<label>system_u:object_r:fixed_disk_device_t:s0</label>
</permissions>
</target>
</volume>
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=727474
If parsing qemu command line fails (e.g. because of non-existing
process number supplied), we jump to cleanup label where we free
pidfile. Therefore it needs to be initialized. Otherwise we free
random pointer.
Coverity complained that 4 out of 5 callers to virJSONValueObjectGetBoolean
checked for errors. But we documented that we don't care in this case.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONGetBlockInfo): Use
ignore_value.
Detected by Coverity. We want to increment the size_t counter,
not the pointer to the counter. Bug present since 5f5c6fde (0.9.5).
* src/lxc/lxc_controller.c (lxcSetupLoopDevices): Use correct
precedence.
If we send back an unknown program error for async messages,
we will confuse the client because they only expect replies
for method calls. Just log & drop any invalid async messages
* src/rpc/virnetserver.c: Don't send error for async messages
Commit 597fe3cee6 accidentally
introduced a deadlock when reporting an unknown RPC program.
The virNetServerDispatchNewMessage method is called with
the client locked, and must therefore not attempt to send
any RPC messages back to the client. Only once the incoming
message is passed off to the virNetServerHandleJob worker
is it safe to start sending messages back
* src/rpc/virnetserver.c: Delay checking for unknown RPC
program until in worker thread
Redefining disk-only snapshot xml should work even if the user
did not explicitly pass VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY;
the flag is only required for conditions where the <state>
subelement is not already present in parsing (that is, defining
a new snapshot).
Also, fix the error code of some user-visible errors (the remaining
VIR_ERR_INTERNAL_ERROR should not be user-visible, since parsing
of <active> is only done from internal code).
* src/conf/domain_conf.c (virDomainSnapshotDefParseString): Allow
disks during redefinition of disk snapshot.
I am getting this failure with 'make distcheck':
GEN ../../src/remote_protocol-structs
/bin/sh: ../../src/remote_protocol-structs-t: Permission denied
make[4]: *** [../../src/remote_protocol-structs] Error 1
since it attempts a sub-run of a VPATH 'make check' where $(srcdir)
is intentionally read-only. I'm not sure which commit introduced
the problem, although I suspect it was around 62dee6f when I
refactored protocol struct checking to be more powerful.
$(@F) is required by POSIX, and although it is not yet portable
to all make implementations, we already require GNU make.
* src/Makefile.am (PDWTAGS): Generate temp file into current
directory, since $(srcdir) is read-only during distcheck.
Previously libvirt's disk device XML only had a single attribute,
error_policy, to control both read and write error policy, but qemu
has separate options for controlling read and write. In one case
(enospc) a policy is allowed for write errors but not read errors.
This patch adds a separate attribute that sets only the read error
policy. If just error_policy is set, it will apply to both read and
write error policy (previous behavior), but if the new rerror_policy
attribute is set, it will override error_policy for read errors only.
Possible values for rerror_policy are "stop", "report", and "ignore"
("report" is the qemu-controlled default for rerror_policy when
error_policy isn't specified).
For consistency, the value "report" has been added to the possible
values for error_policy as well.
commit 12062ab set rerror=ignore when error_policy="enospace" was
selected (since the rerror option in qemu doesn't accept "enospc", as
the werror option does).
After that patch was already pushed, Paolo Bonzini noticed it and
commented that leaving rerror at the default ("report") would be a
better choice. This patch corrects the problem - if error_policy =
"enospace" is given, rerror is left off the qemu commandline,
effectively setting it to "report". For other values, rerror is still
set to match werror.
Additionally, the parsing of error_policy was changed to no longer
erroneously allow "default" as a choice - as with most other
attributes, if you want the default setting, just don't specify an
error_policy.
Finally, two ommissions in the first patch were corrected - a
long-dormant qemuxml2argv test for enospace was enabled, and fixed to
pass, and the argv2xml parser in qemu_command.c was updated to
recognize the different spelling on the qemu commandline.
Now that RHEL 6.2 Beta is out, it would be nice to test multifunction
devices on that platform. This changes things so that the multifunction
cap bit can be set in two different ways: by version comparison (needed
for qemu 0.13 which lacked a -device query), and by -device query
(provided by qemu.git and backported to the RHEL beta build of
qemu-kvm which still claims to be a modified 0.12, and therefore needed
for RHEL).
* src/qemu/qemu_capabilities.c (qemuCapsParseDeviceStr): Allow
second method of setting multifunction cap bit.
* tests/qemuhelptest.c (mymain): Test it.
* tests/qemuhelpdata/qemu-kvm-0.12.1.2-rhel62-beta: New file.
* tests/qemuhelpdata/qemu-kvm-0.12.1.2-rhel62-beta-device: Likewise.
If using one of the new non-NAT/routed virtual network
configurations, the LXC driver would not know how to
setup the VETH devices. Adding in calls to setup the
"actual" network configuration at VM startup and cleanup
when shutting down fixes this.
* src/lxc/lxc_driver.c: Setup/cleanup actual net devs
Implements the documentation for snapshot revert vs. force.
Part of the patch tightens existing behavior (previously, reverting
to an old snapshot without <domain> was blindly attempted, now it
requires force), while part of it relaxes behavior (previously, it
was not possible to revert an active domain to an ABI-incompatible
active snapshot, now force allows this transition).
* src/qemu/qemu_driver.c (qemuDomainRevertToSnapshot): Check for
risky situations, and allow force to get past them.
Once we know which set of disks belong to a snapshot, reverting or
deleting that snapshot should visit just those disks, rather than
also visiting disks that were hot-plugged in the meantime or
skipping disks that were hot-unplugged in the meantime.
* src/qemu/qemu_domain.c (qemuDomainSnapshotForEachQcow2): Use
snapshot domain details when available. Avoid NULL deref.
Although reverting to a snapshot is a form of data loss, this is
normally expected. However, there are two cases where additional
surprises (failure to run the reverted state, or a break in
connectivity to the domain) can come into play. Requiring extra
acknowledgment in these cases will make it less likely that
someone can get into an unrecoverable state due to a default revert.
Also create a new error code, so users can distinguish when forcing
would make a difference, rather than having to blindly request force.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_SNAPSHOT_REVERT_FORCE):
New flag.
* src/libvirt.c (virDomainRevertToSnapshot): Document it.
* include/libvirt/virterror.h (VIR_ERR_SNAPSHOT_REVERT_RISKY): New
error value.
* src/util/virterror.c (virErrorMsg): Implement it.
* tools/virsh.c (cmdDomainSnapshotRevert): Add --force to virsh.
* tools/virsh.pod (snapshot-revert): Document it.
Commit 9f5e53e introduced the ability to filter snapshots to
just roots, but it was never implemented for VBox until now.
The VBox implementation prohibits deletion of a snapshot with
multiple children. Hence, there can only be at most one root,
which is found by searching for the snapshot with a NULL uuid.
Prior to 4.0, snapshotGet looked up by UUID, and snapshotFind
looked up by name; after that point, snapshotGet disappeared
and snapshotFind handles uuid or name.
* src/vbox/vbox_tmpl.c (vboxDomainSnapshotNum)
(vboxDomainSnapshotListNames): Implement limiting list to root.
Qemu driver tries to update balloon data in virDomainGetInfo and if it
can't do so because there is another monitor job running, it just
reports what's known in domain def. However, if there was no job running
but getting the data from qemu fails, we would fail the whole API. This
doesn't make sense. Let's make the failure nonfatal.
No need to request the parent of a snapshot if we aren't going to use it.
* src/esx/esx_vi.c (esxVI_GetSnapshotTreeByName): Make parent
optional.
* src/esx/esx_driver.c (esxDomainSnapshotCreateXML)
(esxDomainSnapshotLookupByName, esxDomainRevertToSnapshot)
(esxDomainSnapshotDelete): Simplify accordingly.
Commit 9f5e53e introduced the ability to filter snapshots to
just roots, but it was never implemented for ESX until now.
* src/esx/esx_vi.h (esxVI_GetNumberOfSnapshotTrees)
(esxVI_GetSnapshotTreeNames): Add parameter.
* src/esx/esx_vi.c (esxVI_GetNumberOfSnapshotTrees)
(esxVI_GetSnapshotTreeNames): Allow choice of recursion or not.
* src/esx/esx_driver.c (esxDomainSnapshotNum)
(esxDomainSnapshotListNames): Use it to limit to roots.
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=730909
When support for setting the qemu disk error policy to "enospc" was
added, it was inadvertently spelled "enospace". This patch corrects
that on the qemu commandline (while retaining the "enospace" spelling
for libvirt's XML).
Also, while examining the qemu source, I found that "enospc" is not
allowed for the read error policy, only for write error policy (makes
sense). Since libvirt currently only has a single error policy
setting, when "enospace" is selected, the read error policy is set to
"ignore".
Destination libvirtd remembers the original name in the prepare phase
and clears it in the finish phase. The original name is used when
comparing domain name in migration cookie.
When booting a virtual machine with a kernel/initrd it is possible
to pass command line arguments using the <cmdline>...args...</cmdline>
element in the guest XML. These appear to the kernel / init process
in /proc/cmdline.
When booting a container we do not have a custom /proc/cmdline,
but we can easily set an environment variable for it. Ideally
we could pass individual arguments to the init process as a
regular set of 'char *argv[]' parameters, but that would involve
libvirt parsing the <cmdline> XML text. This can easily be added
later, even if we add the env variable now
* docs/drvlxc.html.in: Document env variables passed to LXC
* src/conf/domain_conf.c: Add <cmdline> to be parsed for
guests of type='exe'
* src/lxc/lxc_container.c: Set LIBVIRT_LXC_CMDLINE env var
This patch is a fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=743176
which was discovered by Dan Berrange while making bandwidth
configuration work for LXC guests.
Background: Although virtportprofile data from a network portgroup is
only applicable for direct mode interfaces, the code that copies
bandwidth data from the portgroup was also only being executed in the
case of direct mode interfaces. The result was that interfaces using
traditional virtual networks (forward mode='nat|route|none'), and
those using a host bridge for forwarding, would not pick up bandwidth
data from a portgroup defined in the network.
This patch moves that code outside the conditional, so that bandwidth
information is *alway* copied from the appropriate portgroup (unless
the <interface> definition itself already has bandwidth information,
which would take precedence over what's in the portgroup anyway).
Code altered so that it is consistent with the associated comment. The
'autoconf' variable is forced to zero.
Signed-off-by: Neil Wilson <neil@brightbox.co.uk>
Do not crash if virStreamFinish is called after error.
==11000== Invalid read of size 4
==11000== at 0x373A8099A0: pthread_mutex_lock (pthread_mutex_lock.c:51)
==11000== by 0x4C7CADE: virMutexLock (threads-pthread.c:85)
==11000== by 0x4D57C31: virNetClientStreamRaiseError (virnetclientstream.c:203)
==11000== by 0x4D385E4: remoteStreamFinish (remote_driver.c:3541)
==11000== by 0x4D182F9: virStreamFinish (libvirt.c:14157)
==11000== by 0x40FDC4: cmdScreenshot (virsh.c:3075)
==11000== by 0x42BA40: vshCommandRun (virsh.c:14922)
==11000== by 0x42ECCA: main (virsh.c:16381)
==11000== Address 0x59b86c0 is 16 bytes inside a block of size 216 free'd
==11000== at 0x4A06928: free (vg_replace_malloc.c:427)
==11000== by 0x4C69E2B: virFree (memory.c:310)
==11000== by 0x4D57B56: virNetClientStreamFree (virnetclientstream.c:184)
==11000== by 0x4D3DB7A: remoteDomainScreenshot (remote_client_bodies.h:1812)
==11000== by 0x4CFD245: virDomainScreenshot (libvirt.c:2903)
==11000== by 0x40FB73: cmdScreenshot (virsh.c:3029)
==11000== by 0x42BA40: vshCommandRun (virsh.c:14922)
==11000== by 0x42ECCA: main (virsh.c:16381)
When support for was added for PCI multifunction cards (in commit
9f8baf, first included in libvirt 0.9.3), it was done by always
turning on the multifunction bit for all PCI devices. Since that time
it has been realized that this is not an ideal solution, and that the
multifunction bit must be selectively turned on. For example, see
https://bugzilla.redhat.com/show_bug.cgi?id=728174
and the discussion before and after
https://www.redhat.com/archives/libvir-list/2011-September/msg01036.html
This patch modifies multifunction support so that the multifunction=on
option is only added to the qemu commandline for a device if its PCI
<address> definition has the attribute "multifunction='on'", e.g.:
<address type='pci' domain='0x0000' bus='0x00'
slot='0x04' function='0x0' multifunction='on'/>
In practice, the multifunction bit should only be turned on if
function='0' AND other functions will be used in the same slot - it
usually isn't needed for functions 1-7 (although there are apparently
some exceptions, e.g. the Intel X53 according to the QEMU source
code), and should never be set if only function 0 will be used in the
slot. The test cases have been changed accordingly to illustrate.
With this patch in place, if a user attempts to assign multiple
functions in a slot without setting the multifunction bit for function
0, libvirt will issue an error when the domain is defined, and the
define operation will fail. In the future, we may decide to detect
this situation and automatically add multifunction=on to avoid the
error; even then it will still be useful to have a manual method of
turning on multifunction since, as stated above, there are some
devices that excpect it to be turned on for all functions in a slot.
A side effect of this patch is that attempts to use the same PCI
address for two different devices will now log an error (previously
this would cause the domain define operation to fail, but there would
be no log message generated). Because the function doing this log was
almost completely rewritten, I didn't think it worthwhile to make a
separate patch for that fix (the entire patch would immediately be
obsoleted).
If the regexes supported (?:pvs)?, then we could handle this by
optionally matching but not returning the initial command name. But it
doesn't. So add a new char* argument to
virStorageBackendRunProgRegex(). If that argument is NULL then we act
as usual. Otherwise, if the string at that argument is found at the
start of a returned line, we drop that before running the regex.
With this patch, virt-manager shows me lvs with command_names 1 or 0.
The definitions of PVS_BASE etc may want to be moved into the configure
scripts (though given how PVS is found, IIUC that could only happen if
pvs was a link to pvs_real), but in any case no sense dealing with that
until we're sure this is an ok way to handle it.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Currently, qemuDomainGetXMLDesc and qemudDomainGetInfo check for
outstanding synchronous job before (eventual) monitor entering.
However, there can be already async job set, e.g. migration.
Before, URIs such as hyperv+ssh:// have been declined by the Hyper-V
driver resulting in the remote driver trying to connect to an
non-existing libvirtd.
Now such URIs trigger an error in the yper-V driver suggesting to
try again without the transport part in the scheme.
Before, URIs such as esx+ssh:// have been declined by the ESX driver
resulting in the remote driver trying to connect to an non-existing
libvirtd.
Now such URIs trigger an error in the ESX driver suggesting to try
again without the transport part in the scheme.
If the daemon is restarted so we reconnect to monitor, cdrom media
can be ejected. In that case we don't want to show it in domain xml,
or require it on migration destination.
To check for disk status use 'info block' monitor command.
* src/qemu/qemu_migration.c: if 'vmdef' is NULL, the function
virDomainSaveConfig still dereferences it, it doesn't make
sense, so should add return value check to make sure 'vmdef'
is non-NULL before calling virDomainSaveConfig, in addition,
in order to debug later, also should record error information
into log.
Signed-off-by: Alex Jia <ajia@redhat.com>
First hypervisor implementation of the new API.
Allows 'virsh snapshot-list --tree' to be more efficient.
* src/qemu/qemu_driver.c (qemuDomainSnapshotGetParent): New
function.
Mostly straight-forward, although this is the first API that
returns a new snapshot based on a snapshot rather than a domain.
* src/remote/remote_protocol.x
(REMOTE_PROC_DOMAIN_SNAPSHOT_GET_PARENT): New rpc.
(remote_domain_snapshot_get_parent_args)
(remote_domain_snapshot_get_parent_ret): New structs.
* src/rpc/gendispatch.pl: Adjust generator.
* src/remote/remote_driver.c (remote_driver): Use it.
* src/remote_protocol-structs: Update.
Although a client can already obtain a snapshot's parent by
dumping and parsing the xml, then doing a snapshot lookup by
name, it is more efficient to get the parent in one step, which
in turn will make operations that must traverse a snapshot
hierarchy easier to perform.
* include/libvirt/libvirt.h.in (virDomainSnapshotGetParent):
Declare.
* src/libvirt.c (virDomainSnapshotGetParent): New function.
* src/libvirt_public.syms: Export it.
* src/driver.h (virDrvDomainSnapshotGetParent): New callback.
This patch fixes the regression with using named pipes for qemu serial
devices noted in:
https://bugzilla.redhat.com/show_bug.cgi?id=740478
The problem was that, while new code in libvirt looks for a single
bidirectional fifo of the name given in the config, then relabels that
and continues without looking for / relabelling the two unidirectional
fifos named ${name}.in and ${name}.out, qemu looks in the opposite
order. So if the user had naively created all three fifos, libvirt
would relabel the bidirectional fifo to allow qemu access, but qemu
would attempt to use the two unidirectional fifos and fail (because it
didn't have proper permissions/rights).
This patch changes the order that libvirt looks for the fifos to match
what qemu does - first it looks for the dual fifos, then it looks for
the single bidirectional fifo. If it finds the dual unidirectional
fifos first, it labels/chowns them and ignores any possible
bidirectional fifo.
(Note commit d37c6a3a (which first appeared in libvirt-0.9.2) added
the code that checked for a bidirectional fifo. Prior to that commit,
bidirectional fifos for serial devices didn't work because libvirt
always required the ${name}.(in|out) fifos to exist, and qemu would
always prefer those.
If a domain started with -no-shutdown shuts down while libvirtd is not
running, it will be seen as paused when libvirtd reconnects to it. Use
the paused reason to detect if a domain was stopped because of shutdown
and finish the process just as if a SHUTDOWN event is delivered from
qemu.
The AppArmor security driver adds only the path specified in the domain
XML for character devices of type 'pipe'. It should be using <path>.in
and <path>.out. We do this by creating a new vah_add_file_chardev() and
use it for char devices instead of vah_add_file(). Also adjust
valid_path() to accept S_FIFO (since qemu chardevs of type 'pipe' use
fifos). This is https://launchpad.net/bugs/832507
This patch was made in response to:
https://bugzilla.redhat.com/show_bug.cgi?id=738095
In short, qemu's default for the rombar setting (which makes the
firmware ROM of a PCI device visible/not on the guest) was previously
0 (not visible), but they recently changed the default to 1
(visible). Unfortunately, there are some PCI devices that fail in the
guest when rombar is 1, so the setting must be exposed in libvirt to
prevent a regression in behavior (it will still require explicitly
setting <rom bar='off'/> in the guest XML).
rombar is forced on/off by adding:
<rom bar='on|off'/>
inside a <hostdev> element that defines a PCI device. It is currently
ignored for all other types of devices.
At the moment there is no clean method to determine whether or not the
rombar option is supported by QEMU - this patch uses the advice of a
QEMU developer to assume support for qemu-0.12+. There is currently a
patch in the works to put this information in the output of "qemu-kvm
-device pci-assign,?", but of course if we switch to keying off that,
we would lose support for setting rombar on all the versions of qemu
between 0.12 and whatever version gets that patch.
SIGTERM handling for -no-shutdown is already fixed in qemu git and
libvirt can safely use it. The downside is that 0.15.50 version of qemu
can be any qemu compiled from git, even that without the fix for
SIGTERM. However, I think this patch is worth it since excluding 0.15.50
from the check makes testing current qemu with libvirt much easier and
someone running qemu from git should be able to rebuild fixed qemu from
git if they hit the problem with a hang on shutdown.
* src/storage/storage_driver.c: As virStorageVolLookupByPath lookups
all the pool objs of the drivers, breaking when failing on getting
the stable path of the pool will just breaks the whole lookup process,
it can cause the API fails even if the vol exists indeed. It won't get
any benefit. This patch is to fix it.
QEMU 0.13 introduced cache=unsafe for -drive, this patch exposes
it in the libvirt layer.
* Introduced a new QEMU capability flag ($prefix_CACHE_UNSAFE),
as even if $prefix_CACHE_V2 is set, we can't know if unsafe
is supported.
* Improved the reliability of qemu cache type detection.
commit 984840a2c2 removed the
notification of waiting calls when VIR_NET_CONTINUE messages
arrive. This was to fix the case of a virStreamAbort() call
being prematurely notified of completion.
The problem is that sometimes there are dummy calls from a
virStreamRecv() call waiting that *do* need to be notified.
These dummy calls should have a status VIR_NET_CONTINUE. So
re-add the notification upon VIR_NET_CONTINUE, but only if
the waiter also has a status of VIR_NET_CONTINUE.
* src/rpc/virnetclient.c: Notify waiting call if stream data
arrives
* src/rpc/virnetclientstream.c: Mark dummy stream read packet
with status VIR_NET_CONTINUE
If a domain has inactive XML we want to transfer it to destination
when migrating with VIR_MIGRATE_PERSIST_DEST. In order to harm
the migration protocol as least as possible, a optional cookie was
chosen.
The previous patch removed all snapshots, but not the directory
where the snapshots lived, which is still a form of stale data.
* src/qemu/qemu_domain.c (qemuDomainRemoveInactive): Wipe any
snapshot directory.
Commit 282fe1f0 documented that transient domains will auto-delete
any snapshot metadata when the last reference to the domain is
removed, and that management apps are in charge of grabbing any
snapshot metadata prior to that point. However, this was not
actually implemented for qemu until now.
* src/qemu/qemu_driver.c (qemudDomainCreate)
(qemuDomainDestroyFlags, qemuDomainSaveInternal)
(qemudDomainCoreDump, qemuDomainRestoreFlags, qemudDomainDefine)
(qemuDomainUndefineFlags, qemuDomainMigrateConfirm3)
(qemuDomainRevertToSnapshot): Clean up snapshot metadata.
* src/qemu/qemu_migration.c (qemuMigrationPrepareAny)
(qemuMigrationPerformJob, qemuMigrationPerformPhase)
(qemuMigrationFinish): Likewise.
* src/qemu/qemu_process.c (qemuProcessHandleMonitorEOF)
(qemuProcessReconnect, qemuProcessReconnectHelper)
(qemuProcessAutoDestroyDom): Likewise.
This patch is mostly code motion - moving some functions out
of qemu_driver and into qemu_domain so they can be reused by
multiple qemu_* files (since qemu_driver.h must not grow).
It also adds a new helper function, qemuDomainRemoveInactive,
which will be used in the next patch.
* src/qemu/qemu_domain.h (qemuFindQemuImgBinary)
(qemuDomainSnapshotWriteMetadata, qemuDomainSnapshotForEachQcow2)
(qemuDomainSnapshotDiscard, qemuDomainSnapshotDiscardAll)
(qemuDomainRemoveInactive): New prototypes.
(struct qemu_snap_remove): New struct.
* src/qemu/qemu_domain.c (qemuDomainRemoveInactive)
(qemuDomainSnapshotDiscardAllMetadata): New functions.
(qemuFindQemuImgBinary, qemuDomainSnapshotWriteMetadata)
(qemuDomainSnapshotForEachQcow2, qemuDomainSnapshotDiscard)
(qemuDomainSnapshotDiscardAll): Move here...
* src/qemu/qemu_driver.c (qemuFindQemuImgBinary)
(qemuDomainSnapshotWriteMetadata, qemuDomainSnapshotForEachQcow2)
(qemuDomainSnapshotDiscard, qemuDomainSnapshotDiscardAll): ...from
here.
(qemuDomainUndefineFlags): Update caller.
* src/conf/domain_conf.c (virDomainRemoveInactive): Doc fixes.
Commit 19f8c98 introduced VIR_DOMAIN_UNDEFINE_SNAPSHOTS_METADATA,
with the intent that omitting the flag makes undefine fail, and
including the flag deletes metadata. But it used the wrong logic.
Also, hoist the transient domain sooner, so that we don't
accidentally remove metadata of a transient domain.
* src/qemu/qemu_driver.c (qemuDomainUndefineFlags): Check correct
flag value.
Detected by Coverity. The only way to get to error_unlink is if
path was successfully assigned, so the if was useless. Meanwhile,
there was a return statement that did not free path.
* src/locking/lock_driver_sanlock.c
(virLockManagerSanlockSetupLockspace): Fix mem-leak, and drop
useless if.
Related #BZ: https://bugzilla.redhat.com/show_bug.cgi?id=702260.
There are two problems described in the BZ:
1) "Can't remove open logical volume".
2) "Unable to deactivate logical volume "foo""
This patch just intends to fix 2), as 1) is expected if the vol
is still used by something, and you never known if "lvchange -an"
will fail or not either (sometime, it will succeed, sometimes not).
We'd better not look for trouble, :-)
For 2), that's caused by race between lvremove and udev event handling,
the only workable way now is to wait the events handling are finished,
though it might introduce latencies, as "udevadmin settle" exits
after *all* events are handled, it's the only way we can fix
the racing in libvirt layer.
See https://bugzilla.redhat.com/show_bug.cgi?id=570359 for more
details.
* src/qemu/qemu_process.c: Taking if (qemuDomainObjEndJob(driver, obj) == 0)
true branch then 'obj' is NULL, virDomainObjIsActive(obj) and
virDomainObjUnref(obj) will dereference NULL pointer.
Signed-off-by: Alex Jia <ajia@redhat.com>
Once virDomainReboot is called for a domain, guest OS initiated shutdown
would always result in reboot instead of shutdown. Only
virDomainShutdown would actually shutd such domain down. That's because
we forgot to reset fakeReboot flag once we asked the domain to reboot.
The commit that prevents disk corruption on domain shutdown
(96fc478417) causes regression with QEMU
0.14.* and 0.15.* because of a regression bug in QEMU that was fixed
only recently in QEMU git. The affected versions of QEMU do not quit on
SIGTERM if started with -no-shutdown, which we use to implement fake
reboot. Since -no-shutdown tells QEMU not to quit automatically on guest
shutdown, domains started using the affected QEMU cannot be shutdown
properly and stay in a paused state.
This patch disables fake reboot feature on such QEMU by not using
-no-shutdown, which makes shutdown work as expected. However,
virDomainReboot will not work in this case and it will report "Requested
operation is not valid: Reboot is not supported with this QEMU binary".
The next patch will add a syntax check that flags this usage in xen
as awkward - while it was valid memory management, it was very hard
to maintain. Swapping to a more traditional allocation may be a bit
slower, but easier to understand.
* src/xen/xend_internal.c (xenDaemonListDomainsOld): Use two-level
allocation, rather than abusing allocation function.
(xenDaemonLookupByUUID): Update caller.
gcc warns when building libvirt 0.9.5 on a 32-bit machine:
qemu/qemu_migration.c: In function 'qemuMigrationToFile':
qemu/qemu_migration.c:2727:38: error: large integer implicitly truncated to unsigned type [-Woverflow]
* src/qemu/qemu_domain.h (QEMU_DOMAIN_FILE_MIG_BANDWIDTH_MAX): Cap
to long when building for 32-bit platform.
Inexplicably the sanlock code all got placed under the GPLv2-only,
so libvirt's use of sanlock introduces a license incompatibility.
The sanlock developers have now rearranged the code such that there
is a 'sanlock_client.so' which is LGPLv2+ while their daemon remains
GPLv2-only. To use the new client library we need to call the new
sanlock_init and sanlock_align APIs instead of sanlock_direct_init
and sanlock_direct_align. These APIs calls are now routed via the
sanlock daemon, instead of doing direct I/O calls to disk.
For all this we require sanlock >= 1.8
* configure.ac: Check for sanlock_client.so instead of sanlock.so
and fix various comments
* libvirt.spec.in: Mandate sanlock >= 1.8
* src/Makefile.am: Link to -lsanlock_client
* src/locking/lock_driver_sanlock.c: Use sanlock_init and
sanlock_align
Libvirt loads the domain conf from status XML if it's running when
starting up. The problem is there is no record of the original conf.
(dom->newDef is NULL here).
So libvirt won't be able to restore the domain conf to original one
when destroying/shutdown. E.g.
1) attach a device without "--persistent"
2) restart libvirtd
3) destroy domain
4) start domain
One will see the the disk still exists.
This patch is to fix the peoblem by assigning persistent domain conf
to dom->newDef if it's NULL and the domain is running.
Virsh man page lists driver types to be used with attach-device
command, but does not specify that those are usable only with the XEN
Hypervisor.
This patch adds statement, that those options specified are applicable
only on the Xen hypervisor and adds option usable with qemu emulator.
This patch also changes type of error returned by QEMU driver if the
user specifies incompatible driver type from VIR_ERR_INTERNAL_ERROR to
VIR_ERR_CONFIG_UNSUPPORTED.
* src/vmx/vmx.c: fix memory leak, 'def' has a initial value 'NULL', so
'goto cleanup' is perfected instead of adding a virConfFree before
'return NULL'.
Signed-off-by: Alex Jia <ajia@redhat.com>
Leak present since introduction of remoteDomainBuildEventGraphics
in commit 987e31e.
* src/remote/remote_driver.c: fix memory leak.
Signed-off-by: Alex Jia <ajia@redhat.com>
For all types of disks other than qcow2, we were requesting that
SELinux labeling visit the new file as if it were qcow2, which
means labeling would try to find the backing files of an empty file.
And for a pre-existing qcow2 disk, we were passing NULL, which meant
that labelling tried to probe the file type (and if probing is
disabled, per the default qemu.conf, this made snapshots fail).
What we really want is to make SELinux labeling visit the new
file as raw; it will later be converted to qcow2 if qemu successfully
made the snapshot.
* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive): Force SELinux labeling
to avoid probe of new file.
For external snapshots to be useful on persistent domains, we must
alter the persistent definition alongside the running definition.
Thanks to the possibility of disk hotplug as well as of edits that
only affect the persistent xml, we can't assume that vm->def and
vm->newDef have the same disk at the same index, so we can only
update the persistent copy if the device destination matches up.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive)
(qemuDomainSnapshotCreateSingleDiskActive): Also affect newDef, if
present.
When libvirt calls virInitialize it creates a thread local
for the virErrorPtr storage, and registers a callback to
cleanup memory when a thread exits. When libvirt is dlclose()d
or otherwise made non-resident, the callback function is
removed from memory, but the thread local may still exist
and if a thread later exists, it will invoke the callback
and SEGV. There may also be other thread locals with callbacks
pointing to libvirt code, so it is in general never safe to
unload libvirt.so from memory once initialized.
To allow dlclose() to succeed, but keep libvirt.so resident
in memory, link with '-z nodelete'. This issue was first
found with the libvirt CIM provider, but can potentially
hit many of the dynamic language bindings which all ultimately
involve dlopen() in some way, either on libvirt.so itself,
or on the glue code for the binding which in turns links
to libvirt
* configure.ac, src/Makefile.am: Ensure libvirt.so is linked
with -z nodelete
* cfg.mk, .gitignore, tests/Makefile.am, tests/shunloadhelper.c,
tests/shunloadtest.c: A test case to unload libvirt while
a thread is still running.
Qemu sends STOP event as part of the shutdown process. Detect such STOP
event and consider shutdown to be reason of emitting such event. That's
the best we can do until qemu provides us the reason directly in STOP
event. This allows us to report shutdown reason for paused state so that
apps can detect domains that failed to finish the shutdown process
(e.g., because qemu is buggy and doesn't exit on SIGTERM or it is
blocked in flushing disk buffers).
Ever since we introduced fake reboot, we call qemuProcessKill as a
reaction to SHUTDOWN event. Unfortunately, qemu doesn't guarantee it
flushed all internal buffers before sending SHUTDOWN, in which case
killing the process forcibly may result in (virtual) disk corruption.
By sending just SIGTERM without SIGKILL we give qemu time to to flush
all buffers and exit. Once qemu exits, we will see an EOF on monitor
connection and tear down the domain. In case qemu ignores SIGTERM or
just hangs there, the process stays running but that's not any different
from a possible hang anytime during the shutdown process so I think it's
just fine.
Also qemu (since 0.14 until it's fixed) has a bug in SIGTERM processing
which causes it not to exit but instead send new SHUTDOWN event and keep
waiting. I think the best we can do is to ignore duplicate SHUTDOWN
events to avoid a SHUTDOWN-SIGTERM loop and leave the domain in paused
state.
When a domain is rebooted using libvirt API, we use fake reboot
consisting of shutting down and resetting the domain. Thus we see a
SHUTDOWN event and set gotShutdown flag. But we never reset it back and
if the domain crashes after it was rebooted this way, we consider it was
a normal shutdown and not a crash.
Commit 4454a9efc7 changed shutoff reason
from VIR_DOMAIN_SHUTOFF_CRASHED to VIR_DOMAIN_SHUTOFF_FAILED in case we
see an unexpected EOF on monitor connection. But FAILED reason is
dedicated for domains that fail to start. CRASHED reason is the right
one to use in this situation.
Libvirt special-cases a specific VIR_ERR_RPC from the remote driver
back into VIR_ERR_NO_SUPPORT on the client, so that clients can
handle missing rpc functions the same whether the hypervisor driver
is local or remote. However, commit c1b22644 introduced a regression:
VIR_FROM_THIS changed from VIR_FROM_REMOTE to VIR_FROM_RPC, so the
special casing no longer works if the server uses the newer error
domain.
* src/rpc/virnetclientprogram.c
(virNetClientProgramDispatchError): Also cater to 0.9.3 and newer.
This patch fixes the bug shown in bugzilla 738778. It's not an nwfilter problem but a connection sharing / closure issue.
https://bugzilla.redhat.com/show_bug.cgi?id=738778
Depending on the speed / #CPUs of the machine you are using you may not see this bug all the time.
* conf/domain_conf.c: allocate memory to def->redirdevs in
virDomainDefParseXML such as VIR_ALLOC_N(def->redirdevs, n),
however, virDomainDefFree(def) hasn't released these memory.
* Detected in valgrind run:
==19820== 209 (16 direct, 193 indirect) bytes in 1 blocks are definitely lost in loss record 25 of 26
==19820== at 0x4A04A28: calloc (vg_replace_malloc.c:467)
==19820== by 0x4A13AF: virAllocN (memory.c:129)
==19820== by 0x4D4A0E: virDomainDefParseXML (domain_conf.c:7258)
==19820== by 0x4D4C93: virDomainDefParseNode (domain_conf.c:7512)
==19820== by 0x4D562F: virDomainDefParse (domain_conf.c:7465)
==19820== by 0x415863: testCompareXMLToXMLFiles (qemuxml2xmltest.c:35)
==19820== by 0x415982: testCompareXMLToXMLHelper (qemuxml2xmltest.c:80)
==19820== by 0x416D31: virtTestRun (testutils.c:140)
==19820== by 0x415604: mymain (qemuxml2xmltest.c:192)
==19820== by 0x416437: virtTestMain (testutils.c:689)
==19820== by 0x3CA7A1ECDC: (below main) (in /lib64/libc-2.12.so)
==19820==
==19820== LEAK SUMMARY:
==19820== definitely lost: 16 bytes in 1 blocks
==19820== indirectly lost: 193 bytes in 5 blocks
==19820== possibly lost: 0 bytes in 0 blocks
==19820== still reachable: 1,054 bytes in 21 blocks
* How to reproduce?
% valgrind -v --leak-check=full ./tests/qemuxml2xmltest
Signed-off-by: Alex Jia <ajia@redhat.com>
Mac OS X 10.6. Snow Leopard and probably other do not provide a mkfs
command to create filesystems. Macro MKFS then remained undefined and
did not provide any substitute, so that build failed on a missing
argument.
Struct virStoragePoolProbeResult was compiled in conditionaly, but
virStorageBackendFileSystemProbe used it unconditionaly. This patch
exempts the struct from conditional include.
Documentation did not specify, that some permissions are required on
target path for coredump for the user running the hypervisor.
Diff to v1:
- reword statements
The new doc text had a few readability issues. Also, the
monitor command text copied a bit too much from the attach case.
* src/libvirt-qemu.c (virDomainQemuMonitorCommand)
(virDomainQemuAttach): Fix typos and grammar.
Adjust qemuMigrationRun() to use migMaxBandwidth in qemuDomainObjPrivate
structure when setting qemu migration speed. Caller-specified 'resource'
parameter overrides migMaxBandwidth.
The qemu migration speed default is 32MiB/s as defined in migration.c
/* Migration speed throttling */
static int64_t max_throttle = (32 << 20);
There's no need to throttle migration when targeting a file, so set migration
speed to unlimited prior to migration, and restore to libvirt default value
after migration.
Default units is MB for migrate_set_speed monitor command, so
(INT64_MAX / (1024 * 1024)) is used for unlimited migration speed.
Tested with both json and text monitors.
Now that migration speed is stored in qemuDomainObjPrivate structure,
save the new value when invoking qemuDomainMigrateSetMaxSpeed().
Allow setting migration speed on inactive domain too.
The maximum bandwidth that can be consumed when migrating a domain
is better classified as an operational vs configuration parameter of
the dommain. As such, store this parameter in qemuDomainObjPrivate
structure.
Commit c246b025 added new functions, but forgot to export them,
resulting in a build failure when using modules.
* src/libvirt_private.syms (network.h): Export new functions.
Commit 973fcd8f introduced the ability for qemu to reject snapshot
reversion on an ABI incompatibility; but the very example that was
first proposed on-list[1] as a demonstration of an ABI incompatibility,
namely that of changing the max memory allocation, was not being
checked for, resulting in a cryptic failure when running with larger
max mem than what the snapshot was created with:
error: operation failed: Error -22 while loading VM state
This commit merely protects the three variables within mem that are
referenced by qemu_command.c, rather than all 7 (the other 4 variables
affect cgroup handling, but as far as I can tell, have no visible effect
to the qemu guest). This also affects migration and save file handling,
which are other places where we perform ABI compatibility checks.
[1] https://www.redhat.com/archives/libvir-list/2010-December/msg00331.html
* src/conf/domain_conf.c (virDomainDefCheckABIStability): Add
memory sizing checks.
Commit 498d783 cleans up some of virtual file names for parsing strings
in memory. This patch cleans up (hopefuly) the rest forgotten by the
first patch.
This patch also changes all of the previously modified "filenames" to
valid URI's replacing spaces for underscores.
Changes to v1:
- Replace all spaces for underscores, so that the strings form valid
URI's
- Replace spaces in places changed by commit 498d783
Regression introduced in commit 3881a470, due to an improper rebase
of a cleanup written beforehand but only applied after a rebased of
a refactoring that created a new function in commit 25fb3ef.
Also avoids passing NULL to printf %s.
* src/qemu/qemu_driver.c: In qemuDomainSnapshotForEachQcow2()
it free up the memory of qemu_driver->qemuImgBinary in the
cleanup tag which leads to the garbage value of qemuImgBinary
in qemu_driver struct and libvirtd crash when running
"virsh snapshot-create" command a second time.
Signed-off-by: Eric Blake <eblake@redhat.com>
'+' in strings get translated to ' ' when editing domains.
While xenDaemonDomainCreateXML() did URL-escape the sexpr,
xenDaemonDomainDefineXML() did not.
Remove the explicit urlencode() in xenDaemonDomainCreateXML() and add
the direct encoding calls to xend_op_ext() because it calls xend_post()
which uses "Content-Type: application/x-www-form-urlencoded". According
to <http://www.w3.org/TR/html4/interact/forms.html#h-17.13.4.1> this
requires all parameters to be url-encoded as specified in rfc1738.
Notice: virBufferAsprintf(..., "%s=%s", ...) is again replaced by three
calls to virBufferURIEncodeString() and virBufferAddChar() because '='
is a "reserved" character, which would get escaped by
virBufferURIEncodeString(), which - by the way - escapes anything not
c_isalnum().
Signed-off-by: Philipp Hahn <hahn@univention.de>
While parsing XML strings from memory, the previous convention in
libvirt was to set the virtual file name to "domain.xml" or something
similar. This could potentialy trick the user into looking for a file
named domain.xml on the disk in an attempt to fix the error.
This patch changes these filenames to something that can't be as easily
confused for a valid filename.
Examples of error messages:
---------------------------
Error while loading file from disk:
15:07:59.015: 527: error : catchXMLError:709 : /path/to/domain.xml:1: StartTag: invalid element name
<domain type='kvm'><
--------------------^
Error while parsing definition in memory:
15:08:43.581: 525: error : catchXMLError:709 : (domain definition):2: error parsing attribute name
<name>vm1</name>
--^
Regression introduced in commit d6f6b2d194. Running
'virsh snapshot-create dom' would mistakenly report that
disks can only be specified for disk snapshots.
* src/conf/domain_conf.c (virDomainSnapshotDefParseString): Only
give error about no disk support when <disk> was found.
These functions access internals of the opaque object, and do
not need any rpc counterpart. It could be argued that we should
have provided these when snapshot objects were first introduced,
since all the other vir*Ptr objects have at least a GetName accessor.
* include/libvirt/libvirt.h.in (virDomainSnapshotGetName)
(virDomainSnapshotGetDomain, virDomainSnapshotGetConnect): Declare.
* src/libvirt.c (virDomainSnapshotGetName)
(virDomainSnapshotGetDomain, virDomainSnapshotGetConnect): New
functions.
* src/libvirt_public.syms: Export them.
Xen PV domU's have no PCI bus. node_device_udev.c calls pci_system_init
which looks for /sys/bus/pci. If it does not find /sys/bus/pci (which it
won't in a Xen PV domU) it returns unsuccesfully (ENOENT), which libvirt
considers fatal. This makes libvirt unusable in this environment, even
though there are plenty of valid virtualisation options that work
there (LXC, UML, and QEmu spring to mind)
https://bugzilla.redhat.com/show_bug.cgi?id=709471
Signed-off-by: Soren Hansen <soren@linux2go.dk>
* src/rpc/virnettlscontext.c: fix memory leak on
virNetTLSContextValidCertificate.
* Detected in valgrind run:
==25667==
==25667== 6,085 (44 direct, 6,041 indirect) bytes in 1 blocks are definitely
lost in loss record 326 of 351
==25667== at 0x4005447: calloc (vg_replace_malloc.c:467)
==25667== by 0x4F2791F3: _asn1_add_node_only (structure.c:53)
==25667== by 0x4F27997A: _asn1_copy_structure3 (structure.c:421)
==25667== by 0x4F276A50: _asn1_append_sequence_set (element.c:144)
==25667== by 0x4F2743FF: asn1_der_decoding (decoding.c:1194)
==25667== by 0x4F22B9CC: gnutls_x509_crt_import (x509.c:229)
==25667== by 0x805274B: virNetTLSContextCheckCertificate
(virnettlscontext.c:1009)
==25667== by 0x804DE32: testTLSSessionInit (virnettlscontexttest.c:693)
==25667== by 0x804F14D: virtTestRun (testutils.c:140)
==25667==
==25667== 23,188 (88 direct, 23,100 indirect) bytes in 11 blocks are definitely
lost in loss record 346 of 351
==25667== at 0x4005447: calloc (vg_replace_malloc.c:467)
==25667== by 0x4F22B841: gnutls_x509_crt_init (x509.c:50)
==25667== by 0x805272B: virNetTLSContextCheckCertificate
(virnettlscontext.c:1003)
==25667== by 0x804DDD1: testTLSSessionInit (virnettlscontexttest.c:673)
==25667== by 0x804F14D: virtTestRun (testutils.c:140)
* How to reproduce?
% cd libvirt && ./configure && make && make -C tests valgrind
or
% valgrind -v --leak-check=full ./tests/virnettlscontexttest
Signed-off-by: Alex Jia <ajia@redhat.com>
Variable 'l_disk' initialized to a null pointer value, control jumps to 'case
VIR_DOMAIN_DISK_DEVICE_DISK and then taking false branch, Within the expansion
of the macro 'libxlError': Field access results in a dereference of a null
pointer (loaded from variable 'l_disk').
* src/libxl/libxl_driver.c: Field access results in a dereference of a null
pointer (loaded from variable 'l_disk')
Signed-off-by: Alex Jia <ajia@redhat.com>
Regression introduced in commit 89b6284fd, due to an incorrect
conversion to the new means of converting disk names back to
the correct object.
* src/qemu/qemu_driver.c (qemuDomainGetBlockInfo): Avoid NULL deref.
Although we were initializing worker threads during pool creating,
we missed this during virThreadPoolSendJob. This bug led to segmenation
fault as worker thread free() given argument.
This patch enables modifying network device configuration using the
virUpdateDeviceFlags API method. Matching of devices is accomplished
using MAC addresses.
While updating live configuration of a running domain, the user is
allowed only to change link state of the interface. Additional
modifications may be added later. For now the code checks for
unsupported changes and thereafter changes the link state, if
applicable.
When updating persistent configuration of guest's network interface the
whole configuration (except for the MAC address) may be modified and
is stored for the next startup.
* src/qemu/qemu_driver.c - Add dispatching of virUpdateDevice for
network devices update (live/config)
* src/qemu/qemu_hotplug.c - add setting of initial link state on live
device addition
- add function to change network device
configuration. By now it supports only
changing of link state
* src/qemu/qemu_hotplug.h - Headers to above functions
* src/qemu/qemu_process.c - set link states before virtual machine
start. Qemu does not support setting of
this on the command line.
This patch adds handlers for modification of guest's interface
link state. Both HMP and QMP commands are supported, but as the
link state functionality is from the beginning supported in QMP
the HMP code will probably never be used.
A new element is introduced to XML that allows to control
state of virtual network interfaces in hypervisors.
Live modification of the link state allows networking tools
propagate topology changes to guest OS or testing of
scenarios in complex (virtual) networks.
This patch adds elements to XML grammars and parsing and generating
code.
This patch adds functions to compare structures containing network
device configuration for equality. They serve for the purpose of
disallowing unsupported changes to live network devices.
This patch modifies error handling function for the XML parser provided
by libxml2.
Originaly only a line number and error message were logged. With this
new error handler function, the user is provided with a more complex
description of the parsing error.
Context of the error is printed in libXML2 style and filename of the
file, that caused the error is printed. Example of an parse error:
13:41:36.262: 16032: error : catchXMLError:706 :
/etc/libvirt/qemu/rh_bad.xml:58: Opening and ending tag mismatch: name
line 2 and domain
</domain>
---------^
Context of the error gives the user hints that may help to quickly
locate a corrupt xml file.
fixes BZs:
----------
Bug 708735 - [RFE] Show column and line on XML parsing error
https://bugzilla.redhat.com/show_bug.cgi?id=708735
Bug 726771 - libvirt does not specify problem file if persistent xml is
invalid
https://bugzilla.redhat.com/show_bug.cgi?id=726771
It is important to be able to attach USB redirected devices to a
particular controller (one that supports USB2 for instance).
Without this patch, only the default bus was used.
<redirdev bus='usb' type='spicevmc'>
<address type='usb' bus='0' port='4'/>
</redirdev>
The mainly changes are:
1) Update qemuMonitorGetBlockStatsInfo and it's children (Text/JSON)
functions to return the value of new latency fields.
2) Add new function qemuMonitorGetBlockStatsParamsNumber, which is
to count how many parameters the underlying QEMU supports.
3) Update virDomainBlockStats in src/qemu/qemu_driver.c to be
compatible with the changes by 1).
If libvirt daemon gets restarted and there is (at least) one
unresponsive qemu, the startup procedure hangs up. This patch creates
one thread per vm in which we try to reconnect to monitor. Therefore,
blocking in one thread will not affect other APIs.
This patch creates an optional BeginJob queue size limit. When
active, all other attempts above level will fail. To set this
feature assign desired value to max_queued variable in qemu.conf.
Setting it to 0 turns it off.
This patch annotates APIs with low or high priority.
In low set MUST be all APIs which might eventually access monitor
(and thus block indefinitely). Other APIs may be marked as high
priority. However, some must be (e.g. domainDestroy).
For high priority calls (HPC), there are some high priority workers
(HPW) created in the pool. HPW can execute only HPC, although normal
worker can process any call regardless priority. Therefore, only those
APIs which are guaranteed to end in reasonable small amount of time
can be marked as HPC.
The size of this HPC pool is static, because HPC are expected to end
quickly, therefore jobs assigned to this pool will be served quickly.
It can be configured in libvirtd.conf via prio_workers variable.
Default is set to 5.
To mark API with low or high priority, append priority:{low|high} to
it's comment in src/remote/remote_protocol.x. This is similar to
autogen|skipgen. If not marked, the generator assumes low as default.
With this, it is now possible to create external snapshots even
when SELinux is enforcing, and to protect the new file with a
lock manager.
* src/qemu/qemu_driver.c
(qemuDomainSnapshotCreateSingleDiskActive): Create and register
new file with proper permissions and locks.
(qemuDomainSnapshotCreateDiskActive): Update caller.
Lots of earlier patches led up to this point - the qemu snapshot_blkdev
monitor command can now be controlled by libvirt! Well, insofar as
SELinux doesn't prevent qemu from open(O_CREAT) on the files. There's
still some followup work before things work with SELinux enforcing,
but this patch is big enough to post now.
There's still room for other improvements, too (for example, taking a
disk snapshot of an inactive domain, by using qemu-img for both internal
and external snapshots; wiring up delete and revert control, including
additional flags from my RFC; supporting active QED disk snapshots;
supporting per-storage-volume snapshots such as LVM or btrfs snapshots;
etc.). But this patch is the one that proves the new XML works!
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Wire in
active disk snapshots.
(qemuDomainSnapshotDiskPrepare)
(qemuDomainSnapshotCreateDiskActive)
(qemuDomainSnapshotCreateSingleDiskActive): New functions.
No one uses this yet, but it will be important once
virDomainSnapshotCreateXML learns a VIR_DOMAIN_SNAPSHOT_DISK_ONLY
flag, and the xml allows passing in the new file names.
* src/qemu/qemu_monitor.h (qemuMonitorDiskSnapshot): New prototype.
* src/qemu/qemu_monitor_text.h (qemuMonitorTextDiskSnapshot):
Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDiskSnapshot):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorDiskSnapshot): New
function.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDiskSnapshot):
Likewise.
Snapshots alter the set of disk image files opened by qemu, so
they must be audited. But they don't involve a full disk definition
structure, just the new filename. Make the next patch easier by
refactoring the audit routines to just operate on file name.
* src/conf/domain_audit.h (virDomainAuditDisk): Update prototype.
* src/conf/domain_audit.c (virDomainAuditDisk): Act on strings,
not definition structures.
(virDomainAuditStart): Update caller.
* src/qemu/qemu_hotplug.c (qemuDomainChangeEjectableMedia)
(qemuDomainAttachPciDiskDevice, qemuDomainAttachSCSIDisk)
(qemuDomainAttachUsbMassstorageDevice)
(qemuDomainDetachPciDiskDevice, qemuDomainDetachDiskDevice):
Likewise.
My RFC for snapshot support [1] proposes several rules for when it is
safe to delete or revert to an external snapshot, predicated on
the existence of new API flags. These will be incrementally added
in future patches, but until then, blindly mishandling a disk
snapshot risks corrupting internal state, so it is better to
outright reject the attempts until the other pieces are in place,
thus incrementally relaxing the restrictions added in this patch.
[1] https://www.redhat.com/archives/libvir-list/2011-August/msg00361.html
* src/qemu/qemu_driver.c (qemuDomainSnapshotCountExternal): New
function.
(qemuDomainUndefineFlags, qemuDomainSnapshotDelete): Use it to add
safety valve.
(qemuDomainRevertToSnapshot, qemuDomainSnapshotCreateXML): Add safety
valve.
Prior to this patch, <domainsnapshot>/<disks> was ignored. This
changes it to be an error unless an explicit disk snapshot is
requested (a future patch may relax things if it turns out to
be useful to have a <disks> specification alongside a system
checkpoint).
* include/libvirt/libvirt.h.in
(VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY): New flag.
* src/libvirt.c (virDomainSnapshotCreateXML): Document it.
* src/esx/esx_driver.c (esxDomainSnapshotCreateXML): Disk
snapshots not supported yet.
* src/vbox/vbox_tmpl.c (vboxDomainSnapshotCreateXML): Likewise.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Likewise.
I got confused when 'virsh domblkinfo dom disk' required the
path to a disk (which can be ambiguous, since a single file
can back multiple disks), rather than the unambiguous target
device name that I was using in disk snapshots. So, in true
developer fashion, I went for the best of both worlds - all
interfaces that operate on a disk (aka block) now accept
either the target name or the unambiguous path to the backing
file used by the disk.
* src/conf/domain_conf.h (virDomainDiskIndexByName): Add
parameter.
(virDomainDiskPathByName): New prototype.
* src/libvirt_private.syms (domain_conf.h): Export it.
* src/conf/domain_conf.c (virDomainDiskIndexByName): Also allow
searching by path, and decide whether ambiguity is okay.
(virDomainDiskPathByName): New function.
(virDomainDiskRemoveByName, virDomainSnapshotAlignDisks): Update
callers.
* src/qemu/qemu_driver.c (qemudDomainBlockPeek)
(qemuDomainAttachDeviceConfig, qemuDomainUpdateDeviceConfig)
(qemuDomainGetBlockInfo, qemuDiskPathToAlias): Likewise.
* src/qemu/qemu_process.c (qemuProcessFindDomainDiskByPath):
Likewise.
* src/libxl/libxl_driver.c (libxlDomainAttachDeviceDiskLive)
(libxlDomainDetachDeviceDiskLive, libxlDomainAttachDeviceConfig)
(libxlDomainUpdateDeviceConfig): Likewise.
* src/uml/uml_driver.c (umlDomainBlockPeek): Likewise.
* src/xen/xend_internal.c (xenDaemonDomainBlockPeek): Likewise.
* docs/formatsnapshot.html.in: Update documentation.
* tools/virsh.pod (domblkstat, domblkinfo): Likewise.
* docs/schemas/domaincommon.rng (diskTarget): Tighten pattern on
disk targets.
* docs/schemas/domainsnapshot.rng (disksnapshot): Update to match.
* tests/domainsnapshotxml2xmlin/disk_snapshot.xml: Update test.
Adds an optional element to <domainsnapshot>, which will be used
to give user control over external snapshot filenames on input,
and specify generated filenames on output.
For now, no driver accepts this element; that will come later.
<domainsnapshot>
...
<disks>
<disk name='vda' snapshot='no'/>
<disk name='vdb' snapshot='internal'/>
<disk name='vdc' snapshot='external'>
<driver type='qcow2'/>
<source file='/path/to/new'/>
</disk>
</disks>
<domain>
...
<devices>
<disk ...>
<driver name='qemu' type='raw'/>
<target dev='vdc'/>
<source file='/path/to/old'/>
</disk>
</devices>
</domain>
</domainsnapshot>
* src/conf/domain_conf.h (_virDomainSnapshotDiskDef): New type.
(_virDomainSnapshotDef): Add new elements.
(virDomainSnapshotAlignDisks): New prototype.
* src/conf/domain_conf.c (virDomainSnapshotDiskDefClear)
(virDomainSnapshotDiskDefParseXML, disksorter)
(virDomainSnapshotAlignDisks): New functions.
(virDomainSnapshotDefParseString): Parse new fields.
(virDomainSnapshotDefFree): Clean them up.
(virDomainSnapshotDefFormat): Output them.
* src/libvirt_private.syms (domain_conf.h): Export new function.
* docs/schemas/domainsnapshot.rng (domainsnapshot, disksnapshot):
Add more xml.
* docs/formatsnapshot.html.in: Document it.
* tests/domainsnapshotxml2xmlin/disk_snapshot.xml: New test.
* tests/domainsnapshotxml2xmlout/disk_snapshot.xml: Update.
In order to distinguish disk snapshots from system checkpoints, a
new state value that is only valid for snapshots is helpful.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_LAST): New placeholder.
* src/conf/domain_conf.h (virDomainSnapshotState): New enum mapping.
(VIR_DOMAIN_DISK_SNAPSHOT): New internal enum value.
* src/conf/domain_conf.c (virDomainState): Use placeholder.
(virDomainSnapshotState): Extend mapping by one for use in snapshot.
(virDomainSnapshotDefParseString, virDomainSnapshotDefFormat):
Handle new state.
(virDomainObjSetState, virDomainStateReasonToString)
(virDomainStateReasonFromString): Avoid compiler warnings.
* tools/virsh.c (vshDomainState, vshDomainStateReasonToString):
Likewise.
* src/libvirt_private.syms (domain_conf.h): Export new functions.
* docs/schemas/domainsnapshot.rng: Tighten state definition.
* docs/formatsnapshot.html.in: Document it.
* tests/domainsnapshotxml2xmlout/disk_snapshot.xml: New test.
Since a snapshot is fully recoverable, it is useful to have a
snapshot as a means of hibernating a guest, then reverting to
the snapshot to wake the guest up. This mode of usage is
similar to 'virsh save/virsh restore', except that virsh
save uses an external file while virsh snapshot keeps the
vm state internal to a qcow2 file. However, it only works on
persistent domains.
In the usage pattern of snapshot/revert for hibernating a guest,
there is no need to keep the guest running between the two points
in time, especially since that would generate runtime state that
would just be discarded. Add a flag to make it possible to
stop the domain after the snapshot has completed.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_SNAPSHOT_CREATE_HALT):
New flag.
* src/libvirt.c (virDomainSnapshotCreateXML): Document it.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML)
(qemuDomainSnapshotCreateActive): Implement it.
Reverting to a state prior to an external snapshot risks
corrupting any other branches in the snapshot hierarchy that
were using the snapshot as a read-only backing file. So
disk snapshot code will default to preventing reverting to
a snapshot that has any children, meaning that deleting just
the children of a snapshot becomes a useful operation in
preparing that snapshot for being a future reversion target.
The code for the new flag is simple - it's one less deletion,
plus a tweak to keep the current snapshot correct.
* include/libvirt/libvirt.h.in
(VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN_ONLY): New flag.
* src/libvirt.c (virDomainSnapshotDelete): Document it, and
enforce mutual exclusion.
* src/qemu/qemu_driver.c (qemuDomainSnapshotDelete): Implement
it.
The previous patch introduced new config, but if a hypervisor does
not support that new config, someone can write XML that does not
behave as documented. This prevents some of those cases by
explicitly rejecting transient disks for several hypervisors.
Disk snapshots will require a new flag to actually affect a snapshot
creation, so there's not much to reject there.
* src/qemu/qemu_command.c (qemuBuildDriveStr): Reject transient
disks for now.
* src/libxl/libxl_conf.c (libxlMakeDisk): Likewise.
* src/xenxs/xen_sxpr.c (xenFormatSxprDisk): Likewise.
* src/xenxs/xen_xm.c (xenFormatXMDisk): Likewise.
As discussed here:
https://www.redhat.com/archives/libvir-list/2011-August/msg00361.htmlhttps://www.redhat.com/archives/libvir-list/2011-August/msg00552.html
Adds snapshot attribute and transient sub-element:
<devices>
<disk type=... snapshot='no|internal|external'>
...
<transient/>
</disk>
</devices>
* docs/schemas/domaincommon.rng (snapshot): New define.
(disk): Add snapshot and persistent attributes.
* docs/formatdomain.html.in: Document them.
* src/conf/domain_conf.h (virDomainDiskSnapshot): New enum.
(_virDomainDiskDef): New fields.
* tests/qemuxml2argvdata/qemuxml2argv-disk-transient.xml: New
test of rng, no args counterpart until qemu support is complete.
* tests/qemuxml2argvdata/qemuxml2argv-disk-snapshot.args: New
file, snapshot attribute does not affect args.
* tests/qemuxml2argvdata/qemuxml2argv-disk-snapshot.xml: Likewise.
* tests/qemuxml2argvtest.c (mymain): Run new test.
Fix bug #611823 storage driver should prohibit pools with duplicate
underlying storage.
Add internal API virStoragePoolSourceFindDuplicate() to do uniqueness
check based on source location infomation for pool type.
* AUTHORS: add Lei Li
At least Xen-3.4.3 translates the /vm/localtime SXPR value to
/domain/platform/localtime and /domain/image/{linux,hvm}/localtime when
the domain is defined. When reading back that information libvirt only
handles HVM domains, but not PV domains: This results in libvirtd always
returning
<clock offset="utc"/>
while Xend used (localtime 1).
For PV domains use /domain/image/linux/localtime.
When reverting to a snapshot, the inactive domain configuration
has to be rolled back to what it was at the time of the snapshot.
Additionally, if the VM is active and the snapshot was active,
this now adds a failure if the two configurations are ABI
incompatible, rather than risking qemu confusion.
A future patch will add a VIR_DOMAIN_SNAPSHOT_FORCE flag, which
will be required for two risky code paths - reverting to an
older snapshot that lacked full domain information, and reverting
from running to a live snapshot that requires starting a new qemu
process. Any reverting that stops a running vm is also a form
of data loss (discarding the current running state to go back in
time), but as that is what reversion usually implies, it is
probably not worth requiring a force flag.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Copy out
domain.
(qemuDomainSnapshotCreateXML, qemuDomainRevertToSnapshot): Perform
ABI compatibility checks.
Commit 69278878 fixed one direction of arbitrarily-named snapshots,
but not the round trip path. While auditing domain_conf, I found
a couple other instances that weren't escaping arbitrary strings.
* src/conf/domain_conf.c (virDomainFSDefFormat)
(virDomainGraphicsListenDefFormat, virDomainSnapshotDefFormat):
Escape arbitrary strings.
Just like VM saved state images (virsh save), snapshots MUST
track the inactive domain xml to detect any ABI incompatibilities.
The indentation is not perfect, but functionality comes before form.
Later patches will actually supply a full domain; for now, this
wires up the storage to support one, but doesn't ever generate one
in dumpxml output.
Happily, libvirt.c was already rejecting use of VIR_DOMAIN_XML_SECURE
from read-only connections, even though before this patch, there was
no information to be secured by the use of that flag.
And while we're at it, mark the libvirt snapshot metadata files
as internal-use only.
* src/libvirt.c (virDomainSnapshotGetXMLDesc): Document flag.
* src/conf/domain_conf.h (_virDomainSnapshotDef): Add member.
(virDomainSnapshotDefParseString, virDomainSnapshotDefFormat):
Update signature.
* src/conf/domain_conf.c (virDomainSnapshotDefFree): Clean up.
(virDomainSnapshotDefParseString): Optionally parse domain.
(virDomainSnapshotDefFormat): Output full domain.
* src/esx/esx_driver.c (esxDomainSnapshotCreateXML)
(esxDomainSnapshotGetXMLDesc): Update callers.
* src/vbox/vbox_tmpl.c (vboxDomainSnapshotCreateXML)
(vboxDomainSnapshotGetXMLDesc): Likewise.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML)
(qemuDomainSnapshotLoad, qemuDomainSnapshotGetXMLDesc)
(qemuDomainSnapshotWriteMetadata): Likewise.
* docs/formatsnapshot.html.in: Rework doc example.
Based on a patch by Philipp Hahn.
Minor semantic change - allow domain xml to be generated in place
within a larger buffer, rather than having to go through a
temporary string.
* src/conf/domain_conf.c (virDomainDefFormatInternal): Add
parameter.
(virDomainDefFormat, virDomainObjFormat): Update callers.
Migration is another case of stranding metadata. And since
snapshot metadata is arbitrarily large, there's no way to
shoehorn it into the migration cookie of migration v3.
This patch consolidates two existing locations for migration
validation into one helper function, then enhances that function
to also do the new checks. If we could always trust the source
to validate migration, then the destination would not have to
do anything; but since older servers that did not do checking
can migrate to newer destinations, we have to repeat some of
the same checks on the destination; meanwhile, we want to
detect failures as soon as possible. With migration v2, this
means that validation will reject things at Prepare on the
destination if the XML exposes the problem, otherwise at Perform
on the source; with migration v3, this means that validation
will reject things at Begin on the source, or if the source
is old and the XML exposes the problem, then at Prepare on the
destination.
This patch is necessarily over-strict. Once a later patch
properly handles auto-cleanup of snapshot metadata on the
death of a transient domain, then the only time we actually
need snapshots to prevent migration is when using the
--undefinesource flag on a persistent source domain.
It is possible to recreate snapshot metadata on the destination
with VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE and
VIR_DOMAIN_SNAPSHOT_CREATE_CURRENT. But for now, that is limited,
since if we delete the snapshot metadata prior to migration,
then we won't know the name of the current snapshot to pass
along; and if we delete the snapshot metadata after migration
and use the v3 migration cookie to pass along the name of the
current snapshot, then we need a way to bypass the fact that
this patch refuses migration with snapshot metadata present.
So eventually, we may have to introduce migration protocol v4
that allows feature negotiation and an arbitrary number of
handshake exchanges, so as to pass as many rpc calls as needed
to transfer all the snapshot xml hierarchy.
But all of that is thoughts for the future; for now, the best
course of action is to quit early, rather than get into a
funky state of stale metadata; then relax restrictions later.
* src/qemu/qemu_migration.h (qemuMigrationIsAllowed): Make static.
* src/qemu/qemu_migration.c (qemuMigrationIsAllowed): Alter
signature, and allow checks for both outgoing and incoming.
(qemuMigrationBegin, qemuMigrationPrepareAny)
(qemuMigrationPerformJob): Update callers.
A nice benefit of deleting all snapshots at undefine time is that
you don't have to do any reparenting or subtree identification - since
everything goes, this is an O(n) process, whereas using multiple
virDomainSnapshotDelete calls would be O(n^2) or worse. But it is
only doable for snapshot metadata, where we are in control of the
data being deleted; for the actual snapshots, there's too much
likelihood of something going wrong, and requiring even more API
calls to figure out what failed in the meantime, so callers are
better off deleting the snapshot data themselves one snapshot at
a time where they can deal with failures as they happen.
* src/qemu/qemu_driver.c (qemuDomainUndefineFlags): Honor new flags.
As more clients start to want to know this information, doing
a PATH stat walk and malloc for every client adds up.
We are only caching the location, not the capabilities, so even
if qemu-img is updated in the meantime, it will still probably
live in the same location. So there is no need to worry about
clearing this particular cache.
* src/qemu/qemu_conf.h (qemud_driver): Add member.
* src/qemu/qemu_driver.c (qemudShutdown): Cleanup.
(qemuFindQemuImgBinary): Add an argument, and cache result.
(qemuDomainSnapshotForEachQcow2, qemuDomainSnapshotDiscard)
(qemuDomainSnapshotCreateInactive, qemuDomainSnapshotRevertInactive)
(qemuDomainSnapshotCreateXML, qemuDomainRevertToSnapshot): Update
callers.
Just as leaving managed save metadata behind can cause problems
when creating a new domain that happens to collide with the name
of the just-deleted domain, the same is true of leaving any
snapshot metadata behind. For safety sake, extend the semantic
change of commit b26a9fa9 to also cover snapshot metadata as a
reason to reject undefining an inactive domain. A future patch
will make sure that shutdown of a transient domain automatically
deletes snapshot metadata (whether by destroy, shutdown, or
guest-initiated action). Management apps of transient domains
should take care to capture xml of snapshots, if it is necessary
to recreate the snapshot metadata on a later transient domain
with the same name and uuid.
This also documents a new flag that hypervisors can choose to
support as a shortcut for taking care of the metadata as part of
the undefine process; however, nontrivial driver support for these
flags will be deferred to future patches.
Note that ESX and VBox can never be transient; therefore, they
do not have to worry about automatic cleanup after shutdown
(the persistent domain still remains); likewise they never
store snapshot metadata, so the undefine flag is trivial.
The nontrivial work remaining is thus in the qemu driver.
* include/libvirt/libvirt.h.in
(VIR_DOMAIN_UNDEFINE_SNAPSHOTS_METADATA): New flag.
* src/libvirt.c (virDomainUndefine, virDomainUndefineFlags):
Document new limitations and flag.
* src/esx/esx_driver.c (esxDomainUndefineFlags): Trivial
implementation.
* src/vbox/vbox_tmpl.c (vboxDomainUndefineFlags): Likewise.
* src/qemu/qemu_driver.c (qemuDomainUndefineFlags): Enforce
the limitations.
Redefining a qemu snapshot requires a bit of a tweak to the common
snapshot parsing code, but the end result is quite nice.
Be careful that redefinitions do not introduce circular parent
chains. Also, we don't want to allow conversion between online
and offline existing snapshots. We could probably do some more
validation for snapshots that don't already exist to make sure
they are even feasible, by parsing qemu-img output, but that
can come later.
* src/conf/domain_conf.h (virDomainSnapshotParseFlags): New
internal flags.
* src/conf/domain_conf.c (virDomainSnapshotDefParseString): Alter
signature to take internal flags.
* src/esx/esx_driver.c (esxDomainSnapshotCreateXML): Update caller.
* src/vbox/vbox_tmpl.c (vboxDomainSnapshotCreateXML): Likewise.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Support
new public flags.
Supporting NO_METADATA on snapshot creation is interesting - we must
still return a valid opaque snapshot object, but the user can't get
anything out of it (unless we add a virDomainSnapshotGetName()),
since it is no longer registered with the domain.
Also, virsh now tries to query for secure xml, in anticipation of
when we store <domain> xml inside <domainsnapshot>; for now, we
can trivially support it, since we have nothing secure.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Support
new flag.
(qemuDomainSnapshotGetXMLDesc): Trivially support VIR_DOMAIN_XML_SECURE.
The first two flags are essential for being able to replicate
snapshot hierarchies across multiple hosts, which will come in
handy for supervised migrations. It also allows a management app
to take a snapshot of a transient domain, save the metadata, stop
the domain, recreate a new transient domain by the same name,
redefine the snapshot, then revert to it.
This is not quite as convenient as leaving the metadata behind
after a domain is no longer around, but doing that has a few
problems: 1. the libvirt API can only delete snapshot metadata
if there is a valid domain handle to use to get to that snapshot
object - if stale data is left behind without a domain, there is
no way to request that the data be cleaned up. 2. creating a new
domain with the same name but different uuid than the older
domain where a snapshot existed cannot use the older snapshot
data; this risks confusing libvirt, and forbidding the stale
data is similar to the recent patch to forbid stale managed save.
The first two flags might be useful on hypervisors with no metadata,
but only for modifying the notion of the current snapshot;
however, I don't know how to do that for ESX or VBox.
The third flag is a convenience option, to combine a creation with
a delete metadata into one step. It is trivial for hypervisors
with no metadata.
The qemu changes will be involved enough to warrant a separate patch.
* include/libvirt/libvirt.h.in
(VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE)
(VIR_DOMAIN_SNAPSHOT_CREATE_CURRENT)
(VIR_DOMAIN_SNAPSHOT_CREATE_NO_METADATA): New flags.
* src/libvirt.c (virDomainSnapshotCreateXML): Document them, and
enforce mutual exclusion.
* src/esx/esx_driver.c (esxDomainSnapshotCreateXML): Trivial
implementation.
* src/vbox/vbox_tmpl.c (vboxDomainSnapshotCreateXML): Likewise.
* docs/formatsnapshot.html.in: Document re-creation.
To make it easier to know when undefine will fail because of existing
snapshot metadata, we need to know how many snapshots have metadata.
Also, it is handy to filter the list of snapshots to just those that
have no parents; document that flag now, but implement it in later patches.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_SNAPSHOT_LIST_ROOTS)
(VIR_DOMAIN_SNAPSHOT_LIST_METADATA): New flags.
* src/libvirt.c (virDomainSnapshotNum)
(virDomainSnapshotListNames): Document them.
* src/esx/esx_driver.c (esxDomainSnapshotNum)
(esxDomainSnapshotListNames): Implement trivial flag.
* src/vbox/vbox_tmpl.c (vboxDomainSnapshotNum)
(vboxDomainSnapshotListNames): Likewise.
* src/qemu/qemu_driver.c (qemuDomainSnapshotNum)
(qemuDomainSnapshotListNames): Likewise.
Adding this was trivial compared to the previous patch for fixing
qemu snapshot deletion in the first place.
* src/qemu/qemu_driver.c (qemuDomainSnapshotDiscard): Add
parameter.
(qemuDomainSnapshotDiscardDescendant, qemuDomainSnapshotDelete):
Update callers.
A future patch will make it impossible to remove a domain if it
would leave behind any libvirt-tracked metadata about snapshots,
since stale metadata interferes with a new domain by the same name.
But requiring snaphot contents to be deleted before removing a
domain is harsh; with qemu, qemu-img can still make use of the
contents after the libvirt domain is gone. Therefore, we need
an option to get rid of libvirt tracking information, but not
the actual contents. For hypervisors that do not track any
metadata in libvirt, the implementation is trivial; all remaining
hypervisors (really, just qemu) will be dealt with separately.
* include/libvirt/libvirt.h.in
(VIR_DOMAIN_SNAPSHOT_DELETE_METADATA_ONLY): New flag.
* src/libvirt.c (virDomainSnapshotDelete): Document it.
* src/esx/esx_driver.c (esxDomainSnapshotDelete): Trivially
supported when there is no libvirt metadata.
* src/vbox/vbox_tmpl.c (vboxDomainSnapshotDelete): Likewise.
Similar to the last patch in isolating the filtering from the
client actions, so that clients don't have to reinvent the
filtering.
* src/conf/domain_conf.h (virDomainSnapshotForEachChild): New
prototype.
* src/libvirt_private.syms (domain_conf.h): Export it.
* src/conf/domain_conf.c (virDomainSnapshotActOnChild)
(virDomainSnapshotForEachChild): New functions.
(virDomainSnapshotCountChildren): Delete.
(virDomainSnapshotHasChildren): Simplify.
* src/qemu/qemu_driver.c (qemuDomainSnapshotReparentChildren)
(qemuDomainSnapshotDelete): Likewise.
Deleting a snapshot and all its descendants had problems with
tracking the current snapshot. The deletion does not necessarily
proceed in depth-first order, so a parent could be deleted
before a child, wreaking havoc on passing the notion of the
current snapshot to the parent. Furthermore, even if traversal
were depth-first, doing multiple file writes to pass current up
the chain one snapshot at a time is wasteful, comparing to a
single update to the current snapshot at the end of the algorithm.
* src/qemu/qemu_driver.c (snap_remove): Add field.
(qemuDomainSnapshotDiscard): Add parameter.
(qemuDomainSnapshotDiscardDescendant): Adjust accordingly.
(qemuDomainSnapshotDelete): Properly reset current.
This one's nasty. Ever since we fixed virHashForEach to prevent
nested hash iterations for safety reasons (commit fba550f6),
virDomainSnapshotDelete with VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN
has been broken for qemu: it deletes children, while leaving
grandchildren intact but pointing to a no-longer-present parent.
But even before then, the code would often appear to succeed to
clean up grandchildren, but risked memory corruption if you have
a large and deep hierarchy of snapshots.
For acting on just children, a single virHashForEach is sufficient.
But for acting on an entire subtree, it requires iteration; and
since we declared recursion as invalid, we have to switch to a
while loop. Doing this correctly requires quite a bit of overhaul,
so I added a new helper function to isolate the algorithm from the
actions, so that callers do not have to reinvent the iteration.
Note that this _still_ does not handle CHILDREN correctly if one
of the children is the current snapshot; that will be next.
* src/conf/domain_conf.h (_virDomainSnapshotDef): Add mark.
(virDomainSnapshotForEachDescendant): New prototype.
* src/libvirt_private.syms (domain_conf.h): Export it.
* src/conf/domain_conf.c (virDomainSnapshotMarkDescendant)
(virDomainSnapshotActOnDescendant)
(virDomainSnapshotForEachDescendant): New functions.
* src/qemu/qemu_driver.c (qemuDomainSnapshotDiscardChildren):
Replace...
(qemuDomainSnapshotDiscardDescenent): ...with callback that
doesn't nest hash traversal.
(qemuDomainSnapshotDelete): Use new function.
Each snapshot lookup was iterating over the entire hash table, O(n),
instead of honing in directly on the hash key, amortized O(1).
Besides, fixing this means that virDomainSnapshotFindByName can now
be used inside another virHashForeach iteration (without this patch,
attempts to lookup a snapshot by name during a hash iteration will
fail due to nested iteration).
* src/conf/domain_conf.c (virDomainSnapshotFindByName): Simplify.
(virDomainSnapshotObjListSearchName): Delete unused function.
For a system checkpoint of a running or paused domain, it's fairly
easy to honor new flags for altering which state to use after the
revert. For an inactive snapshot, the revert has to be done while
there is no qemu process, so do back-to-back transitions; this also
lets us revert to inactive snapshots even for transient domains.
* src/qemu/qemu_driver.c (qemuDomainRevertToSnapshot): Support new
flags.
Commit 5e47785 broke reverts to offline system checkpoint snapshots
with older qemu, since there is no longer any code path to use
qemu -loadvm on next boot. Meanwhile, reverts to offline system
checkpoints have been broken for newer qemu, both before and
after that commit, since -loadvm no longer works to revert to
disk state without accompanying vm state. Fix both of these by
using qemu-img to revert disk state.
Meanwhile, consolidate the (now 3) clients of a qemu-img iteration
over all disks of a VM into one function, so that any future
algorithmic fixes to the FIXMEs in that function after partial
loop iterations are dealt with at once. That does mean that this
patch doesn't handle partial reverts very well, but we're not
making the situation any worse in this patch.
* src/qemu/qemu_driver.c (qemuDomainRevertToSnapshot): Use
qemu-img rather than 'qemu -loadvm' to revert to offline snapshot.
(qemuDomainSnapshotRevertInactive): New helper.
(qemuDomainSnapshotCreateInactive): Factor guts...
(qemuDomainSnapshotForEachQcow2): ...into new helper.
(qemuDomainSnapshotDiscard): Use it.
If you take a checkpoint snapshot of a running domain, then pause
qemu, then restore the snapshot, the result should be a running
domain, but the code was leaving things paused. Furthermore, if
you take a checkpoint of a paused domain, then run, then restore,
there was a brief but non-deterministic window of time where the
domain was running rather than paused. Fix both of these
discrepancies by always pausing before restoring.
Also, check that the VM is active every time lock is dropped
between two monitor calls.
Finally, straighten out the events that get emitted on each
transition.
* src/qemu/qemu_driver.c (qemuDomainRevertToSnapshot): Always
pause before reversion, and improve events.
Implement the new running/paused overrides for saved state management.
Unfortunately, for virDomainSaveImageDefineXML, the saved state
updates are write-only - I don't know of any way to expose a way
to query the current run/pause setting of an existing save image
file to the user without adding a new API or modifying the domain
xml of virDomainSaveImageGetXMLDesc to include a new element to
reflect the state bit encoded into the save image. However, I
don't think this is a show-stopper, since the API is designed to
leave the state bit alone unless an explicit flag is used to
change it.
* src/qemu/qemu_driver.c (qemuDomainSaveInternal)
(qemuDomainSaveImageOpen): Adjust signature.
(qemuDomainSaveFlags, qemuDomainManagedSave)
(qemuDomainRestoreFlags, qemuDomainSaveImageGetXMLDesc)
(qemuDomainSaveImageDefineXML, qemuDomainObjRestore): Adjust
callers.
While it is nice that snapshots and saved images remember whether
the domain was running or paused, sometimes the restoration phase
wants to guarantee a particular state (paused to allow hot-plugging,
or running without needing to call resume). This introduces new
flags to allow the control, and a later patch will implement the
flags for qemu.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_SAVE_RUNNING)
(VIR_DOMAIN_SAVE_PAUSED, VIR_DOMAIN_SNAPSHOT_REVERT_RUNNING)
(VIR_DOMAIN_SNAPSHOT_REVERT_PAUSED): New flags.
* src/libvirt.c (virDomainSaveFlags, virDomainRestoreFlags)
(virDomainManagedSave, virDomainSaveImageDefineXML)
(virDomainRevertToSnapshot): Document their use, and enforce
mutual exclusion.
There are two classes of management apps that track events - one
that only cares about on/off (and only needs to track EVENT_STARTED
and EVENT_STOPPED), and one that cares about paused/running (also
tracks EVENT_SUSPENDED/EVENT_RESUMED). To keep both classes happy,
any transition that can go from inactive to paused must emit two
back-to-back events - one for started and one for suspended (since
later resuming of the domain will only send RESUMED, but the first
class isn't tracking that).
This also fixes a bug where virDomainCreateWithFlags with the
VIR_DOMAIN_START_PAUSED flag failed to start paused when restoring
from a managed save image.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_EVENT_SUSPENDED_RESTORED)
(VIR_DOMAIN_EVENT_SUSPENDED_FROM_SNAPSHOT)
(VIR_DOMAIN_EVENT_RESUMED_FROM_SNAPSHOT): New sub-events.
* src/qemu/qemu_driver.c (qemuDomainRevertToSnapshot): Use them.
(qemuDomainSaveImageStartVM): Likewise, and add parameter.
(qemudDomainCreate, qemuDomainObjStart): Send suspended event when
starting paused.
(qemuDomainObjRestore): Add parameter.
(qemuDomainObjStart, qemuDomainRestoreFlags): Update callers.
* examples/domain-events/events-c/event-test.c
(eventDetailToString): Map new detail strings.
QEMU uses USB bus name "usb.0" when using the legacy -usb argument.
If we want to allow USB devices to specify their addresses with legacy
-usb, we should either in case of legacy bus name drop the 0 from the
address bus, or just drop the 0 from device id. This patch does the
later.
Another solution would be to permit addressing on non-legacy USB
controllers only.
So that devices can be attached to hubs. Example, to attach to first
port of a usb-hub on port 1.
<hub type='usb'>
<address type='usb' bus='0' port='1'/>
</hub>
<input type='mouse' type='usb'>
<address type='usb' bus='0' port='1.1'/>
</hub>
also add a test entry
Commit 6766ff10 introduced a corner case bug with snapshot creation:
if a snapshot is created, but then we hit OOM while trying to
create the return value of the function, then we have polluted the
internal directory with the snapshot metadata with no way to clean
it up from the running libvirtd.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Don't
write metadata file on OOM condition.
Newer QEMU introduced cache=directsync for -drive, this patchset
is to expose it in libvirt layer.
* Introduced a new QEMU capability flag ($prefix_CACHE_DIRECTSYNC),
As even $prefix_CACHE_V2 is set, we can't known if directsync
is supported.
This patch adds the ability to make the filesystem for a filesystem
pool during a pool build.
The patch adds two new flags, no overwrite and overwrite, to control
when mkfs gets executed. By default, the patch preserves the
current behavior, i.e., if no flags are specified, pool build on a
filesystem pool only makes the directory on which the filesystem
will be mounted.
If the no overwrite flag is specified, the target device is checked
to determine if a filesystem of the type specified in the pool is
present. If a filesystem of that type is already present, mkfs is
not executed and the build call returns an error. Otherwise, mkfs
is executed and any data present on the device is overwritten.
If the overwrite flag is specified, mkfs is always executed, and any
existing data on the target device is overwritten unconditionally.
Several users have reported problems with 'virsh start' failing because
it was encountering a managed save situation where the managed save file
was incomplete. Be more robust to this by using two different magic
numbers, so that newer libvirt can gracefully handle an incomplete file
differently than a complete one, while older libvirt will at least fail
up front rather than trying to load only to have qemu fail at the end.
Managed save is a convenience - it exists to preserve as much state
as possible; if the state was not preserved, it is reasonable to just
log that fact, then proceed with a fresh boot. On the other hand,
user saves are under user control, so we must fail, but by making
the failure message distinct, the user can better decide how to handle
the situation of an incomplete save file.
* src/qemu/qemu_driver.c (QEMUD_SAVE_PARTIAL): New define.
(qemuDomainSaveInternal): Use it to mark incomplete images.
(qemuDomainSaveImageOpen, qemuDomainObjRestore): Add parameter
that controls what to do with partial images.
(qemuDomainRestoreFlags, qemuDomainSaveImageGetXMLDesc)
(qemuDomainSaveImageDefineXML, qemuDomainObjStart): Update callers.
Based on an initial idea by Osier Yang.
In a SELinux or root-squashing NFS environment, libvirt has to go
through some hoops to create a new file that qemu can then open()
by name. Snapshots are a case where we want to guarantee an empty
file that qemu can open; also, reopening a save file to convert it
from being marked partial to complete requires a reopen to avoid
O_DIRECT headaches. Refactor some existing code to make it easier
to reuse in later patches.
* src/qemu/qemu_migration.h (qemuMigrationToFile): Drop parameter.
* src/qemu/qemu_migration.c (qemuMigrationToFile): Let cgroup do
the stat, rather than asking caller to do it and pass info down.
* src/qemu/qemu_driver.c (qemuOpenFile): New function, pulled from...
(qemuDomainSaveInternal): ...here.
(doCoreDump, qemuDomainSaveImageOpen): Use it here as well.
After supporting multi function pci device, we only reserve function 1 on slot 1.
The user can use the other function on slot 1 in the xml config file. We should
detect this wrong usage.
Currently, the lxc implementation invokes 'ip' and 'ifconfig' commands
inside a container using 'virRun'. That has the side effect of requiring
those commands to be present and to function in a manner consistent with
the usage. Some small roots (such as ttylinux) may not have 'ip' or
'ifconfig'.
This patch replaces the use of these commands with usage of
netdevice. The result is that lxc containers do not have to implement
those commands, and lxc in libvirt is only dependent on the netdevice
interface.
I've tested this patch locally against the ubuntu libvirt version enough
to verify its generally sane. I attempted to build upstream today, but
failed with:
/usr/bin/ld:
../src/.libs/libvirt_driver_qemu.a(libvirt_driver_qemu_la-qemu_domain.o):
undefined reference to symbol 'xmlXPathRegisterNs@@LIBXML2_2.4.30
Thats probably a local issue only, but I wanted to get this patch up and
see what others thought of it. This is ubuntu bug
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/828211 .
Hi,
I'm seeing an issue with udev and libvirt-lxc. Libvirt-lxc creates
/dev/ptmx as a symlink to /dev/pts/ptmx. When udev starts up, it
checks the device type, sees ptmx is 'not right', and replaces it
with a 'proper' ptmx.
In lxc, /dev/ptmx is bind-mounted from /dev/pts/ptmx instead of being
symlinked, so udev sees the right device type and leaves it alone.
A patch like the following seems to work for me. Would there be
any objections to this?
>From 4c5035de52de7e06a0de9c5d0bab8c87a806cba7 Mon Sep 17 00:00:00 2001
From: Ubuntu <ubuntu@domU-12-31-39-14-F0-B3.compute-1.internal>
Date: Wed, 31 Aug 2011 18:15:54 +0000
Subject: [PATCH 1/1] make ptmx a bind mount rather than symlink
udev on some systems checks the device type of /dev/ptmx, and replaces it if
not as expected. The symlink created by libvirt-lxc therefore gets replaced.
By creating it as a bind mount, the device type is correct and udev leaves it
alone.
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>
The libvirt BlockPull API supports the use of an initial bandwidth limit but the
qemu block_stream API does not. To get the desired behavior we use the two APIs
strung together: first BlockPull, then BlockJobSetSpeed. We can do this at the
driver level to avoid duplicated code in each monitor path.
Signed-off-by: Adam Litke <agl@us.ibm.com>
Due to an unfortunate precedent in qemu, the units for the bandwidth parameter
to block_job_set_speed are different between the text monitor and the qmp
monitor. While the qmp monitor uses bytes/s, the text monitor expects MB/s.
Correct the units for the text interface.
Signed-off-by: Adam Litke <agl@us.ibm.com>
On systems with many pcpus, the sexpr returned by xend can be quite
large for dom0 when it is configured to have #vcpus = #pcpus (default).
E.g. on a 80 pcpu system, where dom0 had 80 vcpus, the sexpr details
for dom0 was 73817 bytes! Increase maximum buffer size to 256k.
xenDaemonDomainFetch() was overwriting errors reported by
xend_get() and xend_req(). E.g. without patch
error: failed Xen syscall xenDaemonDomainFetch failed to find this domain
with patch
error: internal error Xend returned HTTP Content-Length of 73817, which exceeds
maximum of 65536
Commit 2c85644b0b attempted to
fix a problem with tracking RPC messages from streams by doing
- if (msg->header.type == VIR_NET_REPLY) {
+ if (msg->header.type == VIR_NET_REPLY ||
+ (msg->header.type == VIR_NET_STREAM &&
+ msg->header.status != VIR_NET_CONTINUE)) {
client->nrequests--;
In other words any stream packet, with status NET_OK or NET_ERROR
would cause nrequests to be decremented. This is great if the
packet from from a synchronous virStreamFinish or virStreamAbort
API call, but wildly wrong if from a server initiated abort.
The latter resulted in 'nrequests' being decremented below zero.
This then causes all I/O for that client to be stopped.
Instead of trying to infer whether we need to decrement the
nrequests field, from the message type/status, introduce an
explicit 'bool tracked' field to mark whether the virNetMessagePtr
object is subject to tracking.
Also add a virNetMessageClear function to allow a message
contents to be cleared out, without adversely impacting the
'tracked' field as a naive memset() would do
* src/rpc/virnetmessage.c, src/rpc/virnetmessage.h: Add
a 'bool tracked' field and virNetMessageClear() API
* daemon/remote.c, daemon/stream.c, src/rpc/virnetclientprogram.c,
src/rpc/virnetclientstream.c, src/rpc/virnetserverclient.c,
src/rpc/virnetserverprogram.c: Switch over to use
virNetMessageClear() and pass in the 'bool tracked' value
when creating messages.
Parted does not report disk size in 512 byte units, but
rather the disks' logical sector size, which with modern
drives might be 4k.
* src/storage/parthelper.c: Remove hardcoded 512 byte sector
size
Introduced by 5e495c8b, except the ones for checking if numa
is supported by host, all the NO_SUPPORT are changed back. For
the ones about numa checking, change them into INTERNAL_ERROR.
If the libxl driver is compiled in, then everytime libvirtd
starts up on a non-Xen Dom0 host, it logs a error message.
Since this is an expected condition, we should not log at
'error' level, only 'info'.
* src/libxl/libxl_driver.c: Lower log level for certain
expected errors during driver init
It is possible (expected/likely in Fedora 15) for a cgroup controller
to be mounted in multiple locations at the same time, due to bind
mounts. Currently we leak memory if this happens, because we overwrite
the previous 'mountPoint' string. Instead just accept the first match
we find.
* src/util/cgroup.c: Only accept first match for a cgroup
controller mount
The virSecurityManagerSetProcessFDLabel method was introduced
after a mis-understanding from a conversation about SELinux
socket labelling. The virSecurityManagerSetSocketLabel method
should have been used for all such scenarios.
* src/security/security_apparmor.c, src/security/security_apparmor.c,
src/security/security_driver.h, src/security/security_manager.c,
src/security/security_manager.h, src/security/security_selinux.c,
src/security/security_stack.c: Remove SetProcessFDLabel driver
It is not possible to change the label of a TCP socket once it
has been opened. When creating a TCP socket care must be taken
to ensure the socket creation label is set & then cleared.
Remove the bogus call to virSecurityManagerSetProcessFDLabel
from the lock driver guest setup code and instead make use of
virSecurityManagerSetSocketLabel
The code for creating a sanlock lockspace accidentally used
SANLK_NAME_LEN instead of SANLK_PATH_LEN for a size check.
This meant disk paths were limited to 48 bytes !
* src/locking/lock_driver_sanlock.c: Fix disk path length
check
There is no reason to forbid pausing an autodestroy domain
(not to mention that 'virsh start --paused --autodestroy'
succeeds in creating a paused autodestroy domain).
Meanwhile, qemu was failing to enforce the API documentation that
autodestroy domains cannot be saved. And while the original
documentation only mentioned save/restore, snapshots are another
form of saving that are close enough in semantics as to make no
sense on one-shot domains.
* src/qemu/qemu_driver.c (qemudDomainSuspend): Drop bogus check.
(qemuDomainSaveInternal, qemuDomainSnapshotCreateXML): Forbid
saves of autodestroy domains.
* src/libvirt.c (virDomainCreateWithFlags, virDomainCreateXML):
Document snapshot interaction.
According to qemu-kvm/qerror.c all messages start with a capital
"Device ", but the current code only scans for the lower case "device ".
This results in "virDomainUpdateDeviceFlags()" to not detect locked
CD-ROMs and reporting success even in the case of a failure:
# virsh qemu-monitor-command "$VM" change\ drive-ide0-0-0\ \"/var/lib/libvirt/images/ucs_2.4-0-sec4-20110714145916-dvd-amd64.iso\"
Device 'drive-ide0-0-0' is locked
# virsh update-device "$VM" /dev/stdin <<<"<disk type='file' device='cdrom'><driver name='qemu' type='raw'/><source file='/var/lib/libvirt/images/ucs_2.4-0-sec4-20110714145916-dvd-amd64.iso'/><target dev='hda' bus='ide'/><readonly/><alias name='ide0-0-0'/><address type='drive' controller='0' bus='0' unit='0'/></disk>"
Device updated successfully
Signed-off-by: Philipp Hahn <hahn@univention.de>
There have been several instances of people having problems with
a broken managed save file, and not aware that they could use
'virsh managedsave-remove dom' to fix things. Making it possible
to do this as part of starting a domain makes the same functionality
easier to find, and one less API call.
* include/libvirt/libvirt.h.in (VIR_DOMAIN_START_FORCE_BOOT): New
flag.
* src/libvirt.c (virDomainCreateWithFlags): Document it.
* src/qemu/qemu_driver.c (qemuDomainObjStart): Alter signature.
(qemuAutostartDomain, qemuDomainStartWithFlags): Update callers.
* tools/virsh.c (cmdStart): Expose it in virsh.
* tools/virsh.pod (start): Document it.
Back in 2008 when this line of util.h was written, gnulib's verify
module didn't allow the use of multiple verify() in one file
in combination with our choice of gcc -W options. But that has
since been fixed in gnulib, and newer gnulib even maps verify()
to the C1x feature of _Static_assert, which gives even nicer
diagnostics with a new enough compiler, so we might as well go
with the simpler verify().
* src/util/util.h (VIR_ENUM_IMPL): Use simpler verify, now that
gnulib module is smarter.
Commit 3261761 made it possible to use pipes instead of sockets
for outgoing tunneled migration; however, it caused a regression
because the pipe was never given a SELinux label.
* src/qemu/qemu_migration.c (doTunnelMigrate): Label outgoing pipe.
The bufferOffset has been initialized to zero in virNetMessageEncodePayloadRaw(),
so, we use bufferLength to represent the length of message which is going to be
sent to client side.
From: Matthias Bolte <matthias.bolte@googlemail.com>
Tested-by: Jim Fehlig <jfehlig@novell.com>
Matthias provided this patch to fix an issue I encountered in the
generator with APIs containing call-by-ref long type, e.g.
int virDomainMigrateGetMaxSpeed(virDomainPtr domain,
unsigned long *bandwidth,
unsigned int flags);
Domain listing, basic information retrieval and domain life cycle
management is implemented. But currently the domain XML output
lacks the complete devices section.
The driver uses OpenWSMAN to directly communicate with a Hyper-V
server over its WS-Management interface exposed via Microsoft WinRM.
The driver is based on the work of Michael Sievers. This started in
the same master program project group at the University of Paderborn
as the ESX driver.
See Michael's blog for details: http://hyperv4libvirt.wordpress.com/
Add a generator script to generate the structs and serialization
information for OpenWSMAN.
openwsman.h collects workarounds for problems in OpenWSMAN <= 2.2.6.
There are also disabled sections that would use ws_serializer_free_mem
but can't because it's broken in OpenWSMAN <= 2.2.6. Patches to fix
this have been posted upstream.
When a user migrates a domain by command as
libvirt saves vm's domain XML config in destination host after migration.
But it saves vm->def. Then, the saved XML contains some garbage.
<domain type='kvm' id='50'>
^^^^^^^^
...
<console type='pty' tty='/dev/pts/5'>
^^^^^^^^^^^^^^^^^
Avoid saving unnecessary things by saving persistent vm definition.
In case we add a new program in the future (we did that in the past and
we are going to do it again soon) current daemon will behave badly with
new client that wants to use the new program. Before the RPC rewrite we
used to just send an error reply to any request with unknown program.
With the RPC rewrite in 0.9.3 the daemon just closes the connection
through which such request was sent. This patch fixes this regression.
If users wants to connect to remote unix socket, e.g.
'qemu+unix://<remote>/system' currently the <remote> part is ignored,
ending up connecting to localhost. Connecting to remote socket is not
supported and user should have used TLS/TCP/SSH instead.
On success, the 'sendkey' command does not return any data, so
any data in the reply should be considered to be an error
message
* src/qemu/qemu_monitor_text.c: Treat non-"" reply data as an
error message for 'sendkey' command
The QEMU 'sendkey' command expects keys to be encoded in the same
way as the RFB extended keycode set. Specifically it wants extended
keys to have the high bit of the first byte set, while the Linux
XT KBD driver codeset uses the low bit of the second byte. To deal
with this we introduce a new keymap 'RFB' and use that in the QEMU
driver
* include/libvirt/libvirt.h.in: Add VIR_KEYCODE_SET_RFB
* src/qemu/qemu_driver.c: Use RFB keycode set instead of XT KBD
* src/util/virkeycode-mapgen.py: Auto-generate the RFB keycode
set from the XT KBD set
* src/util/virkeycode.c: Add RFB keycode entry to table. Add a
verify check on cardinality of the codeOffset table
This API labels all sockets created until ClearSocketLabel is called in
a way that a vm can access them (i.e., they are labeled with svirt_t
based label in SELinux).
The APIs are designed to label a socket in a way that the libvirt daemon
itself is able to access it (i.e., in SELinux the label is virtd_t based
as opposed to svirt_* we use for labeling resources that need to be
accessed by a vm). The new name reflects this.
When virStreamAbort is called on a stream that has not been used yet,
quite confusing error is returned: "this function is not supported by
the connection driver". Let's just ignore such streams as there's
nothing to abort anyway.
If migration failed on source daemon, the migration is automatically
canceled by the daemon itself. Thus we don't need to call
virDomainMigrateConfirm3(cancelled=1). Calling it doesn't cause any harm
but the resulting error message printed in logs may confuse people.
Audit all changes to the qemu vm->current_snapshot, and make them
update the saved xml file for both the previous and the new
snapshot, so that there is always at most one snapshot with
<active>1</active> in the xml, and that snapshot is used as the
current snapshot even across libvirtd restarts.
This patch does not fix the case of virDomainSnapshotDelete(,CHILDREN)
where one of the children is the current snapshot; that will be later.
* src/conf/domain_conf.h (_virDomainSnapshotDef): Alter member
type and name.
* src/conf/domain_conf.c (virDomainSnapshotDefParseString)
(virDomainSnapshotDefFormat): Update clients.
* docs/schemas/domainsnapshot.rng: Tighten rng.
* src/qemu/qemu_driver.c (qemuDomainSnapshotLoad): Reload current
snapshot.
(qemuDomainSnapshotCreateXML, qemuDomainRevertToSnapshot)
(qemuDomainSnapshotDiscard): Track current snapshot.
Changing the current vm, and writing that change to the file
system, all before a new qemu starts, is risky; it's hard to
roll back if starting the new qemu fails for some reason.
Instead of abusing vm->current_snapshot and making the command
line generator decide whether the current snapshot warrants
using -loadvm, it is better to just directly pass a snapshot all
the way through the call chain if it is to be loaded.
This frees up the last use of snapshot->def->active for qemu's
use, so the next patch can repurpose that field for tracking
which snapshot is current.
* src/qemu/qemu_command.c (qemuBuildCommandLine): Don't use active
field of snapshot.
* src/qemu/qemu_process.c (qemuProcessStart): Add a parameter.
* src/qemu/qemu_process.h (qemuProcessStart): Update prototype.
* src/qemu/qemu_migration.c (qemuMigrationPrepareAny): Update
callers.
* src/qemu/qemu_driver.c (qemudDomainCreate)
(qemuDomainSaveImageStartVM, qemuDomainObjStart)
(qemuDomainRevertToSnapshot): Likewise.
(qemuDomainSnapshotSetCurrentActive)
(qemuDomainSnapshotSetCurrentInactive): Delete unused functions.
https://bugzilla.redhat.com/show_bug.cgi?id=727709
mentions that if qemu fails to create the snapshot (such as what
happens on Fedora 15 qemu, which has qmp but where savevm is only
in hmp, and where libvirt is old enough to not try the hmp fallback),
then 'virsh snapshot-list dom' will show a garbage snapshot entry,
and the libvirt internal directory for storing snapshot metadata
will have a bogus file.
This fixes the fallout bug of polluting the snapshot-list with
garbage on failure (the root cause of the F15 bug of not having
fallback to hmp has already been fixed in newer libvirt releases).
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Allocate
memory before making snapshot, and cleanup on failure. Don't
dereference NULL if transient domain exited during snapshot creation.
* src/qemu/qemu_migration.c: avoid dead 'ret' assignment and silence
clang warning.
Detected by ccc-analyzer:
libvirt.c:4277:5: warning: Value stored to 'ret' is never read
ret = domain->conn->driver->domainMigrateConfirm3
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* src/qemu/qemu_migration.c: avoid dead 'ret' assignment and silence
clang warning.
Detected by ccc-analyzer:
CC libvirt_driver_qemu_la-qemu_migration.lo
qemu/qemu_migration.c:2046:5: warning: Value stored to 'ret' is never read
ret = qemuMigrationConfirm(driver, sconn, vm,
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
virFileOpenAs takes desired uid:gid as arguments, and not only uses
them for a fork/setuid/setgid when retrying failed open operations,
but additionally always forces the opened file to be owned by the
given uid:gid.
One example of the problems this causes is that, when restoring a
domain from a file that is owned by the qemu user, opening the file
chowns it to root. if dynamic_ownership=1 this is coincidentally
expected, but if dynamic_ownership=0, no existing file should ever
have its ownership changed.
This patch adds an extra check before calling fchown() - it only does
it if O_CREAT was passed to virFileOpenAs() in the openflags.
pciDeviceListSteal(pcidevs, dev) removes dev from pcidevs reducing
the length of pcidevs, so moving onto what was the next dev is wrong.
Instead callers should pop entry 0 repeatedly until pcidevs is empty.
Signed-off-by: Steve Hodgson <shodgson@solarflare.com>
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
I was testing a virsh patch, and wanted to see if I had passed the
flags I thought. But with LIBVIRT_DEBUG in the environment, I just
saw:
14:24:52.359: 15022: debug : virDomainSnapshotNum:15586 : dom=0xc9c180, (VM: name=rhel_6-64, uuid=48f8e8e7-e14f-0e14-02f0-ce71997bdcab),
including a trailing space. This fixes the issues.
* src/libvirt.c: Log flag parameters, even if currently unused.
(VIR_DOMAIN_DEBUG_0): Drop trailing comma in log.
(VIR_DOMAIN_DEBUG_1): Split guts into...
(VIR_DOMAIN_DEBUG_2): ...new macro.
* src/qemu/qemu_monitor_json.c: Handle error "CommandNotFound" and
report the error.
* src/qemu/qemu_monitor_text.c: If a sub info command is not found,
it prints the output of "help info", for other commands,
"unknown command" is printed.
Without this patch, libvirt always report:
An error occurred, but the cause is unknown
This patch was adapted from a patch by Osier Yang <jyang@redhat.com> to
break out detection of unrecognized text monitor commands into a separate
function.
Signed-off-by: Adam Litke <agl@us.ibm.com>
s/VIR_ERR_NO_SUPPORT/VIR_ERR_OPERATION_INVALID/
Special case is changes on lxcDomainInterfaceStats, if it's not
implemented on the platform, prints error like:
lxcError(VIR_ERR_OPERATION_INVALID, "%s",
_("interface stats not implemented on this platform"));
As the function is supported by driver actually, error like
VIR_ERR_NO_SUPPORT is confused.
Now, bad key-code in send-key can cause segmentation fault in libvirt.
(example)
% virsh send-key --codeset win32 12
error: End of file while reading data: Input/output error
This is caused by overrun at scanning keycode array.
Fix it.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Often, we want to use XPath functions on the just-parsed document;
fold this into the parser function for convenience.
* src/util/xml.h (virXMLParseHelper): Add argument.
(virXMLParseStrHelper, virXMLParseFileHelper): Delete.
(virXMLParseCtxt, virXMLParseStringCtxt, virXMLParseFileCtxt): New
macros.
* src/libvirt_private.syms (xml.h): Remove deleted functions.
* src/util/xml.c (virXMLParseHelper): Add argument.
(virXMLParseStrHelper, virXMLParseFileHelper): Delete.
ACK was given too soon. According to the code, the xm driver is
only used for inactive domains, and has no notion of an active
domain, thus, it cannot support undefine of a running domain.
The real fix for xen needs to be in the unified driver and/or
the xend level.
This reverts commit 49186deda6.
* src/qemu/qemu_monitor_text.c: BALLOON_PREFIX was defined as
"balloon: actual=", which cause "actual=" is stripped early before
the real parsing. This patch changes BALLOON_PREFIX into "balloon: ",
and modifies related functions, also renames
"qemuMonitorParseExtraBalloonInfo" to "qemuMonitorParseBalloonInfo",
as after the changing, it parses all the info returned by "info balloon".
Although we are flushing cache after some critical writes (e.g.
volume creation), after some others we do not (e.g. volume cloning).
This patch fix this issue. That is for volume cloning, writing
header of logical volume, and storage wipe.
When spice_tls is set but listen_tls is not, we don't initialize
GnuTLS library. So any later gnutls call (e.g. during migration,
where we initialize a certificate) will access uninitialized GnuTLS
internal structs and throws an error.
Although, we might now initialize GnuTLS twice, it is safe according
to the documentation:
This function can be called many times,
but will only do something the first time.
This patch creates 2 functions: virNetTLSInit and virNetTLSDeinit
with respect to written above.
Regression introduced in commit b7e5ca4.
Mingw lacks kill(), but we were only using it for a sanity check;
so we can go with one less check.
Also, on OOM error, this function should outright fail rather than
claim that the pid file was successfully read.
* src/util/virpidfile.c (virPidFileReadPathIfAlive): Skip kill
call where unsupported, and report error on OOM.
If a client had initiated a stream abort, it will have a call
waiting for a reply in the queue. If more data continues to
arrive on the stream, the abort command could mistakenly get
signalled as complete. Remove the code from async data processing
that looked for waiting calls. Add a sanity check to ensure no
async call can ever be marked as needing a reply
* src/rpc/virnetclient.c: Ensure async data packets can't
trigger a reply
A virsh command like:
migrate --live --copy-storage-all Guest qemu+ssh://user@host/system
--persistent --verbose
shows
Migration: [ 0 %]
during the storage copy and does not start counting
until the ram transfer starts
Fix this by scraping optional disk transfer status, and adding it
into the progress meter.
Otherwise the device will still be bound to pci-stub driver even
it's set as "managed=yes" when do detaching. Of course, it won't
triger any driver reprobing too.
If a stream gets a server initiated abort, the client may still
send an abort request before it receives the server side abort.
This causes the server to send back another abort for the
stream. Since the protocol defines that abort is the last thing
to be sent, the client gets confused by this second abort from
the server. If the stream is already shutdown, just drop any
client requested abort, rather than sending back another message.
This fixes the regression from previous versions.
Tested as follows
In one virsh session
virsh # start foo
virsh # console foo
In other virsh session
virsh # destroy foo
The first virsh session should be able to continue issuing
commands without error. Prior to this patch it saw
virsh # list
error: Failed to list active domains
error: An error occurred, but the cause is unknown
virsh # list
error: Failed to list active domains
error: no call waiting for reply with prog 536903814 vers 1 serial 9
* src/rpc/virnetserverprogram.c: Drop abort requests
for streams which no longer exist
Every active stream results in a reference being held on the
virNetServerClientPtr object. This meant that if a client quit
with any streams active, although all I/O was stopped the
virNetServerClientPtr object would leak. This causes libvirtd
to leak any file handles associated with open streams when a
client quit
To fix this, when we call virNetServerClientClose there is a
callback invoked which lets the daemon release the streams
and thus the extra references
* daemon/remote.c: Add a hook to close all streams
* daemon/stream.c, daemon/stream.h: Add API for releasing
all streams
* src/rpc/virnetserverclient.c, src/rpc/virnetserverclient.h:
Allow registration of a hook to trigger when closing client
Get rid of the #if __linux__ check in virPidFileReadPathIfAlive that
was preventing a check of a symbolic link in /proc/<pid>/exe on
non-linux platforms against an expected executable. Replace
this with a run-time check testing whether the /proc/<pid>/exe is a
symbolic link and if so call the function doing the comparison
against the expected file the link is supposed to point to.
This patch renames getPhysfn to getPhysfnDev and adds code to get the
Physical function and Virtual Function index of the direct attach linkdev (if
the direct attach interface is a SRIOV VF). The idea is to send the port
profile message to a PF if the direct attach interface is a SRIOV VF.
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: Christian Benvenuti <benve@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
This patch adds the following functions to get PF/VF relationship of an SRIOV
network interface:
ifaceIsVirtualFunction: Function to check if a network interface is a SRIOV VF
ifaceGetVirtualFunctionIndex: Function to get VF index if a network interface is a SRIOV VF
ifaceGetPhysicalFunction: Function to get the PF net interface name of a SRIOV VF net interface
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: Christian Benvenuti <benve@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
This patch adds the following helper functions:
pciDeviceIsVirtualFunction: Function to check if a pci device is a sriov VF
pciGetVirtualFunctionIndex: Function to get the VF index of a sriov VF
pciDeviceNetName: Function to get the network device name of a pci device
pciConfigAddressCompare: Function to compare pci config addresses
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: Christian Benvenuti <benve@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
This patch moves some of the sriov related pci code from node_device driver
to src/util/pci.[ch]. Some functions had to go thru name and argument list
change to accommodate the move.
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: Christian Benvenuti <benve@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
In some versions of qemu, both virtio-blk-pci and virtio-net-pci
devices can have an event_idx setting that determines some details of
event processing. When it is enabled, it "reduces the number of
interrupts and exits for the guest". qemu will automatically enable
this feature when it is available, but there may be cases where this
new feature could actually make performance worse (NB: no such case
has been found so far).
As a safety switch in case such a situation is encountered in the
field, this patch adds a new attribute "event_idx" to the <driver>
element of both disk and interface devices. event_idx can be set to
"on" (to force event_idx on in case qemu has it disabled by default)
or "off" (for force event_idx off). In the case that event_idx support
isn't present in qemu, the attribute is ignored (this on the advice of
the qemu developer).
docs/formatdomain.html.in: document the new flag (marking it as
"don't mess with this!"
docs/schemas/domain.rng: add event_idx in appropriate places
src/conf/domain_conf.[ch]: add event_idx to parser and formatter
src/libvirt_private.syms: export
virDomainVirtioEventIdx(From|To)String
src/qemu/qemu_capabilities.[ch]: detect and report event_idx in
disk/net
src/qemu/qemu_command.c: add event_idx parameter to qemu commandline
when appropriate.
tests/qemuxml2argvdata/qemuxml2argv-event_idx.args,
tests/qemuxml2argvdata/qemuxml2argv-event_idx.xml,
tests/qemuxml2argvtest.c,
tests/qemuxml2xmltest.c: test cases for event_idx.
By opening a connection to remote qemu process ourselves and passing the
socket to qemu we get much better errors than just "migration failed"
when the connection is opened by qemu.
The core of these two functions is very similar and most of it is even
exactly the same. Factor out the core functionality into a separate
function to remove code duplication and make further changes easier.
With gcc 4.5.1:
util/virpidfile.c: In function 'virPidFileAcquirePath':
util/virpidfile.c:308:66: error: nested extern declaration of '_gl_verify_function2' [-Wnested-externs]
Then in tests/commandtest.c, the new virPidFile APIs need to be used.
* src/util/virpidfile.c (virPidFileAcquirePath): Move verify to
top level.
* tests/commandtest.c: Use new pid APIs.
In daemons using pidfiles to protect against concurrent
execution there is a possibility that a crash may leave a stale
pidfile on disk, which then prevents later restart of the daemon.
To avoid this problem, introduce a pair of APIs which make
use of virFileLock to ensure crash-safe & race condition-safe
pidfile acquisition & releae
* src/libvirt_private.syms, src/util/virpidfile.c,
src/util/virpidfile.h: Add virPidFileAcquire and virPidFileRelease
In some cases the caller of virPidFileRead might like extra checks
to determine whether the pid just read is really the one they are
expecting. This adds virPidFileReadIfAlive which will check whether
the pid is still alive with kill(0, -1), and (on linux only) will
look at /proc/$PID/path
* libvirt_private.syms, util/virpidfile.c, util/virpidfile.h: Add
virPidFileReadIfValid and virPidFileReadPathIfValid
* network/bridge_driver.c: Use new APIs to check PID validity
The functions for manipulating pidfiles are in util/util.{c,h}.
We will shortly be adding some further pidfile related functions.
To avoid further growing util.c, this moves the pidfile related
functions into a dedicated virpidfile.{c,h}. The functions are
also all renamed to have 'virPidFile' as their name prefix
* util/util.h, util/util.c: Remove all pidfile code
* util/virpidfile.c, util/virpidfile.h: Add new APIs for pidfile
handling.
* lxc/lxc_controller.c, lxc/lxc_driver.c, network/bridge_driver.c,
qemu/qemu_process.c: Add virpidfile.h include and adapt for API
renames
Add some simple wrappers around the fcntl() discretionary file
locking capability.
* src/util/util.c, src/util/util.h, src/libvirt_private.syms: Add
virFileLock and virFileUnlock APIs
We forgot to add virDomainUndefineFlags for a couple of hypervisors.
This wires up trivial versions (since neither hypervisor supports
managed save yet, they do not need to support any flags).
* src/vbox/vbox_tmpl.c (vboxDomainCreateXML): Update caller.
(vboxDomainUndefine): Move guts...
(vboxDomainUndefineFlags): ...to new function.
* src/xenapi/xenapi_driver.c (xenapiDomainUndefine)
(xenapiDomainUndefineFlags): Likewise.
Our logic throws off analyzer tools:
ptr var = NULL;
if (flags == 0) flags = live ? _LIVE : _CONFIG;
if (flags & _LIVE) do stuff
if (flags & _CONFIG) var = non-null;
if (flags & _LIVE) do more stuff
else if (flags & _CONFIG) use var
the tools keep thinking that var can still be NULL in the last
if clause, adding the hint shuts them up.
* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters): Add a
static analysis hint.
While the first encountered dns host record is being parsed, it's
possible for virNetworkDef::hosts to point to memory that has been
allocated, but virNetworkDef::nhosts to still be 0. If there is a
failure during that time, virNetworkDef::hosts will be leaked.
Although this isn't currently the case for virNetworkDef::txtrecords,
it could become that way through future re-factoring, and it hurts
nothing to restructure the freeing of txtrecord data to match that of
hosts data.
The following XML:
<serial type='udp'>
<source mode='connect' service='9999'/>
</serial>
is accepted by domain_conf.c but maps to the qemu command line:
-chardev udp,host=127.0.0.1,port=2222,localaddr=(null),localport=(null)
qemu can cope with everything omitting except the connection port, which
seems to also be the intent of domain_conf validation, so let's not
generate bogus command lines for that case.
The defaults are empty strings for addresses and 0 for the localport
Additionally, tweak the qemu cli parsing to handle omitted host
parameters
for -serial udp
Transient domains reject attempts to set autostart, and using
virDomainCreate to restart a domain only works on persistent
domains. Therefore, managed save makes no sense on transient
domains, and should be rejected up front rather than creating
an otherwise unrecoverable managed save file.
Besides, transient domains imply that a lot more management is
being done by the upper layer; this includes the assumption
that the upper layer is okay managing the saved state file
created by virDomainSave, and does not need to use managed save.
* src/libvirt.c: Document that transient domains are incompatible
with managed save.
* src/qemu/qemu_driver.c (qemuDomainManagedSave): Enforce it.
* src/libxl/libxl_driver.c (libxlDomainManagedSave): Likewise.
I noticed some inconsistent use of 'else'.
* src/qemu/qemu_driver.c (qemuCPUCompare)
(qemuDomainSnapshotCreateXML, qemuDomainRevertToSnapshot)
(qemuDomainSnapshotDiscard): Match coding conventions.
If a snapshot with the name already exists, virDomainSnapshotAssignDef()
just returns NULL, in which case the snapshot definition is leaked.
Currently this leak is not a big problem, since qemuDomainSnapshotLoad()
is only called once during initial startup of libvirtd.
Signed-off-by: Philipp Hahn <hahn@univention.de>
A previous commit gave the LXC driver the ability to mount
block devices for the container filesystem. Through use of
the loopback device functionality, we can build on this to
support use of plain file images for LXC filesytems.
By setting the LO_FLAGS_AUTOCLEAR flag we can ensure that
the loop device automatically disappears when the container
dies / shuts down
* src/lxc/lxc_container.c: Raise error if we see a file
based filesystem, since it should have been turned into
a loopback device already
* src/lxc/lxc_controller.c: Rewrite any filesystems of
type=file, into type=block, by binding the file image
to a free loop device
Currently the LXC driver can only populate filesystems from
host filesystems, using bind mounts. This patch allows host
block devices to be mounted. It autodetects the filesystem
format at mount time, and adds the block device to the cgroups
ACL. Example usage is
<filesystem type='block' accessmode='passthrough'>
<source dev='/dev/sda1'/>
<target dir='/home'/>
</filesystem>
* src/lxc/lxc_container.c: Mount block device filesystems
* src/lxc/lxc_controller.c: Add block device filesystems
to cgroups ACL
An application container shouldn't get a private /dev. Fix
the regression from 6d37888e6a
* src/lxc/lxc_container.c: Don't mount /dev for app containers
Detected by ccc-analyzer, reported by Alex Jia.
qemuProcessStart always calls qemuProcessWaitForMonitor with a
non-negative position, but qemuProcessAttach always calls with -1.
In the latter case, there is no log file we can scrape, so we
also should not be trying to scrape the logs if the qemu process
died at the very end.
* src/qemu/qemu_process.c (qemuProcessWaitForMonitor): Don't try
to read from log in qemuProcessAttach case.
This addresses https://bugzilla.redhat.com/show_bug.cgi?id=713728
When "defining" a new network (or one that exists but isn't currently
active) the new definition is stored in network->def, but for a
network that already exists and is active, the new definition is
stored in network->newDef, and then moved over to network->def as soon
as the network is destroyed.
However, the code that writes the dhcp and dns hosts files used by
dnsmasq was always using network->def for its information, even when
the new data was actually in network->newDef, so the hosts files
always lagged one edit behind the definition.
This patch changes the code to keep the pointer to the new definition
after it's been assigned into the network, and use it directly
(regardless of whether it's stored in network->newDef or network->def)
to construct the hosts files.
Value stored to 'ret' is never read, so remove this dead assignment.
* src/qemu/qemu_monitor_text.c: kill dead assignment.
Signed-off-by: Alex Jia <ajia@redhat.com>
Value stored to 'ret' is never read, in fact, 'cleanup' section will
directly return -1 when function is fail, so remove this dead assignment.
* src/qemu/qemu_process.c: kill dead assignment.
Signed-off-by: Alex Jia <ajia@redhat.com>
When trying to use any SASL authentication for TCP sockets by
setting auth_tls = "sasl" in libvirtd.conf on server side, the
client will hang because of the sasl session relocking other than
dropping the lock when exiting virNetSASLSessionExtKeySize()
* src/rpc/virnetsaslcontext.c: virNetSASLSessionExtKeySize drop the
lock on exit
This patch introduces a internal RPC API "virNetServerClose", which
is standalone with "virNetServerFree". it closes all the socket fds,
and unlinks the unix socket paths, regardless of whether the socket
is still referenced or not.
This is to address regression bug:
https://bugzilla.redhat.com/show_bug.cgi?id=725702
Detection based on gnutls_session doesn't work because GnuTLS 2.x.y
comes with a compat.h that defines gnutls_session to gnutls_session_t.
Instead detect this based on LIBGNUTLS_VERSION_MAJOR. Move this from
configure/config.h to gnutls_1_0_compat.h and make sure that all users
include gnutls_1_0_compat.h properly.
Also fix header guard in gnutls_1_0_compat.h.
Leak detected by Coverity; only possible on unlikely ptsname_r
failure. Additionally, the man page for ptsname_r states that
failure is merely non-zero, not necessarily -1.
* src/util/util.c (virFileOpenTtyAt): Avoid leak on ptsname_r
failure.
Coverity detected that ifaceGetNthParent had already dereferenced
'nth' prior to the conditional; all callers already complied with
passing a non-NULL pointer so make this part of the contract.
* src/util/interface.h (ifaceGetNthParent): Add annotations.
* src/util/interface.c (ifaceGetNthParent): Drop useless null check.
In virNetServerNew, Coverity didn't realize that srv->mdsnGroupName
can only be non-NULL if mdsnGroupName was non-NULL.
In virNetServerRun, Coverity didn't realize that the array is non-NULL
if the array count is non-zero.
* src/rpc/virnetserver.c (virNetServerNew): Use alternate pointer.
(virNetServerRun): Give coverity a hint.
Coverity complained that 395 out of 409 virAsprintf calls are
checked, and therefore assumed that the remaining cases are bugs
waiting to happen. But in each of these cases, a failed virAsprintf
will properly set the target string to NULL, and pass on that
failure to the caller, without wasting efforts to check the call.
Adding the ignore_value silences Coverity.
* src/conf/domain_audit.c (virDomainAuditGetRdev): Ignore
virAsprintf return value, when it behaves like we need.
* src/network/bridge_driver.c (networkDnsmasqLeaseFileNameDefault)
(networkRadvdConfigFileName, networkBridgeDummyNicName)
(networkRadvdPidfileBasename): Likewise.
* src/util/storage_file.c (absolutePathFromBaseFile): Likewise.
* src/openvz/openvz_driver.c (openvzGenerateContainerVethName):
Likewise.
* src/util/command.c (virCommandTranslateStatus): Likewise.
Quite a few leaks detected by coverity. For chr, the leaks were
close enough to the allocations to plug in place; for disk, the
leaks were separated from the allocation by enough other lines with
intermediate failure cases that I refactored the cleanup instead.
* src/qemu/qemu_command.c (qemuParseCommandLine): Plug leaks.
Warning detected by Coverity. No need for the NULL check, and
removing it silences the warning without any semantic change.
* src/qemu/qemu_migration.c (qemuMigrationFinish): All entries to
endjob had non-NULL vm.
Detected by Coverity. Freeing the wrong variable results in both
a memory leak and the likelihood of the caller dereferencing through
a freed pointer.
* src/rpc/virnettlscontext.c (virNetTLSSessionNew): Free correct
variable.
Coverity detected that 5 of 6 callers of virJSONValueArrayGet checked
for a NULL return; and that by not checking we risk a null deref
during an error. The error is unlikely since the prior call to
virJSONValueArraySize would probably have already caught any botched
JSON array parse, but better safe than sorry.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONGetBlockJobInfo):
Check for NULL.
(qemuMonitorJSONExtractPtyPaths): Fix typo.
Detected by Coverity. We want to compare the result of fnmatch 'rv',
not our pre-set return value 'ret'.
* src/rpc/virnetsaslcontext.c (virNetSASLContextCheckIdentity):
Check correct variable.
Revert 6a1f5f568f. Now that libvirt_iohelper takes fds by
inheritance rather than by open() (commit 1eb66479), there is
no longer a race where the parent can unlink() a file prior to
the iohelper open()ing the same file. From there, it makes
more sense to have the callers both create and unlink, rather
than the caller create and the stream unlink, since the latter
was only needed when iohelper had to do the unlink.
* src/fdstream.h (virFDStreamOpenFile, virFDStreamCreateFile):
Callers are responsible for deletion.
* src/fdstream.c (virFDStreamOpenFileInternal): Don't leak created
file on failure.
(virFDStreamOpenFile, virFDStreamCreateFile): Drop parameter.
* src/lxc/lxc_driver.c (lxcDomainOpenConsole): Update callers.
* src/qemu/qemu_driver.c (qemuDomainScreenshot)
(qemuDomainOpenConsole): Likewise.
* src/storage/storage_driver.c (storageVolumeDownload)
(storageVolumeUpload): Likewise.
* src/uml/uml_driver.c (umlDomainOpenConsole): Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainScreenshot): Likewise.
* src/xen/xen_driver.c (xenUnifiedDomainOpenConsole): Likewise.
The previous qemu patch could end up calling unlink(tmp) before
tmp was the name of a valid file (unlinking a fileXXXXXX template
instead), or calling unlink(tmp) twice on success (once here,
and once at the end of the stream). Meanwhile, vbox also suffered
from the same leaked tmp file bug.
* src/qemu/qemu_driver.c (qemuDomainScreenshot): Don't unlink on
success, or on invalid name.
* src/vbox/vbox_tmpl.c (vboxDomainScreenshot): Don't leak temp file.
Spotted by Coverity. Gnutls documents that buffer must be NULL
if gnutls_x509_crt_get_key_purpose_oid is to be used to determine
the correct size needed for allocating a buffer.
* src/rpc/virnettlscontext.c
(virNetTLSContextCheckCertKeyPurpose): Initialize buffer.
Spotted by coverity. If pipe2 fails, then we attempt to close
uninitialized fds, which may result in a double-close.
* src/rpc/virnetserver.c (virNetServerSignalSetup): Initialize fds.
Steps to reproduce this problem (vm1 is not running):
for i in `seq 50`; do virsh managedsave vm1& done; killall virsh
Pre-patch, virNetServerClientClose could end up setting client->sock
to NULL prior to other cleanup functions trying to use client->sock.
This fixes things by checking for NULL in more places, and by deferring
the cleanup until after all queued messages have been served.
* src/rpc/virnetserverclient.c (virNetServerClientRegisterEvent)
(virNetServerClientGetFD, virNetServerClientIsSecure)
(virNetServerClientLocalAddrString)
(virNetServerClientRemoteAddrString): Check for closed socket.
(virNetServerClientClose): Rearrange close sequence.
Analysis from Wen Congyang.
This patch adds an internal function openvzGetVEStatus to
get the real state of the domain. This function is used in
various places in the driver, in particular to detect when
the domain has been shut down by the user with the "halt"
command.
Currently, we attempt to run sync job and async job at the same time. It
means that the monitor commands for two jobs can be run in any order.
In the function qemuDomainObjEnterMonitorInternal():
if (priv->job.active == QEMU_JOB_NONE && priv->job.asyncJob) {
if (qemuDomainObjBeginNestedJob(driver, obj) < 0)
We check whether the caller is an async job by priv->job.active and
priv->job.asynJob. But when an async job is running, and a sync job is
also running at the time of the check, then priv->job.active is not
QEMU_JOB_NONE. So we cannot check whether the caller is an async job
in the function qemuDomainObjEnterMonitorInternal(), and must instead
put the burden on the caller to tell us when an async command wants
to do a nested job.
Once the burden is on the caller, then only async monitor enters need
to worry about whether the VM is still running; for sync monitor enter,
the internal return is always 0, so lots of ignore_value can be dropped.
* src/qemu/THREADS.txt: Reflect new rules.
* src/qemu/qemu_domain.h (qemuDomainObjEnterMonitorAsync): New
prototype.
* src/qemu/qemu_process.h (qemuProcessStartCPUs)
(qemuProcessStopCPUs): Add parameter.
* src/qemu/qemu_migration.h (qemuMigrationToFile): Likewise.
(qemuMigrationWaitForCompletion): Make static.
* src/qemu/qemu_domain.c (qemuDomainObjEnterMonitorInternal): Add
parameter.
(qemuDomainObjEnterMonitorAsync): New function.
(qemuDomainObjEnterMonitor, qemuDomainObjEnterMonitorWithDriver):
Update callers.
* src/qemu/qemu_driver.c (qemuDomainSaveInternal)
(qemudDomainCoreDump, doCoreDump, processWatchdogEvent)
(qemudDomainSuspend, qemudDomainResume, qemuDomainSaveImageStartVM)
(qemuDomainSnapshotCreateActive, qemuDomainRevertToSnapshot):
Likewise.
* src/qemu/qemu_process.c (qemuProcessStopCPUs)
(qemuProcessFakeReboot, qemuProcessRecoverMigration)
(qemuProcessRecoverJob, qemuProcessStart): Likewise.
* src/qemu/qemu_migration.c (qemuMigrationToFile)
(qemuMigrationWaitForCompletion, qemuMigrationUpdateJobStatus)
(qemuMigrationJobStart, qemuDomainMigrateGraphicsRelocate)
(doNativeMigrate, doTunnelMigrate, qemuMigrationPerformJob)
(qemuMigrationPerformPhase, qemuMigrationFinish)
(qemuMigrationConfirm): Likewise.
* src/qemu/qemu_hotplug.c: Drop unneeded ignore_value.
whether or not previous return value is -1, the following codes will be
executed for a inactive guest in src/qemu/qemu_driver.c:
ret = virDomainSaveConfig(driver->configDir, persistentDef);
and if everything is okay, 'ret' is assigned to 0, the previous 'ret'
will be overwritten, this patch will fix this issue.
* src/qemu/qemu_driver.c: avoid return value is overwritten when give a argument
in out of blkio weight range for a inactive guest.
* how to reproduce?
% virsh blkiotune ${guestname} --weight 10
% echo $?
Note: guest must be inactive, argument 10 in out of blkio weight range,
and can get a error information by checking libvirtd.log, however,
virsh hasn't raised any error information, and return value is 0.
https://bugzilla.redhat.com/show_bug.cgi?id=726304
Signed-off-by: Alex Jia <ajia@redhat.com>
whether or not previous return value is -1, the following codes will be
executed for a inactive guest in qemuDomainSetMemoryParameters:
ret = virDomainSaveConfig(driver->configDir, persistentDef);
and if everything is okay, 'ret' is assigned to 0, the previous 'ret'
will be overwritten, this patch will fix this issue.
* src/qemu/qemu_driver.c: avoid return value is overwritten when set
min_guarante value to a inactive guest.
* how to reproduce?
% virsh memtune ${guestname} --min_guarante 1024
% echo $?
Note: guest must be inactive, in fact, 'min_guarante' hasn't been implemented
in memory tunable, and I can get the error when check actual libvirtd.log,
however, virsh hasn't raised any error information, and return value is 0.
Signed-off-by: Alex Jia <ajia@redhat.com>
Introduced by f9a837da73, the condition is not changed after
the else clause is removed. So now it quit with "domain is not
running" when the domain is running. However, when the domain is
not running, it reports "no job is active".
How to reproduce:
1)
% virsh start $domain
% virsh domjobabort $domain
error: Requested operation is not valid: domain is not running
2)
% virsh destroy $domain
% virsh domjobabort $domain
error: Requested operation is not valid: no job is active on the domain
3)
% virsh save $domain /tmp/$domain.save
Before above commands finished, try to abort job in another terminal
% virsh domabortjob $domain
error: Requested operation is not valid: domain is not running
Originally noticed by comparing the xml generated by virDomainSave
with the xml produced by reparsing and redumping that xml, but I
also did an audit of every last use of VIR_DOMAIN_XML_INACTIVE in
domain_conf.c to ensure that no other discrepancies exist.
* src/conf/domain_conf.c (virDomainDeviceInfoIsSet): Add
parameter, and update all callers. Make static.
(virDomainNetDefFormat): Skip generated ifname.
(virDomainDefFormatInternal): Skip default <seclabel>.
(virDomainChrSourceDefParseXML): Skip generated pty path, and add
parameter. Update callers.
* src/conf/domain_conf.h (virDomainDeviceInfoIsSet): Delete.
* src/libvirt_private.syms (domain_conf.h): Update.
Using a macro ensures that all the code is looking for the same
prefix.
* src/conf/domain_conf.h (VIR_NET_GENERATED_PREFIX): New macro.
* src/conf/domain_conf.c (virDomainNetDefParseXML): Use it.
* src/uml/uml_conf.c (umlConnectTapDevice): Likewise.
* src/qemu/qemu_command.c (qemuNetworkIfaceConnect): Likewise.
Suggested by Laine Stump.
This is in response to:
https://bugzilla.redhat.com/show_bug.cgi?id=723862
which points out that a guest on an "isolated" network could
potentially exploit the DNS forwarding provided by dnsmasq to create a
communication channel to the outside.
This patch eliminates that possibility by adding the "--no-resolv"
argument to the dnsmasq commandline, which tells dnsmasq to not
forward on any requests that it can't resolve itself (by looking at
its own static hosts files and runtime list of dhcp clients), but to
instead return a failure for those requests.
This shouldn't cause any undesirable change from current
behavior, even in the case where a guest is currently configured with
multiple interfaces, one of them being connected to an isolated
network, and another to a network that does have connectivity to the
outside. If the isolated network's DNS server is queried for a name
it doesn't know, it will return "Refused" rather than "Unknown", which
indicates to the guest that it should query other servers, so it then
queries the connected DNS server, and gets the desired response.
Without this, cygwin failed to compile:
In file included from ../src/rpc/virnetmessage.h:24,
from ../src/rpc/virnetclient.h:27,
from remote/remote_driver.c:31:
../src/rpc/virnetprotocol.h:9:21: error: rpc/rpc.h: No such file or directory
With that fixed, compilation warned:
rpc/virnetsocket.c: In function 'virNetSocketNewListenUNIX':
rpc/virnetsocket.c:347: warning: format '%d' expects type 'int', but argument 8 has type 'gid_t' [-Wformat]
rpc/virnetsocket.c: In function 'virNetSocketGetLocalIdentity':
rpc/virnetsocket.c:743: warning: pointer targets in passing argument 5 of 'getsockopt' differ in signedness
* src/Makefile.am (libvirt_driver_remote_la_CFLAGS)
(libvirt_net_rpc_client_la_CFLAGS)
(libvirt_net_rpc_server_la_CFLAGS): Include XDR_CFLAGS, for rpc
headers on cygwin.
* src/rpc/virnetsocket.c (virNetSocketNewListenUNIX)
(virNetSocketGetLocalIdentity): Avoid compiler warnings.
Commit 3709a386 ported hooks codes to new command execution API,
together with the useful error message removed. Though we can't
get "errbuf" from the new command execution API anymore, still
we can give a more useful error.
https://bugzilla.redhat.com/show_bug.cgi?id=726398